Add durable execution for AgentOperator & @task.agent#64199
Merged
kaxil merged 6 commits intoMar 25, 2026
Conversation
ddf81bf to
bd9a25e
Compare
When `durable=True`, model responses and tool results are cached step-by-step via ObjectStorage. On retry, cached steps replay instead of re-executing LLM calls and tool operations. Architecture: - CachingModel(WrapperModel) intercepts model.request() calls - CachingToolset(WrapperToolset) intercepts tool.call_tool() calls - DurableStepCounter provides shared monotonic step indexing - DurableStorage persists all steps in a single JSON file on ObjectStorage (configured via [common.ai] durable_cache_path) - Cache file is deleted on successful task completion Bumps pydantic-ai-slim to >=1.34.0 for WrapperModel/WrapperToolset.
bd9a25e to
9f22f34
Compare
AgentOperatorAgentOperator & @task.agent
- Rewrite docs section: clarify config errors, production storage, retries requirement, side effects/idempotency, cache cleanup - Move per-step durable logs to DEBUG, add single INFO summary line showing replayed vs new steps after agent run completes - Remove noisy storage-level log lines (save/hit/cleanup) - Swap toolset wrapping order so CachingToolset sits inside LoggingToolset (cache logs appear within tool call log groups) - Track replay/cache stats on DurableStepCounter
- Catch json.JSONDecodeError in _load_cache so a truncated cache file (from a crash mid-write) is treated as empty instead of crashing all subsequent retries - Wrap save_tool_result json.dumps with clear TypeError message when a custom tool returns non-JSON-serializable values - Fix docs: per-step logs are DEBUG, only the summary is INFO
… handling - Raise ValueError if both durable=True and enable_hitl_review=True are set (HITL regeneration bypasses durable model wrapping) - Skip caching with a warning for non-serializable tool results (e.g. BinaryContent from MCP tools) instead of failing the task - Document the BinaryContent limitation in agent.rst
Member
Author
|
Addressed both review comments in 35d5833: HITL + durable — Added BinaryContent from MCP tools — Changed |
gopidesupavan
approved these changes
Mar 25, 2026
nailo2c
pushed a commit
to nailo2c/airflow
that referenced
this pull request
Mar 30, 2026
…#64199) When `durable=True`, model responses and tool results are cached step-by-step via ObjectStorage. On retry, cached steps replay instead of re-executing LLM calls and tool operations. Architecture: - CachingModel(WrapperModel) intercepts model.request() calls - CachingToolset(WrapperToolset) intercepts tool.call_tool() calls - DurableStepCounter provides shared monotonic step indexing - DurableStorage persists all steps in a single JSON file on ObjectStorage (configured via [common.ai] durable_cache_path) - Cache file is deleted on successful task completion Bumps pydantic-ai-slim to >=1.34.0 for WrapperModel/WrapperToolset.
Suraj-kumar00
pushed a commit
to Suraj-kumar00/airflow
that referenced
this pull request
Apr 7, 2026
…#64199) When `durable=True`, model responses and tool results are cached step-by-step via ObjectStorage. On retry, cached steps replay instead of re-executing LLM calls and tool operations. Architecture: - CachingModel(WrapperModel) intercepts model.request() calls - CachingToolset(WrapperToolset) intercepts tool.call_tool() calls - DurableStepCounter provides shared monotonic step indexing - DurableStorage persists all steps in a single JSON file on ObjectStorage (configured via [common.ai] durable_cache_path) - Cache file is deleted on successful task completion Bumps pydantic-ai-slim to >=1.34.0 for WrapperModel/WrapperToolset.
abhijeets25012-tech
pushed a commit
to abhijeets25012-tech/airflow
that referenced
this pull request
Apr 9, 2026
…#64199) When `durable=True`, model responses and tool results are cached step-by-step via ObjectStorage. On retry, cached steps replay instead of re-executing LLM calls and tool operations. Architecture: - CachingModel(WrapperModel) intercepts model.request() calls - CachingToolset(WrapperToolset) intercepts tool.call_tool() calls - DurableStepCounter provides shared monotonic step indexing - DurableStorage persists all steps in a single JSON file on ObjectStorage (configured via [common.ai] durable_cache_path) - Cache file is deleted on successful task completion Bumps pydantic-ai-slim to >=1.34.0 for WrapperModel/WrapperToolset.
2 tasks
4 tasks
75 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
(Part of AIP-99 Common Data Access Pattern + AI)
Adds
durable=Trueparameter toAgentOperatorand@task.agentthat caches each LLM response and tool result to ObjectStorage as the agent runs. On retry, cached steps replay instantly -- no repeated LLM calls, no repeated tool execution, no repeated cost.How it works
Architecture:
request()callscall_tool()callsWhy ObjectStorage instead of XCom?
Airflow 3.x clears XCom for a task on retry. ObjectStorage (file://, s3://, gs://, etc.) survives retry clearing. The cache file is deleted on successful completion.
Demo
Attempt 1 -- agent runs normally, caching each step. A transient failure occurs after 3 tool calls:
Attempt 2 -- cached steps replay instantly (no LLM calls, no tool execution), then the agent continues from where it left off:
Summary line at INFO level shows how many steps were replayed vs executed fresh:
Per-step detail is available at DEBUG level:
Try it yourself
1. Create a demo database and connection
2. Configure durable cache path
Add to
airflow.cfg:3. Create the demo DAG
Save to your dags folder (uses a
FlakyToolsetthat fails once onquerywhen a flag file exists):4. Run the demo
Attempt 1 will run
list_tables,get_schema, etc. (all cached), then fail onquery. The flag file is removed on failure. Attempt 2 retries automatically -- cached steps replay instantly, thenquerysucceeds.Usage
Requires
[common.ai] durable_cache_pathinairflow.cfg:Side effects
Durable execution caches return values, not side effects. Read-only tools (SQLToolset, HookToolset) replay safely. Tools with side effects (file writes, API calls) won't re-execute on replay -- the cached return value is used instead. See the docs for details on idempotency considerations.
Dependencies
Bumps pydantic-ai-slim to >=1.34.0 for WrapperModel/WrapperToolset base classes.