Python: Fix state snapshot to use deepcopy so nested mutations are detected in durable workflow activities#4518
Python: Fix state snapshot to use deepcopy so nested mutations are detected in durable workflow activities#4518moonbox3 wants to merge 6 commits intomicrosoft:mainfrom
Conversation
…#4500) Replace dict() shallow copy with copy.deepcopy() when snapshotting workflow state before activity execution. The shallow copy shared references to nested objects (dicts, lists), so in-place mutations by executors were reflected in both the snapshot and live state, producing an empty diff and preventing state updates from propagating to downstream activities. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tected in durable workflow activities Fixes microsoft#4500
moonbox3
left a comment
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 92%
✗ Correctness
This diff adds a reproduction report markdown file to the repository root. It contains a hardcoded absolute local filesystem path (/Users/evmattso/...) that leaks personal environment details into the shared repo, and the 'Full Analysis' section contains unformatted, concatenated stream-of-consciousness notes that appear to be raw LLM output pasted verbatim. More fundamentally, this file appears to be a temporary investigation artifact rather than something that belongs in the repository—it should either be excluded from the commit or placed in an issue/PR comment instead.
✗ Security Reliability
This diff adds a markdown reproduction report to the repository. It contains no executable code, so there are no injection, deserialization, or resource-leak risks. However, it embeds a developer-local absolute filesystem path that leaks internal environment details (username, directory layout, worktree structure) into the public repository. The report also appears to be a one-off debugging artifact with stream-of-consciousness notes rather than a permanent project file.
✓ Test Coverage
This diff adds only a REPRODUCTION_REPORT.md file documenting a shallow-copy bug. It contains no production code changes and no test code changes. The report references a test file (test_shallow_copy_bug.py) but that file is not included in this diff, so there is nothing to evaluate for test coverage. The report also contains raw stream-of-consciousness analysis text that appears to be unedited LLM output, but that is a content quality concern, not a test coverage issue.
✗ Design Approach
This PR commits a transient reproduction/debugging report as a permanent markdown file in the repository root. It does not fix the identified bug (shallow copy via
dict()at line 303 of_app.py). The file contains local filesystem paths, stream-of-consciousness LLM debug output, and no actual code change. This is a debugging artifact that should not be checked into the repository — the right action is to fix the bug itself (replacedict()withcopy.deepcopy()) and add the reproduction test to the test suite, not to commit investigation notes.
Flagged Issues
- REPRODUCTION_REPORT.md contains a hardcoded absolute local path (/Users/evmattso/git/agent-framework-2/.worktrees/agent/fix-4500-1/...) on line 14, leaking personal environment information into the repository.
- This file appears to be a transient investigation artifact (with raw debug narration in the 'Full Analysis' section) and should not be committed to the repository root. Consider moving these findings into the GitHub issue or PR description instead.
- REPRODUCTION_REPORT.md embeds an absolute local filesystem path (
/Users/evmattso/git/agent-framework-2/.worktrees/agent/fix-4500-1/…) which leaks the developer's username and internal directory structure into the repository. This should not be committed to a public/shared repo. - This file appears to be a transient debugging/investigation artifact (contains raw stream-of-consciousness analysis notes) rather than a permanent project document. It should not be checked in to the repository root.
- This PR commits a debugging artifact (REPRODUCTION_REPORT.md) to the repository root instead of actually fixing the identified bug. The report itself identifies the fix — replace
dict(deserialized_state)withcopy.deepcopy(deserialized_state)at line 303 of_app.py— but does not apply it. Remove this file and submit the actual fix instead. - The file contains a hardcoded local filesystem path (
/Users/evmattso/git/agent-framework-2/.worktrees/agent/fix-4500-1/...) which is developer-specific and should never be committed to the repository.
Suggestions
- If the report must be committed, replace the absolute path with a repository-relative path and clean up the 'Full Analysis' section, which contains duplicated stream-of-consciousness sentences concatenated without separators.
- If a reproduction report is needed, use a relative path (e.g.,
python/packages/azurefunctions/tests/test_shallow_copy_bug.py) instead of an absolute local path. - Consider whether this report belongs in the repo at all — the findings should live in the GitHub issue or PR description, not as a committed markdown file.
- The referenced test file
python/packages/azurefunctions/tests/test_shallow_copy_bug.pyis not included in this diff. If it is meant to be committed alongside this report, it should be added to the PR so reviewers can verify the reproduction tests are correct and assertions are meaningful. - The report contains an absolute local file path (
/Users/evmattso/git/agent-framework-2/.worktrees/agent/fix-4500-1/...) on line 14 which is not portable and should be replaced with a repository-relative path. - The 'Full Analysis' section (lines 29-35) contains raw, unedited stream-of-consciousness text with repeated false-start paragraphs. Consider cleaning it up or removing it before merging.
- If you want to preserve reproduction evidence, add the reproduction test (
test_shallow_copy_bug.py) to the test suite as a proper regression test rather than committing a prose report. - The 'Full Analysis' section is raw, unedited stream-of-consciousness output with repeated sentences and no paragraph breaks. If a report file were appropriate (it isn't here), it would need proper editing.
Automated review by moonbox3's agents
There was a problem hiding this comment.
Pull request overview
Fixes durable workflow activity state propagation in the Azure Functions integration by ensuring the pre-execution state snapshot is fully independent of the live state, so in-place mutations to nested structures are detected and forwarded to downstream activities.
Changes:
- Replace shallow snapshot (
dict(...)) withcopy.deepcopy(...)when capturing the pre-execution shared state in the activity wrapper. - Add tests intended to cover nested dict/list mutation diff behavior.
- Add a reproduction report documenting the original bug and analysis.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
python/packages/azurefunctions/agent_framework_azurefunctions/_app.py |
Uses copy.deepcopy for the pre-execution snapshot to detect nested in-place mutations in state diffing. |
python/packages/azurefunctions/tests/test_app.py |
Adds a new test suite covering nested mutation diff scenarios (currently not exercising the production code path directly). |
REPRODUCTION_REPORT.md |
Adds a generated reproduction/analysis report for issue #4500 (contains local path + verbose agent-like output). |
You can also share your feedback on Copilot code review. Take the survey.
…#4500) - Delete REPRODUCTION_REPORT.md (debugging artifact with local paths and raw LLM output) - Extract _create_state_snapshot() and _compute_state_updates() as module-level helpers in _app.py so tests exercise the production code path - Update TestStateSnapshotDiff to import and use production helpers instead of reimplementing snapshot/diff logic locally Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…microsoft#4500) Add two additional tests to TestStateSnapshotDiff: - test_shallow_copy_would_miss_nested_mutations: reproduces the original bug by demonstrating that dict() (shallow copy) misses nested mutations - test_create_state_snapshot_isolates_nested_objects: verifies the production _create_state_snapshot helper creates a true deep copy These tests ensure a regression back to shallow copy would be caught. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address PR review comment: add test_executor_activity_detects_nested_state_mutations that captures the actual executor_activity function from _setup_executor_activity and verifies it detects in-place nested mutations. This test would fail if _app.py line 314 regressed from _create_state_snapshot() back to dict(). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| from __future__ import annotations | ||
|
|
||
| import asyncio | ||
| import copy |
There was a problem hiding this comment.
| import copy | |
| from copy import deepcopy |
|
|
||
| def _create_state_snapshot(state: dict[str, Any]) -> dict[str, Any]: | ||
| """Create a deep copy of the deserialized state for later diffing.""" | ||
| return copy.deepcopy(state) |
There was a problem hiding this comment.
| return copy.deepcopy(state) | |
| return deepcopy(state) |
|
|
||
| def _compute_state_updates(original_snapshot: dict[str, Any], current_state: dict[str, Any]) -> dict[str, Any]: | ||
| """Compute state updates by comparing current state against the original snapshot.""" | ||
| return {k: v for k, v in current_state.items() if k not in original_snapshot or original_snapshot[k] != v} |
There was a problem hiding this comment.
not sure how big the snapshots will becomes, but this is potentially a expensive operation, could you do a comparison of set(original.keys()) with set(current_state.keys()) first?
Motivation and Context
When a durable workflow activity mutates nested objects (dicts, lists) inside shared state, the changes were not propagated to the next activity. This is because the pre-execution snapshot shared references with the live state, making the diff comparison always see them as equal.
Fixes #4500
Description
The root cause was that
original_snapshot = dict(deserialized_state)only performed a shallow copy, so nested mutable objects (dicts, lists) in the snapshot pointed to the same objects as the live state. When an activity mutated those nested values in-place, the snapshot reflected the same changes, causing the post-execution diff to miss them entirely. The fix replacesdict(...)withcopy.deepcopy(...)so the snapshot is fully independent. Regression tests verify that in-place mutations to nested dicts, lists, and new keys are all correctly detected in the state diff.Contribution Checklist