Fix flaky Cursor/OpenCode E2E behavior and transcript prep timing#923
Merged
gtrrz-victor merged 10 commits intomainfrom Apr 11, 2026
Merged
Fix flaky Cursor/OpenCode E2E behavior and transcript prep timing#923gtrrz-victor merged 10 commits intomainfrom
gtrrz-victor merged 10 commits intomainfrom
Conversation
Match Cursor interactive completion prompts with current UI text and let OpenCode honor per-prompt timeout overrides so the affected interactive tests and multi-session flow are less timing-sensitive. Entire-Checkpoint: 13ad540ecefa
Resolve Cursor transcript path before polling in prepareTranscriptForState. Previously, PrepareTranscript was called with the stored flat path (.../id.jsonl) before re-resolution to the correct nested path (.../id/id.jsonl), wasting the entire timeout polling a nonexistent file. Increase Cursor PrepareTranscript timeout from 3s to 5s to provide margin for slow IDE flushes. Increase Cursor E2E startup timeouts from 30s to 45s to handle slow trust dialog and initialization. Fix TestWriteTemporary_PathNormalizationAndSkipping by resolving macOS /var -> /private/var symlink on temp directories so absolute paths match git's resolved repo root. Entire-Checkpoint: 1bd0dcfecd83
Contributor
There was a problem hiding this comment.
Pull request overview
This PR reduces E2E flakiness for Cursor and OpenCode agents and improves transcript preparation reliability in the manual-commit strategy, aiming to avoid missed transcript writes and timeout-bound failures.
Changes:
- Adds per-prompt timeout overrides in
TestMultiSessionSequentialand makes OpenCode read those overrides. - Broadens Cursor CLI “ready” prompt matching and increases startup/trust wait windows.
- Re-resolves transcript paths before transcript preparation waits to handle Cursor’s flat→nested relocation behavior; hardens a macOS path normalization test via symlink resolution.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| e2e/tests/multi_session_test.go | Passes per-prompt timeout overrides to reduce sequential multi-prompt failures. |
| e2e/agents/opencode.go | Applies per-prompt timeout configuration when running OpenCode prompts. |
| e2e/agents/cursor_cli.go | Expands prompt readiness regex and increases wait windows to reduce Cursor CLI flakiness. |
| cmd/entire/cli/strategy/common.go | Re-resolves transcript paths before preparation waits to avoid polling stale locations. |
| cmd/entire/cli/checkpoint/checkpoint_test.go | Stabilizes macOS path normalization test by resolving temp dir symlinks. |
| cmd/entire/cli/agent/cursor/cursor.go | Increases Cursor transcript preparation wait window. |
Comments suppressed due to low confidence (1)
e2e/agents/opencode.go:105
- In openCodeAgent.RunPrompt, the per-prompt timeout option (cfg.PromptTimeout) is overridden by the E2E_TIMEOUT env var unconditionally. That means callers (like tests) cannot actually override the timeout when E2E_TIMEOUT is set, which conflicts with the intended “per-prompt override” behavior. Consider making the precedence explicit (e.g., cfg.PromptTimeout > env > default, or at least document that env always wins).
timeout := a.timeout
if cfg.PromptTimeout > 0 {
timeout = cfg.PromptTimeout
}
if envTimeout := os.Getenv("E2E_TIMEOUT"); envTimeout != "" {
if parsed, err := time.ParseDuration(envTimeout); err == nil {
timeout = parsed
}
}
Entire-Checkpoint: 70e55c90815e
Entire-Checkpoint: b03141021f3d
Entire-Checkpoint: 5775134d9525
Entire-Checkpoint: 43afb2e95802
Entire-Checkpoint: 8fdf9b8eda51
Contributor
Author
|
Bugbot run |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 957579f. Configure here.
When the context deadline is shorter than maxWait, the warning log now reports the actual timeout used instead of the constant maxWait. Entire-Checkpoint: 268ae794a8c6
gtrrz-victor
approved these changes
Apr 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
TestMultiSessionSequentialprompt timeouts to reduce timeout-bound failures.Validation
Note
Low Risk
Low risk: changes are limited to transcript polling behavior and test/e2e timeouts/prompt matching, with minimal impact on core business logic. Main risk is masking genuine failures by increasing waits or broadening readiness regexes.
Overview
Hardens Cursor transcript preparation by extending the wait window and making polling respect
ctxdeadlines/cancellation using a timer-based sleep.Stabilizes E2E agent runs by broadening Cursor CLI ready-state prompt matching, increasing startup/trust-dialog wait timeouts, and making OpenCode honor
WithPromptTimeoutover theE2E_TIMEOUTenv var.Reduces test flakiness across platforms by resolving macOS temp-dir symlinks in a checkpoint path-normalization test, and by raising timeouts in
TestMultiSessionSequentialvia per-prompt overrides.Reviewed by Cursor Bugbot for commit 957579f. Configure here.