feat: split triage and pr-fixer into separate GHA workflows#1177
feat: split triage and pr-fixer into separate GHA workflows#1177Gkrumbach07 merged 2 commits intomainfrom
Conversation
- triage.yml: daily cron (8am UTC weekdays), discovers untriaged Jira/GH issues, creates Implement sessions (max 5/cycle) - pr-fixer.yml: 30 min cron, manages all ai-managed PRs with session reuse, change detection, and circuit breaker - fix-single: triggered only by @ambient-fix comment - fix-batch: scheduled + manual dispatch - Both use ambient-action@v0.0.3 with inactivity timeout (60s) - Unified label: ai-managed (was agent-managed) - Removed old matrix-based batch pattern - Skip Jira gracefully if MCP tool unavailable Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughPR fixer workflow reworked to run fixes from issue comments and a single orchestrator batch job using session-frontmatter prompts and Claude Opus; batching now uses an orchestrator prompt with retry/circuit-breaker logic. Added a scheduled Issue Triage workflow that queries Jira/GitHub and spawns child investigation/fix sessions. Changes
Sequence Diagram(s)sequenceDiagram
participant IssueComment as Issue Comment Event
participant GitHubAPI as GitHub API
participant PRBody as PR Body Parser
participant AmbientAction as ambient-action v0.0.3
participant Claude as Claude Opus Model
participant SessionMgmt as Session Management
IssueComment->>GitHubAPI: event payload (issue_comment)
GitHubAPI-->>PRBody: fetch PR/issue body
PRBody->>PRBody: extract acp:session_id and frontmatter
PRBody-->>AmbientAction: invoke fix-single with session context
AmbientAction->>Claude: run session-driven fix prompt
Claude->>SessionMgmt: read/create session state
Claude-->>AmbientAction: return fix operations & updated frontmatter
AmbientAction->>GitHubAPI: apply PR body updates/comments
GitHubAPI-->>AmbientAction: confirm update
sequenceDiagram
participant Schedule as Scheduler (cron/dispatch)
participant AmbientAction as ambient-action v0.0.3
participant Jira as Jira (MCP)
participant GitHub as GitHub Issues
participant Claude as Claude Opus Model
participant ChildSession as Child Session Creator
Schedule->>AmbientAction: trigger triage job
AmbientAction->>Jira: query untriaged items (max 5)
alt Jira available
Jira-->>AmbientAction: return items
else Jira unavailable
AmbientAction->>GitHub: query untriaged issues
GitHub-->>AmbientAction: return items
end
AmbientAction->>Claude: run triage orchestrator prompt
Claude->>GitHub: check for ai-managed PRs / existing sessions
Claude->>ChildSession: create investigation/fix child sessions as needed
ChildSession-->>Claude: return session name & UID
Claude->>GitHub: add `ai-triaged` label and comment with session link
GitHub-->>AmbientAction: confirm updates
AmbientAction->>Schedule: write session summary to step summary
🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/pr-fixer.yml:
- Around line 50-61: The existing session lookup step (id: existing) writes
session_id and source to GITHUB_OUTPUT but those outputs are never consumed;
update later steps to read steps.existing.outputs.session_id and
steps.existing.outputs.source where session continuity is needed, and stop
unconditionally resetting retry_count to 0 — instead only initialize retry_count
when no session_id was found (e.g., if: steps.existing.outputs.session_id ==
''), or pass the previous retry_count through the workflow when a session
exists; in short, wire the existing step's outputs into the logic that decides
session continuation and remove the hard-reset of retry_count so an `@ambient-fix`
resumes the prior session/retry state.
- Around line 8-13: The scheduled workflow can overlap with long-running runs;
update the workflow to prevent concurrent runs by adding a top-level concurrency
block (e.g., concurrency: group: "pr-fixer-${{ github.ref }}"
cancel-in-progress: false) so new triggers (from schedule or workflow_dispatch)
will queue until the current run finishes, or alternatively change the schedule
cron to an interval >= the max run time (e.g., hourly) to avoid overlap; apply
this change near the existing schedule and workflow_dispatch entries so all
triggers are serialized.
- Around line 141-173: The workflow currently increments retry_count
unconditionally after sending a message ("After sending: increment
retry_count..."), which treats normal churn as failure; change this so
retry_count is incremented only when the agent fails to make progress (no new
external commits, no new/updated review comments, CI still failing, merge
conflicts unresolved) or when the agent explicitly reports a failure state, and
reset or leave retry_count unchanged when the agent's action produced a
meaningful change; update the logic around the session management step that
references retry_count/last_action/ai-needs-human (the "After sending" block and
any code that updates frontmatter like <!-- acp:session_id=... last_action=...
retry_count=... -->) to base increments on a success flag or a comparison of
repo state before/after the agent run rather than incrementing unconditionally.
- Around line 68-71: The workflow currently uses the mutable tag "uses:
ambient-code/ambient-action@v0.0.3" in two places (the "Fix PR" step and the
other step later) which weakens supply-chain security; update both "uses:
ambient-code/ambient-action@v0.0.3" occurrences to pin the action to the
corresponding full commit SHAs (replace `@v0.0.3` with @<commit-sha> for each
occurrence), ensuring you fetch the exact commit SHAs from the
ambient-code/ambient-action repository and update both instances consistently.
- Around line 131-133: The gh CLI call that lists PRs uses the default of 30
results; update the Run command that invokes `gh pr list --repo ${{
github.repository }} --state open --label ai-managed --search "draft:false"` to
include an explicit limit (e.g. add `--limit 1000`) so all matching PRs are
returned; modify the `gh pr list` invocation to `gh pr list ... --search
"draft:false" --limit 1000` (or another appropriate numeric limit) to ensure no
PRs are silently omitted.
In @.github/workflows/triage.yml:
- Around line 3-10: Add a top-level concurrency guard to the triage workflow to
serialize scheduled and manual runs: insert a concurrency block (e.g.,
concurrency: group: 'triage-${{ github.workflow }}' cancel-in-progress: false)
so that runs triggered by schedule and workflow_dispatch cannot overlap and will
queue instead of racing; update the .github/workflows/triage.yml file around the
existing on: schedule and workflow_dispatch sections to include this concurrency
configuration.
- Around line 20-23: The workflow step named "Run triage orchestrator" (id:
session) currently pins the action as ambient-code/ambient-action@v0.0.3;
replace the tag with the specific full commit SHA for
ambient-code/ambient-action (e.g.,
ambient-code/ambient-action@<full-commit-sha>) so the action is immutably
pinned; update the uses value in that step accordingly and commit the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ae0efb6c-68d1-4b14-bfb5-4c0fbac18bc4
📒 Files selected for processing (2)
.github/workflows/pr-fixer.yml.github/workflows/triage.yml
.github/workflows/pr-fixer.yml
Outdated
| - name: Check for existing session | ||
| if: steps.fork_check.outputs.skip != 'true' | ||
| id: existing | ||
| env: | ||
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| run: | | ||
| # Read PR body and extract session_id from frontmatter | ||
| BODY=$(gh pr view ${{ steps.pr.outputs.number }} --repo "${{ github.repository }}" --json body --jq '.body') | ||
| SESSION_ID=$(echo "$BODY" | grep -oP 'acp:session_id=\K[^ ]+' | head -1 || echo "") | ||
| SOURCE=$(echo "$BODY" | grep -oP 'source=\K[^ ]+' | head -1 || echo "") | ||
| echo "session_id=$SESSION_ID" >> $GITHUB_OUTPUT | ||
| echo "source=$SOURCE" >> $GITHUB_OUTPUT |
There was a problem hiding this comment.
The existing-session lookup currently has no effect.
Line 60 stores session_id, but nothing later reads steps.existing.outputs.session_id. Line 89 also hard-resets retry_count=0, so an @ambient-fix on an already-managed PR loses its prior session/retry state instead of continuing it.
Also applies to: 81-90
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/pr-fixer.yml around lines 50 - 61, The existing session
lookup step (id: existing) writes session_id and source to GITHUB_OUTPUT but
those outputs are never consumed; update later steps to read
steps.existing.outputs.session_id and steps.existing.outputs.source where
session continuity is needed, and stop unconditionally resetting retry_count to
0 — instead only initialize retry_count when no session_id was found (e.g., if:
steps.existing.outputs.session_id == ''), or pass the previous retry_count
through the workflow when a session exists; in short, wire the existing step's
outputs into the logic that decides session continuation and remove the
hard-reset of retry_count so an `@ambient-fix` resumes the prior session/retry
state.
| - name: Fix PR | ||
| if: steps.fork_check.outputs.skip != 'true' | ||
| id: session | ||
| uses: ambient-code/ambient-action@v0.0.2 | ||
| uses: ambient-code/ambient-action@v0.0.3 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /dev/null 2>&1; pwd > /dev/null; cat -n ".github/workflows/pr-fixer.yml" | head -130Repository: ambient-code/platform
Length of output: 6302
Pin both ambient-code/ambient-action references to full commit SHAs instead of @v0.0.3.
Semantic version tags are mutable and weaken supply-chain security for a workflow that modifies PR state and accesses secrets. Replace with pinned commit SHAs at lines 71 and 123.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/pr-fixer.yml around lines 68 - 71, The workflow currently
uses the mutable tag "uses: ambient-code/ambient-action@v0.0.3" in two places
(the "Fix PR" step and the other step later) which weakens supply-chain
security; update both "uses: ambient-code/ambient-action@v0.0.3" occurrences to
pin the action to the corresponding full commit SHAs (replace `@v0.0.3` with
@<commit-sha> for each occurrence), ensuring you fetch the exact commit SHAs
from the ambient-code/ambient-action repository and update both instances
consistently.
.github/workflows/pr-fixer.yml
Outdated
| ## 2. Circuit breaker | ||
| If retry_count >= 3: comment "AI was unable to resolve after 3 attempts. Needs human attention.", | ||
| add ai-needs-human label, remove ai-managed label. Skip. | ||
|
|
||
| ## 3. Check for changes since last_action | ||
| Ignore commits authored by the bot. Only look for: | ||
| - New commits by someone other than the bot | ||
| - New or updated review comments | ||
| - New CI failures | ||
| - Merge conflicts from base branch changes | ||
|
|
||
| If nothing changed → skip entirely. | ||
|
|
||
| ## 4. Something changed — act | ||
| - CI failing → send message to existing session with CI logs | ||
| - New review comments → send message with the comments | ||
| - Merge conflicts → send message to rebase | ||
| - New external commits → send message to review and ensure CI passes | ||
|
|
||
| ## 5. Session management | ||
| Before sending a message, check session status: | ||
| - Running → send the message | ||
| - Stopped → restart (reuse), then send | ||
| - Not found → create new session with this prompt: | ||
| "You are maintaining an open pull request. | ||
| PR: <URL> Source issue: <KEY> (if known) | ||
| 1. Check out the PR branch. | ||
| 2. Resolve merge conflicts, fix CI failures, address review comments. | ||
| 3. Push fixes. Do not merge/close/force-push. | ||
| 4. Write frontmatter: <!-- acp:session_id=$AGENTIC_SESSION_NAME source=<KEY> last_action=<NOW> retry_count=0 --> | ||
| 5. If broken beyond repair, comment and stop." | ||
|
|
||
| After sending: increment retry_count and update last_action in frontmatter. |
There was a problem hiding this comment.
The circuit breaker currently counts normal churn as failure.
Line 173 increments retry_count after every message, regardless of outcome. Three ordinary rounds of new comments/CI churn will trip ai-needs-human even if the agent is making progress.
Suggested prompt tweak
- After sending: increment retry_count and update last_action in frontmatter.
+ After sending:
+ - If the PR is still broken after your attempted fix, increment retry_count.
+ - If the PR is healthy again, reset retry_count to 0.
+ - Update last_action in frontmatter.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## 2. Circuit breaker | |
| If retry_count >= 3: comment "AI was unable to resolve after 3 attempts. Needs human attention.", | |
| add ai-needs-human label, remove ai-managed label. Skip. | |
| ## 3. Check for changes since last_action | |
| Ignore commits authored by the bot. Only look for: | |
| - New commits by someone other than the bot | |
| - New or updated review comments | |
| - New CI failures | |
| - Merge conflicts from base branch changes | |
| If nothing changed → skip entirely. | |
| ## 4. Something changed — act | |
| - CI failing → send message to existing session with CI logs | |
| - New review comments → send message with the comments | |
| - Merge conflicts → send message to rebase | |
| - New external commits → send message to review and ensure CI passes | |
| ## 5. Session management | |
| Before sending a message, check session status: | |
| - Running → send the message | |
| - Stopped → restart (reuse), then send | |
| - Not found → create new session with this prompt: | |
| "You are maintaining an open pull request. | |
| PR: <URL> Source issue: <KEY> (if known) | |
| 1. Check out the PR branch. | |
| 2. Resolve merge conflicts, fix CI failures, address review comments. | |
| 3. Push fixes. Do not merge/close/force-push. | |
| 4. Write frontmatter: <!-- acp:session_id=$AGENTIC_SESSION_NAME source=<KEY> last_action=<NOW> retry_count=0 --> | |
| 5. If broken beyond repair, comment and stop." | |
| After sending: increment retry_count and update last_action in frontmatter. | |
| ## 2. Circuit breaker | |
| If retry_count >= 3: comment "AI was unable to resolve after 3 attempts. Needs human attention.", | |
| add ai-needs-human label, remove ai-managed label. Skip. | |
| ## 3. Check for changes since last_action | |
| Ignore commits authored by the bot. Only look for: | |
| - New commits by someone other than the bot | |
| - New or updated review comments | |
| - New CI failures | |
| - Merge conflicts from base branch changes | |
| If nothing changed → skip entirely. | |
| ## 4. Something changed — act | |
| - CI failing → send message to existing session with CI logs | |
| - New review comments → send message with the comments | |
| - Merge conflicts → send message to rebase | |
| - New external commits → send message to review and ensure CI passes | |
| ## 5. Session management | |
| Before sending a message, check session status: | |
| - Running → send the message | |
| - Stopped → restart (reuse), then send | |
| - Not found → create new session with this prompt: | |
| "You are maintaining an open pull request. | |
| PR: <URL> Source issue: <KEY> (if known) | |
| 1. Check out the PR branch. | |
| 2. Resolve merge conflicts, fix CI failures, address review comments. | |
| 3. Push fixes. Do not merge/close/force-push. | |
| 4. Write frontmatter: <!-- acp:session_id=$AGENTIC_SESSION_NAME source=<KEY> last_action=<NOW> retry_count=0 --> | |
| 5. If broken beyond repair, comment and stop." | |
| After sending: | |
| - If the PR is still broken after your attempted fix, increment retry_count. | |
| - If the PR is healthy again, reset retry_count to 0. | |
| - Update last_action in frontmatter. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/pr-fixer.yml around lines 141 - 173, The workflow
currently increments retry_count unconditionally after sending a message ("After
sending: increment retry_count..."), which treats normal churn as failure;
change this so retry_count is incremented only when the agent fails to make
progress (no new external commits, no new/updated review comments, CI still
failing, merge conflicts unresolved) or when the agent explicitly reports a
failure state, and reset or leave retry_count unchanged when the agent's action
produced a meaningful change; update the logic around the session management
step that references retry_count/last_action/ai-needs-human (the "After sending"
block and any code that updates frontmatter like <!-- acp:session_id=...
last_action=... retry_count=... -->) to base increments on a success flag or a
comparison of repo state before/after the agent run rather than incrementing
unconditionally.
| - name: Run triage orchestrator | ||
| id: session | ||
| uses: ambient-code/ambient-action@v0.0.3 | ||
| with: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /
find . -name "triage.yml" -type f 2>/dev/null | head -20Repository: ambient-code/platform
Length of output: 113
🏁 Script executed:
cat -n .github/workflows/triage.ymlRepository: ambient-code/platform
Length of output: 4347
🏁 Script executed:
# Check if there are any exposed secrets in the workflow
rg -n "(GITHUB_TOKEN|secrets\.|env\.)" .github/workflows/triage.yml -A 2Repository: ambient-code/platform
Length of output: 351
Pin ambient-action to a full commit SHA.
Version tags like @v0.0.3 are mutable and can be retagged, allowing behavior changes without triggering another review. Replace with the commit SHA. Secrets and permissions are correctly scoped.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/triage.yml around lines 20 - 23, The workflow step named
"Run triage orchestrator" (id: session) currently pins the action as
ambient-code/ambient-action@v0.0.3; replace the tag with the specific full
commit SHA for ambient-code/ambient-action (e.g.,
ambient-code/ambient-action@<full-commit-sha>) so the action is immutably
pinned; update the uses value in that step accordingly and commit the change.
- Add concurrency guards to prevent overlapping runs - Remove unused existing-session lookup step from fix-single - Add --limit 200 to gh pr list in batch orchestrator - Fix circuit breaker: only increment retry_count on failure, reset to 0 when PR is healthy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
pr-fixer.ymlinto two focused workflows with distinct cadencestriage.yml— daily issue discovery and session creationpr-fixer.yml— 30 min PR management with session reuse and circuit breakerChanges
triage.yml(new)ai-triagedto issues, child sessions addai-managedto PRspr-fixer.yml(rewritten)@ambient-fixPR comment (was also label + manual)ai-managedPRslast_action(ignores bot commits)ai-needs-humanlabelfix-batch/fix-eachpatternai-managed(wasagent-managed)Both workflows
ambient-action@v0.0.3with inactivity timeout (timeout: '60'= 60s idle auto-stop)claude-opus-4-6Test plan
triage.yml— verify it discovers issues and creates sessions@ambient-fixon a PR — verifyfix-singletriggerspr-fixer.yml(no PR number) — verify batch orchestrator runsai-managedlabel triggers batch pickup on next cycle🤖 Generated with Claude Code
Summary by CodeRabbit