Bound the AI Moderator rate-limit pre-check pagination — it walks the entire run history and is cancelled by the 15-min job timeout on every invocation.
Severity: P1 — 100% failure; the pre_activation job is cancelled before the agent ever activates. 3/3 invocations in the last 6h.
Problem statement
The Check user rate limit step in the pre_activation job times out and the job is cancelled at ~15 minutes with ##[error]The operation was canceled. The last log line is Fetching page 205 (up to 100 runs per page)... — check_rate_limit.cjs paginates the full workflow run history (200+ pages), skipping runs "created before threshold" one page at a time, and never finishes within the 15-min job timeout. This is not an intentional rate-limit block and not an API/script error — it is unbounded pagination.
Affected workflows and run IDs
- AI Moderator (
.github/workflows/ai-moderator.lock.yml)
Probable root cause
check_rate_limit.cjs enumerates run history forward and filters by created-at threshold in JS instead of stopping pagination once it crosses the time window. As repo run volume grows, page count (now 205+) exceeds what fits in the 15-min job budget → job cancelled.
Proposed remediation
- Stop paginating as soon as the first run older than the rate-limit window is seen (results are already time-ordered), instead of scanning all pages.
- Prefer the GitHub search/list API
created:>=<threshold> server-side filter, or cap per_page×max-pages to a small bounded value.
- Add a hard internal timeout/short-circuit so the check degrades to "allow" (or a fast deterministic verdict) rather than letting the whole job hit the 15-min cancel.
Success criteria / verification
- AI Moderator
pre_activation completes in seconds and the Check user rate limit step no longer paginates past a bounded number of pages.
- Three consecutive AI Moderator triggers complete without a
cancelled pre_activation job.
Parent: #39344. Analyzed runs: 27559255648, 27553836400, 27552434310.
Related to #39344
Generated by 🔍 [aw] Failure Investigator (6h) · 572.8 AIC · ⌖ 11.7 AIC · ⊞ 4.5K · ◷
Bound the AI Moderator rate-limit pre-check pagination — it walks the entire run history and is cancelled by the 15-min job timeout on every invocation.
Severity: P1 — 100% failure; the
pre_activationjob is cancelled before the agent ever activates. 3/3 invocations in the last 6h.Problem statement
The
Check user rate limitstep in thepre_activationjob times out and the job iscancelledat ~15 minutes with##[error]The operation was canceled.The last log line isFetching page 205 (up to 100 runs per page)...—check_rate_limit.cjspaginates the full workflow run history (200+ pages), skipping runs "created before threshold" one page at a time, and never finishes within the 15-min job timeout. This is not an intentional rate-limit block and not an API/script error — it is unbounded pagination.Affected workflows and run IDs
.github/workflows/ai-moderator.lock.yml)Probable root cause
check_rate_limit.cjsenumerates run history forward and filters by created-at threshold in JS instead of stopping pagination once it crosses the time window. As repo run volume grows, page count (now 205+) exceeds what fits in the 15-min job budget → job cancelled.Proposed remediation
created:>=<threshold>server-side filter, or capper_page×max-pages to a small bounded value.Success criteria / verification
pre_activationcompletes in seconds and theCheck user rate limitstep no longer paginates past a bounded number of pages.cancelledpre_activation job.Parent: #39344. Analyzed runs: 27559255648, 27553836400, 27552434310.
Related to #39344