diff --git a/errors/known-unsolved/npm-registry-e429-e403-rate-limit-github-actions-ci.yml b/errors/known-unsolved/npm-registry-e429-e403-rate-limit-github-actions-ci.yml new file mode 100644 index 0000000..23e9dd9 --- /dev/null +++ b/errors/known-unsolved/npm-registry-e429-e403-rate-limit-github-actions-ci.yml @@ -0,0 +1,118 @@ +id: known-unsolved-114 +title: 'npm Registry Intermittent 429/403 Rate-Limit Errors During GitHub Actions CI (E429 / E403)' +category: known-unsolved +severity: warning +tags: + - npm + - registry + - rate-limit + - 429 + - E429 + - E403 + - intermittent + - network +patterns: + - regex: 'npm error code E429|npm error 429 Too Many Requests.*registry\.npmjs\.org|E429.*registry\.npmjs|registry\.npmjs.*429' + flags: 'i' + - regex: 'npm error code E403.*registry\.npmjs\.org|npm.*E403.*registry\.npmjs' + flags: 'i' + - regex: 'npm error 429.*npmjs|npmjs\.org.*429.*Too Many Requests' + flags: 'i' +error_messages: + - 'npm error code E429' + - 'npm error 429 Too Many Requests - GET https://registry.npmjs.org/' + - 'npm error code E403' + - 'npm error 403 Forbidden - GET https://registry.npmjs.org/' +root_cause: | + GitHub-hosted runners share IP address ranges in NAT pools. When many + concurrent workflows from different users trigger npm installs through the + same egress NAT IP, the npmjs.com registry rate-limits that IP, returning + HTTP 429 Too Many Requests or 403 Forbidden. This is an npm infrastructure + policy, not a GitHub Actions bug. + + The issue manifests intermittently and typically correlates with: + - npm registry incident windows (npmjs status.npmjs.org) + - High concurrency on GitHub-hosted runners sharing IP pools + - Periods when npm temporarily tightens rate limits + + The August 2025 outage (documented in nodejs/node#59620 with 20 reactions) + showed that major npm registry incidents can affect GitHub Actions runners + broadly. Similar E429/E403 spikes appear in GitHub Community discussion + #198588 (June 10, 2026) where multiple users reported simultaneous failures. + + npm does not expose per-runner or per-organization quotas, and GitHub has + no control over npm registry rate limits. The only reliable workarounds are + retry logic, caching, or using an alternative registry mirror. + + Source: nodejs/node#59620 (20 reactions, Aug 2025); GitHub Community #198588 (Jun 2026). +fix: | + No permanent fix available — this is a rate limit imposed by npm.js external to GitHub. + + Workarounds (in order of effectiveness): + 1. **Use actions/cache to cache node_modules or npm cache** — reduces npm + registry hits on repeat runs. + 2. **Add retry with --prefer-online=false** — allows npm to use cached + versions when registry is unavailable. + 3. **Use npm mirror / Verdaccio proxy** — route through a self-hosted npm + proxy that caches packages. + 4. **Switch to pnpm or yarn with lock file** — these package managers often + retry more aggressively. + 5. **Re-run the failed job** — the rate limit window is typically short (seconds + to minutes). + 6. **Monitor npmjs.org status** — check https://status.npmjs.org/ when CI + failures are widespread. +fix_code: + - language: yaml + label: 'Cache npm registry to reduce rate-limit exposure' + code: | + jobs: + install: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - name: Setup Node.js with npm cache + uses: actions/setup-node@v5 + with: + node-version: '20' + cache: 'npm' # caches ~/.npm — avoids re-downloading on cache hit + + - name: Install with retry + run: | + for i in 1 2 3; do + npm ci && break || sleep $((i * 15)) + done + - language: yaml + label: 'Use .npmrc to configure retry and offline mode fallback' + code: | + # .npmrc in repo root + # fetch-retry-mintimeout=20000 + # fetch-retry-maxtimeout=120000 + # fetch-retries=5 + # prefer-offline=false # set to true after first install to use cache + + jobs: + install: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: actions/setup-node@v5 + with: + node-version: '20' + cache: 'npm' + - run: npm ci --prefer-offline +prevention: + - 'Always cache node_modules or npm cache in CI to reduce registry dependency' + - 'Monitor https://status.npmjs.org/ during widespread CI failures before investigating code' + - 'Implement retry loops for npm install steps — transient 429s typically resolve within seconds' + - 'Consider GitHub Packages or a self-hosted Verdaccio registry as a stable npm mirror' + - 'Use lock files (package-lock.json / yarn.lock / pnpm-lock.yaml) to ensure deterministic installs' +docs: + - url: 'https://github.com/nodejs/node/issues/59620' + label: 'nodejs/node#59620: GitHub Actions failing due to npm 403/429 responses (20 reactions, Aug 2025)' + - url: 'https://github.com/orgs/community/discussions/198588' + label: 'GitHub Community #198588: Frequent NPM Network Errors with GitHub Actions (Jun 2026)' + - url: 'https://status.npmjs.org/' + label: 'npm registry status page' + - url: 'https://docs.npmjs.com/cli/v10/using-npm/config#prefer-offline' + label: 'npm docs: prefer-offline configuration' \ No newline at end of file diff --git a/errors/permissions-auth/github-token-missing-vulnerability-alerts-permission-dependabot-api-403.yml b/errors/permissions-auth/github-token-missing-vulnerability-alerts-permission-dependabot-api-403.yml new file mode 100644 index 0000000..58a96d4 --- /dev/null +++ b/errors/permissions-auth/github-token-missing-vulnerability-alerts-permission-dependabot-api-403.yml @@ -0,0 +1,103 @@ +id: permissions-auth-110 +title: 'GITHUB_TOKEN Missing vulnerability-alerts Permission Returns 403 on Dependabot Alerts API' +category: permissions-auth +severity: error +tags: + - GITHUB_TOKEN + - dependabot + - vulnerability-alerts + - permissions + - 403 + - security-events + - dependabot-api +patterns: + - regex: 'vulnerability-alerts.*read|dependabot.*alerts.*403|403.*dependabot.*alerts|vulnerability.alerts.*permission' + flags: 'i' + - regex: 'Resource not accessible by integration.*dependabot|HttpError.*403.*dependabot.*alerts' + flags: 'i' +error_messages: + - 'HttpError: Resource not accessible by integration' + - '403 - {"message":"Resource not accessible by integration","documentation_url":"https://docs.github.com/rest/dependabot/alerts"}' + - 'RequestError: Resource not accessible by integration' +root_cause: | + The GitHub REST API endpoint GET /repos/{owner}/{repo}/dependabot/alerts + (and related Dependabot alerts endpoints) requires the `vulnerability-alerts` + permission scope on the GITHUB_TOKEN. This scope was historically only + available as a GitHub App permission; as of April 2026, it is also available + as a native GITHUB_TOKEN workflow permission. + + Workflows that call the Dependabot alerts API (directly or via octokit/ + actions/github-script) without explicitly granting `vulnerability-alerts: read` + receive a 403 "Resource not accessible by integration" response, even when + the token has `contents: read` or `security-events: read` — those scopes + do not cover Dependabot alert data. + + The GitHub Community discussion #60612 (19 upvotes) highlights this as a + common pain point: developers assume `security-events` covers Dependabot + alerts, but it does not. The `security-events` scope covers code scanning + results; `vulnerability-alerts` is a separate scope for Dependabot. + + Source: GitHub Community #60612; gh-aw PR #27668 (Apr 2026). +fix: | + Add `vulnerability-alerts: read` to the `permissions:` block of your workflow + or job. This scope was added to GITHUB_TOKEN in April 2026. + + Note: If your repository uses the default GITHUB_TOKEN permissions ("read + all, write none" or "write all"), you may still need an explicit permissions + block since `vulnerability-alerts` is not included in legacy shorthand configs. +fix_code: + - language: yaml + label: 'Fix: add vulnerability-alerts: read permission to the job' + code: | + jobs: + check-dependabot: + runs-on: ubuntu-latest + permissions: + contents: read + vulnerability-alerts: read # required for Dependabot alerts API + steps: + - uses: actions/github-script@v7 + with: + script: | + const alerts = await github.rest.dependabot.listAlertsForRepo({ + owner: context.repo.owner, + repo: context.repo.repo, + state: 'open' + }); + console.log(`Open Dependabot alerts: ${alerts.data.length}`); + - language: yaml + label: 'Workflow-level permissions block (applies to all jobs)' + code: | + name: Security Check + on: + schedule: + - cron: '0 8 * * 1' + + permissions: + contents: read + vulnerability-alerts: read # new scope — NOT included in security-events + + jobs: + audit: + runs-on: ubuntu-latest + steps: + - name: List open Dependabot alerts + env: + GH_TOKEN: ${{ github.token }} + run: | + gh api repos/${{ github.repository }}/dependabot/alerts \ + --jq '.[] | select(.state=="open") | .security_advisory.summary' +prevention: + - 'Do not confuse security-events (code scanning) with vulnerability-alerts (Dependabot) — they are separate scopes' + - 'Always run workflows with minimum-privilege permissions; add vulnerability-alerts: read only when accessing Dependabot endpoints' + - 'Check the GitHub Docs permissions reference before calling any REST API endpoint from a workflow' + - 'Use the gh cli (gh api repos/.../dependabot/alerts) with GH_TOKEN to test permissions locally' +docs: + - url: 'https://github.com/orgs/community/discussions/60612' + label: 'GitHub Community #60612: Can''t access Dependabot alerts API via default token (19 upvotes)' + - url: 'https://docs.github.com/en/actions/security-for-github-actions/security-guides/automatic-token-authentication#permissions-for-the-github_token' + label: 'GitHub Docs: GITHUB_TOKEN permissions reference' + - url: 'https://docs.github.com/en/rest/dependabot/alerts' + label: 'GitHub REST API: Dependabot alerts' + - url: 'https://github.com/github/gh-aw/pull/27668' + label: 'gh-aw PR #27668: Add vulnerability-alerts as GITHUB_TOKEN permission scope (Apr 2026)' \ No newline at end of file diff --git a/errors/runner-environment/macos-15-arm64-python-asyncio-multiprocessing-22x-slowdown-image-20260527.yml b/errors/runner-environment/macos-15-arm64-python-asyncio-multiprocessing-22x-slowdown-image-20260527.yml new file mode 100644 index 0000000..3d3bb33 --- /dev/null +++ b/errors/runner-environment/macos-15-arm64-python-asyncio-multiprocessing-22x-slowdown-image-20260527.yml @@ -0,0 +1,82 @@ +id: runner-environment-428 +title: 'macOS-15 arm64 Python asyncio + multiprocessing 22x Performance Regression Since Image 20260527.0100.1' +category: runner-environment +severity: error +tags: + - macos-15 + - arm64 + - python + - asyncio + - multiprocessing + - performance + - regression + - runner-images +patterns: + - regex: 'macos-15.*arm64.*slow|asyncio.*macos.*regression|macos.*asyncio.*slow|python.*multiprocessing.*macos.*stall' + flags: 'i' +error_messages: + - 'macos-15-arm64 20260527.0100.1' + - 'Test execution time increased from 65s to 1411s' + - 'asyncio multiprocessing spawn subprocess 22x slowdown' +root_cause: | + The macos-15-arm64 runner image version 20260527.0100.1 (released May 27, 2026) + introduced a severe performance regression in Python asyncio workloads that spawn + subprocess or multiprocessing child processes. Test suites using asyncio with + per-test subprocess isolation see ~22x slowdown: a suite completing in 65s on + 20260520.0085.1 takes ~1410s (23 minutes) on 20260527.0100.1. + + The regression affects only 20260527.0100.1+ on macos-15-arm64. The install + phase is unaffected (15-16s unchanged); the slowdown is entirely in test + execution. Likely root cause is a change in the sandboxing, security policy, + or process spawning path for arm64 macOS that increases per-subprocess launch + overhead. The exact changed component (Rosetta, SIP policy, XPC, launchd change) + had not been identified at time of filing. + + Source: actions/runner-images#14181 (open June 3, 2026). +fix: | + No complete fix available while the root cause is under investigation. + + 1. Pin to prior image version using the `runs-on` image pinning syntax if + your runner supports it, or use a macOS-14 runner as a fallback. + 2. Track actions/runner-images#14181 for a patched image release. + 3. Reduce per-test subprocess spawning: batch tests into fewer subprocess + invocations, or use in-process test isolation rather than spawning a new + Python interpreter per test. + 4. Use `macos-14` instead of `macos-15`/`macos-latest` temporarily: +fix_code: + - language: yaml + label: 'Temporary workaround: fall back to macos-14 runner' + code: | + jobs: + test: + # Temporarily use macos-14 until runner-images#14181 is resolved + runs-on: macos-14 # was: macos-latest or macos-15-arm64 + steps: + - uses: actions/checkout@v4 + - uses: actions/setup-python@v6 + with: + python-version: '3.13' + - run: pip install -e . + - run: pytest + - language: yaml + label: 'Matrix that includes fallback for debugging' + code: | + jobs: + test: + strategy: + matrix: + os: [macos-14, macos-15-arm64] # compare side by side + runs-on: ${{ matrix.os }} + steps: + - uses: actions/checkout@v4 + - run: python -m pytest +prevention: + - 'Pin to specific macOS runner image versions during stability-sensitive periods' + - 'Monitor actions/runner-images issues for performance regressions before migrating to new image versions' + - 'Add test duration assertions to CI to catch slowdowns automatically' + - 'Prefer in-process test isolation over per-test subprocess spawning for large test suites' +docs: + - url: 'https://github.com/actions/runner-images/issues/14181' + label: 'runner-images#14181: macos-15-arm64 22x performance regression since 20260527.0100.1' + - url: 'https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources' + label: 'GitHub Docs: Supported runners and hardware resources' \ No newline at end of file diff --git a/errors/runner-environment/setup-dotnet-v530-global-json-short-version-rollforward-typeerror.yml b/errors/runner-environment/setup-dotnet-v530-global-json-short-version-rollforward-typeerror.yml new file mode 100644 index 0000000..527a911 --- /dev/null +++ b/errors/runner-environment/setup-dotnet-v530-global-json-short-version-rollforward-typeerror.yml @@ -0,0 +1,88 @@ +id: runner-environment-429 +title: 'setup-dotnet v5.3.0 Crashes With TypeError on global.json Using Short SDK Version (e.g. "8.0")' +category: runner-environment +severity: error +tags: + - setup-dotnet + - global-json + - rollForward + - dotnet + - version + - TypeError +patterns: + - regex: 'Cannot read properties of undefined.*substring|setup-dotnet.*global.*json.*TypeError|global\.json.*rollForward.*undefined' + flags: 'i' + - regex: "setup-dotnet.*5\\.3\\.0.*global|global-json.*8\\.0.*undefined" + flags: 'i' +error_messages: + - 'Error: Cannot read properties of undefined (reading ''substring'')' + - 'Error: Cannot read properties of undefined (reading ''substring'') at getVersionFromGlobalJson' +root_cause: | + actions/setup-dotnet v5.3.0 (released May 2026) added support for global.json + `rollForward` policies (latestFeature, latestMinor, latestMajor, etc.) via PR #538. + The implementation parses the version string by splitting on "." and reading + the feature-band component via `.substring()`. When `global.json` specifies + a short SDK version like `"8.0"` (missing the feature-band/patch component, + e.g. `8.0.100`), the feature-band part is `undefined`, and calling + `.substring()` on it throws: + + TypeError: Cannot read properties of undefined (reading 'substring') + + This was not an error in v5.2.x or earlier because those versions did not + parse the rollForward policy and did not access feature-band components. + + The global.json `sdk.version` field officially requires a fully qualified SDK + version (e.g. `8.0.100`), but many repositories historically used short + versions like `"8.0"` that worked by accident with older action versions. + + Source: actions/setup-dotnet#739 (open May 28, 2026). +fix: | + Update `global.json` to use a fully-qualified SDK version including the + feature-band and patch components: + + Wrong: `"version": "8.0"` + Right: `"version": "8.0.100"` (or the exact SDK version you want) + + You can find available SDK versions at https://dotnet.microsoft.com/download/dotnet + or by running `dotnet --list-sdks` locally. + + Alternatively, pin setup-dotnet to v5.2.0 until a v5.3.x release improves + the error message for invalid version strings. +fix_code: + - language: yaml + label: 'Fixed: global.json with fully-qualified SDK version' + code: | + # global.json — BEFORE (breaks with setup-dotnet v5.3.0+) + # { + # "sdk": { + # "version": "8.0", + # "rollForward": "latestFeature" + # } + # } + + # global.json — AFTER (correct fully-qualified version) + # { + # "sdk": { + # "version": "8.0.100", + # "rollForward": "latestFeature" + # } + # } + - language: yaml + label: 'Temporary pin to setup-dotnet v5.2.0 while updating global.json' + code: | + - name: Setup .NET + uses: actions/setup-dotnet@v5.2.0 # pin until global.json is fixed + with: + global-json-file: global.json +prevention: + - 'Always use fully-qualified .NET SDK versions in global.json (e.g. "8.0.100" not "8.0")' + - 'Validate global.json against the official spec before upgrading setup-dotnet' + - 'Pin action versions in production workflows and review release notes before upgrading' + - 'Use "dotnet --list-sdks" to confirm the exact version string to put in global.json' +docs: + - url: 'https://github.com/actions/setup-dotnet/issues/739' + label: 'setup-dotnet#739: 5.3.0 breaks global.json with latestFeature' + - url: 'https://github.com/actions/setup-dotnet/pull/538' + label: 'setup-dotnet PR #538: Support global.json rollForward latest* variants' + - url: 'https://learn.microsoft.com/en-us/dotnet/core/tools/global-json#version' + label: 'Microsoft Docs: global.json version field spec' \ No newline at end of file diff --git a/errors/runner-environment/windows-11-arm64-az-module-12-to-15-three-major-version-jump.yml b/errors/runner-environment/windows-11-arm64-az-module-12-to-15-three-major-version-jump.yml new file mode 100644 index 0000000..0c5f39f --- /dev/null +++ b/errors/runner-environment/windows-11-arm64-az-module-12-to-15-three-major-version-jump.yml @@ -0,0 +1,96 @@ +id: runner-environment-430 +title: 'windows-11-arm64 Image 20260608 — Az PowerShell Module Jumps Three Major Versions (12.5.0 → 15.6.1)' +category: runner-environment +severity: warning +tags: + - windows-11-arm64 + - az-module + - powershell + - breaking-change + - major-version-bump + - runner-images +patterns: + - regex: 'Az.*12\..*to.*15\.|Az.*15\..*windows.*arm|windows-11-arm.*Az.*major' + flags: 'i' + - regex: 'The term.*is not recognized|Az\..*not.*loaded|Import-Module.*Az\.' + flags: 'i' +error_messages: + - 'Az 12.5.0 → 15.6.1 (windows-11-arm64 image 20260608.69.1)' + - 'BreakingChangeAttributeException' + - 'The term ''*-Az*'' is not recognized as a name of a cmdlet' +root_cause: | + The windows-11-arm64 runner image updated from version 20260525.56.1 to + 20260608.69.1 (released June 8, 2026), jumping the Az PowerShell module from + 12.5.0 directly to 15.6.1 — a span of three major versions. + + All other runners (ubuntu-*, windows-2022, windows-2025, macos-*) went from + Az 14.6.0 → 15.6.1 (one major version bump). Only windows-11-arm64 had this + anomalous three-major-version skip, because the prior Az 13.x and 14.x + versions were never shipped to that runner image, leaving a gap that + accumulated three cycles of breaking changes. + + Breaking changes accumulated across Az 13, 14, and 15 include: + - Az.Compute: renamed parameters in Set-AzVMDiskEncryptionExtension + - Az.Network: removed deprecated Load Balancer SKU aliases + - Az.Storage: changed default TLS version for storage accounts + - Az.Accounts: updated Connect-AzAccount interactive flow + - Az.Resources: renamed New-AzResourceGroupDeployment parameters + + Workflows that passed on other runners (which had incremental upgrades) may + fail immediately on windows-11-arm64 due to the compounded breaking changes. + + Source: actions/runner-images#14207 (June 9, 2026). +fix: | + 1. **Pin the Az module version** at the start of your job to avoid the + system-installed version: + Install-Module -Name Az -RequiredVersion 14.6.0 -Force -AllowClobber -Scope CurrentUser + 2. **Test your workflow on windows-11-arm64 specifically** — don't assume + it passes because it passes on windows-latest or ubuntu-latest. + 3. **Update Az usage** to the new cmdlet names and parameters per the + Az 13, 14, and 15 migration guides. + 4. **Switch to a different runner** (windows-2025/windows-latest) for Az-heavy + workflows if arm64 is not required, as those runners had gradual upgrades. +fix_code: + - language: yaml + label: 'Workaround: pin Az module version to avoid three-major-jump' + code: | + jobs: + deploy: + runs-on: windows-11-arm64 + steps: + - name: Pin Az module version + shell: pwsh + run: | + # Pin to last known-good version before the jump + Install-Module -Name Az -RequiredVersion 14.6.0 ` + -Force -AllowClobber -Scope CurrentUser -Repository PSGallery + Import-Module Az -RequiredVersion 14.6.0 + + - name: Deploy + shell: pwsh + run: | + Connect-AzAccount -Identity + # ... your Az commands + - language: yaml + label: 'Alternative: use windows-latest instead of windows-11-arm64 for Az workflows' + code: | + jobs: + deploy: + # Use windows-latest (gradual Az upgrades) instead of windows-11-arm64 + runs-on: windows-latest # was: windows-11-arm64 + steps: + - uses: azure/login@v2 + - run: Get-AzResourceGroup + shell: pwsh +prevention: + - 'Always specify exact Az module versions in workflows rather than relying on runner-baked versions' + - 'Test on windows-11-arm64 separately — it may lag behind other runners on tool versions' + - 'Subscribe to runner-images release notes (GitHub RSS or Watch) to track breaking tool updates' + - 'Use Install-Module with -RequiredVersion in CI to decouple from image tool versions' +docs: + - url: 'https://github.com/actions/runner-images/issues/14207' + label: 'runner-images#14207: Windows Desktop 11 Arm64 20260608 Image Update' + - url: 'https://learn.microsoft.com/en-us/powershell/azure/migrate-az-14.0.0' + label: 'Microsoft Docs: Migration guide from Az 13.x to Az 14.x' + - url: 'https://learn.microsoft.com/en-us/powershell/azure/migrate-az-15.0.0' + label: 'Microsoft Docs: Migration guide from Az 14.x to Az 15.x' \ No newline at end of file