Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
id: known-unsolved-114
title: 'npm Registry Intermittent 429/403 Rate-Limit Errors During GitHub Actions CI (E429 / E403)'
category: known-unsolved
severity: warning
tags:
- npm
- registry
- rate-limit
- 429
- E429
- E403
- intermittent
- network
patterns:
- regex: 'npm error code E429|npm error 429 Too Many Requests.*registry\.npmjs\.org|E429.*registry\.npmjs|registry\.npmjs.*429'
flags: 'i'
- regex: 'npm error code E403.*registry\.npmjs\.org|npm.*E403.*registry\.npmjs'
flags: 'i'
- regex: 'npm error 429.*npmjs|npmjs\.org.*429.*Too Many Requests'
flags: 'i'
error_messages:
- 'npm error code E429'
- 'npm error 429 Too Many Requests - GET https://registry.npmjs.org/'
- 'npm error code E403'
- 'npm error 403 Forbidden - GET https://registry.npmjs.org/'
root_cause: |
GitHub-hosted runners share IP address ranges in NAT pools. When many
concurrent workflows from different users trigger npm installs through the
same egress NAT IP, the npmjs.com registry rate-limits that IP, returning
HTTP 429 Too Many Requests or 403 Forbidden. This is an npm infrastructure
policy, not a GitHub Actions bug.

The issue manifests intermittently and typically correlates with:
- npm registry incident windows (npmjs status.npmjs.org)
- High concurrency on GitHub-hosted runners sharing IP pools
- Periods when npm temporarily tightens rate limits

The August 2025 outage (documented in nodejs/node#59620 with 20 reactions)
showed that major npm registry incidents can affect GitHub Actions runners
broadly. Similar E429/E403 spikes appear in GitHub Community discussion
#198588 (June 10, 2026) where multiple users reported simultaneous failures.

npm does not expose per-runner or per-organization quotas, and GitHub has
no control over npm registry rate limits. The only reliable workarounds are
retry logic, caching, or using an alternative registry mirror.

Source: nodejs/node#59620 (20 reactions, Aug 2025); GitHub Community #198588 (Jun 2026).
fix: |
No permanent fix available — this is a rate limit imposed by npm.js external to GitHub.

Workarounds (in order of effectiveness):
1. **Use actions/cache to cache node_modules or npm cache** — reduces npm
registry hits on repeat runs.
2. **Add retry with --prefer-online=false** — allows npm to use cached
versions when registry is unavailable.
3. **Use npm mirror / Verdaccio proxy** — route through a self-hosted npm
proxy that caches packages.
4. **Switch to pnpm or yarn with lock file** — these package managers often
retry more aggressively.
5. **Re-run the failed job** — the rate limit window is typically short (seconds
to minutes).
6. **Monitor npmjs.org status** — check https://status.npmjs.org/ when CI
failures are widespread.
fix_code:
- language: yaml
label: 'Cache npm registry to reduce rate-limit exposure'
code: |
jobs:
install:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Setup Node.js with npm cache
uses: actions/setup-node@v5
with:
node-version: '20'
cache: 'npm' # caches ~/.npm — avoids re-downloading on cache hit

- name: Install with retry
run: |
for i in 1 2 3; do
npm ci && break || sleep $((i * 15))
done
- language: yaml
label: 'Use .npmrc to configure retry and offline mode fallback'
code: |
# .npmrc in repo root
# fetch-retry-mintimeout=20000
# fetch-retry-maxtimeout=120000
# fetch-retries=5
# prefer-offline=false # set to true after first install to use cache

jobs:
install:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v5
with:
node-version: '20'
cache: 'npm'
- run: npm ci --prefer-offline
prevention:
- 'Always cache node_modules or npm cache in CI to reduce registry dependency'
- 'Monitor https://status.npmjs.org/ during widespread CI failures before investigating code'
- 'Implement retry loops for npm install steps — transient 429s typically resolve within seconds'
- 'Consider GitHub Packages or a self-hosted Verdaccio registry as a stable npm mirror'
- 'Use lock files (package-lock.json / yarn.lock / pnpm-lock.yaml) to ensure deterministic installs'
docs:
- url: 'https://github.com/nodejs/node/issues/59620'
label: 'nodejs/node#59620: GitHub Actions failing due to npm 403/429 responses (20 reactions, Aug 2025)'
- url: 'https://github.com/orgs/community/discussions/198588'
label: 'GitHub Community #198588: Frequent NPM Network Errors with GitHub Actions (Jun 2026)'
- url: 'https://status.npmjs.org/'
label: 'npm registry status page'
- url: 'https://docs.npmjs.com/cli/v10/using-npm/config#prefer-offline'
label: 'npm docs: prefer-offline configuration'
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
id: permissions-auth-110
title: 'GITHUB_TOKEN Missing vulnerability-alerts Permission Returns 403 on Dependabot Alerts API'
category: permissions-auth
severity: error
tags:
- GITHUB_TOKEN
- dependabot
- vulnerability-alerts
- permissions
- 403
- security-events
- dependabot-api
patterns:
- regex: 'vulnerability-alerts.*read|dependabot.*alerts.*403|403.*dependabot.*alerts|vulnerability.alerts.*permission'
flags: 'i'
- regex: 'Resource not accessible by integration.*dependabot|HttpError.*403.*dependabot.*alerts'
flags: 'i'
error_messages:
- 'HttpError: Resource not accessible by integration'
- '403 - {"message":"Resource not accessible by integration","documentation_url":"https://docs.github.com/rest/dependabot/alerts"}'
- 'RequestError: Resource not accessible by integration'
root_cause: |
The GitHub REST API endpoint GET /repos/{owner}/{repo}/dependabot/alerts
(and related Dependabot alerts endpoints) requires the `vulnerability-alerts`
permission scope on the GITHUB_TOKEN. This scope was historically only
available as a GitHub App permission; as of April 2026, it is also available
as a native GITHUB_TOKEN workflow permission.

Workflows that call the Dependabot alerts API (directly or via octokit/
actions/github-script) without explicitly granting `vulnerability-alerts: read`
receive a 403 "Resource not accessible by integration" response, even when
the token has `contents: read` or `security-events: read` — those scopes
do not cover Dependabot alert data.

The GitHub Community discussion #60612 (19 upvotes) highlights this as a
common pain point: developers assume `security-events` covers Dependabot
alerts, but it does not. The `security-events` scope covers code scanning
results; `vulnerability-alerts` is a separate scope for Dependabot.

Source: GitHub Community #60612; gh-aw PR #27668 (Apr 2026).
fix: |
Add `vulnerability-alerts: read` to the `permissions:` block of your workflow
or job. This scope was added to GITHUB_TOKEN in April 2026.

Note: If your repository uses the default GITHUB_TOKEN permissions ("read
all, write none" or "write all"), you may still need an explicit permissions
block since `vulnerability-alerts` is not included in legacy shorthand configs.
fix_code:
- language: yaml
label: 'Fix: add vulnerability-alerts: read permission to the job'
code: |
jobs:
check-dependabot:
runs-on: ubuntu-latest
permissions:
contents: read
vulnerability-alerts: read # required for Dependabot alerts API
steps:
- uses: actions/github-script@v7
with:
script: |
const alerts = await github.rest.dependabot.listAlertsForRepo({
owner: context.repo.owner,
repo: context.repo.repo,
state: 'open'
});
console.log(`Open Dependabot alerts: ${alerts.data.length}`);
- language: yaml
label: 'Workflow-level permissions block (applies to all jobs)'
code: |
name: Security Check
on:
schedule:
- cron: '0 8 * * 1'

permissions:
contents: read
vulnerability-alerts: read # new scope — NOT included in security-events

jobs:
audit:
runs-on: ubuntu-latest
steps:
- name: List open Dependabot alerts
env:
GH_TOKEN: ${{ github.token }}
run: |
gh api repos/${{ github.repository }}/dependabot/alerts \
--jq '.[] | select(.state=="open") | .security_advisory.summary'
prevention:
- 'Do not confuse security-events (code scanning) with vulnerability-alerts (Dependabot) — they are separate scopes'
- 'Always run workflows with minimum-privilege permissions; add vulnerability-alerts: read only when accessing Dependabot endpoints'
- 'Check the GitHub Docs permissions reference before calling any REST API endpoint from a workflow'
- 'Use the gh cli (gh api repos/.../dependabot/alerts) with GH_TOKEN to test permissions locally'
docs:
- url: 'https://github.com/orgs/community/discussions/60612'
label: 'GitHub Community #60612: Can''t access Dependabot alerts API via default token (19 upvotes)'
- url: 'https://docs.github.com/en/actions/security-for-github-actions/security-guides/automatic-token-authentication#permissions-for-the-github_token'
label: 'GitHub Docs: GITHUB_TOKEN permissions reference'
- url: 'https://docs.github.com/en/rest/dependabot/alerts'
label: 'GitHub REST API: Dependabot alerts'
- url: 'https://github.com/github/gh-aw/pull/27668'
label: 'gh-aw PR #27668: Add vulnerability-alerts as GITHUB_TOKEN permission scope (Apr 2026)'
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
id: runner-environment-428
title: 'macOS-15 arm64 Python asyncio + multiprocessing 22x Performance Regression Since Image 20260527.0100.1'
category: runner-environment
severity: error
tags:
- macos-15
- arm64
- python
- asyncio
- multiprocessing
- performance
- regression
- runner-images
patterns:
- regex: 'macos-15.*arm64.*slow|asyncio.*macos.*regression|macos.*asyncio.*slow|python.*multiprocessing.*macos.*stall'
flags: 'i'
error_messages:
- 'macos-15-arm64 20260527.0100.1'
- 'Test execution time increased from 65s to 1411s'
- 'asyncio multiprocessing spawn subprocess 22x slowdown'
root_cause: |
The macos-15-arm64 runner image version 20260527.0100.1 (released May 27, 2026)
introduced a severe performance regression in Python asyncio workloads that spawn
subprocess or multiprocessing child processes. Test suites using asyncio with
per-test subprocess isolation see ~22x slowdown: a suite completing in 65s on
20260520.0085.1 takes ~1410s (23 minutes) on 20260527.0100.1.

The regression affects only 20260527.0100.1+ on macos-15-arm64. The install
phase is unaffected (15-16s unchanged); the slowdown is entirely in test
execution. Likely root cause is a change in the sandboxing, security policy,
or process spawning path for arm64 macOS that increases per-subprocess launch
overhead. The exact changed component (Rosetta, SIP policy, XPC, launchd change)
had not been identified at time of filing.

Source: actions/runner-images#14181 (open June 3, 2026).
fix: |
No complete fix available while the root cause is under investigation.

1. Pin to prior image version using the `runs-on` image pinning syntax if
your runner supports it, or use a macOS-14 runner as a fallback.
2. Track actions/runner-images#14181 for a patched image release.
3. Reduce per-test subprocess spawning: batch tests into fewer subprocess
invocations, or use in-process test isolation rather than spawning a new
Python interpreter per test.
4. Use `macos-14` instead of `macos-15`/`macos-latest` temporarily:
fix_code:
- language: yaml
label: 'Temporary workaround: fall back to macos-14 runner'
code: |
jobs:
test:
# Temporarily use macos-14 until runner-images#14181 is resolved
runs-on: macos-14 # was: macos-latest or macos-15-arm64
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v6
with:
python-version: '3.13'
- run: pip install -e .
- run: pytest
- language: yaml
label: 'Matrix that includes fallback for debugging'
code: |
jobs:
test:
strategy:
matrix:
os: [macos-14, macos-15-arm64] # compare side by side
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- run: python -m pytest
prevention:
- 'Pin to specific macOS runner image versions during stability-sensitive periods'
- 'Monitor actions/runner-images issues for performance regressions before migrating to new image versions'
- 'Add test duration assertions to CI to catch slowdowns automatically'
- 'Prefer in-process test isolation over per-test subprocess spawning for large test suites'
docs:
- url: 'https://github.com/actions/runner-images/issues/14181'
label: 'runner-images#14181: macos-15-arm64 22x performance regression since 20260527.0100.1'
- url: 'https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources'
label: 'GitHub Docs: Supported runners and hardware resources'
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
id: runner-environment-429
title: 'setup-dotnet v5.3.0 Crashes With TypeError on global.json Using Short SDK Version (e.g. "8.0")'
category: runner-environment
severity: error
tags:
- setup-dotnet
- global-json
- rollForward
- dotnet
- version
- TypeError
patterns:
- regex: 'Cannot read properties of undefined.*substring|setup-dotnet.*global.*json.*TypeError|global\.json.*rollForward.*undefined'
flags: 'i'
- regex: "setup-dotnet.*5\\.3\\.0.*global|global-json.*8\\.0.*undefined"
flags: 'i'
error_messages:
- 'Error: Cannot read properties of undefined (reading ''substring'')'
- 'Error: Cannot read properties of undefined (reading ''substring'') at getVersionFromGlobalJson'
root_cause: |
actions/setup-dotnet v5.3.0 (released May 2026) added support for global.json
`rollForward` policies (latestFeature, latestMinor, latestMajor, etc.) via PR #538.
The implementation parses the version string by splitting on "." and reading
the feature-band component via `.substring()`. When `global.json` specifies
a short SDK version like `"8.0"` (missing the feature-band/patch component,
e.g. `8.0.100`), the feature-band part is `undefined`, and calling
`.substring()` on it throws:

TypeError: Cannot read properties of undefined (reading 'substring')

This was not an error in v5.2.x or earlier because those versions did not
parse the rollForward policy and did not access feature-band components.

The global.json `sdk.version` field officially requires a fully qualified SDK
version (e.g. `8.0.100`), but many repositories historically used short
versions like `"8.0"` that worked by accident with older action versions.

Source: actions/setup-dotnet#739 (open May 28, 2026).
fix: |
Update `global.json` to use a fully-qualified SDK version including the
feature-band and patch components:

Wrong: `"version": "8.0"`
Right: `"version": "8.0.100"` (or the exact SDK version you want)

You can find available SDK versions at https://dotnet.microsoft.com/download/dotnet
or by running `dotnet --list-sdks` locally.

Alternatively, pin setup-dotnet to v5.2.0 until a v5.3.x release improves
the error message for invalid version strings.
fix_code:
- language: yaml
label: 'Fixed: global.json with fully-qualified SDK version'
code: |
# global.json — BEFORE (breaks with setup-dotnet v5.3.0+)
# {
# "sdk": {
# "version": "8.0",
# "rollForward": "latestFeature"
# }
# }

# global.json — AFTER (correct fully-qualified version)
# {
# "sdk": {
# "version": "8.0.100",
# "rollForward": "latestFeature"
# }
# }
- language: yaml
label: 'Temporary pin to setup-dotnet v5.2.0 while updating global.json'
code: |
- name: Setup .NET
uses: actions/setup-dotnet@v5.2.0 # pin until global.json is fixed
with:
global-json-file: global.json
prevention:
- 'Always use fully-qualified .NET SDK versions in global.json (e.g. "8.0.100" not "8.0")'
- 'Validate global.json against the official spec before upgrading setup-dotnet'
- 'Pin action versions in production workflows and review release notes before upgrading'
- 'Use "dotnet --list-sdks" to confirm the exact version string to put in global.json'
docs:
- url: 'https://github.com/actions/setup-dotnet/issues/739'
label: 'setup-dotnet#739: 5.3.0 breaks global.json with latestFeature'
- url: 'https://github.com/actions/setup-dotnet/pull/538'
label: 'setup-dotnet PR #538: Support global.json rollForward latest* variants'
- url: 'https://learn.microsoft.com/en-us/dotnet/core/tools/global-json#version'
label: 'Microsoft Docs: global.json version field spec'
Loading
Loading