Skip to content

Monthly Pathfinder gap-analysis automation#96

Open
jpr5 wants to merge 3 commits into
mainfrom
blitz/pathfinder-gaps/integration-automation
Open

Monthly Pathfinder gap-analysis automation#96
jpr5 wants to merge 3 commits into
mainfrom
blitz/pathfinder-gaps/integration-automation

Conversation

@jpr5
Copy link
Copy Markdown
Contributor

@jpr5 jpr5 commented Jun 6, 2026

Summary

  • Adds a monthly GitHub Actions workflow that pulls Pathfinder analytics, clusters unanswered / empty-result queries into actionable documentation gaps (deterministic clustering + optional LLM classification), and reports high-severity gaps to Slack with a Notion report.
  • Durable month-over-month state via workflow artifacts (not cache, which evicts at 7 days idle) so the high-severity dedup survives the monthly cadence; persist-before-alert ordering prevents repeat-alert storms.

Required configuration (before enabling)

  • New repo secrets: PATHFINDER_ANALYTICS_TOKEN (analytics API bearer), ANTHROPIC_API_KEY (optional — LLM classification; falls back to deterministic clustering if unset), NOTION_TOKEN (report publishing).
  • Reuses the existing SLACK_WEBHOOK_OSS_ALERTS org secret.
  • Workflow grants actions: read (cross-run artifact download for the prior-state baseline).

Notes

  • scripts/gap-analysis/ is excluded from the npm tarball (.npmignore); @anthropic-ai/sdk is a devDependency (only this CI script uses it, dynamically imported).
  • Typechecked via tsconfig.scripts.json; 93 unit tests cover clustering, LLM-output recovery, dedup/cap, the dry-run contract, and prompt-injection escaping of untrusted query text.
  • Deferred (non-blocking) robustness follow-ups tracked for a follow-up PR: cold-start state-chain log nuance, additional LLM-output recovery edge-cases, Notion >100-block batching, CJK query normalization.

Test plan

  • CI green (typecheck-scripts, prettier, vitest).
  • Provision the 3 secrets, then a manual workflow_dispatch dry-run to verify analytics fetch + report generation without alerting.

jpr5 added 3 commits June 6, 2026 13:34
…ctionable doc gaps

Analyzes the trailing 30 days of analytics queries, clusters low-confidence
and unanswered queries by topic, and uses the Anthropic SDK to summarize each
cluster into a named knowledge gap. Hardened against malformed LLM output:
recovers object-wrapped and multi-array gap shapes order-independently, caps
and de-duplicates same-key gaps so alerts stay bounded, escapes untrusted
query text in the prompt, persists gap state before alerting, and scopes the
state lineage to the default scheduled branch with broken-chain detection.
Includes the cluster helper and full unit-test coverage.
Adds the monthly-gap-analysis GitHub Action (30-day lookback, analytics-API
based) that runs the tool on a cron and alerts on detected gaps. Extends the
static-quality workflow to prettier-check the gap-analysis scripts and adds a
typecheck-scripts job, since the root tsconfig excludes scripts/ and never
type-checked the shipped scheduled scripts.
Adds tsconfig.scripts.json, which type-checks scripts/gap-analysis (and its
tests) without emitting, since the root tsconfig is scoped to src/. Adds the
Anthropic SDK as a devDependency (it is only used by the scheduled script, not
the published runtime) and excludes all tsconfig*.json from the npm tarball so
the new scripts config does not ship.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant