Skip to content

feat(layer): add Vercel markdown-rewrite module#32

Merged
amondnet merged 4 commits into
mainfrom
worktree-agent-a1a3a89e87faf9e6d
May 28, 2026
Merged

feat(layer): add Vercel markdown-rewrite module#32
amondnet merged 4 commits into
mainfrom
worktree-agent-a1a3a89e87faf9e6d

Conversation

@amondnet
Copy link
Copy Markdown
Contributor

@amondnet amondnet commented May 28, 2026

Summary

Adds packages/layer/modules/markdown-rewrite.ts — a build-time Nuxt module that injects rewrite rules into Vercel's build-output config.json so AI agents get raw markdown instead of the SPA shell.

Ports docus upstream commits:

  • 6fd8686bfeat(llms): redirect homepage to /llms.txt
  • 9ceafe6ffeat(llms): add docs page redirection to raw markdown for agents

See docs/docus-upstream-changes.md item #9.

Behaviour

  • No-op on every non-Vercel preset. The check is preset.startsWith('vercel'), so it covers vercel, vercel-edge, vercel-static, etc.
  • On Vercel: read <output.publicDir>/../config.json (the Vercel build-output config), confirm llms.txt was emitted, then unshift route pairs onto routes so they fire before the SPA fallback.
  • Rules emitted when the request carries Accept: text/markdown or User-Agent: curl/*:
    • ^/$/llms.txt
    • ^/<locale>/?$/llms.txt (one per runtimeConfig.public.i18n.locales entry)
    • ^<page>$/raw<page>.md (one per /raw/...md link discovered in llms.txt)
  • Vercel's has array is AND-ed, so OR semantics between the Accept and User-Agent matchers require emitting two rule entries per src → dest pair.
  • Locale codes are regex-escaped before being joined into the alternation, so an exotic code can't break the pattern.

Conventions

  • Module style follows the existing packages/layer/modules/{config,shadcn}.ts (defineNuxtModule + named module).
  • TypeScript only; an inline VercelBuildOutputConfig / VercelRoute / VercelHeaderHas interface describes the parts we touch — no any.
  • Module is registered in packages/layer/nuxt.config.ts immediately after ./modules/shadcn.

Verification

bun install
cd packages/layer && bun typecheck      # no errors in markdown-rewrite.ts
cd ../.. && bun lint                     # clean

# Vercel build
NITRO_PRESET=vercel bun --filter @pleaseai/docs-site build
jq '.routes[] | select(.has // empty)' apps/docs/.vercel/output/config.json

Output (homepage rules only, because the current nuxt-llms config emits links to canonical /docs/... URLs rather than /raw/...md — the per-page rules will start being emitted as soon as llms.txt carries raw-md links):

{
  "src": "^/$",
  "dest": "/llms.txt",
  "headers": { "content-type": "text/markdown; charset=utf-8" },
  "has": [{ "type": "header", "key": "accept", "value": "(.*)text/markdown(.*)" }],
  "continue": true
}
{
  "src": "^/$",
  "dest": "/llms.txt",
  "headers": { "content-type": "text/markdown; charset=utf-8" },
  "has": [{ "type": "header", "key": "user-agent", "value": "curl/.*" }],
  "continue": true
}

Default (cloudflare) build is unaffected — the module bails silently and dist/ is produced as before.

Notes

  • Per-page rules require llms.txt to enumerate /raw/...md URLs. The current site config doesn't, so only the homepage routes are emitted today. This matches the upstream behaviour and avoids accidentally rewriting asset URLs. Wiring nuxt-llms to also emit raw-md URLs is tracked separately.
  • Runtime e2e is Vercel-only (build-output rewrites apply at the edge), so verification here is limited to inspecting the generated config.json.

Follow-up to #27.

Adds a Nuxt module that injects Vercel build-output rewrite rules so
that AI agents (anything sending `Accept: text/markdown` or
`User-Agent: curl/*`) get served raw markdown instead of the SPA shell.

Behaviour:
- No-op on every non-Vercel preset (matches `vercel`, `vercel-edge`,
  `vercel-static`, etc. via `preset.startsWith('vercel')`).
- On Vercel: read `<output.publicDir>/../config.json`, validate that
  `llms.txt` was emitted, and unshift route pairs onto `routes` so
  they fire before the SPA fallback.
- Rules:
  - `/`         -> `/llms.txt`
  - `/<locale>` -> `/llms.txt`   (per `runtimeConfig.public.i18n.locales`)
  - `/<page>`   -> `/raw<page>.md` for every `/raw/...md` link found
    in `llms.txt`
- Vercel's `has` array is AND-ed, so OR semantics between the
  `Accept` and `User-Agent` matchers require emitting two rules per
  `src` -> `dest` pair.
- Locale codes are regex-escaped before being joined into the
  alternation so exotic codes can't break the pattern.

Ports upstream docus commits `6fd8686b` and `9ceafe6f` -- see
`docs/docus-upstream-changes.md` item #9.

Verification:
- `bun lint`        -> clean
- `bun typecheck`   -> no new errors in `markdown-rewrite.ts`
- `NITRO_PRESET=vercel bun --filter @pleaseai/docs-site build`
  injects two routes (homepage / Accept + User-Agent) into
  `.vercel/output/config.json`.
- Default (cloudflare) build is unchanged; module bails silently.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Nuxt module, markdown-rewrite.ts, which automatically injects Vercel rewrite routes to serve raw markdown files (such as llms.txt and raw documentation pages) to AI agents requesting markdown content via specific headers or user agents. The feedback suggests improving the robustness of these routes by decoding URL-encoded characters in the paths using decodeURIComponent and allowing optional trailing slashes in the regex patterns for docs pages.

Comment thread packages/layer/modules/markdown-rewrite.ts
Comment thread packages/layer/modules/markdown-rewrite.ts Outdated
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 2 files

Architecture diagram
sequenceDiagram
    participant Nuxt as Nuxt Build Process
    participant MDModule as markdown-rewrite Module
    participant FS as File System
    participant Vercel as Vercel Build Output Config

    Note over Nuxt,Vercel: Build-time Module Flow (Vercel Preset Only)

    Nuxt->>MDModule: nitro:init hook
    MDModule->>MDModule: Check preset startsWith('vercel')
    alt Non-Vercel preset
        MDModule-->>Nuxt: No-op (return early)
    else Vercel preset
        MDModule->>FS: Read <publicDir>/../config.json
        alt Read fails
            FS-->>MDModule: Error
            MDModule->>MDModule: log.warn (skip rewrites)
        else Parse fails
            MDModule->>MDModule: log.warn (skip rewrites)
        else Success
            FS-->>MDModule: raw JSON
            MDModule->>FS: Read <publicDir>/llms.txt
            alt llms.txt missing
                FS-->>MDModule: Error
                MDModule->>MDModule: log.warn (skip rewrites)
            else llms.txt found
                FS-->>MDModule: llms.txt content
                MDModule->>MDModule: Build route definitions
                Note over MDModule: For each src→dest pair,<br/>create 2 routes (Accept AND curl)
                MDModule->>MDModule: Rule 1: ^/$ → /llms.txt
                MDModule->>MDModule: Rule 2: ^/(locale)/?$ → /llms.txt (per locale)
                MDModule->>MDModule: Rule 3: ^<page>$ → /raw<page>.md (from llms.txt links)
                MDModule->>MDModule: Escape regex in locale codes & page paths
                MDModule->>Vercel: unshift routes into vcConfig.routes
                alt Success
                    Vercel-->>MDModule: Updated config
                    MDModule->>FS: Write config.json
                    FS-->>MDModule: Done
                    MDModule->>MDModule: log.info (injected N routes)
                end
            end
        end
    end
    MDModule-->>Nuxt: Module setup complete
Loading

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread packages/layer/modules/markdown-rewrite.ts Outdated
Comment thread packages/layer/modules/markdown-rewrite.ts Outdated
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 28, 2026

Deploying docs-please with  Cloudflare Pages  Cloudflare Pages

Latest commit: c2d844b
Status:⚡️  Build in progress...

View logs

@amondnet amondnet self-assigned this May 28, 2026
amondnet added 3 commits May 29, 2026 00:48
Apply review feedback from cubic-dev-ai and gemini-code-assist on
the markdown-rewrite module:

- Read locale config from `nuxt.options.i18n` instead of
  `runtimeConfig.public.i18n` (matching the pattern used by the
  `nitro:config` hook in nuxt.config.ts). The i18n module does
  not always populate `runtimeConfig.public.i18n`, so the previous
  source could silently miss locale routes. (cubic)
- Decode the URL pathname with `decodeURIComponent` so that paths
  with URL-encoded characters (e.g. `%20`) match Vercel's router,
  which compares against decoded request pathnames. (gemini)
- Allow an optional trailing slash on docs page route patterns
  (`/?$` instead of `$`) so requests like
  `/en/getting-started/installation/` are matched consistently
  with the per-locale homepage routes. (cubic, gemini)
Resolve modules-array conflict in packages/layer/nuxt.config.ts by
keeping both new module registrations from #29/#30/#31 and this PR's
markdown-rewrite module.
Resolve modules-array conflict in packages/layer/nuxt.config.ts (#33
landed @nuxtjs/mcp-toolkit; keep ./modules/markdown-rewrite from this
branch alongside it).
@amondnet amondnet merged commit 7074801 into main May 28, 2026
1 of 2 checks passed
@amondnet amondnet deleted the worktree-agent-a1a3a89e87faf9e6d branch May 28, 2026 16:05
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant