This repository is the AG-Claw research workspace. It combines a clean-room backend seed, a temporary web shell, planning documents, and a non-authoritative reference runtime tree used for behavior study and migration analysis.
backend/is the clean-room implementation surface.web/is the temporary AG-Claw operator shell.- root
src/is a reference artifact for study and parity analysis, not the product foundation. mcp-server/is a research utility for browsing the reference source tree safely.
Read these first before extending the system:
docs/agclaw-clean-room-boundary.mddocs/agclaw-capability-gap-matrix.mddocs/agclaw-local-runbook.mddocs/agclaw-mobile-audit.mddocs/agclaw-subsystem-migration-matrix.mddocs/agclaw-replacement-backlog.mddocs/agclaw-naming-inventory.mddocs/agclaw-pretext-spike.mddocs/agclaw-excluded-references.mddocs/agclaw-next-slice-status.mddocs/agclaw-vision-runbook.md
backend/: clean-room HTTP services for chat, orchestration, provider health, and MES research flowsweb/: Next.js UI for local testing and operator workflowsmcp-server/: MCP explorer for the referencesrc/treepromptfoo/: prompt and evaluation harness
These integrations are live today when you run the local stack:
@chenglou/pretextis installed inweb/and drives measured chat input sizing, preview truncation helpers, and virtualized chat message height estimation.promptfoois installed inpromptfoo/and wired for routed local multimodal gate runs.impeccableis installed inweb/as a local UI audit tool with repo presets; it is not part of the production runtime path.- the clean-room backend now exposes hybrid MES retrieval scoring with lexical, metadata, and TF-IDF vector signals.
agency-agentsideas are surfaced as selectable agent packs in the settings UI, buddy flows, research orchestration, and persisted artifacts.OpenVikingideas are surfaced as memory namespaces and commit modes in the settings UI, orchestration metadata, and investigation bundles.MiroFishideas are surfaced as staged workflow modes in the research workbench and orchestration outputs.nanochatideas are surfaced as thenano-chatpack, nano briefs, and carry-forward bundle summaries.- the chat shell now switches to mobile-specific navigation, composer, and file-viewer overlays on small screens.
hereticremains intentionally excluded.
One command for the full local stack, including the routed vision runtime on 127.0.0.1:11500:
Set-Location (git rev-parse --show-toplevel)
.\scripts\start-agclaw-local.ps1 -EnableRoutedVision -StartLiteLLMAdd -PullVisionModels the first time on a new machine to fetch qwen2.5vl:7b and gemma3:4b. Add -RunVisionGate if you want the promptfoo routed multimodal gate to execute after the stack comes up.
Use two terminals if you want the real clean-room backend behind the temporary web shell.
Terminal A:
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot
$env:PYTHONPATH = (Resolve-Path ./backend)
python -m agclaw_backend.server --host 127.0.0.1 --port 8008Terminal B:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm install
$env:AGCLAW_BACKEND_URL = "http://127.0.0.1:8008"
$env:AGCLAW_WEB_ROOT = ".."
npm run devThen open http://127.0.0.1:3000.
For Android over Wi-Fi, bind the web server to the LAN interface instead:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm install
$env:AGCLAW_BACKEND_URL = "http://127.0.0.1:8008"
$env:AGCLAW_WEB_ROOT = ".."
npm run dev:lanThen open http://<your-pc-lan-ip>:3000 on the Android device.
For Android over USB with lower latency, keep the backend and web UI running locally and set up adb reverse:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm run android:usb:setupThen open http://127.0.0.1:3000 in Chrome on the Android device. The helper also reverses 8008, 3100, and 11500 for backend, mock-stack, and vision-runtime access.
If you only want a fast mock-backed browser demo, skip the backend terminal and run:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm install
node .\scripts\start-playwright-stack.mjs 8108That serves the UI at http://127.0.0.1:3100.
- Open
http://127.0.0.1:3000after the stack is up. - Go to
Settings -> Integrationsto toggle Pretext measurement and choose the active agent pack, memory namespace, commit mode, and workflow mode. - Open the buddy panel to see pack-aware prompt suggestions and active memory/workflow context.
- Open
Research tools -> Orchestrateto see the resolved orchestration route, active pack metadata, current investigation bundle, persisted bundles, and bundle export/share actions. - Use the chat surface to exercise the Pretext-backed composer sizing and virtualized message rendering.
Full local stack with routed vision and LiteLLM:
Set-Location (git rev-parse --show-toplevel)
.\scripts\start-agclaw-local.ps1 -EnableRoutedVision -StartLiteLLMFirst-time full stack with model pulls and promptfoo multimodal gate:
Set-Location (git rev-parse --show-toplevel)
.\scripts\start-agclaw-local.ps1 -EnableRoutedVision -PullVisionModels -StartLiteLLM -RunVisionGateMock-backed browser demo:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
node .\scripts\start-playwright-stack.mjs 8108Web verification:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm run type-check
npm run e2eBackend verification:
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot
$env:PYTHONPATH = (Resolve-Path ./backend)
python -m unittest discover -s backend/testsPromptfoo and UI audit helpers:
Set-Location (git rev-parse --show-toplevel)
bun run promptfoo:latest
bun run audit:web:common
bun run audit:web:chat-shell
bun run audit:web:buddy
bun run audit:web:settings
bun run audit:web:url:homePowerShell:
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot
$env:PYTHONPATH = (Resolve-Path ./backend)
python -m agclaw_backend.server --host 127.0.0.1 --port 8008Health check:
Invoke-WebRequest http://127.0.0.1:8008/health | Select-Object -Expand ContentKey endpoints:
GET /healthGET /api/provider-healthPOST /api/chatPOST /api/orchestrateGET /api/orchestration/historyPOST /api/mes/retrievePOST /api/mes/log-slimPOST /api/mes/interpret-screen
PowerShell:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm install
$env:AGCLAW_BACKEND_URL = "http://127.0.0.1:8008"
$env:AGCLAW_WEB_ROOT = ".."
npm run devOpen http://127.0.0.1:3000.
For Android devices on the same Wi-Fi network, run:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
$env:AGCLAW_BACKEND_URL = "http://127.0.0.1:8008"
$env:AGCLAW_WEB_ROOT = ".."
npm run dev:lanFor Android over USB, connect the phone, confirm it in adb devices, and run:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm run android:usb:setupAfter that, open http://127.0.0.1:3000 on the device. If you want desktop-side inspection, use chrome://inspect/#devices in desktop Chrome.
If you want a quick mock-backed stack for browser testing, the Playwright launcher will start the backend in mock mode automatically:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm install
node .\scripts\start-playwright-stack.mjs 8108That serves the UI at http://127.0.0.1:3100.
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm install
npm run e2eThe Playwright configuration builds the web app and launches the local mock backend automatically.
Optional Selenium smoke tests against a running UI:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
$env:AGCLAW_UI_URL = "http://127.0.0.1:3000"
npm run selenium:smoke
npm run selenium:smoke:mobileOptional Appium smoke test against a USB-connected Android device running Chrome:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
$env:AGCLAW_UI_URL = "http://127.0.0.1:3000"
$env:APPIUM_SERVER_URL = "http://127.0.0.1:4723"
npm run appium:android:smokeThat Appium flow expects an Appium server and an authorized Android device. Set ANDROID_UDID if more than one device is connected.
The same Appium script also works with an Android emulator if it appears in adb devices and has Chrome available.
Optional BrowserStack cloud run using the same mobile smoke flow:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
$env:BROWSERSTACK_USERNAME = "<username>"
$env:BROWSERSTACK_ACCESS_KEY = "<access-key>"
$env:BROWSERSTACK_LOCAL = "true"
$env:AGCLAW_UI_URL = "http://bs-local.com:3000"
npm run browserstack:mobile:smokeIf the app is already publicly reachable, set BROWSERSTACK_LOCAL=false and point AGCLAW_UI_URL at that public URL instead.
For a local Android emulator workflow, start an emulator from Android Studio or the emulator CLI, confirm it appears in adb devices, run an Appium server, and then use the same script:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
$env:AGCLAW_UI_URL = "http://10.0.2.2:3000"
$env:APPIUM_SERVER_URL = "http://127.0.0.1:4723"
npm run appium:android:smokeUse 10.0.2.2 from the emulator when you are not using adb reverse; keep 127.0.0.1 for USB devices after npm run android:usb:setup.
The MCP explorer is for research against the reference src/ tree. It is not part of the clean-room runtime.
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "mcp-server")
npm install
npm run build
$env:AGCLAW_REFERENCE_SRC_ROOT = (Resolve-Path ../src)
node .\dist\src\index.jsThe explorer also accepts legacy CLAUDE_CODE_SRC_ROOT for compatibility, but new setups should use AGCLAW_REFERENCE_SRC_ROOT.
Backend:
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot
$env:PYTHONPATH = (Resolve-Path ./backend)
python -m unittest discover -s backend/testsWeb:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "web")
npm run buildMCP explorer:
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "mcp-server")
npm run buildIf you want one control plane in front of multiple models, point the UI at LiteLLM through the existing openai-compatible provider.
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot
.\scripts\start-litellm.ps1Then in the AG-Claw settings UI:
- Provider:
openai-compatible - API URL:
http://127.0.0.1:4000 - API key:
agclaw-dev-keyby default, or your LiteLLM bearer token if you changed it
That same gateway URL also works with the local benchmark script below.
The starter config lives at litellm/agclaw-config.local.yaml and exposes qwen2.5:3b, gemma3:1b, and qwen2.5vl:3b through one OpenAI-compatible endpoint.
For the routed multimodal local workflow, the same config also exposes:
vision-caption-local->qwen2.5vl:7bon127.0.0.1:11500vision-hmi-local->qwen2.5vl:7bon127.0.0.1:11500vision-ocr-local->gemma3:4bon127.0.0.1:11500
The PowerShell launchers resolve litellm.exe from the repo .venv first, then fall back to PATH.
Useful local reference-tool commands from the repo root:
bun run promptfoo:latest
bun run audit:web-ui -- --help
bun run audit:web:common
bun run audit:web:chat-shell
bun run audit:web:buddy
bun run audit:web:settingsThe first checks the currently published promptfoo version with npm view, which is the practical non-interactive equivalent of confirming what npx promptfoo@latest will pull on this machine. The Impeccable commands run targeted audits for the AG-Claw chat shell, buddy surfaces, and settings surfaces without adding anything to the runtime path.
If the web server is already running locally, you can also audit the live home page with:
bun run audit:web:url:homeThat live URL audit uses a repo-owned Puppeteer wrapper around Impeccable's browser detector so it works on Windows paths; it expects a local Chrome or Edge install.
Reference-integration design spikes are checked in for future subsystem work:
docs/agclaw-agency-agents-spike.mddocs/agclaw-openviking-spike.mddocs/agclaw-mirofish-spike.md
The promptfoo pack now includes an allowlisted Hugging Face ingestion path for evaluation assets.
$repoRoot = git rev-parse --show-toplevel
Set-Location (Join-Path $repoRoot "promptfoo")
npm install
npm run import:hf-assets -- --dataset rico-screen2words --limit 20Imported samples are written under promptfoo/cases/hf/ and include governance metadata from the allowlist manifest.
If Hugging Face traffic is intercepted by a corporate proxy, set AGCLAW_HF_CA_FILE to the proxy PEM bundle before running the importer. Use AGCLAW_HF_ALLOW_INSECURE_TLS=1 only as a temporary fallback.
Build and run the multimodal promptfoo packs after importing both rico-screen2words and ocr-vqa samples:
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot
.\scripts\start-agclaw-local.ps1 -EnableRoutedVision -PullVisionModels -RunVisionGateIf you want the manual path instead, point caption and HMI to qwen2.5vl:7b on 127.0.0.1:11500, and OCR to gemma3:4b on the same port.
To compare the small local assistant defaults from this slice:
$repoRoot = git rev-parse --show-toplevel
Set-Location $repoRoot
node .\scripts\benchmark-local-assistants.mjs --models qwen2.5:3b,gemma3:1bUse --api-base http://127.0.0.1:4000/v1/chat/completions to run the same pass through LiteLLM instead of Ollama.
backend/README.mdfor backend endpoint and benchmark detailsmcp-server/README.mdfor MCP explorer usagepromptfoo/README.mdfor the eval harness and routed multimodal gatedocs/agclaw-local-runbook.mdfor the consolidated local startup pathdocs/agclaw-pretext-spike.mdfor the scoped Pretext integration notedocs/agclaw-vision-runbook.mdfor screen interpretation validationdocs/agclaw-excluded-references.mdfor what is mapped versus intentionally excludeddocs/agclaw-next-slice-status.mdfor the current merged-state snapshotdocs/repo-status.mdfor current migration status
No new AG-Claw operator-facing surface should introduce Claude-branded product naming. Upstream provider identifiers such as anthropic or model ids such as claude-sonnet-* remain acceptable only where they describe compatibility with external APIs.