Real-time AI observability for coding agents.
AIr monitors your AI coding sessions — context windows, tool calls, token costs, latency, output quality, prompt effectiveness, and drift detection — streaming everything to a live dashboard with built-in data redaction.
npm install @hydrotik/airThis gives you:
airCLI — starts the server + dashboard (single process)- SDK — instrument MCP servers, RAG pipelines, and custom tools
- Event types — TypeScript definitions for all telemetry events
Want to wire your RAG pipeline, MCP server, or custom tools into AIr?
📖 Integration Guide — everything you need:
- Copy-paste prompt for AI coding agents (Claude Code, Codex, ChatGPT) to auto-instrument your services
- HTTP API reference — just POST JSON, no SDK required
- Language examples — JavaScript, Python, Go, and cURL
- Step-by-step walkthrough — config file → verify → instrument → dashboard
Most integrations take under 5 minutes. The only required field is
source.
Standalone (installed via npm):
npx @hydrotik/air # default port 5200
npx @hydrotik/air --port 8080 # custom portMonorepo (development):
pnpm turbo run dev --filter=@hydrotik/airBackground / detached (cross-platform — macOS, Linux, Windows):
# Start as a detached background process (survives terminal closure)
pnpm --filter @hydrotik/air start:detached
# Custom port
pnpm --filter @hydrotik/air start:detached -- --port 8080
# Stop the background server
pnpm --filter @hydrotik/air stop⚠ Troubleshooting: Server won't start from AI agents / CI tools
AI coding agents (pi, Cursor, VS Code tasks) run commands in a process group that gets cleaned up when the shell exits. A simple
node cli.js &will die when the parent tool finishes. Usestart:detachedinstead — it spawns the server as a fully detached process withchild.unref(), writes a PID file to$TMPDIR/air-server.pid, and exits cleanly. Works on all platforms.
Opens:
- Dashboard → http://localhost:5200 (production) or http://localhost:5201 (dev)
- API → http://localhost:5200/api/health
AIr works with multiple AI coding agents. Pick your setup:
cd .pi/extensions/ai-rum-collector && npm installThen /reload in pi. The extension auto-discovers and streams tool calls, turns, token usage, context breakdown, and compaction events via WebSocket. See Pi collector docs.
One command installs the hooks:
npx air-install-claude-codeThis copies two hook scripts into .claude/hooks/ and wires them into .claude/settings.json:
.claude/hooks/
├── air-session-start.js ← SessionStart hook
└── air-post-tool-use.js ← PostToolUse hook
What happens automatically:
- When Claude Code starts a session,
air-session-start.jschecks if the AIr server is running and starts it if needed - Every tool call triggers
air-post-tool-use.js, which POSTs the event to AIr via HTTP - Session IDs are persisted to a temp file so events from the same session are correlated
Manual install (if you prefer not to use the installer):
- Copy
session-start.jsandpost-tool-use.jsfromsrc/collectors/claude-code/into.claude/hooks/ - Add to
.claude/settings.json:
{
"hooks": {
"SessionStart": [{
"hooks": [{
"type": "command",
"command": "node .claude/hooks/air-session-start.js"
}]
}],
"PostToolUse": [{
"hooks": [{
"type": "command",
"command": "node .claude/hooks/air-post-tool-use.js"
}]
}]
}
}Configuration:
| Variable | Default | Description |
|---|---|---|
AIR_URL |
http://localhost:5200 |
AIr server HTTP endpoint |
AIR_PORT |
5200 |
Server port (used by auto-start) |
AIR_ENABLED |
true |
Set to "false" to disable collection |
AIR_AUTOSTART |
true |
Set to "false" to skip auto-starting the server |
What it collects vs Pi:
| Feature | Pi | Claude Code |
|---|---|---|
| Tool call events | ✅ With precise timing | ✅ No duration (PostToolUse only) |
| Token usage per turn | ✅ From API response | ❌ Not exposed in hooks |
| Context breakdown | ✅ Full treemap | |
| Compaction events | ✅ Direct hook | ❌ Not exposed |
| Connection | WebSocket (persistent) | HTTP POST (per-event) |
See Claude Code collector docs.
Run the watcher alongside Codex — it tails session files in real time:
# Terminal 1: Start the AIr server
npx air
# Terminal 2: Start the Codex watcher
npx air-codex-watcherThe watcher monitors ~/.codex/sessions/ for new and updated .jsonl files, maps Codex events to AIr telemetry, and POSTs them to the server.
Options:
# Watch all sessions (live + new)
npx air-codex-watcher
# Watch a specific session ID
npx air-codex-watcher --session 019c7e7f
# Replay a past session into AIr (backfill)
npx air-codex-watcher --replay ~/.codex/sessions/2026/02/20/rollout-2026-02-20T23-39-53-019c7e7f.jsonlWhat it collects:
| Codex Event | AIr Event | Data |
|---|---|---|
session_meta |
session_start |
session_id, model, cwd |
event_msg:task_started |
turn_start |
turn index |
event_msg:task_complete |
turn_end |
tool call count |
event_msg:token_count |
token_usage |
input/output tokens |
response_item:function_call |
tool_call_start |
tool name, call_id, input preview |
response_item:function_call_output |
tool_call_end |
call_id, output, duration, errors |
response_item:custom_tool_call |
tool_call_start/end |
apply_patch, etc. |
compacted |
compaction |
summary length |
Configuration:
| Variable | Default | Description |
|---|---|---|
AIR_URL |
http://localhost:5200 |
AIr server HTTP endpoint |
AIR_ENABLED |
true |
Set to "false" to disable |
CODEX_HOME |
~/.codex |
Codex home directory |
See Codex collector docs.
import { AirClient } from '@hydrotik/air/sdk';
const air = new AirClient({ url: 'ws://localhost:5200/ws/collector' });
air.trace('my_tool', { input: 'data' }, async () => doWork());That's it. Every tool call, turn, and context change streams to the dashboard in real-time.
Sessions from different agents are labeled — the dashboard shows whether each session came from Pi, Claude Code, or Codex.
Total tokens in context window, session cost, tool call count, turn count, compactions, and context utilization percentage — all updating live. Context % changes color dynamically: pink (<80%), yellow/amber (≥80%), red (≥90%).
D3 treemap showing what fills your context — system prompt, user messages, assistant responses, tool results, thinking blocks. Like a webpack bundle analyzer for your LLM context. Hover any segment for a tooltip with token count and percentage.
Area chart tracking context window fill percentage with 80%/95% warning thresholds. Know when you're approaching compaction. Charts fill their full panel height via flex layout.
Cache read / output / input tokens per turn as stacked area chart with gradient fills. See cache efficiency, cost drivers, and compaction sawtooth patterns. Auto-scales Y-axis, hidden X-axis labels for maximum data density.
DevTools-style timeline of tool executions with durations. Spot slow reads, long builds, and error patterns.
Scrolling log of all telemetry events — color-coded by type, newest-first, with inline summaries.
AIr is designed to store metadata only — sizes, durations, counts, rates — not your prompts, code, or conversations.
Set via --redaction flag or AIR_REDACTION_LEVEL env var:
| Level | What's stored | Use case |
|---|---|---|
preview (default) |
Content truncated to 50 chars, sensitive patterns scrubbed | Production — safe observability |
full |
ALL content fields stripped, only numeric metadata remains | Strict compliance environments |
none |
Everything stored as-is | Local development only |
npx air --redaction full # maximum privacy
npx air --redaction preview # balanced (default)
npx air --redaction none # development only ⚠️At preview and full levels, the server automatically detects and redacts:
- API keys and tokens (
sk-...,Bearer ...,AKIA...) - JWT tokens
- Email addresses
- Private key blocks
- Database connection strings with credentials
.env-styleKEY=VALUEpatterns
- No raw prompts stored — prompt tracking uses one-way SHA-256 hashes
- No code content stored — tool I/O stores byte sizes, not actual content
- Redaction at ingestion — data is sanitized before it hits SQLite
- Local-only by default — server binds to localhost, no external network calls
- No telemetry about telemetry — AIr itself sends nothing to external services
AIr tracks timing at multiple granularities:
- Turn latency — time from user message to final response (Pi collector)
- Tool call duration — per-tool execution time with waterfall visualization
- API call latency — MCP and RAG operation timing (SDK)
import { AirClient } from '@hydrotik/air/sdk';
const air = new AirClient({ sessionId: 'my-session' });
// Auto-measure an operation
const result = await air.measureLatency('api_call', async () => {
return await fetch('https://api.example.com/data');
}, { model: 'gpt-4o' });
// Record with phase breakdown
air.recordLatency('turn', 1500, {
ttftMs: 200,
phases: [
{ name: 'thinking', durationMs: 800 },
{ name: 'tool_execution', durationMs: 500 },
{ name: 'response_generation', durationMs: 200 },
],
});| Endpoint | Description |
|---|---|
GET /api/sessions/:id/latency |
Latency stats by operation (avg/min/max) |
GET /api/sessions/:id/latency/timeseries?operation=turn |
Time series |
AIr tracks costs automatically using a built-in pricing table for common models (Claude, GPT-4o, Gemini, Codex, etc.) and supports manual cost recording.
When a token_usage event arrives with zero cost, AIr auto-computes it from the model pricing table. The Pi collector emits cost events automatically.
const air = new AirClient({
sessionId: 'my-session',
budgetLimit: 5.00, // Alert when session cost exceeds $5
});When cumulative cost crosses the budget limit, a cost event with budgetExceeded: true is emitted and stored.
Prices per 1M tokens (USD). Override via custom events if your pricing differs.
| Model | Input | Output | Cache Read |
|---|---|---|---|
| claude-4-sonnet | $3.00 | $15.00 | $0.30 |
| claude-4-opus | $15.00 | $75.00 | $1.50 |
| gpt-4o | $2.50 | $10.00 | $1.25 |
| gpt-4.1 | $2.00 | $8.00 | $0.50 |
| gpt-4.1-mini | $0.40 | $1.60 | $0.10 |
| gemini-2.5-pro | $1.25 | $10.00 | — |
| Endpoint | Description |
|---|---|
GET /api/sessions/:id/cost |
Cost breakdown by model |
GET /api/sessions/:id/cost/timeseries |
Cumulative cost over time |
AIr tracks quality signals for every LLM turn — no content stored, only metrics:
- Tool success rate — what % of tool calls succeeded
- Response token count — verbosity tracking
- Cache hit rate — context efficiency
- Retry detection — did the turn need a correction?
- Response latency — time from prompt to response
- User rating — optional 1-5 star rating via SDK
The Pi collector emits output_eval events automatically on every turn with tool success rate, cache hit rate, response latency, and token counts.
air.recordOutputEval(3, 'claude-4-sonnet', 'anthropic', {
responseTokens: 450,
toolCallCount: 5,
toolErrorCount: 1,
responseLatencyMs: 3200,
cacheHitRate: 0.65,
}, { userRating: 4, tags: ['accurate', 'concise'] });| Endpoint | Description |
|---|---|
GET /api/sessions/:id/evals |
Aggregate quality metrics by model |
GET /api/sessions/:id/evals/timeseries |
Quality signals over time |
Track which prompt variants work best — without storing prompt content.
- Prompts are identified by a SHA-256 hash (first 16 chars) — the raw text never leaves your machine
- Each prompt gets a variant label (
baseline,v2-concise,v3-cot, etc.) - After a task completes, record effectiveness metrics
- Query the API to compare variants by goal achievement, cost, latency, and error rates
import { AirClient, hashPrompt } from '@hydrotik/air/sdk';
const air = new AirClient({ sessionId: 'my-session' });
// Rate after a successful interaction
air.ratePrompt('v2-concise', 'system', systemPromptText, {
goalAchieved: true,
turnsToComplete: 3,
totalTokens: 5000,
totalCost: 0.02,
totalLatencyMs: 15000,
toolErrorRate: 0,
requiredCompaction: false,
}, 4); // 4/5 stars
// Compare variants via API
// GET /api/prompts → all variants ranked by goal rate
// GET /api/prompts?hash=abc123 → variants for specific prompt| Endpoint | Description |
|---|---|
GET /api/prompts |
All prompt variants ranked by effectiveness |
GET /api/prompts?hash=<hash> |
Compare variants of a specific prompt |
GET /api/prompts/:variant |
All ratings for a specific variant |
AIr automatically detects when model behavior changes — latency spikes, cost increases, error rate jumps, or token usage shifts.
- The server maintains rolling baselines for key metrics (window of 50 samples)
- When a new value deviates beyond the threshold, a
driftevent is emitted - Drift events are stored and queryable for post-mortem analysis
| Metric | What it tracks |
|---|---|
latency |
Tool call and turn duration |
cost |
Per-turn cost |
token_usage |
Total tokens per turn |
output_tokens |
Response verbosity |
error_rate |
Tool failure rate |
cache_hit_rate |
Context cache efficiency |
| Severity | Default Deviation |
|---|---|
info |
≥25% from baseline |
warning |
≥50% from baseline |
critical |
≥100% from baseline |
| Endpoint | Description |
|---|---|
GET /api/drift |
Recent drift events (all sessions) |
GET /api/drift?session=<id> |
Drift events for a session |
GET /api/drift/summary |
Drift counts by metric and severity |
🚀 Quick integration? See the Integration Guide for step-by-step instructions and a prompt you can paste into any AI coding agent to auto-instrument your services.
AIr supports any RAG system through three integration paths — from zero-code config to full SDK instrumentation.
Drop a .air.json in your project root to register your RAG providers. The dashboard shows them immediately — even before data flows.
{
"providers": {
"rag": [
{
"name": "product-search",
"type": "qdrant",
"description": "Product catalog vector search",
"embeddingModel": "text-embedding-3-small",
"dimensions": 1536
},
{
"name": "docs-kb",
"type": "pinecone",
"description": "Documentation knowledge base"
}
],
"mcp": [
{ "name": "design-mcp", "description": "Design system MCP server" }
]
},
"redaction": "preview",
"budgetLimit": 10.00
}The server reads this config on startup and registers providers in the dashboard's Integrations panel with status indicators (active/inactive/never seen).
Your RAG system — Python, Go, Rust, whatever — POSTs simple JSON to dedicated endpoints. No SDK needed, no full event schema required. Only source is mandatory.
Log a retrieval:
curl -X POST http://localhost:5200/api/rag/retrieval \
-H 'Content-Type: application/json' \
-d '{
"source": "product-search",
"query": "red running shoes",
"resultCount": 10,
"topScore": 0.92,
"durationMs": 45
}'Log an embedding:
curl -X POST http://localhost:5200/api/rag/embedding \
-H 'Content-Type: application/json' \
-d '{
"source": "product-search",
"model": "text-embedding-3-small",
"inputTokens": 150,
"durationMs": 12,
"dimensions": 1536
}'Log an indexing operation:
curl -X POST http://localhost:5200/api/rag/index \
-H 'Content-Type: application/json' \
-d '{
"source": "docs-kb",
"documentCount": 500,
"totalTokens": 250000,
"durationMs": 3200
}'Register a provider at runtime:
curl -X POST http://localhost:5200/api/providers/rag \
-H 'Content-Type: application/json' \
-d '{ "name": "my-rag", "type": "custom", "description": "My RAG pipeline" }'The server auto-fills defaults from .air.json config (e.g., embedding model, dimensions) and applies redaction before storage.
For TypeScript/Node.js RAG systems, use the SDK for automatic tracing with async/await wrappers.
Config-driven (reads .air.json):
import { createRagTracersFromConfig } from '@hydrotik/air/sdk';
// Creates a tracer for each provider in .air.json
const rag = createRagTracersFromConfig({ sessionId: 'my-session' });
// Use by provider name
const results = await rag['product-search'].traceRetrieval('red shoes', async () => {
return await qdrant.search({ vector, limit: 10 });
}, {
extractResults: (r) => ({
count: r.length,
topScore: r[0]?.score,
chunkSizes: r.map(doc => doc.tokenCount),
}),
});Manual setup:
import { createRagTracer } from '@hydrotik/air/sdk';
const rag = createRagTracer('product-search', {
sessionId: 'my-session',
defaultEmbeddingModel: 'text-embedding-3-small',
defaultDimensions: 1536,
});
await rag.traceRetrieval('query', fetchFn);
await rag.traceEmbedding('text-embedding-3-small', 150, embedFn);
await rag.traceIndex(500, 250000, indexFn);| Panel | What it shows |
|---|---|
| Integrations | All registered providers with status (active/idle/never seen), type, event count, last seen |
| RAG Pipeline | Stats table: source, type, call count, avg latency, avg results, relevance scores, token volumes |
| Live Event Feed | Real-time stream of rag_retrieval, rag_embedding, rag_index events |
| Drift Detection | Alerts when RAG latency, error rate, or result quality shifts from baseline |
The type field in .air.json is for display only — AIr works with any backend:
| Type | Icon |
|---|---|
pinecone |
🌲 |
qdrant |
🔷 |
weaviate |
🕸 |
chroma |
🎨 |
pgvector |
🐘 |
milvus |
🔬 |
custom |
⚙️ |
| Endpoint | Description |
|---|---|
GET /api/providers |
All registered RAG + MCP providers with status |
GET /api/providers/rag |
RAG providers only |
POST /api/providers/rag |
Register a new RAG provider at runtime |
POST /api/rag/retrieval |
Log a retrieval (simplified — only source required) |
POST /api/rag/embedding |
Log an embedding generation |
POST /api/rag/index |
Log a document indexing operation |
Create .air.json or air.config.json in your project root. AIr searches up to 5 parent directories.
{
"providers": {
"rag": [{ "name": "my-rag", "type": "qdrant" }],
"mcp": [{ "name": "my-mcp" }]
},
"redaction": "preview",
"budgetLimit": 10.00,
"port": 5200
}| Variable | Default | Description |
|---|---|---|
AIR_URL |
ws://localhost:5200/ws/collector |
AIr server WebSocket endpoint |
AIR_ENABLED |
true |
Set to "false" to disable collection |
AIR_REDACTION_LEVEL |
preview |
Data redaction: none, preview, full |
Telemetry persists to SQLite at ~/.hydrotik/air/telemetry.db (WAL mode). Delete the file to reset.
| Port | Service |
|---|---|
| 5200 | AIr server (Fastify + WebSocket + REST API) |
| 5201 | AIr dashboard (Vite dev server, proxies to 5200) |
Ports are configured in @hydrotik/config (packages/hy-config/src/ports.ts).
AIr ships a lightweight SDK for instrumenting anything that talks to an LLM — MCP servers, RAG pipelines, custom tools, or your own agent framework.
Wrap an MCP client to auto-instrument all callTool, readResource, and getPrompt calls:
import { Client } from '@modelcontextprotocol/sdk/client';
import { instrumentMcp } from '@hydrotik/air/sdk';
const client = new Client({ name: 'my-app', version: '1.0' });
// Proxy wraps all MCP methods with telemetry
const instrumented = instrumentMcp(client, 'my-mcp-server', {
sessionId: 'my-session',
});
// All calls are now auto-traced in the AIr dashboard
const result = await instrumented.callTool('search', { query: 'hello' });Or use the manual tracer for more control:
import { createMcpTracer } from '@hydrotik/air/sdk';
const mcp = createMcpTracer('my-server');
const result = await mcp.traceToolCall('search', { query: 'hello' }, async () => {
return await myMcpClient.callTool('search', { query: 'hello' });
});Instrument vector DB queries, embedding generation, and document indexing:
import { createRagTracer } from '@hydrotik/air/sdk';
const rag = createRagTracer('pinecone', { sessionId: 'my-session' });
// Trace a retrieval with result extraction
const results = await rag.traceRetrieval('search query', async () => {
return await pinecone.query({ vector, topK: 5 });
}, {
extractResults: (r) => ({
count: r.matches.length,
topScore: r.matches[0]?.score,
chunkSizes: r.matches.map(m => m.metadata.tokenCount),
}),
});
// Trace embedding generation
const embedding = await rag.traceEmbedding('text-embedding-3-small', 150, async () => {
return await openai.embeddings.create({ model: 'text-embedding-3-small', input: text });
}, { dimensions: 1536 });
// Trace document indexing
await rag.traceIndex(100, 50000, async () => {
return await pinecone.upsert(vectors);
});For anything not covered by MCP or RAG helpers, use the base AirClient:
import { AirClient } from '@hydrotik/air/sdk';
const air = new AirClient({
sessionId: 'my-session',
provider: 'my-custom-tool',
});
// Trace any async operation
const result = await air.trace('database_query', { table: 'users', filter: 'active' }, async () => {
return await db.query('SELECT * FROM users WHERE active = true');
});
// Or emit raw events
air.emit({
type: 'custom',
provider: 'my-tool',
eventName: 'cache_hit',
data: { key: 'user:123', ttl: 300 },
});MCP and RAG panels appear automatically in the dashboard when data from those providers flows in. No configuration needed — the dashboard detects event types and renders the appropriate panels:
- MCP Servers — table of server/method/tool stats with call counts, avg/min/max latency, error rates
- RAG Pipeline — table of source/type stats with retrieval counts, avg relevance scores, token volumes
All endpoints return JSON. All ingested data is subject to server-side redaction.
| Endpoint | Description |
|---|---|
| Server | |
GET /api/health |
Server uptime, connected clients, redaction level |
GET /api/config |
Server configuration and enabled features |
| Sessions | |
GET /api/sessions |
List all sessions with summary stats |
GET /api/sessions/:id |
Single session summary |
GET /api/sessions/:id/events |
All events for a session |
GET /api/events/recent |
Recent events across all sessions |
| Tools | |
GET /api/sessions/:id/tool-calls |
Tool call records with timing |
GET /api/sessions/:id/tool-stats |
Per-tool aggregate stats (count, avg/min/max ms, errors) |
| Context | |
GET /api/sessions/:id/context |
Context utilization snapshots over time |
GET /api/sessions/:id/context/latest |
Latest context breakdown with segments |
| Latency | |
GET /api/sessions/:id/latency |
Latency stats by operation |
GET /api/sessions/:id/latency/timeseries |
Latency time series (optional ?operation=) |
| Cost | |
GET /api/sessions/:id/cost |
Cost breakdown by model |
GET /api/sessions/:id/cost/timeseries |
Cumulative cost over time |
| Quality | |
GET /api/sessions/:id/evals |
Output evaluation stats by model |
GET /api/sessions/:id/evals/timeseries |
Quality signals over time |
| Prompts | |
GET /api/prompts |
All prompt variants ranked by effectiveness |
GET /api/prompts?hash=<hash> |
Compare variants of a specific prompt |
GET /api/prompts/:variant |
All ratings for a variant |
| Drift | |
GET /api/drift |
Recent drift events (optional ?session=) |
GET /api/drift/summary |
Drift counts by metric and severity |
| Integrations | |
GET /api/sessions/:id/mcp-stats |
MCP call stats grouped by server/method/tool |
GET /api/sessions/:id/rag-stats |
RAG stats grouped by source/type |
GET /api/sessions/:id/providers |
Event type summary for all providers |
GET /api/providers |
Registered RAG + MCP providers with status |
GET /api/providers/rag |
RAG providers only |
POST /api/providers/rag |
Register a new RAG provider at runtime |
| RAG Ingest (simplified) | |
POST /api/rag/retrieval |
Log a retrieval — only source required |
POST /api/rag/embedding |
Log an embedding — only source required |
POST /api/rag/index |
Log an indexing op — only source required |
| Ingestion (full events) | |
POST /api/ingest |
Ingest a single event (redacted before storage) |
POST /api/ingest/batch |
Ingest multiple events at once |
┌──────────────┐ WebSocket ┌──────────────┐ WebSocket ┌──────────────┐
│ Pi Agent │ ─────────────→ │ │ ─────────────→ │ │
│ + Extension │ /ws/collector │ │ /ws/dashboard │ │
└──────────────┘ │ │ │ │
┌──────────────┐ HTTP POST │ AIr Server │ │ Dashboard │
│ Claude Code │ ─────────────→ │ (Fastify) │ │ (React+D3) │
│ + Hooks │ /api/ingest │ │ │ │
└──────────────┘ │ │ │ │
┌──────────────┐ HTTP POST │ │ │ │
│ Codex CLI │ ─────────────→ │ │ │ │
│ + Watcher │ /api/ingest └──────┬───────┘ └──────────────┘
└──────────────┘ │
SQLite DB
~/.hydrotik/air/
telemetry.db
AIr accepts telemetry via two protocols:
- WebSocket (
/ws/collector) — persistent connection for long-lived processes (Pi extension, SDK clients) - HTTP POST (
/api/ingest,/api/ingest/batch) — fire-and-forget for short-lived processes (Claude Code hooks, Codex watcher)
Both paths go through the same TelemetryStore.ingestEvent() — same DB, same broadcast to dashboard clients.
Hooks into pi's event system via ExtensionAPI:
tool_execution_start/tool_execution_end— tool call timing and I/O sizesturn_start/turn_end— LLM roundtrip tracking + token usage from responseagent_start/agent_end— session lifecyclesession_compact— compaction eventsmodel_select— model changesctx.getContextUsage()— real token count from pictx.sessionManager.getBranch()— context breakdown by message categoryctx.getSystemPrompt()— system prompt size
The collector is silent and non-blocking — if the AIr server isn't running, events are dropped without disrupting pi. Reconnects automatically every 5 seconds.
Two Node.js hook scripts that run as short-lived processes on each Claude Code event:
- SessionStart — auto-starts AIr server, emits
session_start, persists session ID to temp file - PostToolUse — emits
tool_call_start+tool_call_endwith correlated IDs, reads context metrics from statusline bridge
Hooks use HTTP POST because they're ephemeral processes — no time to establish a WebSocket.
A long-running watcher that tails ~/.codex/sessions/*.jsonl:
- Watches for file changes via
fs.watch(falls back to 2s polling) - Tracks byte offsets per file to avoid re-processing
- Maps Codex JSONL entries (
session_meta,function_call,task_started, etc.) to AIr events - Extracts wall-time duration from Codex output metadata
- Supports
--replaymode for backfilling historical sessions
- Ingests events via WebSocket, persists to SQLite with WAL mode
- Broadcasts events to connected dashboard clients in real-time
- Serves REST API for historical queries
- Four tables:
sessions,events,tool_calls,context_snapshots
- Connects to server via WebSocket for live updates
- Falls back to REST API for historical data on session switch
- D3.js for treemap and waterfall visualizations
- Recharts for time-series charts
- Styled with
@hydrotik/tokensvia vanilla-extract (dark theme)
@hydrotik/air → Event types, MODEL_PRICING, computeCost, DriftDetector, redaction utils
@hydrotik/air/sdk → AirClient, hashPrompt, instrumentMcp, createRagTracer
@hydrotik/air/server → createServer() for programmatic use
bin:
air→ starts server + serves built dashboard on a single portair-install-claude-code→ installs Claude Code hooks into.claude/air-codex-watcher→ tails Codex session files and streams to AIr
dist/
├── server/ ← Fastify server + CLI (ESM)
│ ├── cli.js ← npx entry point
│ └── index.js ← createServer() export
├── sdk/ ← SDK for instrumentation (ESM + CJS + DTS)
│ ├── index.js
│ ├── index.cjs
│ └── index.d.ts
├── shared/ ← Event type definitions (ESM + CJS + DTS)
│ ├── index.js
│ ├── index.cjs
│ └── index.d.ts
└── dashboard/ ← Pre-built React SPA
├── index.html
└── assets/ ← JS + CSS bundles (~650KB gzip: ~190KB)
Runtime dependencies: fastify, better-sqlite3, ws, @fastify/cors, @fastify/static, @fastify/websocket
Dashboard (React, D3, Recharts, vanilla-extract) is pre-built at publish time — zero React dependency at runtime.
| Layer | Technology |
|---|---|
| Server | Fastify 5, better-sqlite3, @fastify/websocket, data redaction, drift detection |
| Dashboard | React 19, Vite 6, D3.js 7, Recharts 2 (pre-built) |
| Styling | vanilla-extract, @hydrotik/tokens (compiled to CSS) |
| Collectors | Pi ExtensionAPI (WebSocket), Claude Code hooks (HTTP), Codex watcher (HTTP) |
| SDK | WebSocket (ws), crypto (prompt hashing), zero external deps |
| Storage | SQLite 3 (WAL mode), ~/.hydrotik/air/telemetry.db |
| Security | 3-level content redaction, SHA-256 prompt hashing, sensitive pattern scrubbing |
Dashboard shows "Reconnecting…"
The AIr server isn't running. Start it with npx @hydrotik/air or pnpm turbo run dev --filter=@hydrotik/air.
"0 sessions" after reload The collector connects to the server async. Send a message in pi — the first tool call or turn will create a session.
Context % doesn't match pi footer
Delete ~/.hydrotik/air/telemetry.db to clear stale data, restart the server, then /reload in pi.
Pi extension not loading
Check that npm install was run inside .pi/extensions/ai-rum-collector/ (the ws package must be in node_modules). Run /reload in pi after fixing.
Claude Code hooks not firing
Verify .claude/settings.json has the hook entries under hooks.SessionStart and hooks.PostToolUse. Run npx air-install-claude-code again to re-install. Check that the hook scripts exist at .claude/hooks/air-session-start.js and .claude/hooks/air-post-tool-use.js.
Claude Code sessions not correlating
The SessionStart hook persists a session ID to $TMPDIR/air-claude-code/session.json. If PostToolUse events show up as separate sessions, check that the temp directory is writable and both hooks run in the same OS user context.
Codex watcher not seeing sessions
Verify ~/.codex/sessions/ exists and contains .jsonl files. Run npx air-codex-watcher --replay <file> on a specific file to test the pipeline. Check AIR_URL if the server is on a non-default port.
Codex watcher missing events
The watcher processes the 3 most recent files on startup. Older sessions need --replay to backfill. If fs.watch isn't working (some network filesystems), the watcher falls back to 2s polling automatically.
MIT
