Distributed AI evaluation infrastructure and deployment gating for generative AI systems.
Agent Vigilo turns LLM and agent evaluation into a production runtime: versioned WASM evaluators, durable evaluation runs, worker/coordinator execution, normalized results, and pass/fail gates that can sit in CI or release workflows.
It is built for the parts of AI evaluation that become hard at scale: idempotent distributed work, durable event delivery, evaluator isolation, retry-safe persistence, and auditable results.
- Run evaluations like infrastructure: PostgreSQL-backed state, RabbitMQ work distribution, Rust workers, and deterministic state guards.
- Ship versioned evaluators: publish WASI Preview 2 WebAssembly evaluators with strict WIT contracts.
- Protect the runtime: Wasmtime fuel, memory, timeout, log, and concurrency limits isolate evaluator execution.
- Avoid lost events: durable outbox ledger plus hot delivery queue, RabbitMQ publisher confirms, and idempotency keys.
- Gate deployments: aggregate evaluator results into reproducible pass/fail decisions for agent releases.
- Getting started: run your first evaluation.
- Architecture overview: containers, components, flows, and state diagrams.
- Worker runtime: chunk claiming, evaluator execution, and result persistence.
- Runtime limits: Wasm evaluator sandbox and worker concurrency controls.
- Outbox lifecycle: durable event publication and retry behavior.
- Publishing evaluators: build and publish versioned WASM evaluators.
Rust, Tokio, PostgreSQL, SQLx, RabbitMQ, Wasmtime, WASI Preview 2, WIT, Docusaurus.
Agent Vigilo is an active systems project focused on reliable AI evaluation, LLM evaluation workflows, agent testing, and deployment gates. The implementation favors explicit contracts, durable state transitions, and operational diagrams over black-box orchestration.