Skip to content

test_runner: add --prompt mode driving the Firefox DevTools MCP#2197

Open
msujaws wants to merge 1 commit into
mozilla:mainfrom
msujaws:prompt-devtools-mcp-verdict
Open

test_runner: add --prompt mode driving the Firefox DevTools MCP#2197
msujaws wants to merge 1 commit into
mozilla:mainfrom
msujaws:prompt-devtools-mcp-verdict

Conversation

@msujaws

@msujaws msujaws commented Jun 29, 2026

Copy link
Copy Markdown

Add an agent-driven verdict mode, the natural-language equivalent of --command (like git bisect run). --prompt "<instruction>" shells out to the claude CLI in headless mode, pointed at the @mozilla/firefox-devtools-mcp server via a generated --mcp-config, to inspect each build and decide good/bad.

The MCP launches the build itself, so AgentTestRunner installs the build to obtain the binary path (passed via --firefox-path) without starting it. Verdicts are parsed as GOOD/BAD from the agent output.

Gating: the Firefox DevTools MCP supports Firefox 100+, so ranges that predate it are rejected. This happens both up front (resolved good/bad range in cli.validate) and per build (mozversion application_version), configurable via --prompt-min-version (default 100).

A pre-run check (Application.check_prerequisites) fails fast before bisecting if claude/npx are missing or if the instruction is not usable for a good/bad determination.

--prompt is mutually exclusive with --command and --launch. Adds --prompt-headless and --prompt-model. New UnsupportedVersionError.

@msujaws

msujaws commented Jun 29, 2026

Copy link
Copy Markdown
Author

A sample command that can be used to demonstrate this is:

mozregression --prompt "Go to about:preferences and look for an AI controls menu item. Good if it is there, bad if it isn't." --find-fix

Add an agent-driven verdict mode, the natural-language equivalent of --command (like `git bisect run`). `--prompt "<instruction>"` shells out to the `claude` CLI in headless mode, pointed at the @mozilla/firefox-devtools-mcp server via a generated --mcp-config, to inspect each build and decide good/bad.

The MCP launches the build itself, so AgentTestRunner installs the build to obtain the binary path (passed via --firefox-path) without starting it. Verdicts are parsed as GOOD/BAD from the agent output.

Gating: the Firefox DevTools MCP supports Firefox 100+, so ranges that predate it are rejected. This happens both up front (resolved good/bad range in cli.validate) and per build (mozversion application_version), configurable via --prompt-min-version (default 100).

A pre-run check (Application.check_prerequisites) fails fast before bisecting if `claude`/`npx` are missing or if the instruction is not usable for a good/bad determination.

--prompt is mutually exclusive with --command and --launch. Adds --prompt-headless and --prompt-model. New UnsupportedVersionError.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant