hyperlight-dev · simongdavies · Apr 29, 2026 · Apr 29, 2026 · Apr 29, 2026 · Apr 29, 2026
diff --git a/docs/MCP.md b/docs/MCP.md
@@ -29,7 +29,7 @@ Add servers to `~/.hyperagent/config.json` (same format as VS Code's `mcp.json`)
 Or use the setup script:
 
 ```bash
-just mcp-setup-everything   # sets up the MCP everything test server
+hyperagent --mcp-setup-everything   # sets up the MCP everything test server
 ```
 
 ### 2. Start HyperAgent
@@ -127,6 +127,27 @@ are shown during approval.
 
 ## Commands
 
+### Standalone setup commands
+
+These command-line options run one setup/show action and then exit. They do
+not start an agent session, and they do not require the repository Justfile.
+
+| Command | Action |
+|---------|--------|
+| `hyperagent --mcp-setup-everything` | Configure the MCP everything test server. Requires npm/npx; first use downloads `@modelcontextprotocol/server-everything`. |
+| `hyperagent --mcp-setup-github` | Configure the GitHub MCP server. Requires npm/npx and `GITHUB_TOKEN`; the command will remind you to use `gh auth token` if needed. |
+| `hyperagent --mcp-setup-filesystem [dir]` | Configure the filesystem MCP server rooted at `dir` (default `/tmp/mcp-fs`). Requires npm/npx; first use downloads `@modelcontextprotocol/server-filesystem`. |
+| `hyperagent --mcp-show-config` | Print configured MCP servers from `~/.hyperagent/config.json`. |
+| `hyperagent --mcp-setup-workiq` | Configure Microsoft Work IQ stdio MCP. Pre-fetches `@microsoft/workiq@latest`, runs its interactive EULA command, then writes config. |
+| `hyperagent --mcp-add-http <name> <url> [clientId] [tenantId] [scopes] [flow]` | Add a generic HTTP MCP server, optionally with OAuth. |
+| `hyperagent --mcp-m365-create-app [args...]` | Create/reuse an Entra app registration for Agent 365 HTTP MCP servers. Requires Azure CLI and `az login`. |
+| `hyperagent --mcp-setup-m365 [args...]` | Configure Agent 365 per-service HTTP MCP servers and pre-approve them. |
+| `hyperagent --mcp-m365-refresh-servers [args...]` | Refresh the user M365 server catalog using a cached or supplied bearer token. |
+| `hyperagent --mcp-m365-show` | Show saved M365 app registration details. |
+
+The Justfile recipes with matching names are development conveniences for this
+repository; the `hyperagent` options above are the user-facing path.
+
 ### Slash commands
 
 | Command                | Action                                                |
@@ -216,7 +237,7 @@ gate uses the MCP spec's `ToolAnnotations` (hints from the server):
 
 The gate runs on the **host side** while the guest VM is paused — the
 LLM's handler code sees either a normal result or
-`{ error: "Operation denied..." }`. The LLM doesn't need to know about
+`{ ok: false, error: "Operation denied..." }`. The LLM doesn't need to know about
 the gate; it writes code normally.
 
 Example prompt shown to the user:
@@ -247,7 +268,7 @@ MCP tools with native PPTX generation in a single workflow.
 export GITHUB_TOKEN=$(gh auth token)
 
 # Configure the GitHub MCP server
-just mcp-setup-github
+hyperagent --mcp-setup-github
 ```
 
 This creates `~/.hyperagent/config.json` with the GitHub server configured,
@@ -341,7 +362,7 @@ registration.
 ### One-shot setup
 
 ```bash
-just mcp-setup-workiq
+hyperagent --mcp-setup-workiq
 ```
 
 This writes the following entry to `~/.hyperagent/config.json`:
@@ -417,11 +438,12 @@ Instead of the single stdio `workiq` server you can connect to the
 per-service Agent 365 HTTP endpoints directly. This gives you finer
 `/mcp enable` control per M365 service and uses MSAL for OAuth.
 
-The setup script uses the VS Code MCP extension's pre-registered client ID
-(`aebc6443-...`) which has `McpServers.*` scopes admin-consented in all
-M365 Copilot tenants — no per-tenant app registration needed.
+Use an Entra public-client app registration for OAuth. You can create or reuse
+one with `hyperagent --mcp-m365-create-app`, then configure the per-service MCP
+entries from the saved app details.
 
-21 servers are available (see the full list with `just mcp-setup-m365 list`).
+The bundled catalog includes the available Agent 365 servers (see the full list
+with `hyperagent --mcp-setup-m365 list`).
 Common ones:
 
 | Config entry         | Service                          |
@@ -437,25 +459,27 @@ Common ones:
 #### Setup
 
 ```bash
+# One-time: create or reuse an Entra public-client app registration
+hyperagent --mcp-m365-create-app
+
 # Configure all M365 servers with browser auth (one-time)
-just mcp-setup-m365 all \
-  aebc6443-996d-45c2-90f0-388ff96faa56 \
-  <your-tenant-id> \
-  "" browser
+hyperagent --mcp-setup-m365 all \
+  <your-client-id> <your-tenant-id> "" browser
 
 # Or a subset
-just mcp-setup-m365 "mail,teams,planner" \
-  aebc6443-996d-45c2-90f0-388ff96faa56 \
-  <your-tenant-id> \
-  "" browser
+hyperagent --mcp-setup-m365 "mail,teams,planner" \
+  <your-client-id> <your-tenant-id> "" browser
 
 # List available services
-just mcp-setup-m365 list
+hyperagent --mcp-setup-m365 list
 ```
 
 This writes config entries AND pre-approves all configured servers so the
 LLM can connect them without interactive prompts.
 
+If you just ran `hyperagent --mcp-m365-create-app`, you can pass empty strings
+for the client ID and tenant ID to use the saved app details.
+
 #### Auth flows
 
 The `FLOW` argument (last positional) is **required**:
@@ -478,25 +502,23 @@ works with cached tokens.
 
 #### Custom Entra app registration
 
-If your tenant blocks the VS Code client ID, create your own app:
+If you already have a tenant-owned public-client app registration, pass it
+explicitly instead of using the saved app state:
 
 ```bash
-just mcp-m365-create-app
-# Then use your app's client ID:
-just mcp-setup-m365 all <your-client-id> <your-tenant-id> "" browser
+hyperagent --mcp-setup-m365 all <your-client-id> <your-tenant-id> "" browser
 ```
 
 #### Scope
 
-All servers use `ea9ffc3e-8a23-4a7d-836d-234d7c7565c1/.default` (the Agent 365
-resource app ID with `.default`), which requests all pre-consented scopes in
-one shot. This matches what [a365cli](https://github.com/sozercan/a365cli) uses.
+All servers use the Agent 365 resource `.default` scope, which requests the
+pre-consented Agent 365 MCP scopes in one shot.
 
 #### Refreshing the server catalog
 
 ```bash
-just mcp-m365-refresh-servers     # uses cached OAuth token
-just mcp-m365-refresh-servers --token <bearer>  # explicit token
+hyperagent --mcp-m365-refresh-servers     # uses cached OAuth token
+hyperagent --mcp-m365-refresh-servers --token <bearer>  # explicit token
 ```
 
 ## HTTP Transport & OAuth

diff --git a/scripts/build-binary.js b/scripts/build-binary.js
@@ -380,6 +380,14 @@ if (existsSync(skillsSrc)) {
   copyDirRecursive(skillsSrc, skillsDst);
 }
 
+// Copy MCP setup data needed by standalone CLI setup commands.
+const scriptsDst = join(LIB_DIR, "scripts");
+mkdirSync(scriptsDst, { recursive: true });
+const m365CatalogSrc = join(ROOT, "scripts", "m365-mcp-servers.json");
+if (existsSync(m365CatalogSrc)) {
+  copyFileSync(m365CatalogSrc, join(scriptsDst, "m365-mcp-servers.json"));
+}
+
 // Copy @github/copilot CLI (needed by copilot-sdk at runtime)
 // The SDK uses import.meta.resolve("@github/copilot/sdk") to find the CLI
 console.log("📦 Copying Copilot CLI runtime...");

diff --git a/skills/mcp-services/SKILL.md b/skills/mcp-services/SKILL.md
@@ -19,12 +19,14 @@ triggers:
 antiPatterns:
   - Don't try to manage_plugin("mcp:<name>") — MCP servers are NOT regular plugins
   - Don't import from "host:mcp-gateway" — that's the gateway sentinel, not a server
-  - Don't guess tool names — always call mcp_server_info() first
+  - Don't guess tool names or parameters — always call mcp_tool_info() first
   - Don't hardcode MCP tool schemas — they change when servers update
+  - Don't call MCP server tools directly from LLM tools — execute them only inside generated handler code
 allowed-tools:
   - register_handler
   - list_mcp_servers
   - mcp_server_info
+  - mcp_tool_info
   - manage_mcp
   - execute_javascript
   - delete_handler
@@ -52,6 +54,28 @@ allowed-tools:
 MCP (Model Context Protocol) servers provide external tool capabilities — M365
 services, GitHub, databases, custom APIs. Follow this exact workflow:
 
+## Default Behaviour: Handler-Only MCP Execution
+
+For normal user questions against external services — read, list, search, lookup,
+summarise recent items — use focused discovery, then execute MCP calls inside a
+registered handler:
+
+```
+list_mcp_servers()
+manage_mcp({ action: "connect", name: "<server>" })
+mcp_tool_info({ name: "<server>", query: "<what you need>" })
+apply_profile({ profiles: "mcp-network" }) // external MCP calls need wall-clock time
+register_handler(...) // import from host:mcp-<server>, await the selected tool
+execute_javascript(...)
+```
+
+Do **not** call MCP server tools directly from LLM tools. The handler is the
+auditable execution boundary for MCP calls. Avoid `file-builder` and
+`fs-write`/`fs-read` unless the user asked for an artifact or the task truly
+needs large intermediate output. If a result is too large, first retry with
+narrower handler arguments: `limit`, `top`, `$top`, `$select`, `$filter`, date
+ranges, search query, or a more specific tool.
+
 ### Step 1: Discover configured servers
 
 ```
@@ -71,36 +95,86 @@ manage_mcp({ action: "connect", name: "work-iq-mail" })
 - If not approved → prompts the user for approval (shows tools + security info)
 - Returns `{ success: true, tools: [...], module: "host:mcp-<name>" }`
 
-### Step 3: Get tool schemas
+### Step 3: Get focused tool schemas
+
+```
+mcp_tool_info({ name: "work-iq-mail", query: "search recent messages" })
+```
+
+Returns JSON Schema for the relevant tools plus TypeScript declarations. Read
+this BEFORE writing handler code — tool names and parameter shapes vary per
+server.
+
+If you already know the tool names, request only those tools:
+
+```
+mcp_tool_info({ name: "work-iq-mail", tools: ["SearchEmails", "GetEmail"] })
+```
+
+Use `mcp_server_info({ name: "work-iq-mail", query: "..." })` only when you
+need server-level details as well. Avoid dumping every schema unless the user
+explicitly asks to inspect the whole server.
+
+### Step 4: Apply the MCP network profile
 
 ```
-mcp_server_info("work-iq-mail")
+apply_profile({ profiles: "mcp-network" })
 ```
 
-Returns full JSON Schema for every tool plus TypeScript declarations. Read this
-BEFORE writing handler code — tool names and parameter shapes vary per server.
+MCP handlers wait on external service calls, so the default 5s wall-clock limit
+is often too small even when CPU usage is low. Use `mcp-network` before
+executing MCP handlers. It raises wall time without enabling file plugins.
 
-### Step 4: Use the tools in handler code
+### Step 5: Register handler code that calls MCP tools
+
+For reads, searches, and lookups, generate handler code that imports from the
+server module and awaits the selected MCP tool:
 
 ```javascript
 import { SearchEmails } from "host:mcp-work-iq-mail";
 
 export default async function handler(event) {
-  const result = await SearchEmails({ query: "from:boss subject:urgent" });
-  return { content: [{ type: "text", text: JSON.stringify(result) }] };
+  const result = await SearchEmails({
+    query: "from:boss subject:urgent",
+    top: 5,
+  });
+  if (!result.ok) return result;
+  return { content: [{ type: "text", text: JSON.stringify(result.data) }] };
 }
 ```
 
+MCP calls return a stable envelope inside handler code:
+
+```javascript
+{
+  ok: true,
+  data: { /* parsed primary result */ },
+  text: "...",      // original text content when available
+  raw: [/* MCP content */],
+  meta: [/* secondary content such as correlation IDs */]
+}
+```
+
+On failure they return `{ ok: false, error: "..." }`. Always check `ok` and
+`error` before using `data`.
+
+### Step 6: Execute the handler and iterate narrowly
+
+Run the handler with `execute_javascript`. If output is too large, edit the
+handler to narrow the MCP request before enabling file plugins.
+
 Key rules:
 
 - Import from `host:mcp-<server-name>` (the name from list_mcp_servers)
-- Tool function names are EXACTLY as returned by mcp_server_info
+- Apply `mcp-network` before running MCP handlers; network I/O hits wall-clock limits
+- Tool function names are EXACTLY as returned by mcp_tool_info
 - All MCP tool calls are async — use `await`
-- Tools return `{ content: [{type, text}] }` — parse the text field as needed
-- Some servers return embedded JSON (status text + JSON) — extract the JSON part
+- Tools return `{ ok, data, text, raw, error }` — check `ok`/`error` first
+- `data` is the parsed primary result; use `raw` only when debugging envelopes
+- If output is large, narrow the MCP request in handler code before trying file plugins
 - **Write operations** (tools not marked `readOnlyHint: true`) may prompt the
   user for approval before executing. If denied, the tool returns
-  `{ error: "Operation denied..." }` — handle this gracefully and explain
+  `{ ok: false, error: "Operation denied..." }` — handle this gracefully and explain
   to the user what happened. Do NOT retry denied operations.
 
 ### Server name patterns