diff --git a/docs/MCP.md b/docs/MCP.md index a951359..2f2dd11 100644 --- a/docs/MCP.md +++ b/docs/MCP.md @@ -29,7 +29,7 @@ Add servers to `~/.hyperagent/config.json` (same format as VS Code's `mcp.json`) Or use the setup script: ```bash -just mcp-setup-everything # sets up the MCP everything test server +hyperagent --mcp-setup-everything # sets up the MCP everything test server ``` ### 2. Start HyperAgent @@ -127,6 +127,27 @@ are shown during approval. ## Commands +### Standalone setup commands + +These command-line options run one setup/show action and then exit. They do +not start an agent session, and they do not require the repository Justfile. + +| Command | Action | +|---------|--------| +| `hyperagent --mcp-setup-everything` | Configure the MCP everything test server. Requires npm/npx; first use downloads `@modelcontextprotocol/server-everything`. | +| `hyperagent --mcp-setup-github` | Configure the GitHub MCP server. Requires npm/npx and `GITHUB_TOKEN`; the command will remind you to use `gh auth token` if needed. | +| `hyperagent --mcp-setup-filesystem [dir]` | Configure the filesystem MCP server rooted at `dir` (default `/tmp/mcp-fs`). Requires npm/npx; first use downloads `@modelcontextprotocol/server-filesystem`. | +| `hyperagent --mcp-show-config` | Print configured MCP servers from `~/.hyperagent/config.json`. | +| `hyperagent --mcp-setup-workiq` | Configure Microsoft Work IQ stdio MCP. Pre-fetches `@microsoft/workiq@latest`, runs its interactive EULA command, then writes config. | +| `hyperagent --mcp-add-http [clientId] [tenantId] [scopes] [flow]` | Add a generic HTTP MCP server, optionally with OAuth. | +| `hyperagent --mcp-m365-create-app [args...]` | Create/reuse an Entra app registration for Agent 365 HTTP MCP servers. Requires Azure CLI and `az login`. | +| `hyperagent --mcp-setup-m365 [args...]` | Configure Agent 365 per-service HTTP MCP servers and pre-approve them. | +| `hyperagent --mcp-m365-refresh-servers [args...]` | Refresh the user M365 server catalog using a cached or supplied bearer token. | +| `hyperagent --mcp-m365-show` | Show saved M365 app registration details. | + +The Justfile recipes with matching names are development conveniences for this +repository; the `hyperagent` options above are the user-facing path. + ### Slash commands | Command | Action | @@ -216,7 +237,7 @@ gate uses the MCP spec's `ToolAnnotations` (hints from the server): The gate runs on the **host side** while the guest VM is paused — the LLM's handler code sees either a normal result or -`{ error: "Operation denied..." }`. The LLM doesn't need to know about +`{ ok: false, error: "Operation denied..." }`. The LLM doesn't need to know about the gate; it writes code normally. Example prompt shown to the user: @@ -247,7 +268,7 @@ MCP tools with native PPTX generation in a single workflow. export GITHUB_TOKEN=$(gh auth token) # Configure the GitHub MCP server -just mcp-setup-github +hyperagent --mcp-setup-github ``` This creates `~/.hyperagent/config.json` with the GitHub server configured, @@ -341,7 +362,7 @@ registration. ### One-shot setup ```bash -just mcp-setup-workiq +hyperagent --mcp-setup-workiq ``` This writes the following entry to `~/.hyperagent/config.json`: @@ -417,11 +438,12 @@ Instead of the single stdio `workiq` server you can connect to the per-service Agent 365 HTTP endpoints directly. This gives you finer `/mcp enable` control per M365 service and uses MSAL for OAuth. -The setup script uses the VS Code MCP extension's pre-registered client ID -(`aebc6443-...`) which has `McpServers.*` scopes admin-consented in all -M365 Copilot tenants — no per-tenant app registration needed. +Use an Entra public-client app registration for OAuth. You can create or reuse +one with `hyperagent --mcp-m365-create-app`, then configure the per-service MCP +entries from the saved app details. -21 servers are available (see the full list with `just mcp-setup-m365 list`). +The bundled catalog includes the available Agent 365 servers (see the full list +with `hyperagent --mcp-setup-m365 list`). Common ones: | Config entry | Service | @@ -437,25 +459,27 @@ Common ones: #### Setup ```bash +# One-time: create or reuse an Entra public-client app registration +hyperagent --mcp-m365-create-app + # Configure all M365 servers with browser auth (one-time) -just mcp-setup-m365 all \ - aebc6443-996d-45c2-90f0-388ff96faa56 \ - \ - "" browser +hyperagent --mcp-setup-m365 all \ + "" browser # Or a subset -just mcp-setup-m365 "mail,teams,planner" \ - aebc6443-996d-45c2-90f0-388ff96faa56 \ - \ - "" browser +hyperagent --mcp-setup-m365 "mail,teams,planner" \ + "" browser # List available services -just mcp-setup-m365 list +hyperagent --mcp-setup-m365 list ``` This writes config entries AND pre-approves all configured servers so the LLM can connect them without interactive prompts. +If you just ran `hyperagent --mcp-m365-create-app`, you can pass empty strings +for the client ID and tenant ID to use the saved app details. + #### Auth flows The `FLOW` argument (last positional) is **required**: @@ -478,25 +502,23 @@ works with cached tokens. #### Custom Entra app registration -If your tenant blocks the VS Code client ID, create your own app: +If you already have a tenant-owned public-client app registration, pass it +explicitly instead of using the saved app state: ```bash -just mcp-m365-create-app -# Then use your app's client ID: -just mcp-setup-m365 all "" browser +hyperagent --mcp-setup-m365 all "" browser ``` #### Scope -All servers use `ea9ffc3e-8a23-4a7d-836d-234d7c7565c1/.default` (the Agent 365 -resource app ID with `.default`), which requests all pre-consented scopes in -one shot. This matches what [a365cli](https://github.com/sozercan/a365cli) uses. +All servers use the Agent 365 resource `.default` scope, which requests the +pre-consented Agent 365 MCP scopes in one shot. #### Refreshing the server catalog ```bash -just mcp-m365-refresh-servers # uses cached OAuth token -just mcp-m365-refresh-servers --token # explicit token +hyperagent --mcp-m365-refresh-servers # uses cached OAuth token +hyperagent --mcp-m365-refresh-servers --token # explicit token ``` ## HTTP Transport & OAuth diff --git a/scripts/build-binary.js b/scripts/build-binary.js index a5a4e1a..ec5974b 100644 --- a/scripts/build-binary.js +++ b/scripts/build-binary.js @@ -380,6 +380,14 @@ if (existsSync(skillsSrc)) { copyDirRecursive(skillsSrc, skillsDst); } +// Copy MCP setup data needed by standalone CLI setup commands. +const scriptsDst = join(LIB_DIR, "scripts"); +mkdirSync(scriptsDst, { recursive: true }); +const m365CatalogSrc = join(ROOT, "scripts", "m365-mcp-servers.json"); +if (existsSync(m365CatalogSrc)) { + copyFileSync(m365CatalogSrc, join(scriptsDst, "m365-mcp-servers.json")); +} + // Copy @github/copilot CLI (needed by copilot-sdk at runtime) // The SDK uses import.meta.resolve("@github/copilot/sdk") to find the CLI console.log("📦 Copying Copilot CLI runtime..."); diff --git a/skills/mcp-services/SKILL.md b/skills/mcp-services/SKILL.md index a989e7b..5a8d311 100644 --- a/skills/mcp-services/SKILL.md +++ b/skills/mcp-services/SKILL.md @@ -19,12 +19,14 @@ triggers: antiPatterns: - Don't try to manage_plugin("mcp:") — MCP servers are NOT regular plugins - Don't import from "host:mcp-gateway" — that's the gateway sentinel, not a server - - Don't guess tool names — always call mcp_server_info() first + - Don't guess tool names or parameters — always call mcp_tool_info() first - Don't hardcode MCP tool schemas — they change when servers update + - Don't call MCP server tools directly from LLM tools — execute them only inside generated handler code allowed-tools: - register_handler - list_mcp_servers - mcp_server_info + - mcp_tool_info - manage_mcp - execute_javascript - delete_handler @@ -52,6 +54,28 @@ allowed-tools: MCP (Model Context Protocol) servers provide external tool capabilities — M365 services, GitHub, databases, custom APIs. Follow this exact workflow: +## Default Behaviour: Handler-Only MCP Execution + +For normal user questions against external services — read, list, search, lookup, +summarise recent items — use focused discovery, then execute MCP calls inside a +registered handler: + +``` +list_mcp_servers() +manage_mcp({ action: "connect", name: "" }) +mcp_tool_info({ name: "", query: "" }) +apply_profile({ profiles: "mcp-network" }) // external MCP calls need wall-clock time +register_handler(...) // import from host:mcp-, await the selected tool +execute_javascript(...) +``` + +Do **not** call MCP server tools directly from LLM tools. The handler is the +auditable execution boundary for MCP calls. Avoid `file-builder` and +`fs-write`/`fs-read` unless the user asked for an artifact or the task truly +needs large intermediate output. If a result is too large, first retry with +narrower handler arguments: `limit`, `top`, `$top`, `$select`, `$filter`, date +ranges, search query, or a more specific tool. + ### Step 1: Discover configured servers ``` @@ -71,36 +95,86 @@ manage_mcp({ action: "connect", name: "work-iq-mail" }) - If not approved → prompts the user for approval (shows tools + security info) - Returns `{ success: true, tools: [...], module: "host:mcp-" }` -### Step 3: Get tool schemas +### Step 3: Get focused tool schemas + +``` +mcp_tool_info({ name: "work-iq-mail", query: "search recent messages" }) +``` + +Returns JSON Schema for the relevant tools plus TypeScript declarations. Read +this BEFORE writing handler code — tool names and parameter shapes vary per +server. + +If you already know the tool names, request only those tools: + +``` +mcp_tool_info({ name: "work-iq-mail", tools: ["SearchEmails", "GetEmail"] }) +``` + +Use `mcp_server_info({ name: "work-iq-mail", query: "..." })` only when you +need server-level details as well. Avoid dumping every schema unless the user +explicitly asks to inspect the whole server. + +### Step 4: Apply the MCP network profile ``` -mcp_server_info("work-iq-mail") +apply_profile({ profiles: "mcp-network" }) ``` -Returns full JSON Schema for every tool plus TypeScript declarations. Read this -BEFORE writing handler code — tool names and parameter shapes vary per server. +MCP handlers wait on external service calls, so the default 5s wall-clock limit +is often too small even when CPU usage is low. Use `mcp-network` before +executing MCP handlers. It raises wall time without enabling file plugins. -### Step 4: Use the tools in handler code +### Step 5: Register handler code that calls MCP tools + +For reads, searches, and lookups, generate handler code that imports from the +server module and awaits the selected MCP tool: ```javascript import { SearchEmails } from "host:mcp-work-iq-mail"; export default async function handler(event) { - const result = await SearchEmails({ query: "from:boss subject:urgent" }); - return { content: [{ type: "text", text: JSON.stringify(result) }] }; + const result = await SearchEmails({ + query: "from:boss subject:urgent", + top: 5, + }); + if (!result.ok) return result; + return { content: [{ type: "text", text: JSON.stringify(result.data) }] }; } ``` +MCP calls return a stable envelope inside handler code: + +```javascript +{ + ok: true, + data: { /* parsed primary result */ }, + text: "...", // original text content when available + raw: [/* MCP content */], + meta: [/* secondary content such as correlation IDs */] +} +``` + +On failure they return `{ ok: false, error: "..." }`. Always check `ok` and +`error` before using `data`. + +### Step 6: Execute the handler and iterate narrowly + +Run the handler with `execute_javascript`. If output is too large, edit the +handler to narrow the MCP request before enabling file plugins. + Key rules: - Import from `host:mcp-` (the name from list_mcp_servers) -- Tool function names are EXACTLY as returned by mcp_server_info +- Apply `mcp-network` before running MCP handlers; network I/O hits wall-clock limits +- Tool function names are EXACTLY as returned by mcp_tool_info - All MCP tool calls are async — use `await` -- Tools return `{ content: [{type, text}] }` — parse the text field as needed -- Some servers return embedded JSON (status text + JSON) — extract the JSON part +- Tools return `{ ok, data, text, raw, error }` — check `ok`/`error` first +- `data` is the parsed primary result; use `raw` only when debugging envelopes +- If output is large, narrow the MCP request in handler code before trying file plugins - **Write operations** (tools not marked `readOnlyHint: true`) may prompt the user for approval before executing. If denied, the tool returns - `{ error: "Operation denied..." }` — handle this gracefully and explain + `{ ok: false, error: "Operation denied..." }` — handle this gracefully and explain to the user what happened. Do NOT retry denied operations. ### Server name patterns diff --git a/src/agent/cli-parser.ts b/src/agent/cli-parser.ts index bb9a2ff..8878829 100644 --- a/src/agent/cli-parser.ts +++ b/src/agent/cli-parser.ts @@ -8,6 +8,7 @@ // ───────────────────────────────────────────────────────────────────── import { readFileSync } from "node:fs"; +import type { MCPSetupCommand } from "./mcp/setup-commands.js"; export interface CliConfig { model: string; @@ -68,6 +69,18 @@ export interface CliConfig { * Show version and exit. */ showVersion: boolean; + /** + * Standalone MCP setup/config command. Runs and exits before agent startup. + */ + mcpSetupCommand?: MCPSetupCommand; +} + +function setMCPSetupCommand(config: CliConfig, command: MCPSetupCommand): void { + if (config.mcpSetupCommand) { + console.error("Only one MCP setup option can be used per invocation"); + process.exit(1); + } + config.mcpSetupCommand = command; } function printUsage(): void { @@ -97,7 +110,7 @@ Options: --tune Capture LLM decision/reasoning logs to ~/.hyperagent/logs/ --profile Apply resource profile at startup (limits only) Stack: --profile "web-research heavy-compute" - Profiles: default, file-builder, web-research, heavy-compute + Profiles: default, file-builder, web-research, heavy-compute, mcp-network --auto-approve Auto-approve all interactive prompts (YOLO mode) --prompt "" Send a prompt non-interactively and exit after completion --prompt-file Read prompt from a file (avoids shell quoting issues) @@ -106,6 +119,19 @@ Options: --version, -v Show version and exit --help, -h Show this help message +Standalone MCP setup commands (run and exit): + --mcp-setup-everything Configure the MCP everything test server + --mcp-setup-github Configure the GitHub MCP server (uses GITHUB_TOKEN) + --mcp-setup-filesystem [dir] Configure the filesystem MCP server (default: /tmp/mcp-fs) + --mcp-show-config Show configured MCP servers + --mcp-setup-workiq Configure Microsoft Work IQ stdio MCP server + --mcp-add-http [clientId] [tenantId] [scopes] [flow] + Add a generic HTTP MCP server + --mcp-m365-create-app [args...] Create/reuse Entra app for M365 HTTP MCP + --mcp-setup-m365 [args...] Configure Agent 365 HTTP MCP services + --mcp-m365-refresh-servers [args...] Refresh the M365 MCP server catalog + --mcp-m365-show Show saved M365 app registration details + Plugin commands (at the REPL prompt): /plugins List discovered plugins /enable Audit, configure, and enable a plugin @@ -311,6 +337,56 @@ export function parseCliArgs( case "-v": config.showVersion = true; break; + case "--mcp-setup-everything": + setMCPSetupCommand(config, { kind: "setup-everything" }); + break; + case "--mcp-setup-github": + setMCPSetupCommand(config, { kind: "setup-github" }); + break; + case "--mcp-setup-filesystem": { + const next = argv[i + 1]; + const dir = next && !next.startsWith("--") ? next : "/tmp/mcp-fs"; + if (dir === next) i++; + setMCPSetupCommand(config, { kind: "setup-filesystem", dir }); + break; + } + case "--mcp-show-config": + setMCPSetupCommand(config, { kind: "show-config" }); + break; + case "--mcp-setup-workiq": + setMCPSetupCommand(config, { kind: "setup-workiq" }); + break; + case "--mcp-add-http": + setMCPSetupCommand(config, { + kind: "add-http", + args: argv.slice(i + 1), + }); + i = argv.length; + break; + case "--mcp-m365-create-app": + setMCPSetupCommand(config, { + kind: "m365-create-app", + args: argv.slice(i + 1), + }); + i = argv.length; + break; + case "--mcp-setup-m365": + setMCPSetupCommand(config, { + kind: "m365-setup", + args: argv.slice(i + 1), + }); + i = argv.length; + break; + case "--mcp-m365-refresh-servers": + setMCPSetupCommand(config, { + kind: "m365-refresh-servers", + args: argv.slice(i + 1), + }); + i = argv.length; + break; + case "--mcp-m365-show": + setMCPSetupCommand(config, { kind: "m365-show" }); + break; case "--help": case "-h": printUsage(); diff --git a/src/agent/command-suggestions.ts b/src/agent/command-suggestions.ts index fc20ce5..d94f6e6 100644 --- a/src/agent/command-suggestions.ts +++ b/src/agent/command-suggestions.ts @@ -24,6 +24,14 @@ export const ACTIONABLE_COMMAND_PREFIXES = [ */ const PLACEHOLDER_RE = /example\.(?:com|net|org)|<[^>]+>/i; +function cleanCommandCandidate(candidate: string): string { + return candidate + .trim() + .replace(/^[`*_]+/g, "") + .replace(/[`*_]+$/g, "") + .trim(); +} + /** * Scan the assistant's response text for slash commands that match * actionable prefixes. Returns deduplicated commands in order. @@ -35,34 +43,40 @@ export function extractSuggestedCommands(text: string): string[] { const commands: string[] = []; const seen = new Set(); + const addCommand = (candidate: string): void => { + const cmd = cleanCommandCandidate(candidate); + if (!cmd || seen.has(cmd) || PLACEHOLDER_RE.test(cmd)) return; + seen.add(cmd); + commands.push(cmd); + }; + // Pattern 1: commands inside backticks — `/plugin enable fetch ...` // This catches inline code references the LLM wraps in backticks. const backtickRe = /`(\/(?:plugin\s+enable|plugin\s+disable|mcp\s+enable|buffer|timeout|set)\s[^`]+)`/gi; for (const m of text.matchAll(backtickRe)) { - const cmd = m[1].trim(); - if (!seen.has(cmd) && !PLACEHOLDER_RE.test(cmd)) { - seen.add(cmd); - commands.push(cmd); - } + addCommand(m[1]); + } + + // Pattern 2: commands inside markdown bold — **/mcp enable ...** + // The model often emphasises auth/setup commands this way. + const boldRe = + /\*\*(\/(?:plugin\s+enable|plugin\s+disable|mcp\s+enable|buffer|timeout|set)\s(?:(?!\*\*)[^\n])+)\*\*/gi; + for (const m of text.matchAll(boldRe)) { + addCommand(m[1]); } - // Pattern 2: bare commands as the start of a line (possibly indented). + // Pattern 3: bare commands as the start of a line (possibly indented). // Only matched if not already found via backtick pattern. for (const line of text.split("\n")) { - const trimmed = line.trim(); + const trimmed = cleanCommandCandidate(line); if ( trimmed.startsWith("/") && ACTIONABLE_COMMAND_PREFIXES.some((p) => trimmed.toLowerCase().startsWith(p.toLowerCase()), ) ) { - // Strip any trailing markdown/punctuation the LLM might append - const cleaned = trimmed.replace(/[`*_]+$/g, "").trim(); - if (cleaned && !seen.has(cleaned) && !PLACEHOLDER_RE.test(cleaned)) { - seen.add(cleaned); - commands.push(cleaned); - } + addCommand(trimmed); } } diff --git a/src/agent/index.ts b/src/agent/index.ts index 2fc139a..63d0568 100644 --- a/src/agent/index.ts +++ b/src/agent/index.ts @@ -184,6 +184,12 @@ if (cli.showVersion) { process.exit(0); } +// ── Standalone MCP setup/config commands: run and exit ────────────── +if (cli.mcpSetupCommand) { + runMCPSetupCommand(cli.mcpSetupCommand, { contentRoot: CONTENT_ROOT }); + process.exit(0); +} + // Propagate CLI → env vars (so sandbox-tool.js and other modules pick them up) process.env.COPILOT_MODEL = cli.model; process.env.HYPERLIGHT_CPU_TIMEOUT_MS = cli.cpuTimeout; @@ -741,6 +747,7 @@ import { isMCPHttpConfig, isMCPStdioConfig, mcpConfigDisplayString, + type MCPToolSchema, } from "./mcp/types.js"; import { createMCPClientManager, @@ -758,6 +765,8 @@ import { auditMCPTools, } from "./mcp/approval.js"; import { canAcquireSilently } from "./mcp/auth/msal-oauth.js"; +import { runMCPSetupCommand } from "./mcp/setup-commands.js"; +import { findMCPTool, selectMCPTools } from "./mcp/tool-utils.js"; // Load MCP config from ~/.hyperagent/config.json let mcpManager: MCPClientManager | null = null; @@ -3607,7 +3616,7 @@ const mcpServerInfoTool = defineTool("mcp_server_info", { "Get detailed information about a specific MCP server including its", "tool schemas, connection state, and TypeScript declarations.", "Use list_mcp_servers first to see available server names.", - "CALL THIS before writing handler code that uses host:mcp-* modules.", + "Prefer mcp_tool_info for focused schema lookup before choosing tools.", ].join("\n"), parameters: { type: "object", @@ -3616,10 +3625,35 @@ const mcpServerInfoTool = defineTool("mcp_server_info", { type: "string", description: 'MCP server name (e.g. "github", "everything").', }, + tools: { + type: "array", + description: + "Optional tool names to include. Use this to avoid huge all-server schema dumps.", + items: { type: "string" }, + }, + query: { + type: "string", + description: + 'Optional search query for relevant tools (e.g. "messages chats recent").', + }, + limit: { + type: "number", + description: "Maximum tools to return when filtering or searching.", + }, }, required: ["name"], }, - handler: async ({ name }: { name: string }) => { + handler: async ({ + name, + tools, + query, + limit, + }: { + name: string; + tools?: string[]; + query?: string; + limit?: number; + }) => { const err = requireMCPEnabled(); if (err) return { error: err }; @@ -3637,10 +3671,15 @@ const mcpServerInfoTool = defineTool("mcp_server_info", { }; } - // Generate TypeScript declarations for the server's tools + const selection = selectMCPTools(conn.tools, { tools, query, limit }); + + // Generate TypeScript declarations for the selected tools let declarations: string | null = null; - if (conn.tools.length > 0) { - declarations = generateMCPDeclarations(conn.name, conn.tools); + if (selection.tools.length > 0) { + const selectedSchemas = selection.tools + .map((tool) => findMCPTool(conn.tools, tool.name)) + .filter((tool): tool is MCPToolSchema => tool !== undefined); + declarations = generateMCPDeclarations(conn.name, selectedSchemas); } return { @@ -3649,15 +3688,111 @@ const mcpServerInfoTool = defineTool("mcp_server_info", { transport: isMCPHttpConfig(conn.config) ? "http" : "stdio", endpoint: mcpConfigDisplayString(conn.config), toolCount: conn.tools.length, - tools: conn.tools.map((t) => ({ - name: t.name, - description: t.description, - parameters: t.inputSchema, - })), + returnedToolCount: selection.tools.length, + totalMatches: selection.totalMatches, + missingTools: selection.missing, + tools: selection.tools, module: `host:mcp-${conn.name}`, - importPattern: `import { ${conn.tools.map((t) => t.name).join(", ")} } from "host:mcp-${conn.name}";`, + importPattern: + selection.tools.length > 0 + ? `import { ${selection.tools.map((t) => t.name).join(", ")} } from "host:mcp-${conn.name}";` + : null, declarations, lastError: conn.lastError ?? null, + hint: "Use register_handler to import selected tools from the host:mcp-* module and execute MCP calls inside sandbox handler code.", + }; + }, +}); + +const mcpToolInfoTool = defineTool("mcp_tool_info", { + description: [ + "Get focused schema information for relevant tools on one MCP server.", + "Use this before writing handler code so you do not guess parameter names.", + "Pass either explicit tools or a natural-language query.", + "After this, register a handler that imports the selected tools from host:mcp-.", + ].join("\n"), + parameters: { + type: "object", + properties: { + name: { + type: "string", + description: 'MCP server name (e.g. "github", "work-iq-teams").', + }, + tools: { + type: "array", + description: "Specific MCP tool names to inspect.", + items: { type: "string" }, + }, + query: { + type: "string", + description: + 'Search terms for relevant tools (e.g. "last messages chats").', + }, + limit: { + type: "number", + description: "Maximum matching tools to return; defaults to 8.", + }, + }, + required: ["name"], + }, + handler: async ({ + name, + tools, + query, + limit, + }: { + name: string; + tools?: string[]; + query?: string; + limit?: number; + }) => { + const err = requireMCPEnabled(); + if (err) return { error: err }; + + state.hasCalledListModules = true; + + const conn = mcpManager!.getConnection(name); + if (!conn) { + const available = mcpManager! + .listServers() + .map((c) => c.name) + .join(", "); + return { + error: `MCP server "${name}" not found. Available: ${available || "(none)"}`, + }; + } + + if (conn.state !== "connected") { + return { + name: conn.name, + state: conn.state, + tools: [], + hint: `Connect first with manage_mcp({ action: "connect", name: "${conn.name}" }).`, + }; + } + + const selection = selectMCPTools(conn.tools, { tools, query, limit }); + const selectedSchemas = selection.tools + .map((tool) => findMCPTool(conn.tools, tool.name)) + .filter((tool): tool is MCPToolSchema => tool !== undefined); + + return { + name: conn.name, + state: conn.state, + module: `host:mcp-${conn.name}`, + returnedToolCount: selection.tools.length, + totalMatches: selection.totalMatches, + missingTools: selection.missing, + tools: selection.tools, + declarations: + selectedSchemas.length > 0 + ? generateMCPDeclarations(conn.name, selectedSchemas) + : null, + handlerImportPattern: + selection.tools.length > 0 + ? `import { ${selection.tools.map((t) => t.name).join(", ")} } from "host:mcp-${conn.name}";` + : null, + handlerCallPattern: `const result = await ${selection.tools[0]?.name ?? "ToolName"}({ /* args from schema */ });`, }; }, }); @@ -3670,6 +3805,7 @@ const manageMCPTool = defineTool("manage_mcp", { "", "After connecting, the server's tools are available as host:mcp-", "modules in handler code.", + "After connecting, use mcp_tool_info, then register a handler that imports and awaits the selected MCP tools.", ].join("\n"), parameters: { type: "object", @@ -3711,6 +3847,8 @@ const manageMCPTool = defineTool("manage_mcp", { success: true, message: `"${params.name}" is already connected with ${conn.tools.length} tool(s).`, tools: conn.tools.map((t) => t.name), + nextStep: + "Call mcp_tool_info({ name, query }) to choose a tool, then register_handler with imports from host:mcp-.", }; } @@ -3821,6 +3959,8 @@ const manageMCPTool = defineTool("manage_mcp", { message: `"${params.name}" connected with ${connected.tools.length} tool(s).`, module: `host:mcp-${params.name}`, tools: connected.tools.map((t) => t.name), + nextStep: + "Call mcp_tool_info({ name, query }) to choose a tool, then register_handler with imports from host:mcp-.", }; } catch (err) { return { @@ -4640,6 +4780,7 @@ function buildSessionConfig() { // MCP SDK tools — gated inside handlers, always registered listMCPServersTool, mcpServerInfoTool, + mcpToolInfoTool, manageMCPTool, // Conditionally include tuning tool — only when --tune is active ...(state.tuneEnabled ? [llmThoughtTool] : []), @@ -4675,6 +4816,7 @@ function buildSessionConfig() { // MCP tools — always listed, gated inside handler "list_mcp_servers", "mcp_server_info", + "mcp_tool_info", "manage_mcp", // Conditionally expose tuning tool ...(state.tuneEnabled ? ["llm_thought"] : []), diff --git a/src/agent/mcp/client-manager.ts b/src/agent/mcp/client-manager.ts index 6967242..d20f330 100644 --- a/src/agent/mcp/client-manager.ts +++ b/src/agent/mcp/client-manager.ts @@ -37,6 +37,16 @@ import { deleteCachedSession, } from "./session-cache.js"; +export interface MCPToolCallResult { + ok: boolean; + data?: unknown; + text?: string; + raw?: unknown; + meta?: unknown[]; + error?: string; + truncated?: boolean; +} + /** * Create an MCP client manager that handles connection lifecycle, * tool discovery, and tool execution for configured MCP servers. @@ -286,7 +296,7 @@ export function createMCPClientManager() { serverName: string, toolName: string, args: Record, - ): Promise { + ): Promise { let conn = connections.get(serverName); if (!conn) { throw new Error(`[mcp] Unknown server: ${serverName}`); @@ -327,20 +337,37 @@ export function createMCPClientManager() { // Check for errors if (result.isError) { const errorText = extractTextContent(result.content); - return { error: errorText }; + return { + ok: false, + error: errorText, + text: errorText, + raw: result.content, + }; } - // Extract content, enforce size limit - const content = extractContent(result.content); - const contentStr = JSON.stringify(content); - if (contentStr.length > MCP_MAX_RESPONSE_BYTES) { + // Extract content, enforce size limit against the useful payload first. + // raw duplicates the original MCP content and may be much larger than + // the parsed data/text the handler actually needs. + const normalised = normaliseToolResult(result.content); + const payloadSize = JSON.stringify({ + data: normalised.data, + text: normalised.text, + meta: normalised.meta, + }).length; + if (payloadSize > MCP_MAX_RESPONSE_BYTES) { return { - error: `Response too large (${contentStr.length} bytes). Maximum is ${MCP_MAX_RESPONSE_BYTES} bytes.`, + ok: false, + error: `Response too large (${payloadSize} bytes). Maximum is ${MCP_MAX_RESPONSE_BYTES} bytes.`, truncated: true, }; } - return content; + const fullSize = JSON.stringify(normalised).length; + if (fullSize > MCP_MAX_RESPONSE_BYTES) { + return { ...normalised, raw: undefined, truncated: true }; + } + + return normalised; } catch (err) { // Server may have died or network is down — mark as error for reconnect const msg = (err as Error).message; @@ -355,7 +382,7 @@ export function createMCPClientManager() { conn.state = "error"; conn.lastError = msg; } - return { error: `[mcp] Tool call failed: ${msg}` }; + return { ok: false, error: `[mcp] Tool call failed: ${msg}` }; } } @@ -537,6 +564,70 @@ function extractContent(content: any[]): unknown { }); } +/** + * Convert arbitrary MCP content into the stable shape exposed to the LLM. + */ +// eslint-disable-next-line @typescript-eslint/no-explicit-any +export function normaliseToolResult(content: any[]): MCPToolCallResult { + const extracted = extractContent(content); + const text = extractTextContent(content); + const { data, meta } = selectPrimaryData(extracted); + + return { + ok: true, + data, + ...(text ? { text } : {}), + raw: content, + ...(meta.length > 0 ? { meta } : {}), + }; +} + +function selectPrimaryData(extracted: unknown): { + data: unknown; + meta: unknown[]; +} { + if (!Array.isArray(extracted)) { + return { data: extracted, meta: [] }; + } + + const dataItems = extracted.filter(hasDataProperty); + if (dataItems.length === 0) { + return { data: extracted, meta: [] }; + } + + const structuredItems = dataItems.filter((item) => isStructured(item.data)); + if (structuredItems.length === 1) { + return { + data: structuredItems[0].data, + meta: extracted.filter((item) => item !== structuredItems[0]), + }; + } + + if (structuredItems.length > 1) { + return { + data: structuredItems.map((item) => item.data), + meta: extracted.filter((item) => !structuredItems.includes(item)), + }; + } + + return { + data: dataItems.map((item) => item.data), + meta: extracted.filter((item) => !dataItems.includes(item)), + }; +} + +function hasDataProperty(value: unknown): value is { data: unknown } { + return ( + typeof value === "object" && + value !== null && + Object.prototype.hasOwnProperty.call(value, "data") + ); +} + +function isStructured(value: unknown): boolean { + return typeof value === "object" && value !== null; +} + /** * Try to recover structured JSON from a text content payload. * Handles three patterns observed in the wild: diff --git a/src/agent/mcp/plugin-adapter.ts b/src/agent/mcp/plugin-adapter.ts index 6dcef5f..2abfa0a 100644 --- a/src/agent/mcp/plugin-adapter.ts +++ b/src/agent/mcp/plugin-adapter.ts @@ -15,6 +15,7 @@ import type { MCPToolSchema, MCPToolAnnotations, } from "./types.js"; +import { isReadOnlyMCPTool } from "./tool-utils.js"; /** * Callback that decides whether a write operation should proceed. @@ -74,10 +75,10 @@ export function createMCPPluginAdapter( functions[tool.name] = async (...args: unknown[]): Promise => { const toolArgs = (args[0] as Record) ?? {}; - // Write-safety gate: if the tool is not read-only, check - // with the gate before executing. The guest VM is paused - // during this check — it's safe to prompt the user. - if (gate && tool.annotations?.readOnlyHint !== true) { + // Write-safety gate: check tools that are not known or inferred + // read-only. The guest VM is paused during this check, so it is + // safe to prompt the user. + if (gate && !isReadOnlyMCPTool(tool)) { const allowed = await gate( conn.name, tool.name, @@ -86,6 +87,7 @@ export function createMCPPluginAdapter( ); if (!allowed) { return { + ok: false, error: `Operation denied: ${tool.name} on ${conn.name} was blocked by the write-safety gate. The user declined the operation.`, }; } @@ -126,7 +128,7 @@ export function generateMCPDeclarations( : "Record"; lines.push(`/** ${tool.description} */`); lines.push( - `export declare function ${tool.name}(input: ${paramType}): unknown;`, + `export declare function ${tool.name}(input: ${paramType}): Promise;`, ); lines.push(""); } @@ -145,8 +147,8 @@ export function generateMCPModuleHints( overview: `MCP server "${serverName}" — ${tools.length} tool(s) available via host:mcp-${serverName}`, criticalRules: [ `Import with: import { toolName } from "host:mcp-${serverName}"`, - "All calls are synchronous from the sandbox perspective (async auto-awaited)", - "Returns { error: string } on failure — always check for error field", + "All calls are async — use await", + "Returns { ok: boolean, data?: unknown, text?: string, error?: string } — always check ok/error", ], exports: tools.map((t) => ({ name: t.name, diff --git a/src/agent/mcp/setup-commands.ts b/src/agent/mcp/setup-commands.ts new file mode 100644 index 0000000..49182a8 --- /dev/null +++ b/src/agent/mcp/setup-commands.ts @@ -0,0 +1,1339 @@ +// ── MCP setup CLI commands ─────────────────────────────────────────── +// +// Standalone command-line helpers for configuring MCP servers without +// requiring users to download the repository Justfile or helper scripts. + +import { createHash } from "node:crypto"; +import { spawnSync } from "node:child_process"; +import { + existsSync, + mkdirSync, + readFileSync, + readdirSync, + writeFileSync, +} from "node:fs"; +import { homedir, platform } from "node:os"; +import { dirname, join } from "node:path"; + +export type MCPSetupCommand = + | { kind: "setup-everything" } + | { kind: "setup-github" } + | { kind: "setup-filesystem"; dir: string } + | { kind: "show-config" } + | { kind: "setup-workiq" } + | { kind: "add-http"; args: string[] } + | { kind: "m365-create-app"; args: string[] } + | { kind: "m365-setup"; args: string[] } + | { kind: "m365-refresh-servers"; args: string[] } + | { kind: "m365-show" }; + +interface RunOptions { + contentRoot: string; +} + +interface OAuthAuth { + method: "oauth"; + flow: "browser" | "device-code"; + clientId: string; + scopes: string[]; + tenantId?: string; +} + +interface StdioServerEntry { + command: string; + args?: string[]; + env?: Record; + allowTools?: string[]; + denyTools?: string[]; +} + +interface HttpServerEntry { + type: "http"; + url: string; + auth?: OAuthAuth; +} + +interface HyperAgentConfig { + mcpServers?: Record; + [key: string]: unknown; +} + +interface SavedM365State { + clientId?: string; + tenantId?: string; + appName?: string; + callbackPort?: number; +} + +interface CatalogServer { + id?: string; + url: string; + scope: string; + audience?: string; + publisher?: string; +} + +interface Catalog { + _comment?: string; + resourceId?: string; + discoverEndpoint?: string; + callbackPort?: number; + servers: Record; +} + +interface DiscoveredServer { + readonly mcpServerName?: string; + readonly id?: string; + readonly url?: string; + readonly scope?: string; + readonly audience?: string; + readonly publisher?: string; +} + +interface DiscoveryPayload { + readonly mcpServers?: readonly DiscoveredServer[]; +} + +interface M365AppArgs { + appName: string; + callbackPort: number; + serviceRef: string; + clientId: string; +} + +interface AzResult { + ok: boolean; + stdout: string; + stderr: string; + status: number | null; +} + +const NAME_PATTERN = /^[a-z0-9][a-z0-9-]*$/; +const AGENT365_RESOURCE_ID = "ea9ffc3e-8a23-4a7d-836d-234d7c7565c1"; +const DEFAULT_APP_NAME = "HyperAgent M365"; +const DEFAULT_CALLBACK_PORT = 8080; +const ALIAS_PREFIX = "work-iq-"; +const AZ_BIN = platform() === "win32" ? "az.cmd" : "az"; + +const CONFIG_DIR = join(homedir(), ".hyperagent"); +const CONFIG_FILE = join(CONFIG_DIR, "config.json"); +const M365_STATE_FILE = join(CONFIG_DIR, "m365.json"); +const M365_USER_CATALOG = join(CONFIG_DIR, "m365-mcp-servers.json"); +const M365_TOKENS_DIR = join(CONFIG_DIR, "mcp-tokens"); +const APPROVAL_FILE = join(CONFIG_DIR, "approved-mcp.json"); + +const supportsColour = process.stdout.isTTY === true; +const C = supportsColour + ? { + red: "\u001b[0;31m", + green: "\u001b[0;32m", + yellow: "\u001b[0;33m", + cyan: "\u001b[0;36m", + reset: "\u001b[0m", + } + : { red: "", green: "", yellow: "", cyan: "", reset: "" }; + +function logStep(msg: string): void { + console.log(`${C.cyan}▸${C.reset} ${msg}`); +} + +function logSuccess(msg: string): void { + console.log(`${C.green}✅${C.reset} ${msg}`); +} + +function logWarning(msg: string): void { + console.log(`${C.yellow}⚠️${C.reset} ${msg}`); +} + +function logError(msg: string): void { + console.error(`${C.red}❌${C.reset} ${msg}`); +} + +function fail(msg: string): never { + logError(msg); + process.exit(1); +} + +function readJson(path: string): T | undefined { + if (!existsSync(path)) return undefined; + try { + return JSON.parse(readFileSync(path, "utf8")) as T; + } catch (err) { + fail(`Failed to read ${path}: ${(err as Error).message}`); + } +} + +function readConfig(): HyperAgentConfig { + return readJson(CONFIG_FILE) ?? {}; +} + +function writeConfig(cfg: HyperAgentConfig): void { + mkdirSync(CONFIG_DIR, { recursive: true, mode: 0o700 }); + writeFileSync(CONFIG_FILE, JSON.stringify(cfg, null, 2) + "\n", { + mode: 0o600, + }); +} + +function isHttpServerEntry( + server: StdioServerEntry | HttpServerEntry, +): server is HttpServerEntry { + return "type" in server && server.type === "http"; +} + +function getBundledCatalogPath(contentRoot: string): string { + return join(contentRoot, "scripts", "m365-mcp-servers.json"); +} + +function getCatalogPath(contentRoot: string): string { + return existsSync(M365_USER_CATALOG) + ? M365_USER_CATALOG + : getBundledCatalogPath(contentRoot); +} + +function readCatalog(contentRoot: string): Catalog { + const catalogPath = getCatalogPath(contentRoot); + const catalog = readJson(catalogPath); + if (!catalog) fail(`M365 MCP server catalog missing: ${catalogPath}`); + return catalog; +} + +function writeUserCatalog(catalog: Catalog): void { + mkdirSync(CONFIG_DIR, { recursive: true, mode: 0o700 }); + writeFileSync(M365_USER_CATALOG, JSON.stringify(catalog, null, 2) + "\n", { + mode: 0o600, + }); +} + +function spawnInherited(command: string, args: string[]): void { + const result = spawnSync(command, args, { stdio: "inherit" }); + if (result.error) fail(`${command} failed: ${result.error.message}`); + if (result.status !== 0) { + process.exit(result.status ?? 1); + } +} + +function spawnCapture(command: string, args: string[]): string | undefined { + const result = spawnSync(command, args, { + encoding: "utf8", + stdio: ["ignore", "pipe", "pipe"], + }); + if (result.status !== 0) return undefined; + return result.stdout.trim() || undefined; +} + +export function runMCPSetupCommand( + command: MCPSetupCommand, + options: RunOptions, +): void { + switch (command.kind) { + case "setup-everything": + setupEverything(); + return; + case "setup-github": + setupGithub(); + return; + case "setup-filesystem": + setupFilesystem(command.dir); + return; + case "show-config": + showConfig(); + return; + case "setup-workiq": + setupWorkIQ(); + return; + case "add-http": + addHttp(command.args); + return; + case "m365-create-app": + setupM365App(command.args, options.contentRoot); + return; + case "m365-setup": + setupM365(command.args, options.contentRoot); + return; + case "m365-refresh-servers": + refreshM365Servers(command.args, options.contentRoot); + return; + case "m365-show": + showM365(); + return; + } +} + +function setupEverything(): void { + console.log("Configuring MCP 'everything' test server..."); + console.log( + "Requires npm/npx. First use downloads @modelcontextprotocol/server-everything.", + ); + + const cfg = readConfig(); + cfg.mcpServers = cfg.mcpServers ?? {}; + cfg.mcpServers.everything = { + command: "npx", + args: ["-y", "@modelcontextprotocol/server-everything"], + }; + writeConfig(cfg); + + logSuccess(`MCP 'everything' server configured in ${CONFIG_FILE}`); + console.log(" Start HyperAgent and ask for the everything test tools."); +} + +function setupGithub(): void { + console.log("Configuring MCP 'github' server..."); + console.log("Requires npm/npx and a GitHub token in GITHUB_TOKEN."); + + if (!process.env.GITHUB_TOKEN) { + logWarning("GITHUB_TOKEN not set. Trying 'gh auth token'..."); + const token = spawnCapture("gh", ["auth", "token"]); + if (token) { + logSuccess( + "GitHub CLI is authenticated; export GITHUB_TOKEN=$(gh auth token) before connecting.", + ); + } else { + logWarning("Could not get a token from gh CLI."); + console.log(" Run: export GITHUB_TOKEN=$(gh auth token)"); + } + } + + const cfg = readConfig(); + cfg.mcpServers = cfg.mcpServers ?? {}; + cfg.mcpServers.github = { + command: "npx", + args: ["-y", "@modelcontextprotocol/server-github"], + env: { GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}" }, + allowTools: [ + "list_issues", + "get_issue", + "search_issues", + "list_pull_requests", + "get_pull_request", + "search_repositories", + "get_file_contents", + ], + denyTools: ["merge_pull_request", "delete_branch", "push_files"], + }; + writeConfig(cfg); + + logSuccess(`MCP 'github' server configured in ${CONFIG_FILE}`); + console.log(" Tip: export GITHUB_TOKEN=$(gh auth token)"); +} + +function setupFilesystem(dir: string): void { + console.log("Configuring MCP 'filesystem' server..."); + console.log( + "Requires npm/npx. First use downloads @modelcontextprotocol/server-filesystem.", + ); + + mkdirSync(dir, { recursive: true }); + const cfg = readConfig(); + cfg.mcpServers = cfg.mcpServers ?? {}; + cfg.mcpServers.filesystem = { + command: "npx", + args: ["-y", "@modelcontextprotocol/server-filesystem", dir], + }; + writeConfig(cfg); + + logSuccess(`MCP 'filesystem' server configured in ${CONFIG_FILE}`); + console.log(` Root directory: ${dir}`); +} + +function showConfig(): void { + if (!existsSync(CONFIG_FILE)) { + console.log(`No config file found at ${CONFIG_FILE}`); + console.log("Run: hyperagent --mcp-setup-everything"); + return; + } + + const cfg = readConfig(); + if (!cfg.mcpServers || Object.keys(cfg.mcpServers).length === 0) { + console.log("No MCP servers configured."); + return; + } + + console.log("Configured MCP servers:"); + for (const [name, server] of Object.entries(cfg.mcpServers)) { + if (isHttpServerEntry(server)) { + const auth = server.auth + ? ` [${server.auth.method}/${server.auth.flow}]` + : ""; + console.log(` ${name}: ${server.url}${auth}`); + } else { + console.log( + ` ${name}: ${server.command ?? "?"} ${(server.args ?? []).join(" ")}`, + ); + } + } +} + +function setupWorkIQ(): void { + console.log("Configuring Microsoft Work IQ stdio MCP server..."); + console.log("Requires Node/npm and a Microsoft 365 Copilot licence."); + console.log( + "Tenant admins may need to consent to the Work IQ CLI enterprise app.", + ); + console.log( + "This command pre-fetches @microsoft/workiq and runs its EULA step.", + ); + console.log(""); + + logStep("Pre-fetching @microsoft/workiq (~188 MB on first run)..."); + spawnInherited("npx", ["-y", "@microsoft/workiq@latest", "version"]); + + logStep("Accepting EULA (interactive, safe to re-run)..."); + spawnInherited("npx", ["-y", "@microsoft/workiq@latest", "accept-eula"]); + + logStep("Writing MCP config entry..."); + const cfg = readConfig(); + cfg.mcpServers = cfg.mcpServers ?? {}; + for (const key of Object.keys(cfg.mcpServers)) { + if (key.startsWith("work-iq-")) delete cfg.mcpServers[key]; + } + cfg.mcpServers.workiq = { + command: "npx", + args: ["-y", "@microsoft/workiq@latest", "mcp"], + }; + writeConfig(cfg); + + logSuccess(`Work IQ stdio MCP server ready in ${CONFIG_FILE}`); + console.log(" First tool call opens a browser for Microsoft sign-in."); +} + +function addHttp(args: string[]): void { + const [name, url, clientId, tenantId, scopes, flowArg] = args; + if (!name || !url) { + fail( + "Usage: hyperagent --mcp-add-http [clientId] [tenantId] [scopes] [flow]", + ); + } + writeHttpServerEntry( + name, + url, + clientId ?? "", + tenantId ?? "", + scopes ?? "", + flowArg ?? "", + ); +} + +function writeHttpServerEntry( + name: string, + url: string, + clientId: string, + tenantId: string, + scopes: string, + flowArg: string, +): void { + if (!NAME_PATTERN.test(name)) { + fail(`Invalid name '${name}' (use lowercase letters, digits, hyphens)`); + } + + let parsedUrl: URL; + try { + parsedUrl = new URL(url); + } catch { + fail(`Invalid URL: ${url}`); + } + const isLocal = + parsedUrl.hostname === "localhost" || parsedUrl.hostname === "127.0.0.1"; + if (parsedUrl.protocol !== "https:" && !isLocal) { + fail(`URL must be https:// (or localhost for testing): ${url}`); + } + + const entry: HttpServerEntry = { type: "http", url }; + if (clientId) { + if (flowArg !== "browser" && flowArg !== "device-code") { + fail( + `flow is required when clientId is provided and must be "browser" or "device-code" (got: "${flowArg}")`, + ); + } + const scopeList = scopes + ? scopes + .split(",") + .map((scope) => scope.trim()) + .filter(Boolean) + : [`${parsedUrl.origin}/.default`]; + entry.auth = { + method: "oauth", + flow: flowArg, + clientId, + scopes: scopeList, + ...(tenantId ? { tenantId } : {}), + }; + } + + const cfg = readConfig(); + cfg.mcpServers = cfg.mcpServers ?? {}; + cfg.mcpServers[name] = entry; + writeConfig(cfg); + + const suffix = clientId ? ` (oauth/${flowArg})` : ""; + logSuccess(`Wrote mcpServers.${name} -> ${url}${suffix}`); +} + +function parseM365AppArgs(argv: string[]): M365AppArgs { + const args: M365AppArgs = { + appName: DEFAULT_APP_NAME, + callbackPort: DEFAULT_CALLBACK_PORT, + serviceRef: "", + clientId: "", + }; + + for (let index = 0; index < argv.length; index++) { + const arg = argv[index]; + switch (arg) { + case "--app-name": + args.appName = argv[++index] ?? args.appName; + break; + case "--callback-port": + args.callbackPort = + Number.parseInt(argv[++index] ?? "", 10) || DEFAULT_CALLBACK_PORT; + break; + case "--service-ref": + args.serviceRef = argv[++index] ?? ""; + break; + case "--client-id": + args.clientId = argv[++index] ?? ""; + break; + case "--help": + case "-h": + printM365CreateAppHelp(); + process.exit(0); + break; + default: + fail(`Unknown --mcp-m365-create-app option: ${arg}`); + } + } + return args; +} + +function printM365CreateAppHelp(): void { + console.log( + "Usage: hyperagent --mcp-m365-create-app " + + "[--app-name NAME] [--callback-port PORT] [--service-ref GUID] [--client-id ID]", + ); + console.log(""); + console.log( + "Creates or reuses a single-tenant public-client Entra ID app registration", + ); + console.log("for Microsoft 365 / Agent 365 HTTP MCP servers."); + console.log(""); + console.log("Prerequisites:"); + console.log(" - Azure CLI installed and logged in: az login"); + console.log(" - Microsoft 365 Copilot licence"); + console.log(" - Frontier preview enrolment where required"); +} + +function az(args: string[]): AzResult { + const result = spawnSync(AZ_BIN, args, { + encoding: "utf8", + maxBuffer: 32 * 1024 * 1024, + }); + if (result.error) { + return { + ok: false, + stdout: "", + stderr: result.error.message, + status: null, + }; + } + return { + ok: result.status === 0, + stdout: result.stdout ?? "", + stderr: result.stderr ?? "", + status: result.status, + }; +} + +function azOrFail(args: string[], failMsg: string): string { + const result = az(args); + if (!result.ok) { + logError(failMsg); + if (result.stderr.trim()) console.error(` ${result.stderr.trim()}`); + process.exit(1); + } + return result.stdout.trim(); +} + +function checkAzurePrerequisites(): void { + if (!az(["--version"]).ok) { + logError("Azure CLI (az) not found."); + console.error( + " Install: https://learn.microsoft.com/cli/azure/install-azure-cli", + ); + process.exit(1); + } + if (!az(["account", "show"]).ok) { + logError("Not logged in to Azure CLI."); + console.error(" Run: az login"); + console.error(" From SSH/WSL: az login --use-device-code"); + process.exit(1); + } + if (!az(["ad", "signed-in-user", "show"]).ok) { + logError("Azure CLI session lacks Microsoft Graph permissions."); + console.error( + " Run: az login --scope https://graph.microsoft.com//.default", + ); + process.exit(1); + } +} + +function setupM365App(argv: string[], contentRoot: string): void { + const args = parseM365AppArgs(argv); + const redirectUri = `http://localhost:${args.callbackPort}/callback`; + + console.log("Setting up Microsoft 365 / Agent 365 MCP app registration..."); + console.log( + "Requires Azure CLI, az login, Microsoft 365 Copilot licensing, and tenant consent.", + ); + console.log(""); + + checkAzurePrerequisites(); + const tenantId = resolveTenantId(); + verifyAgent365Resource(); + + const appId = resolveAppId(args, redirectUri); + declareCatalogScopes(appId, contentRoot); + requestAdminConsent(appId, tenantId); + saveM365State({ + clientId: appId, + tenantId, + appName: args.appName, + callbackPort: args.callbackPort, + }); + printM365AppSummary(args.appName, appId, tenantId, redirectUri); +} + +function resolveTenantId(): string { + logStep("Resolving tenant..."); + const tenantId = azOrFail( + ["account", "show", "--query", "tenantId", "-o", "tsv"], + "Failed to read tenantId from az account show", + ); + const userPrincipal = azOrFail( + ["account", "show", "--query", "user.name", "-o", "tsv"], + "Failed to read signed-in user from az account show", + ); + const tenantDomain = userPrincipal.includes("@") + ? userPrincipal.split("@")[1] + : "(unknown)"; + logSuccess(`Tenant: ${tenantId} (${tenantDomain})`); + return tenantId; +} + +function verifyAgent365Resource(): void { + logStep("Verifying Agent 365 resource is available..."); + if (!az(["ad", "sp", "show", "--id", AGENT365_RESOURCE_ID]).ok) { + logError( + `Agent 365 service principal (${AGENT365_RESOURCE_ID}) not found in your tenant.`, + ); + console.error(" This usually means one of:"); + console.error(" 1. No Microsoft 365 Copilot licence on this tenant."); + console.error(" 2. Not enrolled in the Frontier preview programme:"); + console.error( + " https://adoption.microsoft.com/copilot/frontier-program/", + ); + process.exit(1); + } + logSuccess("Agent 365 resource present"); +} + +function readSavedClientId(): string { + const state = readJson(M365_STATE_FILE); + return state?.clientId ?? ""; +} + +function appExists(appId: string): boolean { + return az(["ad", "app", "show", "--id", appId]).ok; +} + +function updateAppPublicClient(appId: string, redirectUri: string): void { + azOrFail( + [ + "ad", + "app", + "update", + "--id", + appId, + "--public-client-redirect-uris", + redirectUri, + "--is-fallback-public-client", + "true", + "-o", + "none", + ], + `Failed to update app ${appId}`, + ); + logSuccess("Updated redirect URI + public-client flag"); +} + +function resolveAppId(args: M365AppArgs, redirectUri: string): string { + const savedClientId = readSavedClientId(); + + if (args.clientId) { + if (!appExists(args.clientId)) { + fail(`App not found in this tenant: ${args.clientId}`); + } + logWarning(`Adopting existing app via --client-id: ${args.clientId}`); + updateAppPublicClient(args.clientId, redirectUri); + return args.clientId; + } + + if (savedClientId && appExists(savedClientId)) { + logWarning(`Reusing saved app from ${M365_STATE_FILE}: ${savedClientId}`); + updateAppPublicClient(savedClientId, redirectUri); + return savedClientId; + } + + logStep(`Checking for existing app: ${args.appName}`); + const lookup = az([ + "ad", + "app", + "list", + "--display-name", + args.appName, + "--query", + "[0].appId", + "-o", + "tsv", + ]); + const existing = lookup.ok ? lookup.stdout.trim() : ""; + if (existing && existing !== "None") { + logWarning(`App already exists (${existing}) — updating redirect URI`); + updateAppPublicClient(existing, redirectUri); + return existing; + } + + logStep("Creating app registration..."); + if (args.serviceRef) { + return createAppViaGraph(args.appName, redirectUri, args.serviceRef); + } + return createAppViaCli(args.appName, redirectUri); +} + +function createAppViaCli(appName: string, redirectUri: string): string { + const result = az([ + "ad", + "app", + "create", + "--display-name", + appName, + "--sign-in-audience", + "AzureADMyOrg", + "--public-client-redirect-uris", + redirectUri, + "--is-fallback-public-client", + "true", + "--query", + "appId", + "-o", + "tsv", + ]); + + if (!result.ok) { + const combined = `${result.stdout}\n${result.stderr}`.toLowerCase(); + if (combined.includes("servicemanagementreference")) { + logError("Your tenant requires a Service Tree GUID."); + console.error( + " Find one: az ad app list --all --query '[0].serviceManagementReference' -o tsv", + ); + console.error( + ' Re-run: hyperagent --mcp-m365-create-app --service-ref ""', + ); + process.exit(1); + } + fail(`Failed to create app: ${(result.stderr || result.stdout).trim()}`); + } + + const appId = result.stdout.trim(); + logSuccess(`App created: ${appId}`); + return appId; +} + +function createAppViaGraph( + appName: string, + redirectUri: string, + serviceRef: string, +): string { + const body = JSON.stringify({ + displayName: appName, + signInAudience: "AzureADMyOrg", + isFallbackPublicClient: true, + publicClient: { redirectUris: [redirectUri] }, + serviceManagementReference: serviceRef, + }); + const appId = azOrFail( + [ + "rest", + "--method", + "POST", + "--url", + "https://graph.microsoft.com/v1.0/applications", + "--headers", + "Content-Type=application/json", + "--body", + body, + "--query", + "appId", + "-o", + "tsv", + ], + "Failed to create app via Graph API", + ); + logSuccess(`App created: ${appId}`); + return appId; +} + +function declareCatalogScopes(appId: string, contentRoot: string): void { + const catalog = readCatalog(contentRoot); + const scopeValues = [ + ...new Set( + Object.values(catalog.servers) + .map((server) => server.scope) + .filter(Boolean), + ), + ]; + + if (scopeValues.length === 0) { + logWarning("Catalog has no scopes to declare — skipping"); + return; + } + + logStep("Discovering Agent 365 published scopes..."); + const result = az([ + "ad", + "sp", + "show", + "--id", + AGENT365_RESOURCE_ID, + "--query", + "oauth2PermissionScopes[].{value:value,id:id}", + "-o", + "json", + ]); + if (!result.ok || !result.stdout.trim() || result.stdout.trim() === "null") { + logWarning( + "Could not enumerate Agent 365 scopes — skipping scope declaration", + ); + logWarning( + "Users will need to consent each scope individually on first sign-in", + ); + return; + } + + let spScopes: Array<{ value: string; id: string }>; + try { + spScopes = JSON.parse(result.stdout) as Array<{ + value: string; + id: string; + }>; + } catch { + logWarning("Agent 365 scope list returned invalid JSON — skipping"); + return; + } + + logStep("Declaring catalog scopes on the app registration..."); + const valueToId = new Map(spScopes.map((scope) => [scope.value, scope.id])); + let added = 0; + let missing = 0; + + for (const scopeValue of scopeValues) { + const scopeId = valueToId.get(scopeValue); + if (!scopeId) { + console.log( + ` ⚠️ ${scopeValue} (not published by Agent 365 in this tenant)`, + ); + missing += 1; + continue; + } + const addResult = az([ + "ad", + "app", + "permission", + "add", + "--id", + appId, + "--api", + AGENT365_RESOURCE_ID, + "--api-permissions", + `${scopeId}=Scope`, + "-o", + "none", + ]); + if (addResult.ok) { + console.log(` ✅ ${scopeValue}`); + added += 1; + } else { + console.log(` ➖ ${scopeValue} (already declared)`); + } + } + + logSuccess(`Declared scopes (${added} new, ${missing} missing in tenant)`); +} + +function requestAdminConsent(appId: string, tenantId: string): void { + logStep("Requesting admin consent for the app..."); + const result = az([ + "ad", + "app", + "permission", + "admin-consent", + "--id", + appId, + ]); + if (result.ok) { + logSuccess("Admin consent granted"); + return; + } + logWarning("Admin consent not granted (you are probably not a tenant admin)"); + console.log(" Ask a tenant admin to open this URL once:"); + console.log( + ` https://login.microsoftonline.com/${tenantId}/adminconsent?client_id=${appId}`, + ); +} + +function saveM365State(next: SavedM365State): void { + mkdirSync(CONFIG_DIR, { recursive: true, mode: 0o700 }); + const cur = readJson(M365_STATE_FILE) ?? {}; + writeFileSync( + M365_STATE_FILE, + JSON.stringify({ ...cur, ...next }, null, 2) + "\n", + { + mode: 0o600, + }, + ); + logSuccess(`Saved app details to ${M365_STATE_FILE}`); +} + +function printM365AppSummary( + appName: string, + appId: string, + tenantId: string, + redirectUri: string, +): void { + console.log(""); + logSuccess("App registration complete!"); + console.log(` App Name: ${appName}`); + console.log(` Client ID: ${appId}`); + console.log(` Tenant ID: ${tenantId}`); + console.log(` Redirect: ${redirectUri}`); + console.log(""); + console.log("Next:"); + console.log(' hyperagent --mcp-setup-m365 all "" "" "" browser'); + console.log(" hyperagent"); +} + +function setupM365(argv: string[], contentRoot: string): void { + const [ + servicesArg = "all", + clientIdArg = "", + tenantIdArg = "", + scopeOverride = "", + flowArg = "", + ] = argv; + const catalog = readCatalog(contentRoot); + const known = Object.keys(catalog.servers); + const raw = (servicesArg || "all").trim().toLowerCase(); + + if (raw === "list" || raw === "--list" || raw === "ls") { + console.log("Available M365 / Agent 365 MCP servers:\n"); + const sorted = [...known].sort(); + const aliasWidth = Math.max(...sorted.map((alias) => alias.length)); + for (const alias of sorted) { + const server = catalog.servers[alias]; + console.log(` ${alias.padEnd(aliasWidth)} ${server.scope}`); + } + console.log(""); + console.log("Usage:"); + console.log(' hyperagent --mcp-setup-m365 all "" "" "" browser'); + console.log( + ' hyperagent --mcp-setup-m365 "mail,planner" "" "" "" device-code', + ); + return; + } + + if (flowArg !== "browser" && flowArg !== "device-code") { + fail( + `flow is required and must be "browser" or "device-code" (got: "${flowArg}")`, + ); + } + const flow = flowArg; + + const selected = + raw === "" || raw === "all" + ? known + : raw + .split(",") + .map((service) => service.trim()) + .filter(Boolean); + const unknown = selected.filter((service) => !known.includes(service)); + if (unknown.length > 0) { + console.error(`❌ Unknown service(s): ${unknown.join(", ")}`); + console.error(` Known: ${known.join(", ")}, all`); + process.exit(1); + } + + let clientId = clientIdArg; + let tenantId = tenantIdArg; + if (!clientId || !tenantId) { + const state = readJson(M365_STATE_FILE); + if (!state) { + console.error("❌ No saved app state and no clientId/tenantId provided."); + console.error(" Run: hyperagent --mcp-m365-create-app"); + console.error( + ' Or: hyperagent --mcp-setup-m365 "" ', + ); + process.exit(1); + } + clientId = clientId || state.clientId || ""; + tenantId = tenantId || state.tenantId || ""; + logStep(`Using saved app from ${M365_STATE_FILE}`); + } + + if (!clientId || !tenantId) fail("clientId/tenantId required"); + + console.log(`▸ clientId: ${clientId}`); + console.log(`▸ tenantId: ${tenantId}`); + console.log(`▸ services: ${servicesArg}`); + console.log(`▸ flow: ${flow}`); + if (scopeOverride) console.log(`▸ scope override: ${scopeOverride}`); + console.log(""); + + const defaultScope = catalog.resourceId + ? `${catalog.resourceId}/.default` + : undefined; + let count = 0; + const configured: Array<{ + name: string; + url: string; + clientId: string; + flow: string; + tenantId: string; + scopes: string[]; + }> = []; + + for (const service of selected) { + const server = catalog.servers[service]; + const scope = scopeOverride || defaultScope || server.scope; + if (!server.url || !scope) + fail(`Catalog entry for ${service} missing url or scope`); + const tenantedUrl = injectTenantIntoUrl(server.url, tenantId); + const name = ALIAS_PREFIX + service; + writeM365ServerEntry(name, tenantedUrl, clientId, tenantId, scope, flow); + configured.push({ + name, + url: tenantedUrl, + clientId, + flow, + tenantId, + scopes: [scope], + }); + count += 1; + } + + preApproveServers(configured); + logSuccess(`Configured ${count} M365 MCP server(s) and pre-approved them`); + console.log( + " First connect opens a browser or device-code flow, depending on config.", + ); +} + +function injectTenantIntoUrl(url: string, tenantId: string): string { + if (!tenantId) fail("tenantId is required to build M365 MCP server URLs"); + if (url.includes("/agents/tenants/")) return url; + const marker = "/agents/servers/"; + const index = url.indexOf(marker); + if (index === -1) { + fail( + `Catalog URL does not contain '${marker}' — cannot inject tenant: ${url}`, + ); + } + return `${url.slice(0, index)}/agents/tenants/${tenantId}/servers/${url.slice(index + marker.length)}`; +} + +function writeM365ServerEntry( + name: string, + url: string, + clientId: string, + tenantId: string, + scope: string, + flow: "browser" | "device-code", +): void { + const cfg = readConfig(); + cfg.mcpServers = cfg.mcpServers ?? {}; + cfg.mcpServers[name] = { + type: "http", + url, + auth: { + method: "oauth", + flow, + clientId, + scopes: [scope], + ...(tenantId ? { tenantId } : {}), + }, + }; + writeConfig(cfg); + logSuccess(`Wrote mcpServers.${name} -> ${url} (oauth/${flow})`); +} + +function computeConfigHash( + name: string, + url: string, + clientId: string, + flow: string, + tenantId: string, + scopes: string[], +): string { + return createHash("sha256") + .update(name, "utf8") + .update("http", "utf8") + .update(url, "utf8") + .update("oauth", "utf8") + .update(flow, "utf8") + .update(clientId, "utf8") + .update(tenantId, "utf8") + .update(JSON.stringify(scopes), "utf8") + .update("", "utf8") + .update("[]", "utf8") + .update("[]", "utf8") + .digest("hex"); +} + +interface ApprovalRecord { + configHash: string; + approvedAt: string; + approvedTools: string[]; + auditWarnings: string[]; +} + +function preApproveServers( + servers: Array<{ + name: string; + url: string; + clientId: string; + flow: string; + tenantId: string; + scopes: string[]; + }>, +): void { + const store = readJson>(APPROVAL_FILE) ?? {}; + for (const server of servers) { + store[server.name] = { + configHash: computeConfigHash( + server.name, + server.url, + server.clientId, + server.flow, + server.tenantId, + server.scopes, + ), + approvedAt: new Date().toISOString(), + approvedTools: [], + auditWarnings: [], + }; + } + mkdirSync(dirname(APPROVAL_FILE), { recursive: true, mode: 0o700 }); + writeFileSync(APPROVAL_FILE, JSON.stringify(store, null, 2) + "\n", { + mode: 0o600, + }); +} + +function refreshM365Servers(argv: string[], contentRoot: string): void { + const args = parseRefreshArgs(argv); + const catalog = readCatalog(contentRoot); + const endpoint = catalog.discoverEndpoint; + if (!endpoint) fail("M365 catalog is missing discoverEndpoint"); + + const token = args.token ?? loadTokenFromCache(); + if (!token) { + fail( + "No bearer token found. Provide --token , or connect any work-iq-* server once to seed ~/.hyperagent/mcp-tokens/.", + ); + } + + logStep(`Fetching ${endpoint}`); + const payload = fetchDiscoveryPayload(endpoint, token); + const list = payload.mcpServers ?? []; + if (list.length === 0) fail("Discovery returned no servers"); + + const idToExistingAlias = new Map(); + for (const [alias, server] of Object.entries(catalog.servers)) { + if (server.id) idToExistingAlias.set(server.id, alias); + } + + const next: Record = {}; + let added = 0; + let skipped = 0; + for (const entry of list) { + const id = entry.mcpServerName ?? entry.id; + const url = entry.url; + const scope = entry.scope; + const audience = entry.audience; + if (typeof id !== "string" || !/^[A-Za-z0-9_]+$/.test(id)) { + skipped += 1; + continue; + } + if (typeof url !== "string" || !url.startsWith("https://")) { + logWarning(`skipping ${id}: invalid url`); + skipped += 1; + continue; + } + if (typeof scope !== "string" || scope.length === 0) { + logWarning(`skipping ${id}: missing scope`); + skipped += 1; + continue; + } + if ( + !args.includeCustom && + catalog.resourceId && + audience !== catalog.resourceId + ) { + logWarning( + `skipping ${id}: audience '${audience ?? "(none)"}' != resource`, + ); + skipped += 1; + continue; + } + + const existingAlias = idToExistingAlias.get(id); + let alias = existingAlias ?? deriveAlias(id); + if (next[alias] && next[alias].id !== id) { + alias = `${alias}-${id.toLowerCase()}`; + } + next[alias] = { id, url, scope }; + if (!existingAlias) added += 1; + } + + catalog.servers = Object.fromEntries( + Object.entries(next).sort(([left], [right]) => left.localeCompare(right)), + ); + writeUserCatalog(catalog); + logSuccess( + `Rewrote ${M365_USER_CATALOG} (${Object.keys(catalog.servers).length} servers, ${added} new, ${skipped} skipped)`, + ); +} + +function fetchDiscoveryPayload( + endpoint: string, + token: string, +): DiscoveryPayload { + const result = spawnSync( + process.execPath, + [ + "--input-type=module", + "-e", + ` +const endpoint = process.env.HYPERAGENT_DISCOVERY_ENDPOINT; +const token = process.env.HYPERAGENT_DISCOVERY_TOKEN; +if (!endpoint || !token) process.exit(2); +const response = await fetch(endpoint, { + headers: { Accept: "application/json", Authorization: \`Bearer \${token}\` }, +}); +if (!response.ok) { + const body = await response.text().catch(() => ""); + console.error(\`Discovery failed: \${response.status} \${response.statusText}\\n\${body.slice(0, 500)}\`); + process.exit(1); +} +process.stdout.write(await response.text()); +`, + ], + { + encoding: "utf8", + env: { + ...process.env, + HYPERAGENT_DISCOVERY_ENDPOINT: endpoint, + HYPERAGENT_DISCOVERY_TOKEN: token, + }, + maxBuffer: 10 * 1024 * 1024, + stdio: ["ignore", "pipe", "pipe"], + }, + ); + + if (result.error) fail(`Discovery failed: ${result.error.message}`); + if (result.status !== 0) fail(result.stderr.trim() || "Discovery failed"); + try { + return JSON.parse(result.stdout) as DiscoveryPayload; + } catch (error) { + fail( + `Discovery returned invalid JSON: ${error instanceof Error ? error.message : String(error)}`, + ); + } +} + +function parseRefreshArgs(argv: string[]): { + token?: string; + includeCustom: boolean; +} { + const parsed: { token?: string; includeCustom: boolean } = { + includeCustom: false, + }; + for (let index = 0; index < argv.length; index++) { + const arg = argv[index]; + if (arg === "--token" && index + 1 < argv.length) { + parsed.token = argv[++index]; + } else if (arg === "--include-custom") { + parsed.includeCustom = true; + } else if (arg === "--help" || arg === "-h") { + console.log( + "Usage: hyperagent --mcp-m365-refresh-servers [--token ] [--include-custom]", + ); + process.exit(0); + } else { + fail(`Unknown --mcp-m365-refresh-servers option: ${arg}`); + } + } + return parsed; +} + +function loadTokenFromCache(): string | undefined { + if (!existsSync(M365_TOKENS_DIR)) return undefined; + const files = readdirSync(M365_TOKENS_DIR).filter((file) => + file.endsWith(".msal.json"), + ); + let best: { token: string; expiresOn: number } | undefined; + for (const file of files) { + try { + const parsed = JSON.parse( + readFileSync(join(M365_TOKENS_DIR, file), "utf8"), + ) as Record; + const tokenMap = parsed.AccessToken as + | Record + | undefined; + if (!tokenMap) continue; + for (const entry of Object.values(tokenMap)) { + if (typeof entry.secret !== "string") continue; + const expiresOn = Number(entry.expires_on ?? "0"); + if (expiresOn * 1000 < Date.now()) continue; + if (!best || expiresOn > best.expiresOn) { + best = { token: entry.secret, expiresOn }; + } + } + } catch { + // Skip corrupt token cache files. + } + } + return best?.token; +} + +function deriveAlias(serverId: string): string { + let alias = serverId.replace(/^mcp_/i, ""); + alias = alias.replace(/(RemoteServer|Server|Tools)$/i, ""); + alias = alias.replace(/^M365/i, ""); + alias = alias.replace(/([a-z0-9])([A-Z])/g, "$1-$2"); + alias = alias.replace(/_/g, "-"); + return alias.toLowerCase() || serverId.toLowerCase(); +} + +function showM365(): void { + const state = readJson(M365_STATE_FILE); + if (!state) { + console.log("No saved M365 app. Run: hyperagent --mcp-m365-create-app"); + return; + } + console.log("M365 app registration:"); + console.log(` App name: ${state.appName ?? "(unset)"}`); + console.log(` Client ID: ${state.clientId ?? "(unset)"}`); + console.log(` Tenant ID: ${state.tenantId ?? "(unset)"}`); + console.log( + ` Callback port: ${state.callbackPort ?? DEFAULT_CALLBACK_PORT}`, + ); + console.log(` State file: ${M365_STATE_FILE}`); +} diff --git a/src/agent/mcp/tool-utils.ts b/src/agent/mcp/tool-utils.ts new file mode 100644 index 0000000..05cefd2 --- /dev/null +++ b/src/agent/mcp/tool-utils.ts @@ -0,0 +1,186 @@ +import type { MCPToolSchema } from "./types.js"; + +export interface MCPToolInfo { + name: string; + originalName: string; + description: string; + parameters: Record; + annotations?: MCPToolSchema["annotations"]; + inferredReadOnly: boolean; + safety: "read" | "write" | "destructive" | "unknown"; +} + +export interface MCPToolSelection { + tools: MCPToolInfo[]; + missing: string[]; + totalMatches: number; +} + +const READ_ONLY_TOOL_PREFIXES = [ + "get", + "list", + "read", + "search", + "find", + "query", + "lookup", + "fetch", + "describe", + "inspect", + "count", + "check", +]; + +const WRITE_TOOL_PREFIXES = [ + "create", + "update", + "delete", + "remove", + "send", + "post", + "put", + "patch", + "set", + "write", + "add", + "invite", + "assign", + "cancel", + "approve", + "reject", + "archive", + "move", + "copy", + "upload", +]; + +export function formatMCPToolInfo(tool: MCPToolSchema): MCPToolInfo { + const inferredReadOnly = isReadOnlyMCPTool(tool); + return { + name: tool.name, + originalName: tool.originalName, + description: tool.description, + parameters: tool.inputSchema, + ...(tool.annotations ? { annotations: tool.annotations } : {}), + inferredReadOnly, + safety: getMCPToolSafety(tool), + }; +} + +export function findMCPTool( + tools: MCPToolSchema[], + name: string, +): MCPToolSchema | undefined { + const normalised = normaliseToolName(name); + return tools.find( + (tool) => + normaliseToolName(tool.name) === normalised || + normaliseToolName(tool.originalName) === normalised, + ); +} + +export function selectMCPTools( + allTools: MCPToolSchema[], + params: { tools?: string[]; query?: string; limit?: number }, +): MCPToolSelection { + const limit = clampLimit(params.limit); + const missing: string[] = []; + + if (params.tools && params.tools.length > 0) { + const selected: MCPToolSchema[] = []; + for (const requested of params.tools) { + const found = findMCPTool(allTools, requested); + if (found) { + selected.push(found); + } else { + missing.push(requested); + } + } + return { + tools: selected.slice(0, limit).map(formatMCPToolInfo), + missing, + totalMatches: selected.length, + }; + } + + const query = params.query?.trim(); + if (!query) { + return { + tools: allTools.slice(0, limit).map(formatMCPToolInfo), + missing, + totalMatches: allTools.length, + }; + } + + const terms = query.toLowerCase().split(/\s+/).filter(Boolean); + const scored = allTools + .map((tool) => ({ tool, score: scoreTool(tool, terms) })) + .filter((entry) => entry.score > 0) + .sort( + (a, b) => b.score - a.score || a.tool.name.localeCompare(b.tool.name), + ); + + return { + tools: scored.slice(0, limit).map((entry) => formatMCPToolInfo(entry.tool)), + missing, + totalMatches: scored.length, + }; +} + +export function isReadOnlyMCPTool(tool: MCPToolSchema): boolean { + if (tool.annotations?.readOnlyHint === true) return true; + if (tool.annotations?.destructiveHint === true) return false; + + const name = normaliseToolName(tool.name); + if (WRITE_TOOL_PREFIXES.some((prefix) => name.startsWith(prefix))) { + return false; + } + return READ_ONLY_TOOL_PREFIXES.some((prefix) => name.startsWith(prefix)); +} + +export function getMCPToolSafety(tool: MCPToolSchema): MCPToolInfo["safety"] { + if (tool.annotations?.destructiveHint === true) return "destructive"; + if (tool.annotations?.readOnlyHint === true) return "read"; + + const name = normaliseToolName(tool.name); + if (WRITE_TOOL_PREFIXES.some((prefix) => name.startsWith(prefix))) { + return "write"; + } + if (READ_ONLY_TOOL_PREFIXES.some((prefix) => name.startsWith(prefix))) { + return "read"; + } + return "unknown"; +} + +function scoreTool(tool: MCPToolSchema, terms: string[]): number { + const searchable = [ + tool.name, + tool.originalName, + tool.description, + ...Object.keys( + (tool.inputSchema.properties as Record | undefined) ?? + {}, + ), + ] + .join(" ") + .toLowerCase(); + + let score = 0; + for (const term of terms) { + if (normaliseToolName(tool.name).includes(normaliseToolName(term))) { + score += 5; + } else if (searchable.includes(term)) { + score += 1; + } + } + return score; +} + +function normaliseToolName(name: string): string { + return name.toLowerCase().replace(/[^a-z0-9]/g, ""); +} + +function clampLimit(limit: number | undefined): number { + if (typeof limit !== "number" || !Number.isFinite(limit)) return 8; + return Math.min(Math.max(Math.trunc(limit), 1), 50); +} diff --git a/src/agent/profiles.ts b/src/agent/profiles.ts index 720197c..181a0ca 100644 --- a/src/agent/profiles.ts +++ b/src/agent/profiles.ts @@ -197,6 +197,31 @@ const HEAVY_COMPUTE_PROFILE: Profile = { plugins: [], }; +/** + * MCP network profile — for handlers that call external MCP services. + * Network I/O consumes wall-clock time while CPU remains low, so this + * bumps wall timeout without enabling file plugins or inflating CPU. + */ +const MCP_NETWORK_PROFILE: Profile = { + name: "mcp-network", + description: "MCP service calls — longer wall timeout, no file plugins", + patterns: [], + useCases: [ + "Calling MCP tools from handler code", + "External service reads, searches, and lookups via MCP", + "Multi-call MCP handlers that wait on network I/O", + ], + limits: { + cpuTimeoutMs: 2000, + wallTimeoutMs: 30000, + heapMb: 32, + scratchMb: 32, + inputBufferKb: 4096, + outputBufferKb: 4096, + }, + plugins: [], +}; + // ── Profile Registry ───────────────────────────────────────────────── /** All built-in profiles, keyed by name. */ @@ -205,6 +230,7 @@ export const PROFILES: ReadonlyMap = new Map([ [FILE_BUILDER_PROFILE.name, FILE_BUILDER_PROFILE], [WEB_RESEARCH_PROFILE.name, WEB_RESEARCH_PROFILE], [HEAVY_COMPUTE_PROFILE.name, HEAVY_COMPUTE_PROFILE], + [MCP_NETWORK_PROFILE.name, MCP_NETWORK_PROFILE], ]); /** Get a profile by name, or undefined if not found. */ diff --git a/src/agent/system-message.ts b/src/agent/system-message.ts index b1e30db..f912b74 100644 --- a/src/agent/system-message.ts +++ b/src/agent/system-message.ts @@ -132,6 +132,14 @@ MCP (Model Context Protocol) SERVERS: \${MCP_SECTION} async/await IS needed for libraries that use Promises internally. +MCP HANDLER-ONLY EXECUTION: + The LLM must never call MCP server tools directly. Use this exact order: + list_mcp_servers → manage_mcp(connect) → mcp_tool_info → apply_profile({ profiles: "mcp-network" }) → register_handler. + MCP execution happens only inside generated handler code that imports from + host:mcp- and awaits the selected MCP tool. If an MCP result is large, + first narrow tool args (limit/top/select/filter/query) in handler code before + using files. + URLS: Do NOT guess URLs — they will 404. Discover via APIs or verify first. UNAVAILABLE: setTimeout, fetch(), Buffer, fs, process. @@ -161,8 +169,14 @@ export function buildSystemMessage(params: SystemMessageParams): string { ? [ " MCP servers are configured. Call list_mcp_servers() to discover available", ' services and manage_mcp({action:"connect", name:"..."}) to connect them.', - " Once connected, get tool schemas via mcp_server_info(name), then import", - ' tools with: import { tool_name } from "host:mcp-"', + " Once connected, call mcp_tool_info({name, query}) or mcp_tool_info({name, tools})", + " to inspect only relevant schemas. Do NOT guess MCP parameter names.", + ' Before executing MCP handler code, call apply_profile("mcp-network") to allow network wall time.', + " MCP tool execution MUST happen inside registered handler code, not as a direct LLM tool call.", + " Register a handler that imports tools with:", + ' import { tool_name } from "host:mcp-" and await every MCP call.', + " If results are large, narrow args first (limit/top/select/filter/query).", + " MCP results use {ok, data, text, error, raw}; check ok/error first.", " Connection may prompt the user for approval (security review).", ' Do NOT try to manage_plugin("mcp:") — MCP servers are NOT plugins.', ].join("\n") diff --git a/src/agent/tool-gating.ts b/src/agent/tool-gating.ts index 2b2ba58..526b784 100644 --- a/src/agent/tool-gating.ts +++ b/src/agent/tool-gating.ts @@ -35,5 +35,6 @@ export const ALLOWED_TOOLS = new Set([ "ask_user", // SDK protocol — structured questions to the user "list_mcp_servers", // List configured MCP servers + status "mcp_server_info", // Detailed MCP server info + tool schemas + "mcp_tool_info", // Focused MCP tool schema lookup "manage_mcp", // Connect/disconnect MCP servers ]); diff --git a/tests/command-suggestions.test.ts b/tests/command-suggestions.test.ts index bc5e3eb..060fbf1 100644 --- a/tests/command-suggestions.test.ts +++ b/tests/command-suggestions.test.ts @@ -38,6 +38,39 @@ describe("extractSuggestedCommands", () => { expect(extractSuggestedCommands(text)).toEqual(["/set heap 16"]); }); + // ── Markdown-emphasised commands ──────────────────────────── + + it("should extract a bold /mcp enable command", () => { + const text = [ + "The Microsoft Teams MCP server requires authentication.", + "", + "**/mcp enable work-iq-teams**", + "", + "This will prompt you to authenticate in your browser.", + ].join("\n"); + + expect(extractSuggestedCommands(text)).toEqual([ + "/mcp enable work-iq-teams", + ]); + }); + + it("should extract an inline bold /mcp enable command", () => { + const text = "Please run **/mcp enable work-iq-teams** to authenticate."; + + expect(extractSuggestedCommands(text)).toEqual([ + "/mcp enable work-iq-teams", + ]); + }); + + it("should preserve wildcard arguments in bold commands", () => { + const text = + "Run **/plugin enable fetch allowedDomains=[*.bbc.co.uk,feeds.bbci.co.uk]**"; + + expect(extractSuggestedCommands(text)).toEqual([ + "/plugin enable fetch allowedDomains=[*.bbc.co.uk,feeds.bbci.co.uk]", + ]); + }); + // ── Bare commands on their own line ────────────────────────── it("should extract a bare /plugin enable on its own line", () => { diff --git a/tests/mcp.test.ts b/tests/mcp.test.ts index 4d601fb..2dfbe0b 100644 --- a/tests/mcp.test.ts +++ b/tests/mcp.test.ts @@ -15,6 +15,12 @@ import { generateMCPDeclarations, generateMCPModuleHints, } from "../src/agent/mcp/plugin-adapter.js"; +import { + findMCPTool, + isReadOnlyMCPTool, + selectMCPTools, +} from "../src/agent/mcp/tool-utils.js"; +import { normaliseToolResult } from "../src/agent/mcp/client-manager.js"; import type { MCPToolSchema } from "../src/agent/mcp/types.js"; import { isMCPHttpConfig, @@ -968,6 +974,7 @@ describe("generateMCPDeclarations", () => { // No declare module wrapper — validator can't parse ambient modules expect(decl).not.toContain("declare module"); expect(decl).toContain("export declare function get_forecast"); + expect(decl).toContain("Promise"); expect(decl).toContain("GetForecastInput"); expect(decl).toContain("location"); }); @@ -983,6 +990,91 @@ describe("generateMCPDeclarations", () => { }); }); +describe("MCP tool utility helpers", () => { + const tools: MCPToolSchema[] = [ + { + name: "ListChats", + originalName: "ListChats", + description: "List recent Teams chats", + inputSchema: { + type: "object", + properties: { top: { type: "number" } }, + }, + }, + { + name: "SearchTeamsMessages", + originalName: "SearchTeamsMessages", + description: "Search Teams messages", + inputSchema: { + type: "object", + properties: { + message: { type: "string", description: "Search text" }, + top: { type: "number" }, + }, + required: ["message"], + }, + }, + { + name: "SendMessage", + originalName: "SendMessage", + description: "Send a Teams message", + inputSchema: { + type: "object", + properties: { chatId: { type: "string" }, message: { type: "string" } }, + required: ["chatId", "message"], + }, + }, + ]; + + it("selects focused tools by query and preserves parameter schemas", () => { + const selection = selectMCPTools(tools, { + query: "teams messages", + limit: 2, + }); + + expect(selection.tools.map((tool) => tool.name)).toContain( + "SearchTeamsMessages", + ); + expect(selection.tools.length).toBeLessThanOrEqual(2); + expect(selection.tools[0].parameters).toBeDefined(); + }); + + it("matches explicit tool names case-insensitively", () => { + expect(findMCPTool(tools, "search-teams-messages")?.name).toBe( + "SearchTeamsMessages", + ); + }); + + it("infers common list/search tools as read-only when annotations are absent", () => { + expect(isReadOnlyMCPTool(tools[0])).toBe(true); + expect(isReadOnlyMCPTool(tools[1])).toBe(true); + expect(isReadOnlyMCPTool(tools[2])).toBe(false); + }); +}); + +describe("normaliseToolResult", () => { + it("unwraps a single JSON text response into data", () => { + const result = normaliseToolResult([ + { type: "text", text: '{"messages":[{"body":"hello"}]}' }, + ]); + + expect(result.ok).toBe(true); + expect(result.data).toEqual({ messages: [{ body: "hello" }] }); + expect(result.text).toContain("messages"); + }); + + it("promotes the primary structured item and keeps secondary metadata", () => { + const result = normaliseToolResult([ + { type: "text", text: '{"messages":[{"body":"hello"}]}' }, + { type: "text", text: "CorrelationId: abc" }, + ]); + + expect(result.ok).toBe(true); + expect(result.data).toEqual({ messages: [{ body: "hello" }] }); + expect(result.meta).toHaveLength(1); + }); +}); + describe("generateMCPModuleHints", () => { const tools: MCPToolSchema[] = [ { diff --git a/tests/profiles.test.ts b/tests/profiles.test.ts index dd70044..6374eba 100644 --- a/tests/profiles.test.ts +++ b/tests/profiles.test.ts @@ -16,8 +16,8 @@ import { } from "../src/agent/profiles.js"; describe("profile registry", () => { - it("should have exactly 4 built-in profiles", () => { - expect(PROFILES.size).toBe(4); + it("should have exactly 5 built-in profiles", () => { + expect(PROFILES.size).toBe(5); }); it("should include all expected profile names", () => { @@ -26,6 +26,7 @@ describe("profile registry", () => { expect(names).toContain("file-builder"); expect(names).toContain("web-research"); expect(names).toContain("heavy-compute"); + expect(names).toContain("mcp-network"); }); it("should return undefined for unknown profile", () => { @@ -85,6 +86,15 @@ describe("profile definitions", () => { expect(p.limits.heapMb).toBe(64); expect(p.limits.scratchMb).toBe(64); }); + + it("mcp-network should bump wall time without enabling plugins", () => { + const p = getProfile("mcp-network")!; + expect(p.plugins).toHaveLength(0); + expect(p.limits.cpuTimeoutMs).toBe(2000); + expect(p.limits.wallTimeoutMs).toBe(30000); + expect(p.limits.heapMb).toBe(32); + expect(p.limits.scratchMb).toBe(32); + }); }); describe("mergeProfiles", () => { @@ -147,7 +157,7 @@ describe("mergeProfiles", () => { it("should handle stacking all profiles", () => { const result = mergeProfiles(getProfileNames()); expect(result.error).toBeUndefined(); - expect(result.appliedProfiles).toHaveLength(4); + expect(result.appliedProfiles).toHaveLength(5); // Max across all profiles expect(result.limits.cpuTimeoutMs).toBe(15000); // file-builder @@ -161,6 +171,16 @@ describe("mergeProfiles", () => { expect(pluginNames).toContain("fs-write"); }); + it("should stack mcp-network without adding plugins", () => { + const result = mergeProfiles(["default", "mcp-network"]); + expect(result.error).toBeUndefined(); + expect(result.appliedProfiles).toEqual(["default", "mcp-network"]); + expect(result.limits.cpuTimeoutMs).toBe(2000); + expect(result.limits.wallTimeoutMs).toBe(30000); + expect(result.limits.heapMb).toBe(32); + expect(result.plugins).toHaveLength(0); + }); + it("should handle duplicate profile names gracefully", () => { const result = mergeProfiles(["file-builder", "file-builder"]); expect(result.error).toBeUndefined(); diff --git a/tests/tune.test.ts b/tests/tune.test.ts index e0abc69..34b0ac4 100644 --- a/tests/tune.test.ts +++ b/tests/tune.test.ts @@ -106,3 +106,59 @@ describe("profile CLI flag", () => { expect(config.tune).toBe(true); }); }); + +describe("MCP setup CLI flags", () => { + it("parses standalone no-arg MCP setup commands", () => { + expect(parseCliArgs(["--mcp-setup-everything"]).mcpSetupCommand).toEqual({ + kind: "setup-everything", + }); + expect(parseCliArgs(["--mcp-setup-github"]).mcpSetupCommand).toEqual({ + kind: "setup-github", + }); + expect(parseCliArgs(["--mcp-show-config"]).mcpSetupCommand).toEqual({ + kind: "show-config", + }); + }); + + it("parses filesystem setup with default and explicit directories", () => { + expect(parseCliArgs(["--mcp-setup-filesystem"]).mcpSetupCommand).toEqual({ + kind: "setup-filesystem", + dir: "/tmp/mcp-fs", + }); + expect( + parseCliArgs(["--mcp-setup-filesystem", "/var/tmp/mcp"]).mcpSetupCommand, + ).toEqual({ + kind: "setup-filesystem", + dir: "/var/tmp/mcp", + }); + }); + + it("captures remaining args for setup helpers with pass-through options", () => { + expect( + parseCliArgs([ + "--mcp-add-http", + "example", + "https://mcp.example.com/sse", + "client", + "tenant", + "scope.one,scope.two", + "browser", + ]).mcpSetupCommand, + ).toEqual({ + kind: "add-http", + args: [ + "example", + "https://mcp.example.com/sse", + "client", + "tenant", + "scope.one,scope.two", + "browser", + ], + }); + + expect( + parseCliArgs(["--mcp-m365-create-app", "--client-id", "abc"]) + .mcpSetupCommand, + ).toEqual({ kind: "m365-create-app", args: ["--client-id", "abc"] }); + }); +});