Introduction
When using multiple MCP servers in Claude Code (e.g., Notion, Playwright, Chrome DevTools),
the tool definitions alone can consume 36.6k tokens of context.
To reduce this overhead, I built a meta MCP server called Code Mode, which exposes only two tools:
-
search_docsโ search API definitions of child MCP servers -
execute_codeโ run TypeScript that calls MCP bindings
This reduced the token usage from 36.6k โ 4.4k tokens (โ88%).
This post explains the approach, the architecture, and the practical pitfalls I encountered when implementing it.
Background: Why a Meta-MCP?
The design is inspired by Anthropic's engineering post:
๐ Code execution with MCP
https://www.anthropic.com/engineering/code-execution-with-mcp
Instead of giving the LLM a long list of explicit tool definitions,
you give it just enough machinery to:
- look up available API bindings, and
- execute code that uses them.
For this experiment, I connected three child MCP servers:
- Notion MCP
- Playwright MCP
- Chrome DevTools MCP
Playwright and Chrome DevTools overlap, but I wanted to compare their behavior.
(Yes, this inflates the original token count โ but the reduction effect remains the same.)
The MCP Context Problem
Hereโs what Claude Code loads when these MCP servers are enabled:
MCP tools: 36.6k tokens (18.3%)
โ chrome-devtools (26 tools)
โ playwright (22 tools)
โ notion (15 tools)
More than 60 tools are injected into the conversation context every time.
On a 200k context window, 18% was being consumed by tool definitions alone,
leaving less room for code, logs, or iterative reasoning.
The Code Mode Approach
Code Mode exposes only two tools:
Code Mode MCP
โโ search_docs: query API schemas of child MCP servers
โโ execute_code: run TypeScript that calls MCP bindings
The LLM workflow becomes:
-
search_docsโ find the API -
execute_codeโ perform the operation
search_docs
// List all bindings
await search_docs({})
// => notion, playwright, chrome, ...
// Search by keyword
await search_docs({ query: "navigate" })
// => returns schema for playwright.browser_navigate
execute_code
The core of Code Mode.
await execute_code({
code: `
await playwright.browser_navigate({ url: "https://example.com" });
const snapshot = await playwright.browser_snapshot({});
console.log(snapshot);
`,
})
All available bindings (playwright, notion, chrome, etc.)
are injected into the execution environment automatically.
Implementation Notes
Choosing Deno for the Execution Environment
execute_code needs to run arbitrary TypeScript generated by the LLM,
but inside a sandbox.
Deno was the best fit because:
- TypeScript runs without transpilation
- Worker API is built-in and behaves like browser Web Workers
- Permission model is strong and easy to restrict
- The MCP TypeScript SDK works in Deno
Hereโs how the sandbox worker is created:
const worker = new Worker(workerUrl, {
type: "module",
deno: {
permissions: {
net: false,
read: false,
write: false,
env: false,
run: false,
ffi: false,
},
},
})
Execution Constraints
| Item | Constraint |
|---|---|
| Timeout | 30 seconds |
| Allowed | console.log, basic JS/TS, MCP bindings |
| Forbidden | fetch, file access, external network |
Everything must go through MCP tool calls, which are relayed to the parent process.
Connecting Child MCP Servers
Architecture:
Claude Code
โ stdio
Code Mode (search_docs / execute_code)
โโ Notion MCP (SSE via mcp-remote)
โโ Chrome DevTools MCP (stdio)
โโ Playwright MCP (stdio)
If a child server fails to connect:
- It is omitted from
search_docs - Any attempt to call its bindings results in
Unknown tool
Currently, reconnection is manual (/mcp restart).
Pitfall: โos error 35โ When Nesting MCP Servers
When Code Mode launched another MCP server via StdioClientTransport,
I repeatedly hit:
Resource temporarily unavailable (os error 35)
Root Cause
StdioClientTransport defaults to:
stderr: "inherit"
Since the meta server itself communicates with Claude Code via stdio,
stderr inheritance causes I/O interference.
Fix
Always use:
const transport = new StdioClientTransport({
command: "npx",
args: ["-y", "@playwright/mcp@latest"],
stderr: "pipe", // REQUIRED for nested MCP
})
Do this for all child MCP servers.
Reducing the Tool Definition Size
Originally, the execute_code tool description included:
- all available bindings
- code examples
- server lists
It reached 1.3k tokens by itself.
After simplification:
description: "Execute TypeScript code that uses child MCP servers. Use search_docs to inspect available bindings."
โ 687 tokens.
Tool definitions now scale linearly regardless of how many child MCP servers exist.
Results
| Metric | Before | After |
|---|---|---|
| MCP tool tokens | 36.6k | 4.4k |
| Number of tools | 63 | 2 |
| Context usage | 18.3% | 2.2% |
The difference in conversational โbreathing roomโ is substantial.
Limitations
1. Less intuitive for an LLM
Normal MCP tools are self-descriptive.
Code Mode requires:
- searching docs
- writing code
LLMs need a short introduction before they can use it effectively.
2. Hardcoded configuration
Code Mode is not a plug-and-play npm package.
- Child server paths are hardcoded
- Users must adjust the code to their environment
This is an approach, not a product.
3. Weak reconnection handling
Once a child MCP disconnects (OAuth expiry, browser closure, etc.):
- There is no auto-retry
- Manual
/mcp restartis required
This can be improved.
Conclusion
- MCP tool definitions consume significant context
- A meta MCP server can compress dozens of tools into two
- Deno provides a robust sandbox for executing TypeScript safely
-
stderr: "pipe"is essential to avoid I/O conflicts in nested MCP setups - The approach works well but isnโt universal or beginner-friendly
The full implementation (Deno + MCP SDK) is available here:
๐ https://gist.github.com/tgfjt/a3716f7b1651d7fd7df2d769efdf644e
Hope this helps anyone experimenting with multi-MCP setups or context-efficient tooling!
References
- Code Execution with MCP โ Anthropic https://www.anthropic.com/engineering/code-execution-with-mcp
- modelcontextprotocol/typescript-sdk https://github.com/modelcontextprotocol/typescript-sdk
Top comments (0)