MCP Integration

How the Model Context Protocol turns external services into agent tools: transport selection, tool bridging, and connection lifecycle management.

An agent's built-in tools are capable but finite. In production, you need to connect to external services: code repositories, databases, monitoring dashboards, deployment systems. Without a standard integration layer, each connection is a custom adapter with its own auth, error handling, schema translation, and reconnection logic. The maintenance cost compounds with every new service.

The Model Context Protocol standardizes this adapter layer. An MCP server advertises its capabilities through a well-defined API. The agent-side client queries those capabilities at connection time, constructs Tool objects that are structurally identical to built-in tools, and registers them with the agent's tool dispatcher. From that point on, the agent loop dispatches MCP tools and built-in tools through exactly the same code path. The loop never sees "this is an external tool." It sees a Tool object with a name, a schema, and a call() function.

The key insight: MCP handles everything the agent loop doesn't want to know about: transport selection, auth, schema translation, reconnection, and output normalization. By connection time, an MCP tool is structurally indistinguishable from a built-in tool. The complexity is absorbed at the boundary, not distributed into the loop.

The Tool Bridge Pattern

The tool bridge is the core pattern. At connection time, the client queries tools/list, receives a schema for each tool the server exposes, and constructs a standard Tool object for each one:

async function connect_server(server_name: str, config: ServerConfig) -> Connection:
  transport = create_transport(config)   # stdio, sse, http, ws: selected from config
  client = new Client()
  await client.connect(transport)

  # Discover what this server can do
  tools_response = await client.request("tools/list")

  # Bridge: construct agent-compatible Tool objects from MCP schema
  agent_tools = []
  for mcp_tool in tools_response.tools:
    agent_tool = {
      name: f"mcp__{server_name}__{mcp_tool.name}",
      description: truncate(mcp_tool.description, max=2048),
      input_schema: mcp_tool.input_schema,   # MCP server owns the schema
      is_read_only: mcp_tool.annotations?.read_only_hint ?? False,
      is_destructive: mcp_tool.annotations?.destructive_hint ?? False,
      call: (args, ctx) -> client.call_tool(mcp_tool.name, args)
    }
    agent_tools.append(agent_tool)

  return { client, tools: agent_tools }

Four things happen in this bridge:

  1. Discovery: tools/list returns the server's capabilities: name, description, input schema, and annotations.

  2. Construction: each MCP tool becomes a standard Tool object with the same interface as built-in tools. The dispatcher sees no difference.

  3. Namespacing: mcp__{server}__{tool} prevents collisions across servers and makes tool ownership traceable in logs. When you see mcp__payments__refund_transaction in a trace, you know immediately which server and which operation.

  4. Annotation passthrough: the server's hints (read_only_hint, destructive_hint) map directly to the concurrency and permission system. A tool marked read_only can run concurrently. A tool marked destructive triggers confirmation. See Tool System for how the dispatcher uses these flags.

The description truncation (max=2048) is not optional. See Production Considerations for why.

Transport Selection

MCP servers connect via one of five transports, selected by the type field in the server configuration:

TransportConnectionUse Case
stdioLocal subprocess via stdin/stdoutMost common. Server runs as a child process.
sseServer-sent events (HTTP long-poll)Remote servers, legacy. One-way push with POST for client messages.
httpBidirectional HTTP (POST + SSE)The newer standard for remote servers (Streamable HTTP).
wsWebSocketBidirectional streaming. Low-latency remote communication.
sdkIn-process, no networkProgrammatic integration, testing.

The transport is selected at configuration time, not at runtime. The client creates the appropriate transport object based on the config's type field, then all subsequent communication goes through the same request() / notify() interface regardless of transport. The tool bridge pattern above works identically for all five transports. client.request("tools/list") has the same API whether the underlying channel is a subprocess pipe, an HTTP stream, or a WebSocket.

The HTTP transport (sometimes called "Streamable HTTP") is the newest addition to the MCP specification. It replaces SSE for new remote integrations because it supports bidirectional communication without the limitations of server-sent events. SSE connections are one-way push channels. The client has to POST separately for messages to the server, which creates coordination overhead. HTTP transport handles both directions in a single channel. For new remote server integrations, prefer http over sse.

A critical SSE-specific constraint: SSE connections are long-lived GET requests that stay open to receive events. Standard HTTP timeout wrappers (commonly set at 60 seconds) will kill these streams. Any timeout middleware must explicitly skip GET requests. Applying a uniform timeout to all HTTP requests is a common implementation mistake that silently breaks SSE. See Production Considerations item 1.

Connection Lifecycle

Every server connection is one of five states. All tool and resource fetching is gated on the connected state:

# Server connection states: all tool/resource fetching gates on 'connected'
type ConnectionState =
  | "connected"    # tools available: everything works normally
  | "failed"       # connection error: return empty tool list
  | "needs-auth"   # auth required: offer auth tool only, no data tools
  | "pending"      # reconnecting: return empty tools until reconnection succeeds
  | "disabled"     # manually disabled: completely silent

function get_tools_for_server(connection: Connection) -> list[Tool]:
  if connection.state != "connected":
    return []    # all non-connected states return empty: agent loop never sees the error
  return connection.tools

The design principle: all non-connected states return empty tool lists. The agent loop gets a consistent interface regardless of server health. A server can fail, require auth, or be disabled. The loop never sees an error, it just has fewer tools available. This is the same fail-silent pattern used in circuit breakers: downstream failures are absorbed at the boundary.

Each non-connected state has a different implication:

  • failed: connection error. Returns empty tool list. The system may attempt reconnection depending on failure type.
  • needs-auth: server requires authentication. The client can surface an auth command to the user, but no data tools are exposed until auth completes.
  • pending: reconnection in progress. Returns empty tools until the reconnection attempt resolves.
  • disabled: manually disabled by user or admin. Completely silent: no errors, no auth prompts.

The needs-auth state has a caching concern: see Production Considerations item 3.

Batched Startup

When an agent connects to many MCP servers at startup, concurrency limits matter. The critical insight is that local servers and remote servers have fundamentally different resource profiles:

# Local servers (stdio/sdk): spawn child processes
# Too many concurrent spawns causes CPU/memory contention
# Safe default: 3 concurrent connections
process_batched(local_servers, batch_size=3, connect_fn)

# Remote servers (sse/http/ws): establish network connections
# These are just TCP handshakes, so much higher concurrency is safe
# Safe default: 20 concurrent connections
process_batched(remote_servers, batch_size=20, connect_fn)

Local servers spawn child processes. Creating 30 simultaneous processes stresses the operating system's scheduler and process table. Memory and CPU spike at startup. The batch size of 3 is conservative to avoid this contention.

Remote servers are just network connections. 20 concurrent TCP handshakes are routine and well within normal operating parameters. Using a conservative local batch size for remote connections wastes startup time unnecessarily.

Both batch sizes should be configurable via environment variables for environments with unusual constraints (for example, containerized deployments with strict process limits might need batch_size=1 for local servers).

Config Scope Hierarchy

MCP servers are configured at six scopes, each with different reach and precedence. From highest to lowest:

ScopeSourceNotes
EnterpriseManaged config (IT-controlled)Exclusive control: when present, lowest-scope servers are not loaded
LocalMachine-specific settingsPer-machine overrides
UserUser-global settingsApplies across all projects
ProjectProject root configChecked into the repository
DynamicAdded at runtimeVia commands like /mcp add
CloudFetched from providerLowest precedence

Enterprise exclusivity: when an enterprise configuration exists, cloud-provided servers are never loaded. Enterprise has exclusive control over which external services the agent can access. This is a security boundary. It prevents users from sidestepping IT-approved tool sets by adding unapproved cloud servers.

Deduplication by URL signature: two servers configured at different scopes that point to the same endpoint are deduplicated. The deduplication key is the URL signature, not the server name, because names don't collide across scopes (each scope is independent) but two servers pointing to the same Slack integration absolutely would.

Production Considerations

1. SSE streams must skip the request timeout.

SSE connections are long-lived. The GET request stays open for the duration of the event stream. Applying a standard 60-second timeout kills the stream silently. The client receives an error that looks like a network timeout, not like a protocol issue. The fix: any timeout middleware must explicitly skip GET requests while still applying timeouts to POST requests. This is a common implementation mistake when adding timeouts globally, because "add a 60s timeout to all HTTP requests" sounds like a safe default.

2. Batch concurrency must be split by transport type.

Local (stdio) servers spawn processes. Too many concurrent spawns causes operating system resource contention. Remote servers are just network connections. A single batch size for both is either too aggressive for local servers or too conservative for remote ones. Production defaults that work across a wide range of environments: 3 concurrent for local, 20 concurrent for remote. Make both configurable.

3. Cache the needs-auth state to avoid startup hammering.

Without a cache, every startup re-probes all servers that failed auth in the previous session, generating a wave of network requests to servers that will fail again immediately. A short-TTL cache (15 minutes is a reasonable default) avoids repeated failures while picking up legitimate re-authentication. Serialize writes to the cache file to prevent race conditions when multiple connections complete auth simultaneously.

4. Truncate tool descriptions to prevent context poisoning.

MCP servers generated from OpenAPI specifications commonly produce tool descriptions of 15-60KB. Without a cap (2048 characters is a practical default), a single server can exhaust the agent's context budget on every turn. The system prompt grows by 60KB, leaving less space for conversation history and tool results. This failure mode is invisible: the agent doesn't error, it just produces increasingly poor results as context pressure mounts. Truncation must apply to both individual tool descriptions and the server-level instructions string.

5. Session expiry requires full cache invalidation.

For HTTP-transport servers, sessions can expire server-side. The expiry signature is a specific error: HTTP 404 with a JSON-RPC error code indicating session not found. When this happens, the connection cache alone is not sufficient to clear. All fetch caches (tool lists, resource lists, server commands) from the expired session must be cleared. Without full invalidation, a reconnected client serves stale tool lists from the old session. The tool names in those lists may no longer exist on the server, causing every tool dispatch to fail.

6. Memoization must be cleared on reconnect.

On any connection close and reopen, all memoized fetch results become stale. A reconnected server may have different tools, different resources, or different commands than the previous session, especially if the server was updated between sessions. Clearing only the connection object while retaining cached tool lists means the agent dispatches requests to tools that the new session doesn't expose.

Best Practices

  • Do use the tool bridge pattern. Construct Tool objects that are structurally identical to built-in tools so the agent loop dispatch path stays uniform.
  • Do namespace MCP tool names (mcp__{server}__{tool}) to prevent collisions and make tool ownership traceable in logs.
  • Do return empty tool lists for non-connected servers. The agent loop should never see connection errors, just fewer available tools.
  • Do split batch concurrency by transport type. Local process spawning and remote network connections have fundamentally different resource profiles.
  • Do truncate tool descriptions before registering. Set a hard cap (2048 chars) to prevent context budget exhaustion.
  • Don't apply uniform timeouts to all HTTP requests. SSE GET streams are long-lived and must be excluded from standard timeout wrappers.
  • Don't skip tool description truncation. A single OpenAPI-generated server can add 60KB to every turn's context.
  • Don't retain memoized tool lists across reconnections. Every reconnection must start with fresh tool and resource discovery.
  • Don't implement a single availability check for auth state. Use the needs-auth connection state to distinguish "not connected yet" from "requires auth" from "failed with an error."
  • Tool System: The tool dispatch system that MCP tools integrate into. MCP tools use the same concurrency classes, permission checks, and dispatch paths as built-in tools.
  • Command and Plugin Systems: MCP servers can contribute commands to the agent's command registry through the prompt command type.
  • Safety and Permissions: MCP tool annotations (destructive_hint, read_only_hint) feed directly into the agent's permission system.
  • Hooks and Extensions: MCP tools are hookable via the same PreToolUse and PostToolUse events as built-in tools. Hooks can intercept, modify, or block MCP tool calls using the same condition syntax.
  • Pattern Index: All patterns from this page in one searchable list, with context tags and links back to the originating section.
  • Glossary: Definitions for domain terms used on this page.