Skip to main content

Model Context Protocol

Every tool you give an agent costs tokens before it asks a single question. MCP defines how to give agents the right tools at the right cost.

Model Context Protocol standardizes how AI models access external capabilities — databases, APIs, file systems, and other tools. Before MCP, every AI application invented its own integration layer. MCP is the USB-C standard for agent-tool connections: build the server once, use it with any MCP-compatible client.

Architecture

Three roles, one protocol:

Host (Claude Code, Cursor, Claude Desktop)
└── Client (one per server connection)
└── Server (exposes tools/resources/prompts)
└── External system (DB, API, filesystem)
RoleJobExamples
HostContains the LLM, manages all client connectionsClaude Code, Cursor, Claude Desktop
ClientProtocol adapter — one per server, handles transportBuilt into the host application
ServerExposes capabilities to the clientSupabase MCP, GitHub MCP, Perplexity MCP

Connection flow:

  1. Host creates a client for each configured server
  2. Client connects via transport (subprocess stdio for local; HTTP+SSE for remote)
  3. Client and server negotiate capabilities (handshake lists what's available)
  4. Model receives all tool schemas; calls tools as it reasons
  5. Results flow back through client into model context

Three Primitives

Every MCP server exposes some combination of three capability types:

PrimitiveWhat it isToken behavior
ToolsFunctions the model calls — search, query, write, executeSchema in context always; result only when called
ResourcesFile-like data the model can read — documents, DB records, configsLoaded on request; can be large
PromptsReusable prompt templates with typed argumentsMinimal schema cost; injected on demand

Tools are the main capability. Resources and prompts extend the pattern for read-heavy and templated workflows.

Token Economics

Loading tools is not free. Research shows MCP tool definitions inflate input tokens by 3x to 236x depending on the toolset. Every schema loaded before the first message is tokens unavailable for reasoning.

Session budget = context_window × 0.6   (reserve 40% for reasoning)
MCP overhead = Σ(tool_schema_tokens) + Σ(tool_result_tokens)
Target = schema overhead < 15% of session budget

Static vs dynamic loading:

StrategyToken costBest for
Static (always load)All schemas in context from turn 1Toolsets of ≤5 tools used every session
Dynamic (load on demand)Only schemas for tools currently neededLarge toolsets — reduces overhead by 96%
Team profiles (curated sets)Right tools per role, nothing extraThe recommended approach

The practical rule: if a tool isn't in your team's Always Load column, don't load it. See the adoption radar and team matrices for per-team tool profiles.

Transport Types

TransportHow it worksBest for
stdioServer runs as a subprocess; client communicates via stdin/stdoutLocal servers — filesystem, CLI tools
HTTP + SSEServer runs as a remote HTTP service; client connects via SSECloud-hosted servers — APIs, SaaS integrations

stdio is zero-latency and needs no networking. HTTP enables remote, shared server instances — one Supabase MCP server for the whole team.

Why MCP vs Function Calling

Traditional function calling required each application to define its own tools. No reuse across applications. No standard security model. No dynamic discovery.

DimensionFunction callingMCP
IntegrationPer-application custom codeStandard protocol — build once
DiscoveryStatic tool list at startupDynamic capability negotiation
SecurityDefined per integrationServer-enforced access controls
ComposabilityEach agent rebuilds the stackShared server ecosystem (2,000+ servers)
ResourcesTools onlyTools + resources + prompts

The compounding effect: a well-designed MCP server built once works in Claude Code, Cursor, Claude Desktop, and any future client without modification.

Context

Questions

Which MCP server design decision — tool schema clarity, stateless versus stateful context, or authentication model — has the most impact on agent reliability at scale?

  • At what tool count does an MCP server become too complex for an agent to reliably select the right tool without additional routing logic?
  • If dynamic loading reduces token overhead by 96% but increases tool calls by 2-3x, when is static loading still worth it?
  • Which of the three primitives (tools, resources, prompts) is most underused — and what workflow does that gap represent?