Skip to main content
Technical April 20, 2026 · 16 min read

MCP Servers: The Complete Guide for 2026

What MCP servers are, how they work, and how to build one for production AI agent development. Covers resources, tools, prompts, and real use cases.

s
studiobuildit
StudioBuildIt

What this covers: What MCP is and why it matters, the full architecture, building your first server, authentication patterns, when to use hosted vs. custom, the current ecosystem, and the mistakes builders make.

What MCP actually is (without the buzzwords)

Model Context Protocol is a standard for connecting AI models to external tools, data sources, and services. Anthropic introduced it in late 2024, and Google (Gemini), OpenAI (Codex CLI), and most major AI tooling providers have since adopted it.

Before MCP, every AI application that needed to call an external tool required a bespoke integration: write the tool definition in the model’s schema format, handle the model’s output format, manage retries and errors, and repeat for every model. When you switched models, you rewrote the integrations.

MCP standardizes this. A tool is described once. Any MCP-compatible model can use it.

The analogy: HTTP is how browsers and servers communicate. MCP is how AI models and tools communicate. You do not build a custom protocol every time you build a website, you build over HTTP. You should not build custom tool integrations every time you build an AI product, you should build over MCP.

The three primitives

MCP servers expose three kinds of things:

Tools: Functions the AI can call. A tool has a name, a description (which the AI reads to decide when to use it), and an input schema. The server handles the actual execution and returns a result. Examples: create_invoice, search_database, send_email.

Resources: Data the AI can read. Resources are like files or database rows: they have a URI, a description, and content. Examples: a user’s transaction history, a company’s policy documents, a live dashboard reading.

Prompts: Reusable prompt templates the AI can invoke. Less commonly used, but useful for standardizing complex multi-step instructions.

Most production MCP servers focus primarily on tools. Resources are useful when you want to give the AI read access to structured data without embedding it in the context window on every call.

The architecture of an MCP server

A minimal MCP server in TypeScript:

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

const server = new Server(
  { name: "my-server", version: "1.0.0" },
  { capabilities: { tools: {} } }
);

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: "get_customer",
      description: "Look up a customer by email address",
      inputSchema: {
        type: "object",
        properties: {
          email: { type: "string", description: "Customer email address" }
        },
        required: ["email"]
      }
    }
  ]
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === "get_customer") {
    const { email } = request.params.arguments as { email: string };
    const customer = await db.customers.findByEmail(email);
    return {
      content: [{ type: "text", text: JSON.stringify(customer) }]
    };
  }
  throw new Error(`Unknown tool: ${request.params.name}`);
});

const transport = new StdioServerTransport();
await server.connect(transport);

That’s it. The server answers two kinds of requests: “what tools do you have?” and “run this tool with these arguments.”

The transport layer (StdioServerTransport here) handles the communication. For local development and Claude Code integration, stdio transport is standard. For remote servers, the case where you’re running an MCP server as a web service: you use HTTP + SSE (Server-Sent Events) transport.

Transport modes: local vs. remote

Local (stdio): The MCP server runs as a subprocess on the same machine as the AI client. The client starts the server, communicates over stdin/stdout, and kills it when done. Simple, no auth needed for the transport itself, great for Claude Code and Cursor integrations.

Remote (HTTP + SSE): The MCP server runs as a web service. The AI client connects over HTTP, sends requests as POST, receives streaming responses via SSE. This is the mode you use when:

  • Multiple users need to share the same server
  • The server needs persistent state (connections, caches)
  • You’re deploying to production for a multi-tenant AI product
  • You want to host on Cloudflare Workers, AWS Lambda, etc.

Remote transport requires authentication. The MCP spec supports OAuth 2.0, specifically the Authorization Code flow for user-specific credentials and Client Credentials flow for service-to-service. This is where most builders get tripped up.

Authentication patterns

Pattern 1: Static API key (simplest, suitable for internal tools)

The MCP client passes an API key in the Authorization header. The server validates it against an environment variable. Fine for internal tooling where you control both ends. Not suitable for multi-tenant products.

Pattern 2: OAuth 2.0 Authorization Code (for user-specific credentials)

The model needs to act on behalf of a user, reading their Gmail, posting to their Slack, accessing their Salesforce account. OAuth Authorization Code flow handles this:

  1. The MCP server tells the client “this tool requires authorization”
  2. The client opens an auth URL in the browser
  3. The user consents
  4. The server receives the callback and stores the access token
  5. Subsequent tool calls include the token automatically

Implementing this from scratch is work. The @modelcontextprotocol/sdk v1.5+ includes OAuth helpers. For production deployments, look at how Cloudflare’s MCP server templates handle this, they have good reference implementations.

Pattern 3: Short-lived tokens via your existing auth system (for products with their own auth)

If you have an existing auth system, issue short-lived tokens from it. The MCP client (Claude Code, your AI product) presents the token per request. The server validates against your auth service. This is the cleanest pattern for teams that already have JWT or session-based auth.

When to build a custom server vs. use an existing one

The MCP ecosystem now has hundreds of community and official servers. Before building, check:

  • Anthropic’s official servers: GitHub, Slack, Google Drive, Notion, Postgres, Puppeteer. These are production-quality.
  • Cloudflare’s server directory: Large curated list, many hosted servers you can connect without deploying anything.
  • mcp.so: Community directory with ratings.

Build a custom server when:

  • The existing servers don’t expose the specific operations you need
  • You need to wrap internal systems with no public API
  • You need custom business logic in the tool layer (not just “call this API”)
  • You need fine-grained access control that generic servers don’t support
  • Performance matters, a bespoke server can be much faster than a generic one

Don’t build custom servers as a reflex. A 30-minute audit of existing servers often finds something 80% suitable that you can extend rather than replace.

The mistakes builders make

Putting too much in a single server: One server with 40 tools is harder to debug, slower to respond to ListTools, and exposes more attack surface than necessary. Split by domain. One server for CRM operations, one for email, one for billing.

Not writing tool descriptions well: The AI decides which tool to use based on the description. Vague descriptions cause incorrect tool selection. Write descriptions for the AI, not for humans. “Retrieve a customer’s full profile including billing history, active subscriptions, and support ticket count” is better than “Get customer info.”

No error handling: What happens when the API the tool calls is down, returns a 500, or times out? A production MCP server handles all of these gracefully and returns useful error messages the AI can act on: “the CRM is unavailable, please try again in a few minutes” rather than an uncaught exception.

Synchronous operations on async tasks: Some operations take 30 or more seconds (running a report, generating a PDF). Do not make the AI wait synchronously. Return a job ID immediately, expose a get_job_status tool, and let the AI poll. Or use the notification and streaming mechanisms in the MCP spec.

Ignoring the input schema: The input schema is how the AI knows what to send. Loose schemas (“arguments: any object”) produce unpredictable tool calls. Tight schemas with clear descriptions for each property improve reliability substantially.

What the ecosystem looks like in mid-2026

MCP has become the standard. That sentence would have been premature 18 months ago, when there were real questions about whether the protocol would gain adoption beyond Anthropic’s own tools. It has. Google’s Gemini CLI supports it natively. OpenAI’s Codex CLI adopted it. Most major AI tooling products (Cursor, Windsurf, Continue) treat MCP servers as a first-class integration.

The interesting frontier now is composable MCP servers, servers that call other servers, enabling orchestration patterns where a high-level AI request triggers a chain of tool calls across multiple domains without the orchestration logic living in the model context. This is where multi-agent systems and MCP are converging.

The other frontier is hosted MCP marketplaces. Cloudflare Workers MCP is the leading example: deploy an MCP server globally in minutes, with auth, rate limiting, and analytics included. The barrier to shipping a custom MCP server has dropped from “stand up a server” to “write the business logic.”

What to build next

If you are building AI products and your external integrations are still bespoke JSON schemas tied to a specific model’s format, you will hit a wall. Every one of those integrations will need to be refactored when you add a second model, move to a different deployment environment, or want that tool to be usable by other applications.

The migration path is well-defined. Pick the integration causing the most friction right now. Build one MCP server. Observe how much cleaner the architecture becomes.

Then continue.

Share:
← All posts

Related reading

Keep building

Build Agents Newsletter

One build. One lesson. Three links.

Weekly notes from shipping production AI agents. No padding. Free.

No spam. Unsubscribe anytime.