How to Build an MCP Server: A Practitioner's Development Guide

▶️ LISTEN TO THIS ARTICLE

The Model Context Protocol has moved quickly from local demos into a common agent-integration pattern. Community servers, SDK usage, and client integrations keep broadening, but the exact scale depends on the source and snapshot date. Treat the direction as clear and the numbers as source-specific.

Here's the problem: production readiness remains uneven. Many writeups still describe a large share of the ecosystem as local, experimental, or lightly hardened rather than production-hardened. The gap between "it works on my machine with stdio" and "it handles concurrent requests behind a load balancer without leaking credentials" is where most MCP projects stall.

This guide covers the implementation decisions that matter once you're past hello world.

What You're Actually Building

Before touching code, clarify which MCP primitives your server needs. The current build-server documentation describes three, and they serve different purposes.

Tools are functions the model can call to take actions or compute results. They accept inputs, do something, and return outputs. A tool might query a database, call an API, run a calculation, or send a message. Tools are the most common primitive and the one most tutorials default to.

Resources expose data the model can read without triggering side effects. Think files, database rows, API responses formatted as context. Resources use URI addressing (file://, postgres://, github://) and are meant to be pulled, not called. If your tool is "run a query," your resource is "here's the query result as readable context."

Prompts are reusable templates that shape how the model interacts with your server's capabilities. A code-review prompt that bundles the right instructions with a resource reference is a prompt. These are underused and underappreciated: they're how you encode institutional knowledge into the server rather than relying on users to prompt correctly every time.

Most MCP servers in the wild implement only tools, ignore resources, and don't know prompts exist. That's fine for simple integrations. It's a missed opportunity for anything more complex.

SDK Choice: Python vs TypeScript

Both official SDKs are mature. Pick based on where your team lives.

Python SDK (modelcontextprotocol.io/docs) uses a FastMCP-style interface that integrates into the core package. Define tools as normal functions with type hints and docstrings, and the framework infers JSON schemas from your annotations. This reduces boilerplate significantly compared to the lower-level API:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("my-server")

@mcp.tool()
def get_customer(customer_id: str) -> dict:
    """Retrieve a customer record by ID."""
    return db.query("SELECT * FROM customers WHERE id = %s", customer_id)

The docstring becomes the tool description the model sees. The type hints become the input schema. Pydantic handles validation. If your business logic already lives in Python, wiring it to MCP is a few decorators.

TypeScript SDK is often used for MCP servers that live alongside Node.js services or front-end tooling. It provides full types for messages, tools, resources, and transports. Zod handles schema validation. For web-adjacent infrastructure (GitHub Actions, CI tooling, browser-accessible services), TypeScript fits naturally.

Nearform's production implementation guide identifies one consistent mistake across both SDKs: printing to stdout during stdio transport. When your server uses stdio, the protocol messages travel over stdin/stdout. Any debug logging, print statements, or console.log calls that reach stdout corrupt the message stream and break the client connection silently. Send all logs to stderr, a file, or a structured logging sink.

These surface as error responses in the JSON-RPC layer.

Implementing Tools That Don't Break

Tool implementation looks simple until you consider error handling. The MCP spec draws a sharp line between two error categories:

Protocol errors indicate the MCP communication itself failed: malformed requests, schema violations, transport issues. These surface as error responses in the JSON-RPC layer.

Tool execution errors happen inside your tool's logic: a database connection fails, an API returns a 429, a file doesn't exist. These should not be protocol errors. Return them as valid CallToolResult objects with isError: true and a human-readable explanation of what went wrong.

Why does this distinction matter? Because the model sees tool execution errors and can reason about them. If you throw an unhandled exception and let it bubble up as a protocol error, the client breaks and the model has no opportunity to retry with different parameters, escalate, or fall back gracefully. If you return a structured error in the result, the model can read it, decide what to do, and keep the session alive.

@mcp.tool()
def get_customer(customer_id: str) -> dict:
    """Retrieve a customer record by ID."""
    if not customer_id or not customer_id.startswith("cust_"):
        return {"error": "Invalid customer_id format. Expected prefix: cust_"}
    try:
        result = db.query("SELECT * FROM customers WHERE id = %s", customer_id)
        if not result:
            return {"error": f"No customer found with id: {customer_id}"}
        return result
    except DatabaseConnectionError as e:
        return {"error": f"Database unavailable: {str(e)}"}

Input validation belongs in the tool, not left to the model. The MCP tools specification is explicit: validate lengths, formats, and ranges before executing any logic. For string inputs that touch file paths, database queries, or shell commands, sanitize before use. An MCP server with weak input validation is a prompt injection vector that extends into your infrastructure.

The official MCP Inspector (npx @modelcontextprotocol/inspector) gives you a UI to list all registered tools, fire test invocations, and inspect the raw JSON-RPC traffic. Run it before connecting a real model. It's faster than debugging through a chat interface, and it shows you exactly what schema descriptions the model will see.

Transport: The Decision That Actually Matters for Production

This is where most MCP tutorials silently mislead you.

stdio is the default and the simplest transport. Your server runs as a subprocess of the host application, reading from stdin and writing to stdout. Setup is zero: no ports, no auth, no networking. For local development and desktop tooling (Claude Desktop, Cursor, local IDE extensions), stdio works perfectly.

It also collapses under concurrent load. It's single-client by design. If you need more than one connection, stdio cannot help you.

SSE (Server-Sent Events) was the original HTTP-based transport. It's now deprecated by the MCP specification. Don't build new servers with SSE. Existing SSE servers should migrate.

Streamable HTTP is the transport for production remote MCP servers in 2026. The specification describes a single HTTP endpoint that handles both request-response and streaming patterns. Key properties:

Stateless-friendly: works behind standard load balancers without requiring sticky sessions or persistent connections
Session management via the Mcp-Session-Id header, client-generated at initialization
OAuth 2.1 authentication support built into the transport design
Horizontal scaling: deploy multiple instances; any instance can handle any request

The practical rule: use stdio for local development and desktop integrations, use Streamable HTTP for anything that needs to serve more than one client or run in a shared environment.

Authentication: The Security Debt in Plain Sight

Astrix's 2025 MCP security research found that many MCP servers require credentials, and a large share still rely on long-lived static secrets: API keys and personal access tokens stored insecurely and never rotated. OAuth usage remains the smaller slice of the ecosystem.

This isn't just a configuration problem. The original MCP protocol shipped without mandatory authentication. Early servers stored credentials in environment variables and config files. That pattern spread across thousands of community servers and is now deeply embedded in tutorials and documentation.

For any MCP server that moves beyond a developer's local machine, the current standards are clear:

Streamable HTTP transport: OAuth 2.1, as specified in the MCP authorization spec. Not API keys. Not basic auth.
Multi-tenant or enterprise deployments: token rotation, short-lived credentials, and audit logging for every tool invocation
Sensitive tools: human-in-the-loop approval flows before execution, not just model approval

The MCP specification now includes authorization requirements, and the Linux Foundation governance adds review processes for registered servers. But community servers don't retroactively inherit security updates because the spec changed. Treat any community MCP server that handles credentials as untrusted until you've audited it yourself.

For servers you build and control, the security checklist:

Validate and sanitize all inputs before executing any tool logic
Use Streamable HTTP + OAuth 2.1 for remote deployments
Log all tool invocations with parameters (redacted where sensitive)
Rate-limit tool calls per session to limit blast radius from compromised clients
Scope tool permissions to the minimum required; don't expose a "delete record" tool if the use case only requires reading

Testing Before You Connect a Model

Unit test the business logic first, separate from MCP. Your tool functions should be testable as plain Python or TypeScript functions. If a function is only testable through the protocol, the design has a problem.

For protocol-level testing, the MCP Inspector covers most needs: list capabilities, invoke tools with test inputs, inspect the raw messages. For integration testing, the official SDK includes test utilities that let you spin up a server and client in-process without a real transport.

One test worth running explicitly: what happens when your tool receives malformed input? What happens when your database is down? What happens when an API returns an unexpected response code? These failure paths are where MCP servers tend to surface bugs that wouldn't appear in happy-path testing.

The existing articles on agent evals and agent reliability apply here: the model's behavior with your tools is only as predictable as your tool responses are consistent. An MCP server that sometimes returns structured errors and sometimes throws exceptions produces inconsistent model behavior that's genuinely hard to debug.

If your MCP server needs durable cross-session context, pair this testing layer with the memory design patterns in How Agent Memory Got an Architecture.

Rate-limit tool calls per session to limit blast radius from compromised clients 5.

What to Build (and What to Configure Instead)

Before writing your own server, check the existing registry. The official servers repository includes production-ready implementations for filesystems, GitHub, Slack, Google Drive, PostgreSQL, and web search. Most teams that need these integrations are better served configuring a maintained server than forking and owning their own.

Write a custom MCP server when:

You're exposing internal APIs or proprietary data that don't have public implementations
You need business logic that can't be composed from existing server capabilities
You're building a specialized agent that needs tools shaped precisely for its workflow

When building custom, start with the smallest set of tools that covers the use case. More tools mean more surface area for the model to misuse, more validation logic to maintain, and more attack surface in production. The agent-protocol-comparison-2026 guide covers where MCP fits in the broader protocol stack if you're building systems that need agent-to-agent communication alongside tool access.

If those tools touch regulated or professional workflows, AI Agents in Legal is the reminder that tool access creates verification duties, not just integration convenience. If the workflow is fixed enough that the model only needs to classify, extract, or route, The Agent Project That Should Have Been One LLM Call may be the better architecture than a tool-heavy agent.

Practical Implications for Engineering Teams

The teams moving MCP from laptop to production successfully tend to share a few patterns.

They separate the MCP layer from the business layer. The MCP server handles protocol concerns: input parsing, schema validation, error formatting, transport authentication. The business logic lives underneath as ordinary functions that are independently testable.

They treat tool descriptions as user-facing documentation. The model reads your docstrings. If the tool description is vague ("processes the input"), the model will use it inconsistently. Precise descriptions that specify expected input formats, what the tool does and doesn't do, and what error conditions to expect produce substantially more predictable behavior.

They add observability from day one. Every tool invocation should produce a log entry: which tool, what parameters (sanitized), what result, how long it took. When the model makes an unexpected tool call, you need that data to diagnose whether the problem is in the model's reasoning, the tool's description, or the tool's response format. The observability principles in the production agent guide apply directly.

What's Coming in 2026

The 2026 MCP roadmap introduces two changes that affect server design:

Stateless HTTP transport will allow MCP servers to operate without session state entirely, which simplifies horizontal scaling substantially. Currently Streamable HTTP still benefits from session awareness for efficiency. The fully stateless variant removes that requirement.

The Tasks primitive introduces asynchronous, long-running operations. Currently, all MCP interactions are synchronous request-response: the client sends a request and waits for completion. Tasks let an agent dispatch a long job, receive a task ID, and poll for completion later. For MCP servers that wrap slow operations (batch data processing, multi-step workflows, ML inference jobs), this changes the design substantially. If you're building a server today that wraps a slow operation, architect the underlying logic to be pollable now, even if you expose it synchronously until Tasks ships broadly.

The ecosystem is moving toward Streamable HTTP + OAuth 2.1 + async Tasks, which is the design direction to plan for instead of the original stdio + environment-variable credentials pattern. The remaining problem is production readiness.

Where to Start

Read the official MCP build-server guide, choose your SDK, and implement one tool against one real internal system. Don't start with a demo. Start with something your team will actually use, because that's the path that surfaces the real decisions: what data goes in resources vs. tools, what error handling the model can actually use, what your authentication story looks like when someone other than you runs the server.

The protocol is mature. The tooling works. The gap between tutorial and production is mostly about the decisions this guide covers — not the code itself.

How to Build an MCP Server: A Practitioner's Development Guide

Key finding

Why it matters

Evidence base

Operator takeaway

Where this breaks

Use this if

Avoid this if