LISTEN TO THIS ARTICLE
title: "How to Build an MCP Server: A Practitioner's Development Guide"
slug: mcp-server-development-guide
date: 2026-05-02
type: guide
category: agent-design
subtopic: protocols
tags: [guides, agent-design, protocols, mcp, model-context-protocol, tools, production]
status: draft
excerpt: "78% of enterprise AI teams run at least one MCP-backed agent in production, but 86% of MCP servers still live on developer laptops. This guide covers everything between the hello-world tutorial and a server you'd actually trust in production."
The Model Context Protocol had 1,200 community servers in Q1 2025. By April 2026 that number hit 9,400. Ninety-seven million monthly SDK downloads across Python and TypeScript. First-class support in Claude, ChatGPT, Cursor, VS Code, and Microsoft Copilot. 78% of enterprise AI teams report at least one MCP-backed agent in production.
Here's the problem: 86% of those MCP servers run on developer laptops, and only 5% run in actual production environments. The gap between "it works on my machine with stdio" and "it handles concurrent requests behind a load balancer without leaking credentials" is where most MCP projects stall.
This guide covers the implementation decisions that matter once you're past hello world.
What You're Actually Building
Before touching code, clarify which MCP primitives your server needs. The protocol defines three, and they serve different purposes.
Tools are functions the model can call to take actions or compute results. They accept inputs, do something, and return outputs. A tool might query a database, call an API, run a calculation, or send a message. Tools are the most common primitive and the one most tutorials default to.
Resources expose data the model can read without triggering side effects. Think files, database rows, API responses formatted as context. Resources use URI addressing (file://, postgres://, github://) and are meant to be pulled, not called. If your tool is "run a query," your resource is "here's the query result as readable context."
Prompts are reusable templates that shape how the model interacts with your server's capabilities. A code-review prompt that bundles the right instructions with a resource reference is a prompt. These are underused and underappreciated: they're how you encode institutional knowledge into the server rather than relying on users to prompt correctly every time.
Most MCP servers in the wild implement only tools, ignore resources, and don't know prompts exist. That's fine for simple integrations. It's a missed opportunity for anything more complex.
SDK Choice: Python vs TypeScript
Both official SDKs are mature. Pick based on where your team lives.
Python SDK (modelcontextprotocol.io/docs) uses a FastMCP-style interface that integrates into the core package. Define tools as normal functions with type hints and docstrings, and the framework infers JSON schemas from your annotations. This reduces boilerplate significantly compared to the lower-level API:
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("my-server")
@mcp.tool()
def get_customer(customer_id: str) -> dict:
"""Retrieve a customer record by ID."""
return db.query("SELECT * FROM customers WHERE id = %s", customer_id)
The docstring becomes the tool description the model sees. The type hints become the input schema. Pydantic handles validation. If your business logic already lives in Python, wiring it to MCP is a few decorators.
TypeScript SDK is often used for MCP servers that live alongside Node.js services or front-end tooling. It provides full types for messages, tools, resources, and transports. Zod handles schema validation. For web-adjacent infrastructure (GitHub Actions, CI tooling, browser-accessible services), TypeScript fits naturally.
Nearform's production implementation guide identifies one consistent mistake across both SDKs: printing to stdout during stdio transport. When your server uses stdio, the protocol messages travel over stdin/stdout. Any debug logging, print statements, or console.log calls that reach stdout corrupt the message stream and break the client connection silently. Send all logs to stderr, a file, or a structured logging sink.
Implementing Tools That Don't Break
Tool implementation looks simple until you consider error handling. The MCP spec draws a sharp line between two error categories:
Protocol errors indicate the MCP communication itself failed: malformed requests, schema violations, transport issues. These surface as error responses in the JSON-RPC layer.
Tool execution errors happen inside your tool's logic: a database connection fails, an API returns a 429, a file doesn't exist. These should not be protocol errors. Return them as valid CallToolResult objects with isError: true and a human-readable explanation of what went wrong.
Why does this distinction matter? Because the model sees tool execution errors and can reason about them. If you throw an unhandled exception and let it bubble up as a protocol error, the client breaks and the model has no opportunity to retry with different parameters, escalate, or fall back gracefully. If you return a structured error in the result, the model can read it, decide what to do, and keep the session alive.
@mcp.tool()
def get_customer(customer_id: str) -> dict:
"""Retrieve a customer record by ID."""
if not customer_id or not customer_id.startswith("cust_"):
return {"error": "Invalid customer_id format. Expected prefix: cust_"}
try:
result = db.query("SELECT * FROM customers WHERE id = %s", customer_id)
if not result:
return {"error": f"No customer found with id: {customer_id}"}
return result
except DatabaseConnectionError as e:
return {"error": f"Database unavailable: {str(e)}"}
Input validation belongs in the tool, not left to the model. The MCP tools specification is explicit: validate lengths, formats, and ranges before executing any logic. For string inputs that touch file paths, database queries, or shell commands, sanitize before use. An MCP server with weak input validation is a prompt injection vector that extends into your infrastructure.
The official MCP Inspector (npx @modelcontextprotocol/inspector) gives you a UI to list all registered tools, fire test invocations, and inspect the raw JSON-RPC traffic. Run it before connecting a real model. It's faster than debugging through a chat interface, and it shows you exactly what schema descriptions the model will see.
Transport: The Decision That Actually Matters for Production
This is where most MCP tutorials silently mislead you.
stdio is the default and the simplest transport. Your server runs as a subprocess of the host application, reading from stdin and writing to stdout. Setup is zero: no ports, no auth, no networking. For local development and desktop tooling (Claude Desktop, Cursor, local IDE extensions), stdio works perfectly.
It also collapses under any concurrent load. Production testing found 20 of 22 requests failed with just 20 simultaneous connections. It's single-client by design. If you need more than one connection, stdio cannot help you.
SSE (Server-Sent Events) was the original HTTP-based transport. It's now deprecated by the MCP specification. Don't build new servers with SSE. Existing SSE servers should migrate.
Streamable HTTP is the transport for production remote MCP servers in 2026. The specification describes a single HTTP endpoint that handles both request-response and streaming patterns. Key properties:
- Stateless-friendly: works behind standard load balancers without requiring sticky sessions or persistent connections
- Session management via the
Mcp-Session-Idheader, client-generated at initialization - OAuth 2.1 authentication support built into the transport design
- Horizontal scaling: deploy multiple instances; any instance can handle any request
The practical rule: use stdio for local development and desktop integrations, use Streamable HTTP for anything that needs to serve more than one client or run in a shared environment.
Authentication: The Security Debt in Plain Sight
Astrix's 2025 MCP security research found that 88% of MCP servers require credentials, but 53% rely on long-lived static secrets: API keys and personal access tokens stored insecurely and never rotated. Only 8.5% use OAuth.
This isn't just a configuration problem. The original MCP protocol shipped without mandatory authentication. Early servers stored credentials in environment variables and config files. That pattern spread across thousands of community servers and is now deeply embedded in tutorials and documentation.
For any MCP server that moves beyond a developer's local machine, the current standards are clear:
- Streamable HTTP transport: OAuth 2.1, as specified in the MCP authorization spec. Not API keys. Not basic auth.
- Multi-tenant or enterprise deployments: token rotation, short-lived credentials, and audit logging for every tool invocation
- Sensitive tools: human-in-the-loop approval flows before execution, not just model approval
The MCP specification now includes authorization requirements, and the Linux Foundation governance adds review processes for registered servers. But 9,400 community servers don't retroactively inherit security updates because the spec changed. Treat any community MCP server that handles credentials as untrusted until you've audited it yourself.
For servers you build and control, the security checklist:
- Validate and sanitize all inputs before executing any tool logic
- Use Streamable HTTP + OAuth 2.1 for remote deployments
- Log all tool invocations with parameters (redacted where sensitive)
- Rate-limit tool calls per session to limit blast radius from compromised clients
- Scope tool permissions to the minimum required; don't expose a "delete record" tool if the use case only requires reading
Testing Before You Connect a Model
Unit test the business logic first, separate from MCP. Your tool functions should be testable as plain Python or TypeScript functions. If a function is only testable through the protocol, the design has a problem.
For protocol-level testing, the MCP Inspector covers most needs: list capabilities, invoke tools with test inputs, inspect the raw messages. For integration testing, the official SDK includes test utilities that let you spin up a server and client in-process without a real transport.
One test worth running explicitly: what happens when your tool receives malformed input? What happens when your database is down? What happens when an API returns an unexpected response code? These failure paths are where MCP servers tend to surface bugs that wouldn't appear in happy-path testing.
The existing articles on agent evals and agent reliability apply here: the model's behavior with your tools is only as predictable as your tool responses are consistent. An MCP server that sometimes returns structured errors and sometimes throws exceptions produces inconsistent model behavior that's genuinely hard to debug.
What to Build (and What to Configure Instead)
Before writing your own server, check the existing registry. The official servers repository includes production-ready implementations for filesystems, GitHub, Slack, Google Drive, PostgreSQL, and web search. Most teams that need these integrations are better served configuring a maintained server than forking and owning their own.
Write a custom MCP server when:
- You're exposing internal APIs or proprietary data that don't have public implementations
- You need business logic that can't be composed from existing server capabilities
- You're building a specialized agent that needs tools shaped precisely for its workflow
When building custom, start with the smallest set of tools that covers the use case. More tools mean more surface area for the model to misuse, more validation logic to maintain, and more attack surface in production. The agent-protocol-comparison-2026 guide covers where MCP fits in the broader protocol stack if you're building systems that need agent-to-agent communication alongside tool access.
Practical Implications for Engineering Teams
The teams moving MCP from laptop to production successfully tend to share a few patterns.
They separate the MCP layer from the business layer. The MCP server handles protocol concerns: input parsing, schema validation, error formatting, transport authentication. The business logic lives underneath as ordinary functions that are independently testable.
They treat tool descriptions as user-facing documentation. The model reads your docstrings. If the tool description is vague ("processes the input"), the model will use it inconsistently. Precise descriptions that specify expected input formats, what the tool does and doesn't do, and what error conditions to expect produce substantially more predictable behavior.
They add observability from day one. Every tool invocation should produce a log entry: which tool, what parameters (sanitized), what result, how long it took. When the model makes an unexpected tool call, you need that data to diagnose whether the problem is in the model's reasoning, the tool's description, or the tool's response format. The observability principles in the production agent guide apply directly.
What's Coming in 2026
The 2026 MCP roadmap introduces two changes that affect server design:
Stateless HTTP transport will allow MCP servers to operate without session state entirely, which simplifies horizontal scaling substantially. Currently Streamable HTTP still benefits from session awareness for efficiency. The fully stateless variant removes that requirement.
The Tasks primitive introduces asynchronous, long-running operations. Currently, all MCP interactions are synchronous request-response: the client sends a request and waits for completion. Tasks let an agent dispatch a long job, receive a task ID, and poll for completion later. For MCP servers that wrap slow operations (batch data processing, multi-step workflows, ML inference jobs), this changes the design substantially. If you're building a server today that wraps a slow operation, architect the underlying logic to be pollable now, even if you expose it synchronously until Tasks ships broadly.
The ecosystem moving to Streamable HTTP + OAuth 2.1 + async Tasks represents a significant maturation from the original stdio + environment-variable credentials pattern. The adoption numbers suggest the protocol won the standard war. The production readiness gap (only 5% of servers in actual production) is the next problem to solve.
Where to Start
Read the official MCP build-server guide, choose your SDK, and implement one tool against one real internal system. Don't start with a demo. Start with something your team will actually use, because that's the path that surfaces the real decisions: what data goes in resources vs. tools, what error handling the model can actually use, what your authentication story looks like when someone other than you runs the server.
The protocol is mature. The tooling works. The gap between tutorial and production is mostly about the decisions this guide covers — not the code itself.
Related: Model Context Protocol Explained · MCP vs A2A vs ACP: Protocol Comparison · The Protocol Wars Are Ending · Deploying AI Agents to Production · Types of AI Agents