The MCP Problem
Model Context Protocol solves: how do you give an LLM safe, structured access to external systems?
Traditional approaches:
- RAG pipelines: Complex, brittle chunking, variable retrieval quality
- Function calling: Hand-maintained JSON schemas, per-provider differences
- Hardcoded integrations: Not reusable, tight coupling
MCP introduces a standard protocol where:
- An LLM can discover available tools
- Tools declaratively define their interface (JSON Schema)
- Clients are universal - one MCP server works in Claude Desktop, OpenClaw, and custom tooling
This article shows how I built two production MCP servers and several OpenClaw skills, what works, and what doesn’t.
Architecture Principles
1. Tools Are Dumb, LLMs Are Smart
Wrong thinking:
# ❌ BAD: Smart tool doing too much
@tool("smart_search")
def smart_search(query: str, auto_filter: bool = True):
# Parse natural language query
# Detect intent (person? location? date range?)
# Auto-apply filters
# Sort results
# Format response
return formatted_results
This is tempting - one tool that “does everything”. But:
- Debugging nightmare: When it fails, where?
- Black box: LLM can’t show reasoning about tool selection
- Inflexible: Auto-filtering can’t be overridden
- Not composable: LLM can’t combine different strategies
Correct thinking:
# ✅ GOOD: Granular, single-purpose tools
@tool("search_by_person")
def search_by_person(person_name: str, limit: int = 20):
"""Search entries mentioning a specific person."""
return database.query(persons=person_name).limit(limit)
@tool("search_by_date_range")
def search_by_date_range(date_start: str, date_end: str):
"""Filter entries within date range (ISO format)."""
return database.query(date_range=(date_start, date_end))
@tool("search_semantic")
def search_semantic(query: str, threshold: float = 0.2):
"""Semantic search using embeddings."""
embedding = generate_embedding(query)
return database.vector_search(embedding, threshold)
Now the LLM can orchestrate:
User: "Show entries from last week with Alice"
LLM reasoning:
1. "Last week" → calculate date_start="2025-02-10", date_end="2025-02-16"
2. Call: search_by_date_range(date_start="2025-02-10", date_end="2025-02-16")
3. Filter results → Call: search_by_person(person_name="Alice")
4. Present combined results
The intelligence is in the LLM, not the tool. This makes debugging simple - you see exactly which tool calls were made with which parameters.
2. Permission Model: Read vs Write
Not all tools are equally safe. My permission model:
## 🟢 Read-Only Operations (Heartbeat Safe)
- Query databases
- Check system status
- Monitor metrics
→ LLM may use autonomously during background checks
## 🟡 Workspace Operations (No Approval Needed)
- Create files in workspace
- Read code repositories
- Create Git branches (feature branches only)
→ LLM may use, audit trail is maintained
## 🔴 High-Impact Operations (ALWAYS Ask First)
- Send communications (email, notifications)
- Modify calendar events
- Control smart home devices
- Create pull requests
- Trigger pipelines
→ LLM MUST request explicit approval
3. Auto-Generation: Single Source of Truth
Problem: APIs change. Hand-maintained MCP tools drift out of sync.
Solution: Generate tools automatically from existing API definition.
Case Study 1: Home API MCP Server (Node.js)
My home automation API has 60+ endpoints defined as route files. I built auto-generation to keep MCP tools in sync.
Generator Script
generate-tools.js scans all route files and generates MCP tools:
const fs = require('fs');
const path = require('path');
function generateTools() {
const routesDir = path.join(__dirname, '../routes');
const tools = [];
const routes = scanDirectory(routesDir);
routes.forEach(route => {
const tool = {
name: `api:${route.category}_${route.name}`,
description: route.description || 'No description',
inputSchema: {
type: 'object',
properties: {},
required: []
},
api: {
method: route.method,
path: route.path,
queryParams: [],
bodyParams: []
}
};
// Extract parameters from Joi validators
if (route.validators?.query) {
const querySchema = extractJoiSchema(route.validators.query);
Object.assign(tool.inputSchema.properties, querySchema.properties);
tool.api.queryParams = Object.keys(querySchema.properties);
tool.inputSchema.required.push(...querySchema.required);
}
if (route.validators?.body) {
const bodySchema = extractJoiSchema(route.validators.body);
Object.assign(tool.inputSchema.properties, bodySchema.properties);
tool.api.bodyParams = Object.keys(bodySchema.properties);
tool.inputSchema.required.push(...bodySchema.required);
}
tools.push(tool);
});
fs.writeFileSync(
path.join(__dirname, 'generated-tools.json'),
JSON.stringify(tools, null, 2)
);
console.log(`✅ Generated ${tools.length} tools`);
}
Workflow
# 1. Add or modify route
vim ../routes/camper/getLocation.js
# 2. Regenerate tools
npm run generate
# 3. Restart MCP server (restart OpenClaw)
# Tools are now available!
Key benefit: Route definition is single source of truth. MCP tools cannot drift from API spec.
Case Study 2: Ask-My-Day MCP Server (Python)
My journal system with Strava integration. Data pipeline:
journal-processor → Supabase (pgvector)
strava-sync → Supabase
MCP Server ↔ Supabase
↑
13 granular tools
Database Schema
CREATE TABLE personal (
id UUID PRIMARY KEY,
source TEXT NOT NULL, -- 'journal' or 'strava'
date DATE NOT NULL,
title TEXT,
content TEXT,
embedding VECTOR(4096), -- Qwen3 embeddings
metadata JSONB,
INDEX idx_date ON personal(date),
INDEX idx_source ON personal(source),
INDEX idx_embedding ON personal
USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100)
);
Tool Definitions
13 tools across categories:
from mcp.server import Server
from mcp.types import Tool
tools = [
Tool(
name="search_semantic",
description="""Semantic search using embeddings.
**When to use:**
- "What are my best memories?"
- "Show inspiring moments"
- "Search entries about travel"
**Returns:** Most semantically similar entries.""",
inputSchema={
"type": "object",
"properties": {
"query": {"type": "string"},
"limit": {"type": "number", "default": 20},
"threshold": {"type": "number", "default": 0.2}
},
"required": ["query"]
}
),
Tool(
name="search_by_date_range",
description="""Search entries within date range (ISO format: YYYY-MM-DD)""",
inputSchema={
"type": "object",
"properties": {
"date_start": {"type": "string"},
"date_end": {"type": "string"},
"source": {"type": "string", "enum": ["journal", "strava"]},
"limit": {"type": "number", "default": 50}
},
"required": ["date_start", "date_end"]
}
),
# ... 11 more tools
]
Implementation
import os
from supabase import create_client, Client
from mcp.server import Server
from mcp.server.stdio import stdio_server
supabase: Client = create_client(
os.getenv("SUPABASE_URL"),
os.getenv("SUPABASE_KEY")
)
server = Server("ask-my-day")
@server.call_tool()
async def call_tool(name: str, arguments: dict):
if name == "search_semantic":
embedding = await generate_embedding(arguments["query"])
response = supabase.rpc(
'search_personal_semantic',
{
'query_embedding': embedding,
'match_threshold': arguments.get("threshold", 0.2),
'match_count': arguments.get("limit", 20)
}
).execute()
return format_results(response.data)
elif name == "search_by_date_range":
query = supabase.table('personal') \
.select('*') \
.gte('date', arguments["date_start"]) \
.lte('date', arguments["date_end"]) \
.order('date', desc=True) \
.limit(arguments.get("limit", 50))
if arguments.get("source"):
query = query.eq('source', arguments["source"])
response = query.execute()
return format_results(response.data)
Case Study 3: OpenClaw Skills (Not MCP)
Google Calendar Skill (gog)
# Read-only operations
gog calendar events
# High-impact operations (requires approval)
gog calendar create --title "Meeting" --start "2025-02-18 10:00"
Implementation: Shell wrapper around Google Calendar API
Zoho Mail Skill
# Python script: zoho-email.py
python3 zoho-email.py --api-mode rest unread # 🟢 Read
python3 zoho-email.py --api-mode rest search "keyword" # 🟢 Read
python3 zoho-email.py --api-mode rest send-html # 🔴 Requires approval
Philips Hue Skill (openhue)
openhue get room "Living Room" # 🟢 Read status
openhue set room "Living Room" on # 🔴 Requires approval
Brave Search & SearXNG
Brave Search:
- Configured in
openclaw.jsonskills section - API key for web search integration
SearXNG:
- Self-hosted metasearch engine
- Privacy-focused, runs locally
- Custom skill wrapper for search queries
Performance Optimizations
1. Embedding Caching
Embedding generation is the bottleneck (~200ms). Cache frequent queries:
from functools import lru_cache
@lru_cache(maxsize=1000)
async def generate_embedding_cached(text: str):
return await generate_embedding(text)
2. Index Strategy
-- IVFFlat index for vector search
CREATE INDEX idx_personal_embedding ON personal
USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
-- B-tree indexes for metadata filters
CREATE INDEX idx_personal_date ON personal(date);
CREATE INDEX idx_personal_source ON personal(source);
-- GIN index for JSONB metadata
CREATE INDEX idx_personal_metadata ON personal USING GIN (metadata jsonb_path_ops);
Benchmark results:
| Query Type | Before | After | Speedup |
|---|---|---|---|
| Vector search (k=20) | 1200ms | 180ms | 6.7x |
| Date range | 450ms | 25ms | 18x |
| Person search | 800ms | 40ms | 20x |
| Combined | 2000ms | 220ms | 9x |
Error Handling
Structured Error Responses
@server.call_tool()
async def call_tool(name: str, arguments: dict):
try:
result = await execute_tool(name, arguments)
return result
except ValueError as e:
return {
"error": {
"type": "invalid_input",
"message": str(e),
"suggestion": "Check input parameter format"
}
}
except ConnectionError as e:
return {
"error": {
"type": "database_connection",
"message": "Supabase connection failed"
}
}
Logging Strategy
MCP servers communicate via stdio. Logs must go to stderr:
import sys
def log(message: str):
"""Log to stderr (stdout is reserved for MCP protocol)."""
print(f"[{datetime.now().isoformat()}] {message}", file=sys.stderr)
Deployment
Environment Variables
# Home API MCP Server
API_BASE_URL=http://localhost:3000
AUTH_TYPE=bearer
API_TOKEN=your-secure-token
# Ask-My-Day MCP Server
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your-anon-key
EMBEDDING_PROVIDER=ollama
OLLAMA_HOST=http://localhost:11434
OpenClaw Configuration
{
"skills": {
"entries": {
"mcporter": {
"enabled": true,
"servers": {
"home-api": {
"command": "node",
"args": ["/path/to/mcp-server/index.js"],
"env": {
"API_BASE_URL": "http://10.106.0.50:3000",
"AUTH_TYPE": "bearer",
"API_TOKEN": "***"
}
},
"ask-my-day": {
"command": "python",
"args": ["/path/to/mcp_server/server.py"],
"env": {
"SUPABASE_URL": "***",
"SUPABASE_KEY": "***"
}
}
}
}
}
}
}
Anti-Patterns
❌ One Mega-Tool
# DON'T
@tool("do_everything")
def do_everything(action: str, **kwargs):
if action == "search":
return search(**kwargs)
elif action == "analyze":
return analyze(**kwargs)
Function dispatch hell. LLM must pick string action, then format parameters.
❌ Over-Formatting
# DON'T
@tool("get_weather")
def get_weather(city: str):
data = api.get_weather(city)
return f"🌡️ Weather: {data.temp}°C, {data.conditions}"
Return structured data, not formatted strings. Let the LLM present it:
# DO
@tool("get_weather")
def get_weather(city: str):
data = api.get_weather(city)
return {
"city": city,
"temperature_celsius": data.temp,
"conditions": data.conditions
}
Conclusion
Building production MCP servers is more about API design than AI:
- Granular tools beat monolithic tools
- Auto-generation keeps tools in sync with source APIs
- Permission models via documentation + LLM identity
- Structured data over formatted strings
- Index strategy matters (pgvector IVFFlat + metadata B-tree)
My setup:
- Home API MCP: 60 tools, auto-generated from route definitions
- Ask-My-Day MCP: 13 specialized tools, <300ms average latency
- OpenClaw Skills: Google Calendar, Zoho Mail, Hue, Brave, SearXNG
- Combined: Zero maintenance - API changes propagate automatically
Jairix can now “Set heating to 21°C”, “What did I do with Alice last week?”, and “Create an Azure DevOps branch for Traefik migration” - all through the same protocol.
Resources
- Model Context Protocol: https://modelcontextprotocol.io
- Supabase pgvector: https://supabase.com/docs/guides/ai/vector-columns
Code examples simplified for clarity.