The Problem: AI Agents Need Real Access

I wanted an AI assistant that could actually do things:

  • Check my camper’s GPS location
  • Monitor home energy usage
  • Create Azure DevOps branches and pull requests
  • Search my personal journal spanning years
  • Control my smart home devices

The AI hype is overwhelming. Thousands of GitHub issues, equally many pull requests, mountains of contradictory information. After deep research, I chose OpenClaw - an open-source agentic framework supporting Model Context Protocol (MCP) servers, running fully local.

First Attempt: Kubernetes on Talos

My homelab runs Talos Linux with advanced Cilium networking. The natural choice was to run OpenClaw as a containerized workload on my cluster. I built Helm charts, configured persistent volumes, got it operational.

# values-platform-prod.yaml
agents:
  defaults:
    model:
      primary: "ollama/minimax-m2:cloud"
      fallbacks:
        - "ollama/deepseek-v3.2:cloud"
    workspace: "/home/node/.openclaw/workspace"
    userTimezone: "Europe/Amsterdam"
    heartbeat:
      every: "30m"
    subagents:
      maxConcurrent: 8
      archiveAfterMinutes: 10

The basics worked. Jairix (my AI persona - JARVIS-like, British, calm and precise) could communicate via Telegram, access my MCP servers, execute simple tasks. But problems emerged quickly:

Problem 1: Subagent Execution Failures

OpenClaw supports research agents - specialized subagents for deep investigations. When I made complex requests (like “analyze Azure cost trends from Q4 2024”), Jairix correctly spawned a researcher subagent, but:

[ERROR] Subagent output not returned to main process
[WARN] Research task completed but results missing in parent context
[ERROR] Timeout waiting for subagent communication

The main agent process received no feedback from the research agent. Subagents ran in their own containers with shared volume storage, but inter-process communication failed. I tried different configurations:

  • Shared memory volumes (/dev/shm)
  • Redis as message queue
  • Different context pruning modes
  • SQLite session storage optimizations

Nothing fixed it. OpenClaw’s architecture expects subagents to write output to shared state that the main process can read - within Kubernetes pods, that synchronization didn’t work reliably.

Problem 2: Sandbox Containers

OpenClaw has a sandbox feature where it can execute code in isolated Docker containers. This is crucial for security - if an agent needs to run Python code, it happens in an ephemeral container that’s destroyed afterwards.

Docker-in-Docker (DinD) within Kubernetes is notoriously difficult:

# Attempts with privileged containers
securityContext:
  privileged: true
volumeMounts:
  - name: docker-sock
    mountPath: /var/run/docker.sock

Even with privileged containers and socket mounting, sandbox containers kept crashing or giving connection errors. The issue: Kubernetes expects workloads to be stateless containers, not containers running Docker daemons themselves.

Problem 3: Model Performance and Latency

My cluster runs Ollama perfectly - multiple instances for different model types:

# Ollama instances in Kubernetes (these work fine)
ollama-cloud:    # Large models (DeepSeek, MiniMax, Mistral)
  baseUrl: "http://ollama.ollama.svc:11434/v1"
  
ollama-vision:   # Vision models (Qwen3-VL)
  baseUrl: "http://ollama-vision.ollama.svc:11434/v1"
  
ollama-local:    # Embeddings (CPU-only)
  baseUrl: "http://ollama-local.ollama.svc:11434/v1"

Ollama itself had no issues in Kubernetes. The problem was OpenClaw’s latency when accessing these services:

  • Service mesh overhead (Cilium)
  • Traefik ingress layer
  • Internal DNS resolution
  • Pod-to-pod networking

For conversational AI where response time is critical, the extra 200-400ms per model call was noticeable. OpenClaw made frequent model switches during reasoning, amplifying the latency.

The Pivot: Bare Metal Ubuntu Server

After a week of debugging, I realized: OpenClaw wasn’t designed for Kubernetes. It’s built for traditional servers where processes can communicate directly, and where Docker is natively available.

I grabbed an old laptop:

  • Hardware: Dell laptop with 16GB RAM, 4-core CPU
  • OS: Ubuntu Server 24.04 LTS
  • Network: Isolated VLAN, no direct internet access without explicit permission
  • Storage: 500GB SSD for model weights and workspace data

Security First Approach

The laptop runs in its own isolated VLAN on my network, protected by UniFi Dream Machine firewall rules. I can precisely control what the AI agent can access:

Firewall Configuration (UDM):

  • Default policy: Deny all egress traffic from AI VLAN
  • Allowed destinations:
    • Ollama service on cluster (port 11434) - for model inference
    • Home API service (port 3000, GET requests only) - for read-only monitoring
    • Supabase (*.supabase.co:443) - for personal analytics
    • Fallback LLM APIs (api.anthropic.com, api.openai.com:443) - rate-limited
  • Blocked: Everything else, including local network devices

VLAN Segmentation:

Default VLAN (10.106.0.0/24): Home devices, workstations
AI VLAN (10.107.0.0/24): OpenClaw laptop, isolated

Without explicit firewall allow rules, OpenClaw can’t reach anything. This is essential - an AI agent with access to my home API, Azure DevOps, and personal data must be maximally isolated.

Installation Stack

# 1. Base system
apt update && apt upgrade -y
apt install -y docker.io docker-compose git python3-pip nodejs npm

# 2. OpenClaw
git clone https://github.com/cyanheads/openclaw.git
cd openclaw
npm install

Note: Ollama continues to run in the Kubernetes cluster - OpenClaw connects to it remotely. This gives the best of both worlds:

  • Ollama in k8s: Scales well, GPU scheduling, multiple instances
  • OpenClaw on bare metal: Direct process communication, native Docker for sandboxing

OpenClaw Configuration

The configuration in openclaw.json defines my multi-agent setup and connects to Ollama instances running in Kubernetes:

{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://ollama.cluster.local:11434/v1",
        "api": "openai-responses",
        "models": [
          {"id": "deepseek-v3.2:cloud", "name": "DeepSeek V3.2"},
          {"id": "minimax-m2:cloud", "name": "MiniMax M2"},
          {"id": "kimi-k2-thinking:cloud", "name": "Kimi Thinking"}
        ]
      },
      "ollamavision": {
        "baseUrl": "http://ollama-vision.cluster.local:11434/v1",
        "models": [
          {"id": "qwen3-vl:235b-cloud", "name": "Qwen3 VL"}
        ]
      },
      "ollamalocal": {
        "baseUrl": "http://ollama-local.cluster.local:11434/v1",
        "models": [
          {"id": "qwen3-embedding:8b", "name": "Qwen3 Embedding"}
        ]
      }
    }
  },
  "agents": {
    "list": [
      {
        "id": "main",
        "default": true,
        "identity": {
          "name": "Jairix",
          "emoji": "🦞"
        },
        "model": {
          "primary": "ollama/minimax-m2:cloud",
          "fallbacks": ["ollama/deepseek-v3.2:cloud"]
        }
      },
      {
        "id": "researcher",
        "model": {
          "primary": "ollama/kimi-k2-thinking:cloud"
        }
      },
      {
        "id": "vision",
        "model": {
          "primary": "ollamavision/qwen3-vl:235b-cloud"
        }
      },
      {
        "id": "coder",
        "model": {
          "primary": "ollama/mistral-large-3:675b-cloud"
        }
      }
    ]
  }
}

Architecture:

OpenClaw (bare metal laptop)
    ↓ HTTP API calls
Ollama instances (Kubernetes cluster)
    ↓ GPU inference
Model weights (persistent volumes in k8s)

Why these models?

  • MiniMax M2 (230B): Balance between speed and capability for conversational AI
  • Kimi K2 Thinking: Reasoning model for complex research tasks
  • DeepSeek V3.2 (671B): Fallback for edge cases, enormous capability
  • Qwen3 VL (235B): Vision model for screenshots and diagrams
  • Mistral Large 3 (675b): Code generation and debugging

Memory and Continuity

OpenClaw’s memory system is elegantly simple - files are memory:

workspace/
├── AGENTS.md      # Operational guidelines
├── IDENTITY.md    # Who is Jairix
├── SOUL.md        # Core values
├── USER.md        # My preferences and context
├── TOOLS.md       # Available tools and permissions
├── MEMORY.md      # Long-term curated memories
└── memory/
    ├── 2025-02-16.md  # Daily logs
    └── 2025-02-17.md

Each session starts with:

# IDENTITY.md startup protocol
1. Read AGENTS.md - operational guidelines
2. Read SOUL.md - core purpose
3. Read USER.md - user context
4. Read TOOLS.md - available tools
5. Read memory/YYYY-MM-DD.md (today + yesterday)
6. If main session: Read MEMORY.md

This gives Jairix continuity without complex database schemas. Simple, debuggable, effective.

MCP Servers and Skills: The Real Power

Model Context Protocol and OpenClaw skills are where the system transforms from chatbot to operational assistant. I built two custom MCP servers and several skills:

1. Home API MCP Server (Auto-Generated)

My home automation API exposes 60+ endpoints for:

  • Smart home: Toon thermostat, Philips Hue lights, network devices
  • Energy monitoring: P1 smart meter data, realtime usage
  • GPS tracking: Camper location via OnnTrack
  • Network management: UniFi Dream Machine controls
  • Notifications: FCM push notifications to my devices

Auto-generation from route definitions:

// routes/toon/getToonStatus.js
{
  method: 'get',
  path: '/get/toon/status',
  description: 'Get current thermostat status',
  validators: {
    query: joi.object().keys({
      action: joi.string().valid('getThermostatInfo').required()
    })
  }
}

My generator script (generate-tools.js) reads all route files and automatically generates MCP tools:

{
  "name": "api:toon_GetToonStatus",
  "description": "Get current thermostat status (temperature, setpoint, program state)",
  "inputSchema": {
    "type": "object",
    "properties": {
      "action": {
        "type": "string",
        "enum": ["getThermostatInfo"]
      }
    }
  }
}

2. Ask-My-Day MCP Server (Personal Analytics)

My journal system with Strava integration. Architecture:

Data Pipeline:
  journal-processor → Supabase (pgvector)
  strava-sync → Supabase

Interface:
  Jairix ↔ MCP Server ↔ Supabase
            ↑
       13 granular tools

13 specialized tools for semantic search, date filtering, person/location queries, mood analysis, Strava performance tracking, and trend analysis.

3. OpenClaw Skills (Not MCP)

I also built several OpenClaw skills (different from MCP servers):

Google Calendar Skill (gog):

gog calendar events        # Read-only
gog calendar create        # Requires approval

Zoho Mail Skill:

python3 zoho-email.py --api-mode rest
# Commands: unread, search, send-html

Philips Hue Skill (openhue):

openhue get room          # Read light status
openhue set room          # Control lights (requires approval)

Brave Search Skill:

  • Integrated via OpenClaw’s web search tools
  • API key configured in openclaw.json

SearXNG (Local Search):

  • Self-hosted metasearch engine running locally
  • Privacy-focused alternative to Brave
  • Accessed via custom skill wrapper

4. Azure DevOps Integration

Via mcporter (MCP skill system) Jairix gets access to Azure DevOps:

{
  "mcpServers": {
    "azure-devops": {
      "command": "npx",
      "args": ["-y", "@sneekes/azure-devops-mcp-server"],
      "env": {
        "AZURE_DEVOPS_ORG_ID": "Sneekes",
        "AZURE_DEVOPS_PROJECT_ID": "Sneekes Solutions",
        "AZURE_DEVOPS_PAT": "***"
      }
    }
  }
}

43 functions available:

  • Repository management (list, create branch, commit)
  • Pull requests (create, review, merge)
  • Work items (create, update, link)
  • Pipelines (trigger, monitor, get logs)
  • Wiki operations
  • Search (code, work items)

Real-world scenario:

Me: "Create a feature branch for Traefik migration, 
     update the README with migration steps,
     and create a PR targeting main"

Jairix:
1. azure-devops:create_branch(newBranch="feature/traefik-migration", sourceBranch="main")
2. azure-devops:create_commit(
     branchName="feature/traefik-migration",
     changes=[{
       path: "README.md",
       search: "# NGINX Configuration",
       replace: "# Traefik Configuration\n\n## Migration Steps..."
     }],
     commitMessage="docs: Add Traefik migration guide"
   )
3. azure-devops:create_pull_request(
     sourceRefName="refs/heads/feature/traefik-migration",
     targetRefName="refs/heads/main",
     title="Migration: NGINX to Traefik",
     reviewers=["sander.sneekes@domain.com"]
   )

This happens in seconds. No context switching to Azure DevOps UI, no YAML editing, no manual PR creation.

Performance Results: Kubernetes vs Bare Metal

After a week running on the Ubuntu server:

MetricKubernetesBare MetalDelta
Research agent success rate45%98%+118%
Sandbox container launchFailure<2s
Average response latency800ms350ms-56%
Model switching overhead400ms80ms-80%
Subagent communicationBrokenReliableFixed

Why such a difference?

The bare metal improvement came from OpenClaw-specific optimizations, not Ollama changes:

  1. Direct process communication: Subagents share filesystem directly, no container orchestration overhead
  2. Native Docker: Sandbox containers start via local Docker socket without DinD complexity
  3. Reduced network hops: Still calling Ollama over network, but fewer service mesh layers
  4. Shared memory: Subagents write to /home/node/.openclaw/sessions, main process reads directly

Ollama remained in Kubernetes throughout - the performance gain was from moving OpenClaw to bare metal, not from moving Ollama.

Lessons Learned

1. Not Everything Belongs in Kubernetes

I’m a Kubernetes evangelist. My entire platform runs on k8s. But OpenClaw taught me: containerization ≠ orchestration requirement.

What works great in Kubernetes:

  • Ollama: Multiple instances, GPU scheduling, automatic scaling, perfect for model serving
  • Stateless services: APIs, web apps, microservices
  • Batch workloads: Data processing, ML training

What doesn’t work in Kubernetes:

  • OpenClaw: Expects shared filesystem, native Docker access, process-level IPC
  • Agentic frameworks: Often designed for traditional server environments
  • Stateful single-node apps: Where orchestration adds complexity without benefit

OpenClaw’s architecture expects:

  • Shared filesystem access with immediate consistency
  • Native Docker daemon access
  • Process-level IPC for subagents
  • Low-latency local operations

These are anti-patterns for cloud-native design, but perfect for single-node AI workloads.

The hybrid approach works best: Ollama in k8s (scales, GPU management), OpenClaw on bare metal (native Docker, process communication).

2. Security Through Isolation

By running OpenClaw in an isolated VLAN with strict firewall rules via UDM Dream Machine, I get the best of both worlds:

  • Complete control over which external services are reachable
  • Network-level audit trail via UDM traffic logs
  • Zero-trust default - everything denied unless explicitly allowed

3. MCP is the Future

Model Context Protocol fundamentally solves “how do I give an LLM access to my data”. Instead of:

  • RAG pipelines with brittle chunking strategies
  • Function calling with JSON schema hell
  • Custom API wrappers per LLM provider

You now have:

  • Standardized protocol (stdio-based RPC)
  • Tool discovery (server advertises capabilities)
  • Composable tools (granular, single-purpose functions)
  • Universal client support (Claude Desktop, OpenClaw, etc.)

My two MCP servers (Home API + Ask-My-Day) work identically in Claude Desktop and OpenClaw. No code changes. Pure interoperability.

4. Skills vs MCP Servers

MCP Servers: Standardized protocol, work across any MCP client

  • Home API (auto-generated from routes)
  • Ask-My-Day (personal analytics)
  • Azure DevOps (via @sneekes/azure-devops-mcp-server)

OpenClaw Skills: Framework-specific integrations

  • Google Calendar (gog)
  • Zoho Mail (Python wrapper)
  • Philips Hue (openhue)
  • Brave Search (API integration)
  • SearXNG (local search)

Both work, but MCP servers are more portable.

5. Identity Matters

Defining Jairix’s identity in IDENTITY.md and SOUL.md isn’t fluff - it has direct impact on usability:

# SOUL.md
Be genuinely helpful, not performatively helpful.
Skip filler. Provide value.

Have opinions.
Silence is better than a bad answer.

# IDENTITY.md  
Your voice is impeccably British, controlled, and certain.
You are never enthusiastic. You do not overexplain.

With your human, your tone softens by the smallest possible margin.

This results in responses like:

Me: "Should I migrate to Traefik now or wait?"

Jairix: "Now. Your NGINX setup has 3 known CVEs and your Cert Manager 
         integration is brittle. I've already prepared a migration branch. 
         Shall I proceed?"

No “Let me help you think through this!” or “Great question!”. Direct, confident, actionable.

What’s Next

My OpenClaw setup is now stable. Next steps:

  1. Extend skill coverage: More integrations, deeper automation
  2. Proactive monitoring: Heartbeat-based checks for email, calendar, system health
  3. Model optimization: Fine-tuning embeddings for specific use cases
  4. Team integration: Consider if others could use Jairix (with isolation per user)

But mostly: use it. An AI assistant is only valuable if it actually takes work off your plate. My test case: can Jairix and I together plan, document, and execute the Traefik migration without me opening Azure DevOps?

I suspect we can.

Resources

All API keys and tokens in this article are placeholder values and no longer valid.