Self-Hosted AI Agents on Kubernetes: What Works and What Doesn't

There’s a pattern I keep running into with self-hosted AI agent tooling: it runs on Kubernetes, but it wasn’t designed for Kubernetes. You get it working, and most things function correctly, but certain features quietly break in ways that take a while to diagnose. Today I hit that wall again with Hermes Workspace — and it’s the same wall I hit earlier with OpenClaw.

This post is an honest assessment of what works, what doesn’t, and why.

The Common Architecture Problem

OpenClaw, Hermes Agent, and Hermes Workspace all share an implicit assumption: everything runs on one machine.

These tools are designed for the “run it on your server or desktop” use case. That means:

A shared filesystem under ~/.hermes/ or ~/.openclaw/
Local process spawning via child_process.execFile or Python subprocess
tmux sessions on the same machine for persistent agent workers
homedir() resolving to the same path for all components
Direct tool access without network hops

This works perfectly when you install everything on a single Ubuntu server or your laptop. It breaks in interesting ways when you split components across Kubernetes pods.

OpenClaw on Kubernetes

OpenClaw was my previous self-hosted agent stack. It’s gateway-centric — a WebSocket hub that routes between messaging platforms, LLM providers, and tools. The gateway itself runs fine in a container.

The problems start when you try to use features that assume local execution. Plugin execution that spawns processes, file operations that assume a shared working directory, tool calls that write to ~/.openclaw/ and expect the UI to read from the same path — all of these break when the UI and the gateway are in separate pods with separate filesystems.

The workaround was mounting a shared PVC between pods. It works, but it’s fighting the architecture. The tool wasn’t designed with this in mind, so you end up with subtle race conditions, stale file reads, and error messages that make no sense until you realize two pods are fighting over the same files.

I eventually stopped running OpenClaw. Not because it’s a bad tool — it isn’t — but because the friction of maintaining a Kubernetes deployment for something designed for single-machine use wasn’t worth it.

Hermes Agent on Kubernetes

Hermes Agent is a better fit for Kubernetes than OpenClaw, but it still has the same fundamental assumption baked into parts of it.

The gateway (hermes gateway run) is genuinely container-friendly. It’s a long-running server process that listens on ports, reads config from a mounted path, and talks to external services via HTTP. Telegram integration, MCP server connections, delegation to subagents — all of this works correctly in a pod.

The deployment has a real security context issue though: the image uses s6-overlay for init, which requires root at startup to call setgroups() before dropping to the hermes user. Kubernetes defaults (runAsUser: 1000, runAsNonRoot: true) are incompatible with this. The fix is straightforward once you understand the cause, but the Helm chart’s defaults are wrong for the official image — something that should be fixed upstream.

With the correct security context (runAsUser: 0, HERMES_UID=1000, HERMES_GID=1000), the agent starts cleanly and everything that doesn’t require local process spawning works well:

✅ Gateway with Telegram, webhook, and API server
✅ MCP server connections (Azure DevOps via Toolhive, n8n via stdio)
✅ Delegation with parallel subagents using a separate model
✅ Memory persistence via PVC
✅ Config and SOUL.md via ConfigMap bootstrap

One important operational note: config changes require a pod restart. Hermes tries to restart the gateway internally when config.yaml changes, but this fails silently in a container. The fix is just kubectl rollout restart deployment/hermes-agent — not a big deal, but worth knowing before you spend time wondering why your MCP server changes aren’t taking effect.

Hermes Workspace on Kubernetes

Hermes Workspace is where the single-machine assumption bites hardest.

The workspace is a Next.js application that acts as a web UI for Hermes Agent. For the features that talk to the Hermes API — chat, sessions, memory browser, skills, dashboard — it works fine. Point it at http://hermes-agent.hermes-agent.svc.cluster.local:8642 and 9119, and those panes function correctly.

The Swarm feature is where it falls apart.

The swarm dispatcher (/api/swarm-dispatch) works by:

Spawning a tmux session on the local machine
Starting hermes chat --tui inside that session
Pasting prompts into the tmux session and reading checkpoints back out
Falling back to hermes chat -q <prompt> as a oneshot process

All of this assumes the workspace server, hermes binary, tmux, and the agent’s profile directory (~/.hermes/profiles/) are all on the same machine. In a split-pod Kubernetes deployment, none of that is true.

I tried installing hermes-agent and tmux into the workspace image. The workspace then spawns its own hermes process — but that process knows nothing about the sessions running in the actual hermes-agent pod. It’s a second, disconnected agent instance with its own config, its own memory, and no relationship to the gateway you actually want to talk to.

The swarm dispatcher would need to be rewritten to dispatch via the Hermes gateway API instead of local process spawning. That’s a non-trivial change to the workspace codebase and not something worth doing for a personal homelab setup.

What works in Kubernetes:

Chat with full session history
Sessions / Operations monitoring
Memory browser
Skills browser
Dashboard views (models, logs, cron, channels)

What doesn’t work:

Swarm dispatch (tmux + local filesystem dependency)
Conductor spawning agents via the workspace (same issue)

The irony is that Conductor — which is the most visually compelling feature — works fine for monitoring delegation trees that were started elsewhere. If you kick off a delegation task via Telegram or the chat pane, Conductor will show you the running subagents correctly. It’s only the dispatch path that breaks.

The Honest Assessment

These tools are all in the “run it on your server” category, not the “deploy it on Kubernetes” category. The distinction matters:

Run it on your server assumes:

One machine, one filesystem, one network namespace
Local process execution
Direct filesystem access for state
tmux or similar for persistent sessions

Deploy it on Kubernetes requires:

Stateless or explicitly stateful pods
All state in mounted volumes or external stores
Inter-component communication via APIs or message queues
No assumptions about shared process space

When you force a “run it on your server” tool into Kubernetes, you get partial functionality. The networked parts work; the local-execution parts break.

For a homelab where you want full agent functionality including swarm workers, the pragmatic choice is a single VM or bare-metal machine with everything installed locally. For a homelab where you primarily want a persistent Telegram-accessible agent with MCP integrations and occasional delegation tasks, Kubernetes works fine for the core functionality.

I’m in the second camp. Hermes Agent on Kubernetes handles everything I actually use day-to-day: Telegram as the primary interface, Azure DevOps and n8n via MCP, delegation for research tasks, and persistent memory. The workspace provides a usable web UI for chat and monitoring.

The swarm feature would require either a bare-metal deployment or a significant rewrite of the workspace dispatch layer. Neither is worth the effort for my use case.

What I’d Tell Someone Starting Fresh

If you want to self-host an AI agent and you’re comfortable with Kubernetes: deploy Hermes Agent, use Telegram as your primary interface, and treat Hermes Workspace as a monitoring/chat UI. Don’t expect the advanced local-execution features to work.

If you want full swarm functionality: install everything on a single machine. A small VPS with 8GB RAM is enough. Use Docker Compose if you want isolation, but keep all containers on the same host so they can share a filesystem.

If you’re coming from OpenClaw: Hermes Agent is a solid replacement for the gateway functionality. The SOUL.md system is more expressive than OpenClaw’s identity configuration, delegation works better, and the MCP integration is cleaner. You’ll hit the same Kubernetes limitations for advanced features, but the core agent is more capable.

The Common Architecture Problem¶

OpenClaw on Kubernetes¶

Hermes Agent on Kubernetes¶

Hermes Workspace on Kubernetes¶

The Honest Assessment¶

What I’d Tell Someone Starting Fresh¶