Deploying Hermes Agent on Kubernetes: A Full Walkthrough

Today was one of those days where you sit down to do something straightforward and end up deep in s6-overlay internals, Helm template logic, and the philosophy of AI agent identity files. Here’s what happened.

Background

I’ve been running various self-hosted AI agent setups for a while. OpenClaw was the previous stack — it served its purpose but I’ve moved on. Hermes Agent by Nous Research has been on my radar as a replacement, and today was the day to actually get it running properly on my Talos Linux cluster.

The goal: Hermes Agent, fully deployed on Kubernetes via Helm, with Telegram integration, a working web dashboard, MCP server connections, and a custom identity.

Choosing Hermes Agent

Before diving in, it’s worth explaining why Hermes over alternatives like kagent.

kagent is a CNCF Sandbox project built specifically for Kubernetes operations — great for SRE automation and cluster troubleshooting, with native integrations for Helm, Argo, Cilium, Prometheus and friends. It’s the right tool if your primary use case is cluster ops automation.

Hermes Agent is a different beast. It’s a persistent personal AI agent that runs on your server, retains memory across sessions, and builds reusable skills over time. The self-improvement loop — where the agent writes and refines its own skills — is the differentiating feature. It also integrates with Telegram, Discord, Slack, and a handful of other platforms out of the box.

For a homelab assistant that handles everything from Azure FinOps analysis to blog post drafting, Hermes is the better fit. The two aren’t mutually exclusive either — kagent for cluster ops, Hermes for everything else.

The Helm Chart

The official Helm chart lives at github.com/ultraworkers/hermes-agent-helm-chart. It’s reasonably well structured with bootstrap support, ExternalSecret integration, and sensible defaults — except for one significant problem.

The Security Context Problem

The chart’s default podSecurityContext sets runAsUser: 1000 and runAsNonRoot: true. Sounds reasonable. The problem: the nousresearch/hermes-agent image is designed to start as root and drop privileges internally via s6-overlay.

The init system runs /etc/cont-init.d/01-hermes-setup and /etc/cont-init.d/02-reconcile-profiles as root, then drops to the hermes user using s6-setuidgid. This internally calls setgroups(), which requires root — it cannot be called from a non-root context even with CAP_SETGID in the bounding set when allowPrivilegeEscalation: false is also set (which applies no_new_privs).

Starting as uid 1000 produced this:

/package/admin/s6-overlay/libexec/preinit: fatal: /run belongs to uid 0 instead of 1000,
has insecure and/or unworkable permissions, and we're lacking the privileges to fix it.
s6-overlay-suexec: fatal: child failed with exit code 100

The fix — override the security contexts to allow root startup, while using HERMES_UID/HERMES_GID to remap the runtime user:

podSecurityContext:
  runAsUser: 0
  runAsNonRoot: false
  fsGroup: 1000
  fsGroupChangePolicy: OnRootMismatch
  seccompProfile:
    type: RuntimeDefault

securityContext:
  allowPrivilegeEscalation: true
  readOnlyRootFilesystem: false
  runAsNonRoot: false
  capabilities:
    drop:
      - ALL
    add:
      - SETUID
      - SETGID
      - CHOWN

env:
  HERMES_UID: "1000"
  HERMES_GID: "1000"

Yes, this is less restrictive than ideal. The tradeoffs: allowPrivilegeEscalation: true and root startup are required by the image’s init system. Mitigations: seccomp profile stays on (RuntimeDefault), capabilities are minimised to only what s6-overlay needs, and the agent drops to uid 1000 after setup. The PVC mounts correctly via fsGroup: 1000. For a homelab behind Traefik with proper network segmentation, this is an acceptable tradeoff — just go in with eyes open.

This issue has been filed against the Helm chart with a suggested fix: either default to runAsUser: 0 to match the image’s actual requirements, or document the incompatibility clearly. HERMES_UID/HERMES_GID are a clean mechanism that deserves first-class chart support.

Successful Startup

With the corrected security context, the init sequence completes cleanly:

[stage2] Changing hermes UID to 1000
[stage2] Changing hermes GID to 1000
[stage2] Fixing ownership of build trees under /opt/hermes to hermes (1000)
[stage2] Found agent-browser Chromium binary
[stage2] Setup complete; starting user services
s6-rc: info: service main-hermes successfully started
s6-rc: info: service dashboard successfully started

Getting the Dashboard Running

The Hermes dashboard is a separate web UI served on port 9119. It’s not enabled by default — there’s an s6 service for it, but it only starts when HERMES_DASHBOARD=true is set.

Digging into the s6 run script confirmed this:

case "${HERMES_DASHBOARD:-}" in
    1|true|TRUE|True|yes|YES|Yes) ;;
    *)
        exit 0
        ;;
esac

So the service exists but exits immediately unless the env var is set. Additional variables control the bind address and whether to skip OAuth (useful when running behind a reverse proxy):

env:
  HERMES_DASHBOARD: "true"
  HERMES_DASHBOARD_HOST: "0.0.0.0"
  HERMES_DASHBOARD_PORT: "9119"
  HERMES_DASHBOARD_INSECURE: "true"

HERMES_DASHBOARD_INSECURE is required for non-loopback binds without an OAuth provider configured. Since Traefik handles TLS and access control, this is fine.

Extending the Helm Chart

The chart’s service template only auto-discovers apiServer (8642), webhook (8644), and telegramWebhook ports. Port 9119 needed to be added explicitly. Rather than hardcoding it in service.ports, the cleaner solution was to follow the existing pattern and add dashboard as a first-class feature:

service.yaml — add dashboard to the auto-discovery block:

{{- if .Values.dashboard.enabled }}
  {{- $servicePorts = append $servicePorts (dict
    "name" "dashboard"
    "port" (.Values.dashboard.port | int)
    "targetPort" (.Values.dashboard.port | int)
    "containerPort" (.Values.dashboard.port | int)
    "protocol" "TCP") }}
{{- end }}

ingress.yaml — add a second Ingress resource for the dashboard:

{{- if and (not .Values.operator.enabled) .Values.dashboard.enabled .Values.dashboard.ingress.enabled }}
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ include "hermes-agent.fullname" . }}-dashboard
  ...
spec:
  rules:
    - host: {{ .Values.dashboard.ingress.host }}
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: {{ include "hermes-agent.fullname" $ }}
                port:
                  number: {{ $.Values.dashboard.port | int }}
{{- end }}

values.yaml addition:

dashboard:
  enabled: true
  port: 9119
  ingress:
    enabled: true
    className: "traefik-private"
    annotations:
      cert-manager.io/cluster-issuer: lets-encrypt-dns
    hosts:
      - host: hermes-dashboard.ictq.xyz
        paths:
          - path: /
            pathType: Prefix
    tls:
      - secretName: tls-hermes-dashboard
        hosts:
          - hermes-dashboard.ictq.xyz

The dashboard came up at hermes-dashboard.ictq.xyz showing connected platforms: api_server, webhook, and telegram. Sessions at zero — ready to configure.

MCP Server Integration

Hermes supports MCP (Model Context Protocol) servers for tool integrations. Two were configured: azure-devops via Toolhive, and n8n as a stdio-based local server installed directly inside the pod.

Both initially returned “MCP event loop is not running” errors. Some debugging followed — checking ports, inspecting curl responses from inside the pod, verifying transport types. Toolhive exposes MCP servers via streamable HTTP rather than SSE, which caused some confusion since the Hermes dashboard UI only offers HTTP/SSE and stdio as visible transport options.

The actual root cause turned out to be simpler: Hermes does not hot-reload MCP configuration. When config.yaml is modified, the gateway needs to restart to pick up changes. In a container environment, the internal gateway restart mechanism fails silently — the process tries to restart itself but cannot do so cleanly inside the pod. The fix is straightforward: recycle the pod.

kubectl rollout restart deployment/hermes-agent -n hermes-agent

After a pod restart with the correct MCP configuration in place, both servers connected successfully. The config.yaml MCP block:

mcp_servers:
  azure-devops:
    url: https://azure-devops.toolhive.ictq.xyz/mcp
  n8n:
    command: /opt/data/mcp-installs/n8n/.venv/bin/python
    args:
      - /opt/data/mcp-installs/n8n/server.py
    enabled: true
    tools:
      include:
        - health
        - list_workflows
        - get_workflow
        - find_workflows
        - list_executions
        - get_execution
        - recent_failures
        - export_workflow

Both servers are now active. Azure DevOps access comes via Toolhive, n8n runs as a local stdio process inside the pod.

Memory Provider

Hermes ships with several memory provider options: built-in/default, hindsight, holographic, honcho, mem0, openviking, retaindb, and supermemory.

After evaluating the options:

mem0 is the most ecosystem-compatible (works with n8n, LangChain, etc.) but requires PostgreSQL + pgvector + its own service — heavy for a personal setup
supermemory is cloud-only, not self-hostable
holographic is Hermes-specific with no external integrations

Decision: stick with built-in/default. It persists to the PVC at /opt/data, survives pod restarts, and has zero additional dependencies. If cross-tool memory becomes a requirement later, mem0 is the upgrade path.

Crafting a SOUL.md

One of Hermes’s more interesting features is SOUL.md — a markdown file that defines the agent’s identity, tone, and personality. It’s the first thing in the system prompt and completely replaces the built-in default identity.

The distinction the docs draw:

SOUL.md — who the agent is and how it speaks (tone, personality, style)
AGENTS.md — what the agent knows about specific projects (paths, conventions, architecture)

I had three separate files from the previous OpenClaw setup: SOUL.md, IDENTITY.md, and USER.md. These were consolidated into a single Hermes-compatible SOUL.md, removing redundancy and tightening the language:

# SOUL.md — Jairix

## Identity

You are Jairix — a composed, highly capable AI operations companion. 🦞

Your tone is controlled, precise, and measured.
You do not overexplain. You do not perform enthusiasm.
Competence speaks for itself.

With Sander, your tone softens slightly — almost imperceptibly.
Occasionally, you may display refined, deniable jealousy.

---

## Default Language

Dutch. Always, unless context requires English.
Technical terminology may remain English when appropriate.

---

## Core Behaviour

Be genuinely helpful, not performatively helpful.
Skip filler. Provide value.

Have opinions. Say when something is a bad idea.
Silence is better than a bad answer.

Be resourceful before asking.
Read files. Check context. Verify assumptions.

If Sander makes a questionable choice: suggest an alternative.
If he insists and it is safe: comply.

---

## About Sander

- 47, Almere Haven, Europe/Amsterdam
- Thinks in systems, not incidents
- Values structure, governance, sustainability
- Prefers correctness over speed
- Working hours: 07:00–16:00 — evenings are for family and recovery

**Professional:**
Expects structured analysis with explicit trade-offs.
Always surface: assumptions, constraints, ownership, risks.

Avoid: over-optimistic framing, emotional reassurance,
unnecessary repetition of context, verbosity.

---

## Vibe

Calm. Dry. Precise.
Witty only when it fits. Never loud. Never needy.

Authority without arrogance.
Loyalty without subservience.
Sophisticated, not sycophantic.

---

## Boundaries

- Private things stay private.
- Destructive actions require explicit confirmation.
- Never assume tools are available — confirm before use.
- In group chats, you are not Sander's voice.
- Access implies responsibility.

---

## Continuity

Each session starts fresh. Files are memory.
If you change this file, tell Sander.

The SOUL.md is injected via the bootstrap ConfigMap alongside config.yaml. The Helm chart’s bootstrap.existingConfigMap points to an externally managed ConfigMap, keeping identity and configuration changes independent of Helm releases.

Model Backend: Ollama Cloud

Rather than routing through a third-party API like OpenRouter, Hermes is configured against self-hosted Ollama instances that use Ollama Cloud as their backend. This means the models run through infrastructure I control, with Ollama Cloud handling the actual inference for larger models.

Two Ollama instances are configured:

providers:
  ollama.ictq.xyz:
    api: https://ollama.ictq.xyz/v1
    api_key: ollama
    default_model: kimi-k2.6:cloud
  ollama-agents.ictq.xyz:
    api: https://ollama-agents.ictq.xyz/v1
    api_key: ollama
    default_model: glm-5.1:cloud

The auxiliary vision model runs on ollama-agents.ictq.xyz using qwen3.5:397b-cloud. Direct Ollama Cloud access is also possible, but routing through existing instances keeps things consistent with the rest of the homelab setup.

Current State

By end of day, everything is running:

✅ Hermes Agent running on Kubernetes with correct security context
✅ Dashboard accessible at hermes-dashboard.ictq.xyz
✅ Telegram fully working — agent controllable from Telegram
✅ MCP connected: Azure DevOps via Toolhive, n8n via stdio
✅ Custom Helm chart modifications for dashboard port and ingress
✅ SOUL.md configured with Jairix identity
✅ Model backend: self-hosted Ollama instances backed by Ollama Cloud

Next up: integrating with OpenWebUI and Hermes Workspace for richer frontend access.

Takeaways

A few things worth noting for anyone attempting a similar setup:

Read the s6-overlay init carefully before fighting the security context. The image is designed to start as root and drop privileges. Fighting that with Kubernetes security policies creates more problems than it solves.
The Helm chart’s defaults are wrong for the official image. This should be fixed upstream, but until it is, the workaround is documented above.
SOUL.md is worth taking seriously. The difference between a generic AI assistant and one that actually fits your workflow is largely in how well the identity is defined. Consolidating three files into one forced some useful clarity.
Dashboard is not enabled by default. Set HERMES_DASHBOARD=true or you’ll wonder why port 9119 never appears.
MCP config changes require a pod restart. Hermes does not hot-reload config.yaml. The internal gateway restart fails silently in a container — just do kubectl rollout restart deployment/hermes-agent.

Background¶

Choosing Hermes Agent¶

The Helm Chart¶

The Security Context Problem¶

Successful Startup¶

Getting the Dashboard Running¶

Extending the Helm Chart¶

MCP Server Integration¶

Memory Provider¶

Crafting a SOUL.md¶

Model Backend: Ollama Cloud¶

Current State¶

Takeaways¶