Jairix — A fully self-hosted AI agent that planned my road trip through Spain

There are two kinds of people building AI assistants right now: those who pipe their data straight into someone else’s cloud, and those who’d rather not. I fall firmly in the second camp.

The result is Jairix. 🦞

What is OpenClaw?

OpenClaw is a framework for building autonomous AI agents. It runs locally, connects to your own models, and supports a multi-agent architecture where specialized agents collaborate on complex tasks. No subscription, no data sharing, no dependency on an external provider.

The core idea is straightforward: configure a set of agents, give each of them tools and a workspace, and let them work together. Orchestration, memory, and context are entirely under your control.

The hardware

Jairix runs on an HP Envy x360 15-ed1555nd — a consumer laptop with 32 GB of RAM, sitting air-gapped on my home network. No server racks, no dedicated GPU machine. Just a laptop on a desk.

For model inference, I run two dedicated Ollama instances elsewhere in my homelab, reachable via internal URLs. The heavy lifting happens there. But the brain — the orchestration, memory, tools, and decisions — all lives on that HP Envy.

The models

This is one of the parts that surprises people. There’s no OpenAI, no Anthropic API. Everything routes through Ollama.

Ollama Pro — $20/month for serious models

Most of the models in my setup carry a :cloud suffix in their IDs. That’s not a local weight — that’s Ollama Pro, a $20/month plan that gives you access to cloud-hosted open models through the exact same Ollama API you’d use locally. No new SDK, no new authentication flow, no vendor lock-in. You point OpenClaw at your Ollama endpoint and it just works, whether the model is running on your own GPU or on Ollama’s infrastructure.

The practical upshot: I get access to 671B-parameter models like DeepSeek V3.1 and 675B Mistral Large 3 without owning the hardware to run them. Ollama hosts them on NVIDIA infrastructure, guarantees no logging and no training on your data, and charges based on actual GPU time rather than a fixed token cap. For the kind of multi-agent workloads Jairix runs — coordinated research pipelines, long document generation, complex planning tasks — this is a significantly better deal than paying per-token to a closed provider.

I maintain two Ollama endpoints with different roles:

General endpoint — used by the main agent for everyday tasks:

Model	Parameters	Notes
Kimi K2.5	—	Primary default model
Nemotron 3 Super	—	Reasoning, first fallback
GLM-5	—	Fast fallback
MiniMax M2 / M2.5	230B	High-capacity fallback
Qwen3.5	397B (A17B active)	Reasoning + vision
Qwen3 VL	235B	Vision tasks
Qwen3-Coder-Next	—	Code generation
Nemotron 3 Nano	30B	1M token context window

Agents endpoint — dedicated to subagent workloads, with heavier models and higher concurrency:

Model	Parameters	Notes
DeepSeek V3.1	671B	Research and reasoning tasks
Kimi K2.5 / Thinking	—	Planning, thinking variant available
Mistral Large 3	675B	Writer, vision fallback
Ministral 3	14B	Lightweight agent tasks
Devstral Small 2	24B	Code agent (coder subagent)
GLM-5	—	Writer, assembler, devops primary
Qwen3 VL	235B	Vision agent primary
MiniMax M2 / M2.5	230B	Fallbacks
Nemotron 3 Nano	30B	Extended context tasks

The default primary model across the system is Kimi K2.5, with a fallback chain of Nemotron 3 Super → GLM-5 → MiniMax M2 → DeepSeek V3.1 → Qwen3.5. Each specialist agent has its own primary and fallbacks tuned to its task — the coder agent leads with Qwen3-Coder-Next and Devstral, the researcher with Qwen3.5 and Kimi Thinking, the vision agent with Qwen3 VL.

For embeddings and semantic memory search, a local Qwen3 Embedding 8B runs on the same machine as Jairix itself — that one actually is local, no cloud involved.

The architecture

Jairix isn’t a single model answering questions. It’s a coordinated system of specialists.

The main agent handles orchestration. It reads context, decides what needs to happen, and delegates to the right specialist. It doesn’t do research itself. It doesn’t write code itself. It coordinates.

The specialist agents each own a specific domain:

Agent	Primary model	Responsibility
planner	Kimi K2.5 Thinking	Breaking down large tasks, orchestrating workflows
researcher	Qwen3.5 397B	Information gathering, fact verification, comparisons
writer	GLM-5	Turning research into structured documents
assembler	MiniMax M2.5	Combining outputs into final deliverables
coder	Qwen3-Coder-Next	Scripts, fixes, implementations
vision	Qwen3.5 397B	Image analysis, OCR, diagram interpretation
devops	GLM-5	Infrastructure design, CI/CD, cluster configuration
operations	Kimi K2.5	External actions: email, calendar, APIs (requires explicit approval)

Each agent runs in an isolated Docker sandbox with its own workspace, its own model routing, and strictly scoped tool access. A researcher can search the web and write files. It cannot send email. An operations agent can send email. It cannot spawn other agents.

This isn’t just tidy architecture — it’s a meaningful safety boundary. Destructive or external actions always require a human approval step.

Memory and continuity

One of the harder problems with local agents is continuity across sessions. Models don’t remember anything by default.

OpenClaw solves this through a layered memory system:

STATE.md — a live working context file, capped at 3KB, updated frequently. This is the agent’s “what am I doing right now” file. It survives compaction and resets.
memory/YYYY-MM-DD.md — daily operational logs with session details, decisions, and discoveries.
MEMORY.md — curated long-term memory, loaded only in the main session. Personal facts, stable preferences, ongoing projects.

When context gets long and approaches compaction, the agent runs a memory flush: update STATE.md first, then write session details to the daily log. The result is that even after a full context reset, Jairix wakes up with working context intact.

Semantic memory search runs on the local Qwen3 Embedding model with a hybrid retrieval strategy: 70% vector similarity, 30% text matching, with MMR re-ranking and temporal decay so recent context is weighted higher. All of it local, all of it indexed on-device.

Planning a 45-day road trip through Spain

This is where things get concrete.

Earlier this year I asked Jairix to help plan our 2026 camper trip. The parameters: leave early April, return mid-May, travel through Luxembourg, France, and Spain with the family, mix culture with coast, keep it manageable.

What followed was a full multi-agent research workflow.

The planner broke down the task: route structure first, then per-destination research, then campsite selection, then a combined document. The researcher went to work on each destination — pulling weather data for April, finding campsite options near city centers, checking low-emission zones (critical for a diesel camper in French cities), verifying opening hours for attractions, locating supermarkets within range of each overnight stop.

The writer converted that research into structured destination guides. The assembler combined everything into a final travel plan.

The result is what you can see at blackycamperbus.nl/reisplan: 45 nights, 22 destinations, 5,319 kilometers. Luxemburg → Lyon → Béziers → Barcelona → Tarragona → Peñíscola → Valencia → Alicante → Cartagena → Cabo de Gata → Almería → Granada → Málaga → Córdoba → Sevilla → Mérida → Cáceres → Salamanca → Burgos → Bilbao → San Sebastián → Bordeaux → home.

Every destination includes a researched mini-itinerary, campsite with GPS coordinates, supermarkets within walking or cycling distance, public transport options, weather expectations for the relevant month, and local highlights. The Crit’Air sticker requirement for France was flagged automatically. Low-emission zone warnings are included per city.

That’s not a ChatGPT conversation. That’s a coordinated research pipeline that ran unattended, produced structured outputs, and assembled them into a usable document — all through a single Ollama API, all on open models.

What’s actually impressive about this

Not the models. The models are good but they’re not magic. What’s impressive is the system around them.

The fact that a researcher agent can be told “find campsite options within 5km of the city center for a diesel camper, check for LEZ restrictions, include nearest supermarkets” — and actually do it, write the result to a file, and hand it off to a writer — that’s the value. The orchestration. The memory. The separation of concerns.

It’s also the fact that the entire stack — OpenClaw, Ollama, the models — uses open infrastructure. There’s no API key to a closed provider. No terms of service that changes what you can build. No risk of a pricing change making your setup unaffordable overnight.

The practical reality

I won’t pretend it’s frictionless. Open models are not GPT-4o. Some tasks require nudging. Occasionally a subagent produces output that needs a retry. Context management on long sessions requires attention.

But the architecture is sound, and it keeps improving. The multi-agent pattern genuinely works for complex tasks that benefit from decomposition. The memory system genuinely provides continuity. The sandbox isolation genuinely prevents accidents.

And every improvement I make — every prompt refinement, every tool addition, every workflow pattern — stays in my infrastructure. It doesn’t get absorbed into someone else’s training pipeline.

Try it yourself

If you’re interested in running your own setup:

OpenClaw: openclaw.ai
Ollama: ollama.com — local runtime and cloud models under one API
Ollama Pro: ollama.com/pricing — $20/month for access to large cloud-hosted open models
The travel plan built with this system: blackycamperbus.nl/reisplan

The entry point is lower than you’d expect. A machine with 16–32 GB of RAM, an Ollama Pro account for the heavier models, and OpenClaw on top is enough to get started. The rest is configuration and iteration.

Jairix leaves for Spain on April 3rd. The plan is ready. 🦞

What is OpenClaw?¶

The hardware¶

The models¶

Ollama Pro — $20/month for serious models¶

The architecture¶

Memory and continuity¶

Planning a 45-day road trip through Spain¶

What’s actually impressive about this¶

The practical reality¶

Try it yourself¶