Hey everyone! Just wrapped up an awesome weekend project and wanted to share what I built.
TL;DR: I can now deploy Azure infrastructure by just chatting with an AI - all running locally with open-source models.
What I Built
An AI-powered system where I can literally say “Create an AKS cluster in Azure with 3 nodes” in chat, and it actually deploys real infrastructure. No manual Terraform commands needed! And everything runs locally - no OpenAI/Anthropic API calls needed!
Live demo workflow I just ran:
- Generated Terraform code for Azure Storage Account
- Validated and created execution plan
- Listed all workspaces
- Applied changes to Azure (deployed real infrastructure!)
- Updated storage tier from Hot to Cool (in-place update!)
- Cleaned up empty workspaces
- Destroyed infrastructure when done
All through natural conversation. All with local models. All in under 10 minutes.
The Stack
Infrastructure Layer
- Talos Kubernetes cluster (self-hosted)
- Docker Registry for custom images
- Azure trial tenant (for safe experimentation)
AI & Orchestration
- OpenWebUI (locally hosted, self-hosted AI interface)
- Ollama with gpt-oss:120b-cloud (120B parameter model!)
- Terraform MCP Server (deployed in K8s cluster)
- MCPO (MCP Orchestrator in K8s)
Custom Build - Terraform Executor API (Python Flask)
- Validates TF code
- Generates execution plans
- Handles apply/destroy operations
- State management
- Workspace persistence across restarts
Deployment Tools
- OpenWebUI Custom Tools (6 functions for full TF lifecycle)
- Ingress with proper timeouts (nginx)
- Azure Service Principal for auth
How It Works
Here’s the complete flow:
Step 1: Me: “Deploy a storage account in Azure”
Step 2: Ollama (gpt-oss:120b) uses Terraform MCP to get Azure provider docs
Step 3: AI generates Terraform code
Step 4: OpenWebUI Tool calls Terraform Executor API
Step 5: Executor runs: terraform init, validate, plan
Step 6: AI shows me the plan: “This will create 2 resources…”
Step 7: Me: “Apply it!”
Step 8: Executor runs: terraform apply
Step 9: LIVE INFRASTRUCTURE IN AZURE!
Actual Demo Workflow
Here’s what I demonstrated step by step:
Step 1: Generate & Prepare
Me: “Create Terraform code for an Azure Storage Account named ’tftest20241011’ in West Europe with hot access tier and LRS replication, then prepare it for deployment. Create workspace name ‘demo-kubecraft-3’ and display the plan.”
AI Response:
- Generated complete Terraform code
- Ran terraform init
- Validated configuration
- Generated execution plan
- Plan: 2 to add, 0 to change, 0 to destroy
Step 2: List Workspaces
Me: “Can you list all my workspaces?”
AI presented a beautiful table:
| # | WORKSPACE ID | HAS STATE? | HAS PLAN? | TERRAFORM FILES |
|---|---|---|---|---|
| 1 | demo-kubecraft | No | Yes | main.tf |
| 2 | demo-kubecraft-2 | No | Yes | main.tf |
| 3 | demo-kubecraft-3 | No | Yes | main.tf |
The AI even suggested next actions:
- View details of a specific workspace (terraform_get_workspace_info)
- Apply a plan (terraform_apply)
- Refresh a workspace with new code (terraform_refresh)
- Destroy any of these workspaces (terraform_destroy)
Thought time: Less than a second!
Step 3: Apply Infrastructure
Me: “Please apply workspace ‘demo-kubecraft-3’”
AI Response:
- Terraform Apply Successful!
- Apply complete! Resources: 2 added, 0 changed, 0 destroyed
- Resource Group created in Azure
- Storage Account deployed with Hot tier
Step 4: Cleanup Empty Workspaces
Me: “Please destroy workspaces that have ‘No state’”
AI Response:
- Identified and destroyed 2 empty workspaces
- Cleaned up unused resources
Step 5: Update Configuration
Me: “Update the storage account to use Cool access tier instead of Hot in workspace_id ‘demo-kubecraft-3’”
AI Response:
- Refreshed workspace with new configuration
- Plan: 0 to add, 1 to change, 0 to destroy
- Change detected: access_tier: “Hot” to “Cool”
Step 6: Apply Update
Me: “Please apply workspace ‘demo-kubecraft-3’”
AI Response:
- In-place update successful!
- Storage account updated without destruction
- Zero downtime during update
Step 7: Final Cleanup
Me: “Destroy workspace ‘demo-kubecraft-3’”
AI Response:
- Terraform Destroy Successful
- Destroy complete! Resources: 2 destroyed
- All infrastructure cleaned up from Azure
Key Features
Full Terraform Lifecycle - 6 Custom Tools
- terraform_prepare - Validate & generate plan
- terraform_apply - Deploy infrastructure
- terraform_destroy - Clean up resources
- terraform_list_workspaces - View all active workspaces
- terraform_get_workspace_info - Inspect workspace details
- terraform_refresh - Update existing infra (keeps state!)
Production-Ready Elements
- State persistence (survives pod restarts)
- Workspace management
- Proper error handling
- Timeout handling for long-running operations
- Azure credential management
- In-place updates (no destroy/recreate needed)
100% Local & Open Source
- gpt-oss:120b-cloud via Ollama (no API costs!)
- Also tested with qwen3-coder:480b-cloud
- Terraform MCP Server running in-cluster
- Self-hosted OpenWebUI
- Everything in your own Kubernetes cluster
Architecture Highlights
Terraform MCP Server in Cluster
Instead of relying on external APIs, I deployed the Terraform MCP server directly in my Kubernetes cluster as a separate service with StreamableHTTP transport.
Benefits:
- Fast access to Terraform docs (no external API calls)
- Stays in your cluster (no data leaves your network)
- Always up-to-date provider documentation
- Sub-second response times
This gives the AI real-time access to:
- Latest Terraform provider versions
- Module documentation
- Best practices
- Resource schemas
Challenges Solved
Challenge 1: State Management
- Problem: Workspaces lost after container restart
- Solution: Scan disk on startup, rebuild in-memory index
Challenge 2: Ingress Timeouts
- Problem: 504 errors during terraform apply
- Solution: Nginx annotations for 10-minute timeouts
Challenge 3: Model Compatibility
- Problem: DeepSeek models crash on parameterless functions
- Solution: Added dummy parameters for universal model support
- Winner: gpt-oss:120b-cloud worked perfectly!
Challenge 4: MCP Integration
- Problem: MCPO couldn’t talk to HTTP MCP servers initially
- Solution: Deployed Terraform MCP as separate K8s service with StreamableHTTP
Challenge 5: Workspace Refresh (In-Place Updates)
- Problem: Original design would destroy/recreate on changes
- Solution: Added terraform_refresh endpoint that keeps state
- Result: Can update storage tier without destroying the account!
Time Investment
Totally worth it!
Model Performance Notes
gpt-oss:120b-cloud (WINNER)
- Excellent tool calling
- Understands complex Terraform concepts
- Fast inference via Ollama
- Best overall experience
qwen3-coder:480b-cloud (SOLID)
- Great for code generation
- Solid tool calling support
- Massive parameter count pays off
deepseek-v3.1:671b-cloud (QUIRKY)
- Issues with parameterless functions
- Works after adding dummy params
- Still usable but quirky
What’s Next?
Possible improvements:
- N8N integration for approval workflows
- Cost estimation before apply
- Terraform state backend (Azure Storage)
- Drift detection
- Auto-documentation of deployed resources
Key Takeaways
Local models are production-ready - 120B models rival GPT-4 for this use case
MCP in-cluster = fast - No external API latency
OpenWebUI is underrated - Self-hosted, flexible, great for custom tools
Kubernetes makes everything easier - Deploy once, scale forever
State management matters - Workspace persistence was crucial
In-place updates save time - Refresh beats destroy/recreate
Weekend projects are the best - Built something that works!
Cost Analysis
Traditional approach (with GPT-4 API)
- Around $0.01 per request
- 7 operations in demo = $0.07
- Monthly usage: Around $20-50
My approach (all local)
- $0 per request
- Only cost: electricity for GPU/CPU
- One-time hardware investment
Tech Stack Summary
Frontend:
- OpenWebUI (self-hosted)
AI:
- Ollama with gpt-oss:120b-cloud / qwen3-coder:480b-cloud
MCP:
- Terraform MCP Server (in-cluster)
Backend:
- Python Flask API
IaC:
- Terraform 1.10
Cloud:
- Azure (trial tenant)
Infrastructure:
- Talos Kubernetes
- Docker + private registry
- Nginx with extended timeouts
Want To Try This?
All components are open source:
- Terraform MCP: github.com/hashicorp/terraform-mcp-server
- OpenWebUI: github.com/open-webui/open-webui
- MCPO: github.com/open-webui/mcpo
- Ollama: ollama.ai
- Talos: github.com/siderolabs/talos
The custom Terraform Executor is straightforward:
- Python Flask API
- Terraform CLI wrapper
- Basic state management
- Docker container (around 100 lines of Python)
Resources
Models Used
- gpt-oss:120b-cloud (primary)
- qwen3-coder:480b-cloud (alternative)
Talos Resources
- OpenWebUI: Around 2GB RAM
- MCPO: Around 512MB RAM
- Terraform MCP: Around 256MB RAM
- Terraform Executor: Around 512MB RAM
- Ollama: 24GB VRAM (for 120B model)
Total cluster footprint: Around 3.5GB RAM + 24GB VRAM
Closing Thoughts
Want to see a demo? Drop a comment!
This was a fun weekend hack that turned into something genuinely useful. Infrastructure as conversation with 100% local models is wild!
No API keys. No cloud AI services. Just open-source models and Kubernetes.
This is the future of DevOps.
Built with coffee, lo-fi beats, and local GPU power during a rainy weekend.
Last updated: October 11, 2025