I Built an AI That Deploys Real Cloud Infrastructure Just by Chatting With It

Hey everyone! Just wrapped up an awesome weekend project and wanted to share what I built.

TL;DR: I can now deploy Azure infrastructure by just chatting with an AI - all running locally with open-source models.

What I Built

An AI-powered system where I can literally say “Create an AKS cluster in Azure with 3 nodes” in chat, and it actually deploys real infrastructure. No manual Terraform commands needed! And everything runs locally - no OpenAI/Anthropic API calls needed!

Live demo workflow I just ran:

Generated Terraform code for Azure Storage Account
Validated and created execution plan
Listed all workspaces
Applied changes to Azure (deployed real infrastructure!)
Updated storage tier from Hot to Cool (in-place update!)
Cleaned up empty workspaces
Destroyed infrastructure when done

All through natural conversation. All with local models. All in under 10 minutes.

The Stack

Infrastructure Layer

Talos Kubernetes cluster (self-hosted)
Docker Registry for custom images
Azure trial tenant (for safe experimentation)

AI & Orchestration

OpenWebUI (locally hosted, self-hosted AI interface)
Ollama with gpt-oss:120b-cloud (120B parameter model!)
Terraform MCP Server (deployed in K8s cluster)
MCPO (MCP Orchestrator in K8s)

Custom Build - Terraform Executor API (Python Flask)

Validates TF code
Generates execution plans
Handles apply/destroy operations
State management
Workspace persistence across restarts

Deployment Tools

OpenWebUI Custom Tools (6 functions for full TF lifecycle)
Ingress with proper timeouts (nginx)
Azure Service Principal for auth

How It Works

Here’s the complete flow:

Step 1: Me: “Deploy a storage account in Azure”

Step 2: Ollama (gpt-oss:120b) uses Terraform MCP to get Azure provider docs

Step 3: AI generates Terraform code

Step 4: OpenWebUI Tool calls Terraform Executor API

Step 5: Executor runs: terraform init, validate, plan

Step 6: AI shows me the plan: “This will create 2 resources…”

Step 7: Me: “Apply it!”

Step 8: Executor runs: terraform apply

Step 9: LIVE INFRASTRUCTURE IN AZURE!

Actual Demo Workflow

Here’s what I demonstrated step by step:

Step 1: Generate & Prepare

Me: “Create Terraform code for an Azure Storage Account named ’tftest20241011’ in West Europe with hot access tier and LRS replication, then prepare it for deployment. Create workspace name ‘demo-kubecraft-3’ and display the plan.”

AI Response:

Generated complete Terraform code
Ran terraform init
Validated configuration
Generated execution plan
Plan: 2 to add, 0 to change, 0 to destroy

Step 2: List Workspaces

Me: “Can you list all my workspaces?”

AI presented a beautiful table:

#	WORKSPACE ID	HAS STATE?	HAS PLAN?	TERRAFORM FILES
1	demo-kubecraft	No	Yes	main.tf
2	demo-kubecraft-2	No	Yes	main.tf
3	demo-kubecraft-3	No	Yes	main.tf

The AI even suggested next actions:

View details of a specific workspace (terraform_get_workspace_info)
Apply a plan (terraform_apply)
Refresh a workspace with new code (terraform_refresh)
Destroy any of these workspaces (terraform_destroy)

Thought time: Less than a second!

Step 3: Apply Infrastructure

Me: “Please apply workspace ‘demo-kubecraft-3’”

AI Response:

Terraform Apply Successful!
Apply complete! Resources: 2 added, 0 changed, 0 destroyed
Resource Group created in Azure
Storage Account deployed with Hot tier

Step 4: Cleanup Empty Workspaces

Me: “Please destroy workspaces that have ‘No state’”

AI Response:

Identified and destroyed 2 empty workspaces
Cleaned up unused resources

Step 5: Update Configuration

Me: “Update the storage account to use Cool access tier instead of Hot in workspace_id ‘demo-kubecraft-3’”

AI Response:

Refreshed workspace with new configuration
Plan: 0 to add, 1 to change, 0 to destroy
Change detected: access_tier: “Hot” to “Cool”

Step 6: Apply Update

Me: “Please apply workspace ‘demo-kubecraft-3’”

AI Response:

In-place update successful!
Storage account updated without destruction
Zero downtime during update

Step 7: Final Cleanup

Me: “Destroy workspace ‘demo-kubecraft-3’”

AI Response:

Terraform Destroy Successful
Destroy complete! Resources: 2 destroyed
All infrastructure cleaned up from Azure

Key Features

Full Terraform Lifecycle - 6 Custom Tools

terraform_prepare - Validate & generate plan
terraform_apply - Deploy infrastructure
terraform_destroy - Clean up resources
terraform_list_workspaces - View all active workspaces
terraform_get_workspace_info - Inspect workspace details
terraform_refresh - Update existing infra (keeps state!)

Production-Ready Elements

State persistence (survives pod restarts)
Workspace management
Proper error handling
Timeout handling for long-running operations
Azure credential management
In-place updates (no destroy/recreate needed)

100% Local & Open Source

gpt-oss:120b-cloud via Ollama (no API costs!)
Also tested with qwen3-coder:480b-cloud
Terraform MCP Server running in-cluster
Self-hosted OpenWebUI
Everything in your own Kubernetes cluster

Architecture Highlights

Terraform MCP Server in Cluster

Instead of relying on external APIs, I deployed the Terraform MCP server directly in my Kubernetes cluster as a separate service with StreamableHTTP transport.

Benefits:

Fast access to Terraform docs (no external API calls)
Stays in your cluster (no data leaves your network)
Always up-to-date provider documentation
Sub-second response times

This gives the AI real-time access to:

Latest Terraform provider versions
Module documentation
Best practices
Resource schemas

Challenges Solved

Challenge 1: State Management

Problem: Workspaces lost after container restart
Solution: Scan disk on startup, rebuild in-memory index

Challenge 2: Ingress Timeouts

Problem: 504 errors during terraform apply
Solution: Nginx annotations for 10-minute timeouts

Challenge 3: Model Compatibility

Problem: DeepSeek models crash on parameterless functions
Solution: Added dummy parameters for universal model support
Winner: gpt-oss:120b-cloud worked perfectly!

Challenge 4: MCP Integration

Problem: MCPO couldn’t talk to HTTP MCP servers initially
Solution: Deployed Terraform MCP as separate K8s service with StreamableHTTP

Challenge 5: Workspace Refresh (In-Place Updates)

Problem: Original design would destroy/recreate on changes
Solution: Added terraform_refresh endpoint that keeps state
Result: Can update storage tier without destroying the account!

Time Investment

Totally worth it!

Model Performance Notes

gpt-oss:120b-cloud (WINNER)

Excellent tool calling
Understands complex Terraform concepts
Fast inference via Ollama
Best overall experience

qwen3-coder:480b-cloud (SOLID)

Great for code generation
Solid tool calling support
Massive parameter count pays off

deepseek-v3.1:671b-cloud (QUIRKY)

Issues with parameterless functions
Works after adding dummy params
Still usable but quirky

What’s Next?

Possible improvements:

N8N integration for approval workflows
Cost estimation before apply
Terraform state backend (Azure Storage)
Drift detection
Auto-documentation of deployed resources

Key Takeaways

Local models are production-ready - 120B models rival GPT-4 for this use case
MCP in-cluster = fast - No external API latency
OpenWebUI is underrated - Self-hosted, flexible, great for custom tools
Kubernetes makes everything easier - Deploy once, scale forever
State management matters - Workspace persistence was crucial
In-place updates save time - Refresh beats destroy/recreate
Weekend projects are the best - Built something that works!

Cost Analysis

Traditional approach (with GPT-4 API)

Around $0.01 per request
7 operations in demo = $0.07
Monthly usage: Around $20-50

My approach (all local)

$0 per request
Only cost: electricity for GPU/CPU
One-time hardware investment

Tech Stack Summary

Frontend:

OpenWebUI (self-hosted)

AI:

Ollama with gpt-oss:120b-cloud / qwen3-coder:480b-cloud

MCP:

Terraform MCP Server (in-cluster)

Backend:

Python Flask API

IaC:

Terraform 1.10

Cloud:

Azure (trial tenant)

Infrastructure:

Talos Kubernetes
Docker + private registry
Nginx with extended timeouts

Want To Try This?

All components are open source:

Terraform MCP: github.com/hashicorp/terraform-mcp-server
OpenWebUI: github.com/open-webui/open-webui
MCPO: github.com/open-webui/mcpo
Ollama: ollama.ai
Talos: github.com/siderolabs/talos

The custom Terraform Executor is straightforward:

Python Flask API
Terraform CLI wrapper
Basic state management
Docker container (around 100 lines of Python)

Resources

Models Used

gpt-oss:120b-cloud (primary)
qwen3-coder:480b-cloud (alternative)

Talos Resources

OpenWebUI: Around 2GB RAM
MCPO: Around 512MB RAM
Terraform MCP: Around 256MB RAM
Terraform Executor: Around 512MB RAM
Ollama: 24GB VRAM (for 120B model)

Total cluster footprint: Around 3.5GB RAM + 24GB VRAM

Closing Thoughts

Want to see a demo? Drop a comment!

This was a fun weekend hack that turned into something genuinely useful. Infrastructure as conversation with 100% local models is wild!

No API keys. No cloud AI services. Just open-source models and Kubernetes.

This is the future of DevOps.

Built with coffee, lo-fi beats, and local GPU power during a rainy weekend.

Last updated: October 11, 2025

What I Built¶

The Stack¶

Infrastructure Layer¶

AI & Orchestration¶

Custom Build - Terraform Executor API (Python Flask)¶

Deployment Tools¶

How It Works¶

Actual Demo Workflow¶

Step 1: Generate & Prepare¶

Step 2: List Workspaces¶

Step 3: Apply Infrastructure¶

Step 4: Cleanup Empty Workspaces¶

Step 5: Update Configuration¶

Step 6: Apply Update¶

Step 7: Final Cleanup¶

Key Features¶

Full Terraform Lifecycle - 6 Custom Tools¶

Production-Ready Elements¶

100% Local & Open Source¶

Architecture Highlights¶

Terraform MCP Server in Cluster¶

Challenges Solved¶

Challenge 1: State Management¶

Challenge 2: Ingress Timeouts¶

Challenge 3: Model Compatibility¶

Challenge 4: MCP Integration¶

Challenge 5: Workspace Refresh (In-Place Updates)¶

Time Investment¶

Model Performance Notes¶

gpt-oss:120b-cloud (WINNER)¶

qwen3-coder:480b-cloud (SOLID)¶

deepseek-v3.1:671b-cloud (QUIRKY)¶

What’s Next?¶

Key Takeaways¶

Cost Analysis¶

Traditional approach (with GPT-4 API)¶

My approach (all local)¶

Tech Stack Summary¶

Want To Try This?¶

Resources¶

Models Used¶

Talos Resources¶

Closing Thoughts¶

What I Built

The Stack

Infrastructure Layer

AI & Orchestration

Custom Build - Terraform Executor API (Python Flask)

Deployment Tools

How It Works

Actual Demo Workflow

Step 1: Generate & Prepare

Step 2: List Workspaces

Step 3: Apply Infrastructure

Step 4: Cleanup Empty Workspaces

Step 5: Update Configuration

Step 6: Apply Update

Step 7: Final Cleanup

Key Features

Full Terraform Lifecycle - 6 Custom Tools

Production-Ready Elements

100% Local & Open Source

Architecture Highlights

Terraform MCP Server in Cluster

Challenges Solved

Challenge 1: State Management

Challenge 2: Ingress Timeouts

Challenge 3: Model Compatibility

Challenge 4: MCP Integration

Challenge 5: Workspace Refresh (In-Place Updates)

Time Investment

Model Performance Notes

gpt-oss:120b-cloud (WINNER)

qwen3-coder:480b-cloud (SOLID)

deepseek-v3.1:671b-cloud (QUIRKY)

What’s Next?

Key Takeaways

Cost Analysis

Traditional approach (with GPT-4 API)

My approach (all local)

Tech Stack Summary

Want To Try This?

Resources

Models Used

Talos Resources

Closing Thoughts