Preparation Steps
Before starting the upgrade, create an etcd backup:
talosctl -n 10.122.0.10 etcd snapshot etcd.backup
Cluster Configuration
My starting point is a Talos cluster with one control plane and four worker nodes:
- Control Plane: control-01 10.121.0.10
- Workers:
- node01 10.121.0.11
- node02 10.121.0.12
- node03 10.121.0.13
- node04 10.121.0.14
Initial Environment
- Starting Talos version: 1.6.1
- Target Talos version: 1.9.2
- Initial Kubernetes version: 1.29.0
- Target Kubernetes version: 1.31.5
- Hardware: Bare-metal Machine
- Architecture: amd64
Factory Image Creation
Visit https://factory.talos.dev/ and select:
- Hardware Type: Bare-metal Machine
- Choose Talos Linux Version:
"1.6.8" "1.7.7" "1.8.4" "1.9.2" - Machine Architecture: amd64
- System Extensions:
- siderolabs/i915 (20241210-v1.9.2)
- siderolabs/intel-ucode (20241112)
- siderolabs/iscsi-tools (v0.1.6)
- Copy the image to upgrade Talos Linux on the machine:
factory.talos.dev/installer/376567988ad370138ad8b2698212367b8edcb69b5fd68c80be1f2ec7d603b4ba:v1.9.2
Environment Setup
# Define node IPs
export CONTROL_NODE="10.121.0.10"
export WORKER1="10.121.0.11"
export WORKER2="10.121.0.12"
export WORKER3="10.121.0.13"
export WORKER4="10.121.0.14"
Talos Upgrade Process
Since direct upgrades from 1.6.1 to 1.9.2 are not supported, we needed to perform incremental upgrades using images created at factory.talos.dev. The upgrade path followed was:
1.6.1 → 1.6.8 → 1.7.7 → 1.8.4 → 1.9.2
Control plane node upgrades:
talosctl -n ${CONTROL_NODE} upgrade --image factory.talos.dev/installer/2dcd442954d67662d41c61bdb92165aaf7189aff9997bd011b6968c12ce8d9c0:v1.6.8
talosctl -n ${CONTROL_NODE} upgrade --image factory.talos.dev/installer/2dcd442954d67662d41c61bdb92165aaf7189aff9997bd011b6968c12ce8d9c0:v1.7.7
talosctl -n ${CONTROL_NODE} upgrade --image factory.talos.dev/installer/2dcd442954d67662d41c61bdb92165aaf7189aff9997bd011b6968c12ce8d9c0:v1.8.4
talosctl -n ${CONTROL_NODE} upgrade --image factory.talos.dev/installer/9ffda6da42b0e45bb9f486a3579c3c672c6971c1acaba0cc8ed8e9a0a5bb9e09:v1.9.2
For each worker node upgrades:
talosctl -n ${WORKER1} upgrade --image factory.talos.dev/installer/2dcd442954d67662d41c61bdb92165aaf7189aff9997bd011b6968c12ce8d9c0:v1.6.8
talosctl -n ${WORKER1} upgrade --image factory.talos.dev/installer/2dcd442954d67662d41c61bdb92165aaf7189aff9997bd011b6968c12ce8d9c0:v1.7.7
talosctl -n ${WORKER1} upgrade --image factory.talos.dev/installer/2dcd442954d67662d41c61bdb92165aaf7189aff9997bd011b6968c12ce8d9c0:v1.8.4
talosctl -n ${WORKER1} upgrade --image factory.talos.dev/installer/9ffda6da42b0e45bb9f486a3579c3c672c6971c1acaba0cc8ed8e9a0a5bb9e09:v1.9.2
Kubernetes Upgrade Process
After completing the Talos upgrades, we proceeded with upgrading Kubernetes through the following versions:
1.29.0 → 1.29.13 → 1.30.9 → 1.31.5
talosctl -n ${CONTROL_NODE} upgrade-k8s --to 1.29.13
talosctl -n ${CONTROL_NODE} upgrade-k8s --to 1.30.9
talosctl -n ${CONTROL_NODE} upgrade-k8s --to 1.31.5
Troubleshooting
If you encounter the following error during the Kubernetes upgrade:
automatically detected the lowest Kubernetes version 1.29.9
unsupported upgrade path 1.29->1.30 (from "1.29.9" to "1.30.9")
This typically indicates that your talosctl client needs to be updated. If issues persist, you can perform a manual upgrade using the following commands:
Control plane node:
talosctl -n ${CONTROL_NODE} patch mc --mode=no-reboot -p '[{"op": "replace", "path": "/cluster/apiServer/image", "value": "registry.k8s.io/kube-apiserver:v'${K8S_INTERIM2}'"}]'
talosctl -n ${CONTROL_NODE} patch mc --mode=no-reboot -p '[{"op": "replace", "path": "/cluster/controllerManager/image", "value": "registry.k8s.io/kube-controller-manager:v'${K8S_INTERIM2}'"}]'
talosctl -n ${CONTROL_NODE} patch mc --mode=no-reboot -p '[{"op": "replace", "path": "/cluster/scheduler/image", "value": "registry.k8s.io/kube-scheduler:v'${K8S_INTERIM2}'"}]'
talosctl -n ${CONTROL_NODE} patch mc --mode=no-reboot -p '[{"op": "replace", "path": "/machine/kubelet/image", "value": "ghcr.io/siderolabs/kubelet:v'${K8S_INTERIM2}'"}]'
Worker nodes:
talosctl -n ${WORKER1} patch mc --mode=no-reboot -p '[{"op": "replace", "path": "/machine/kubelet/image", "value": "ghcr.io/siderolabs/kubelet:v'${K8S_INTERIM2}'"}]'
talosctl -n ${WORKER2} patch mc --mode=no-reboot -p '[{"op": "replace", "path": "/machine/kubelet/image", "value": "ghcr.io/siderolabs/kubelet:v'${K8S_INTERIM2}'"}]'
talosctl -n ${WORKER3} patch mc --mode=no-reboot -p '[{"op": "replace", "path": "/machine/kubelet/image", "value": "ghcr.io/siderolabs/kubelet:v'${K8S_INTERIM2}'"}]'
talosctl -n ${WORKER4} patch mc --mode=no-reboot -p '[{"op": "replace", "path": "/machine/kubelet/image", "value": "ghcr.io/siderolabs/kubelet:v'${K8S_INTERIM2}'"}]'
Conclusion
The upgrade process was successful, resulting in a fully updated cluster running Talos 1.9.2 and Kubernetes 1.31.5. Remember to always update your talosctl client to match the cluster version to avoid compatibility issues during upgrades.