๐ Hetzner Kubernetes Cluster Setup Guide¶
This guide walks you through setting up a complete Kubernetes cluster on Hetzner Cloud using Talos Linux and Flux GitOps.
๐ Prerequisites¶
Quick Validation¶
Before starting, run our prerequisites validation script to ensure all required tools are installed:
For Linux/macOS/WSL:
# Make the script executable and run it
chmod +x scripts/validate-prerequisites.sh
./scripts/validate-prerequisites.sh
For Windows PowerShell:
For Windows users - Recommended approach: We strongly recommend using Windows Subsystem for Linux (WSL) for the best experience with Kubernetes tools:
# Install WSL (requires restart)
wsl --install
# After restart, open WSL and run the bash version
chmod +x scripts/validate-prerequisites.sh
./scripts/validate-prerequisites.sh
This script will check for all required tools, validate versions, and provide installation guidance for any missing components.
Required Tools¶
- Terraform CLI - Follow the Installation Guide
- Hetzner Cloud Account - Sign up here
- GitHub Account - For GitOps repository access
- kubectl CLI - Installation guide
- talosctl CLI - Installation guide
- Talos OS Snapshot - Custom Hetzner Cloud snapshot (see Step 1)
๏ฟฝ Step 1: Create Talos OS Snapshot¶
The terraform configuration requires a custom Talos OS snapshot in your Hetzner Cloud project.
1.1 Create Talos VM¶
- In Hetzner Cloud Console, create a new server
- Choose Ubuntu 22.04 as the base image
- Select cx11 (cheapest option for snapshot creation)
- Choose location Nuremberg (nbg1)
- Create the server and wait for it to boot
1.2 Install Talos OS¶
SSH into the server and install Talos:
# SSH to your temporary server
ssh root@YOUR_SERVER_IP
# Download and install Talos
curl -Lo /tmp/talos.raw.xz https://github.com/siderolabs/talos/releases/download/v1.8.0/hcloud-amd64.raw.xz
xz -d /tmp/talos.raw.xz
dd if=/tmp/talos.raw of=/dev/sda && sync
reboot
1.3 Create Snapshot¶
- Wait for the server to reboot (it will become unreachable - this is normal)
- In Hetzner Console, go to your server โ Images tab
- Click "Create Image"
- Choose "Snapshot" type
- Enter description:
"Talos OS v1.8.0" - Click "Create Image"
- Wait for snapshot creation to complete
- Note the Snapshot ID (you'll need this for terraform)
- Delete the temporary server to save costs
1.4 Update Terraform Configuration¶
Update the snapshot ID in your terraform configuration:
Edit main.tf and replace the hardcoded snapshot ID 336987024 with your new snapshot ID:
# Find these lines and update with your snapshot ID:
image = "YOUR_SNAPSHOT_ID_HERE" // Talos snapshot ID
โ ๏ธ Important: The current configuration uses hardcoded snapshot ID 336987024 in main.tf. You have two options:
- Create your own Talos snapshot (recommended) and update the terraform files
- Use the existing snapshot ID (if accessible in your Hetzner project)
Alternative: You can make the snapshot ID configurable by adding it as a terraform variable (see troubleshooting section for how to do this).
๏ฟฝ๐ง Step 2: Hetzner Cloud Project Setup¶
2.1 Create Hetzner Cloud Project¶
- Log into the Hetzner Cloud Console
- Click "New Project"
- Enter a project name (e.g.,
k8s-cluster) - Select your preferred location
- Click "Create Project"
2.2 Generate API Token¶
- In your Hetzner project, navigate to Security โ API Tokens
- Click "Generate API Token"
- Enter a description (e.g.,
terraform-access) - Select Read & Write permissions
- Click "Generate API Token"
- โ ๏ธ IMPORTANT: Copy the token immediately - it won't be shown again!
๐ Step 3: Configure Terraform Variables¶
3.1 Bootstrap Configuration (1-bootstrap)¶
Create or update the terraform.tfvars file in the 1-bootstrap directory:
Create terraform.tfvars:
Required Variables for 1-bootstrap:
hcloud_token- Your Hetzner Cloud API token (sensitive)
3.2 Components Configuration (2-components)¶
The 2-components directory already has a terraform.tfvars file. Update it with your GitHub information:
Update terraform.tfvars:
# Path to kubeconfig from 1-bootstrap
kubeconfig_path = "../1-bootstrap/kubeconfig"
# GitHub repository information
github_owner = "YOUR_GITHUB_USERNAME"
github_repository = "bootstrap-cluster"
github_branch = "main"
# Path within repository where cluster configs are stored
github_path = "./clusters/hetzner-mgmt"
# Flux installation namespace
flux_namespace = "flux-system"
# GitHub Personal Access Token (for private repositories)
# Leave empty for public repositories
github_token = "YOUR_GITHUB_TOKEN_HERE"
Required Variables for 2-components:
kubeconfig_path- Path to kubeconfig file (default:../1-bootstrap/kubeconfig)flux_namespace- Kubernetes namespace for Flux (default:flux-system)github_owner- Your GitHub username/organization (required)github_repository- Repository name (default:bootstrap-cluster)github_branch- Git branch to sync (default:main)github_path- Path in repo for cluster configs (default:./clusters/hetzner-mgmt)github_token- GitHub Personal Access Token (required for private repos)
๐ฏ Step 4: GitHub Personal Access Token Setup¶
4.1 Generate GitHub Token¶
- Go to GitHub Settings โ Developer settings โ Personal access tokens
- Click "Generate new token (classic)"
- Enter a note (e.g.,
flux-gitops-access) - Select scopes:
repo(Full control of private repositories)read:org(Read org and team membership)- Click "Generate token"
- โ ๏ธ IMPORTANT: Copy the token immediately!
4.2 Update terraform.tfvars¶
Add your GitHub token to the 2-components/terraform.tfvars file:
๐ Step 5: Deploy the Cluster¶
5.1 Bootstrap Infrastructure¶
cd terraform/environments/hetzner-mgmt-cluster/1-bootstrap
# Initialize Terraform
terraform init
# Plan the deployment
terraform plan
# Apply the configuration
terraform apply
This will create:
- Hetzner Cloud servers for the Talos cluster
- Network configuration
- Floating IP for MetalLB
- Generate kubeconfig file
5.2 Install Components¶
cd ../2-components
# Initialize Terraform
terraform init
# Plan the deployment
terraform plan
# Apply the configuration
terraform apply
This will install:
- Flux GitOps controller
- Configure GitOps sync with your repository
- Create cluster configuration ConfigMap in flux-system namespace
- MetalLB via Helm chart with automatic floating IP configuration via Flux ConfigMap substitution
โ Step 6: Verify Installation¶
6.1 Check Cluster Access¶
# Use the generated kubeconfig
export KUBECONFIG=$(pwd)/../1-bootstrap/kubeconfig
# Check cluster nodes
kubectl get nodes
# Check Flux installation
kubectl get pods -n flux-system
# Check cluster configuration ConfigMap
kubectl get configmap cluster-config -n flux-system -o yaml
6.2 Monitor Flux Sync¶
# Check Flux status
kubectl get gitrepository -n flux-system
# Check kustomizations
kubectl get kustomization -n flux-system
# View Flux logs
kubectl logs -n flux-system -l app=source-controller
# Check Helm releases
kubectl get helmrelease -n metallb-system
# Check MetalLB IP pool configuration
kubectl get ipaddresspool -n metallb-system ip-address-pool -o yaml
# Verify ConfigMap substitution worked
kubectl describe kustomization metallb-install -n flux-system
# Check MetalLB pods
kubectl get pods -n metallb-system
๐ง Troubleshooting¶
Common Issues¶
-
Hetzner API Token Issues
-
Verify token has Read & Write permissions
- Check token isn't expired
-
Ensure correct project is selected
-
GitHub Token Issues
-
Verify token has
reposcope - For organization repos, ensure token has org access
-
Check repository exists and is accessible
-
Kubeconfig Issues
-
Verify kubeconfig was generated in
1-bootstrapdirectory - Check file permissions on kubeconfig
-
Ensure cluster is fully bootstrapped before running
2-components -
Talos Snapshot Issues
-
Verify snapshot exists in your Hetzner project
- Check snapshot ID is correct in terraform files
-
Ensure snapshot is in the same location as your servers
-
Talos Configuration Issues
-
The
talos-cluster-patch.yamlis a template that gets processed by terraform - If you see
${primary_ip}in the file, that's normal - terraform replaces it during deployment -
The actual configuration is written to
talos-cluster-patch-dynamic.yaml(which is ignored by git) -
Flux Resource Creation
- Flux resources (GitRepository, Kustomization) are created using kubectl apply
- This avoids CRD validation issues during terraform planning
- Resources are properly cleaned up when running terraform destroy
Making Snapshot ID Configurable (Optional Improvement)¶
To avoid hardcoding the snapshot ID, you can add it as a terraform variable:
- Add this variable to
main.tf:
variable "talos_image_id" {
description = "Hetzner Cloud snapshot ID for Talos OS"
type = string
default = "336987024"
}
- Replace hardcoded image IDs with:
- Add to
terraform.tfvars:
Debug Commands¶
# Check Terraform state
terraform show
# View detailed logs
terraform apply -debug
# Validate configuration
terraform validate
๐งน Cleanup¶
To destroy the infrastructure:
# Destroy components first
cd terraform/environments/hetzner-mgmt-cluster/2-components
terraform destroy
# Then destroy bootstrap infrastructure
cd ../1-bootstrap
terraform destroy
๐ Next Steps¶
After successful deployment:
- Configure your applications in
clusters/hetzner-mgmt/applications/ - Set up monitoring and logging
- Configure backup solutions
- Review security settings
For more information, see the Hetzner MetalLB Networking Guide.