Hetzner Cloud
The edge site lives in Hetzner Cloud, region nbg1. It's a single-node Talos cluster — control-plane-1, a cx33 instance — fronted by a Hetzner-managed VPC, label-selected cloud firewalls, and a floating IP that survives instance replacement.
Driven by tofu/environment/edge via the hetznercloud/hcloud provider.
Why Hetzner
- Cheap, predictable pricing for a 24/7 edge node. A
cx33is a couple of euros a month and covers what the edge cluster needs. - A real API with a first-class OpenTofu provider — VMs, networks, floating IPs, firewalls, and labels are all declarative.
- EU-resident for any data that hits the edge before it tunnels back home through NetBird.
Alternatives considered
| Option | Why not |
|---|---|
| AWS / GCP / Azure | Overkill and overpriced for a single edge instance |
| DigitalOcean | Comparable, slightly more expensive; no obvious upside |
| Scaleway | Used for the backup S3 bucket, not for compute |
| OVH / Hetzner Robot dedicated | Too much hardware for the edge role |
Layout
floating IP (edge-1)
│
▼
┌────────────────────────┐
│ control-plane-1 │ cx33 · Talos
│ 172.30.0.11 (private)│ nbg1
│ public IP │
└─────┬────────────┬─────┘
│ │
VPC `edge` cloud firewalls
172.30.0.0/16 (label-selected, see below)
subnet `k8s`
172.30.0.0/24
| Resource | Value | Notes |
|---|---|---|
| Network | edge | 172.30.0.0/16, delete-protected |
| Subnet | k8s | 172.30.0.0/24, in eu-central |
| Instance | control-plane-1 | cx33, Talos image |
| Floating IP | edge-1 | Inbound traffic target — survives node replace |
Cloud firewalls
Hetzner cloud firewalls are label-selected, so adding a node to the right firewall is a label edit, not a topology change.
| Firewall | Targets (label) | Ingress | Sources |
|---|---|---|---|
talos-control | talos_control=true | 50000/tcp (apid), 6443/tcp (kube-api), 2379-2380/tcp (etcd) | 172.30.0.0/24 |
talos-internal | talos=true | 50001 (trustd), 51871/udp (Cilium WG), 4240/4244/4245/4250 (Cilium/Hubble), 9962-9964 (metrics), 10250 (kubelet) | 172.30.0.0/24 |
allow-http | http=true | 80, 443 | 0.0.0.0/0 |
allow-ssh | ssh=true | 22 | 0.0.0.0/0 |
Talos itself doesn't run SSH, but allow-ssh exists for the rare bootstrap maintenance instance that does.
NetBird
- Network:
edge→ resourceedge Management Subnet=172.30.0.0/24 - Routing peer:
control-plane-1is inedge_peers(metric 9999, masquerade) - Sidecars:
edge_sidecar_envoygroup has a cross-network policy to theproductionpublicsubnet (192.168.105.0/24) — this is the path that lets the edge Envoy gateway reach apps on the production cluster, the basis of the edge → production traffic chain.
OpenTofu workflow
cd tofu/environment/edge
tofu init
tofu plan -out=plan
tofu apply plan
The provider authenticates with an API token stored in SOPS. State and tokens never live in plaintext on disk; CI runs apply only after a manual approval step.
Backups on Scaleway
The matching tofu/modules/scaleway/backup_bucket module creates an S3-compatible Object Storage bucket on Scaleway used as the Restic target. It's deliberately a different provider so a Hetzner outage can't take both the edge node and its backup target out at the same time.
Operational notes
- Replacing the instance. Re-create with the same name, re-attach the floating IP, re-apply Talos config. The cloud firewalls and VPC stay untouched because they're label-selected.
- Region.
nbg1is the only region used; switching regions is an apply, but the Talos image must be in that region's snapshot library. - Quotas. Hetzner enforces per-project caps on instances / floating IPs / networks. Keep the edge env on its own project for clean blast-radius control.
- Monitoring. Gatus checks the floating IP from outside the homelab; Prometheus scrapes node metrics from inside the mesh.
Where to look next
- Proxmox — same role for the production cluster
- NetBird — how the edge joins the rest of the homelab
- Topics → Real client IPs across the chain