Proxmox VE
Proxmox VE is the hypervisor for the on-prem production site. Three physical nodes — proxmox1, proxmox2, proxmox3 — host the Talos control-plane and worker VMs plus a small fleet of NetBird connector LXCs.
Driven by tofu/environment/production via the bpg/proxmox provider.
Why Proxmox
- Open-source, locally hosted hypervisor with a real REST API the IaC layer can talk to.
- Mixed workloads in one box. KVM VMs for Talos nodes, LXC containers for the NetBird connectors. The connectors don't deserve a full VM and Proxmox handles both natively.
- Live migration between nodes during planned maintenance — the cluster keeps reconciling while one host gets patched.
- GPU passthrough that actually works for the
talos-worker-*nodes feeding Jellyfin / Immich / Tube Archivist.
Alternatives considered
| Option | Why not |
|---|---|
| Bare-metal Talos directly on the boxes | No room for the LXC connectors and ad-hoc utility VMs |
| ESXi / VMware | Closed source; recent licensing changes ruled it out |
| XCP-ng | Solid alternative; Proxmox wins on community + tooling familiarity |
| Harvester | HCI, but pulls in its own Kubernetes — would conflict with Talos |
Cluster layout
| Node | Role |
|---|---|
proxmox1 | Hypervisor + control-plane VM talos-cp-01 |
proxmox2 | Hypervisor + control-plane VM talos-cp-02 |
proxmox3 | Hypervisor + control-plane VM talos-cp-03 |
Each node also runs:
- One Talos worker VM (
talos-worker-0{1,2,3}) — extra NICs into VLAN 104 (storage) and VLAN 105 (public), GPU passthrough. - One NetBird connector LXC (
lxc-proxmox{1,2,3}-netbird) — three IPs, one per VLAN it routes (mgmt / storage / public). See NetBird.
The trio is a real Proxmox cluster (corosync + cluster filesystem), so VMs can live-migrate. Quorum is 2/3.
Bridge layout per node
┌───────── vmbr0 (trunk) ──────────┐
│ │
┌────┴────────┐ ┌──────┴────────┐
│ untagged │ │ tagged 104 │
│ VLAN 100 │ │ VLAN 105 │
│ (mgmt) │ │ (storage/pub) │
└─────────────┘ └───────────────┘
│ │
pve host + cp-VMs + worker VMs (extra
netbird LXC eth0 + NICs) + netbird
worker VM eth0 LXC eth1 / eth2
The full VLAN reference and IP plan lives on the Fabric overview and the UniFi page.
Storage
| Pool | Backing | Used by |
|---|---|---|
local-lvm | Each node's local NVMe | Talos VM disks, LXC root volumes |
truenas-iscsi | iSCSI to the TrueNAS NAS | Optional shared volumes (not used in steady state) |
Talos VM disks are intentionally on local NVMe, not on shared storage. The etcd quorum tolerates one node down; we don't want a NAS outage to take all three control-planes with it. Persistent application data lives on Longhorn inside the cluster, with bulk media on TrueNAS via NFS.
OpenTofu workflow
Everything in tofu/environment/production is declarative. Plan / apply against the Proxmox API:
cd tofu/environment/production
tofu init # idempotent
tofu plan -out=plan
tofu apply plan
Each apply will:
- create / reconcile VMs from cloud-init templates,
- attach VLAN-tagged NICs as required,
- update DNS records via the NetBird overlay's DNS module,
- register peers (the connector LXCs) into the right NetBird groups via setup keys.
The provider talks to Proxmox over HTTPS using a token (tofu_provider). Token + state are SOPS-encrypted at rest.
Operational notes
- Rolling reboots during a Proxmox upgrade: drain Talos workloads off one node first (
kubectl drain), live-migrate the worker VM to a peer, then reboot. The control-plane VM auto-fails-over to the migrated host. - GPU passthrough is per-VM. After replacing a worker, re-confirm the IOMMU groups before re-passing-through (Proxmox numbering can shift on kernel upgrades).
- Cluster-network split on UniFi means corosync rides on VLAN 100. If you change VLAN 100's gateway, plan an extra-careful apply window.
- Backups of the Proxmox VE configuration itself (PVE storage cfg, replication, cluster join) are in scope for the operations layer, separate from in-cluster backups. See Operations → Backups.
Where to look next
- Hetzner — same role for the edge cluster
- Talos — what runs inside these VMs
- Fabric / UniFi — VLAN trunk feeding
vmbr0 - Hardware → NAS — physical box behind the storage VLAN