# K3s NVIDIA GPU Passthrough Guide — Proxmox VE + K3s Stream GPU power into your K3s homelab: pass an NVIDIA GeForce RTX 3070 from Proxmox into a K3s VM and accelerate AI, video, and compute workloads. --- ## Prerequisites * **VT‑d** (Intel) or **AMD‑Vi** enabled **in BIOS › Advanced › System Agent** * Proxmox node with an NVIDIA GPU installed (e.g. RTX 3070 on host `still‑fawn`) * VM OS: Ubuntu 22.04 / 24.04 or Debian 12 * Existing K3s ≥ v1.32 cluster (API or GUI install) * Host‑side tools: `lspci`, `dmesg`, `update-grub`, `modprobe`, `nvidia-ctk`, `kubectl`, `crictl` > ⚠️ **Beware:** Most BIOSes ship with IOMMU/VT‑d disabled. Double‑check and > turn it **on** before continuing. --- ## 1. Enable IOMMU & VFIO on Proxmox Host ```bash # A) Add IOMMU flags to GRUB sed -i 's/quiet/quiet intel_iommu=on iommu=pt/' /etc/default/grub update-grub && reboot # B) Load VFIO modules at boot echo -e "vfio\nvfio_iommu_type1\nvfio_pci\nvfio_virqfd" | sudo tee /etc/modules update-initramfs -u # C) Blacklist host GPU drivers echo -e "blacklist nouveau\nblacklist nvidia" | sudo tee /etc/modprobe.d/blacklist-gpu.conf update-initramfs -u # D) Bind GPU to VFIO — replace IDs with your lspci -nn output echo 'options vfio-pci ids=10de:2484,10de:228b disable_vga=1' | sudo tee /etc/modprobe.d/vfio.conf update-initramfs -u && reboot ``` ### Verification ```bash # Kernel enabled IOMMU? dmesg | grep -E 'DMAR:.*IOMMU enabled' # GPU bound to vfio-pci? lspci -k -s 01:00.0 | grep 'vfio-pci' # Every PCIe device isolated in its own IOMMU group? for g in /sys/kernel/iommu_groups/*; do echo "IOMMU Group ${g##*/}:" for d in "$g"/devices/*; do echo -e "\t$(lspci -nns ${d##*/})" done done ``` ### IOMMU Setup Common Missteps * Missing `iommu=pt` → inconsistent passthrough. * Host driver not blacklisted → GPU never frees for VM. --- ## 2. Create & Configure the VM (Proxmox GUI) 1. **VM Options:** BIOS = **OVMF (UEFI)**, Machine = **q35**, CPU Type = **host** 2. **Hardware → Add → PCI Device:** choose `01:00.0 (GPU)` → enable **All Functions** and **PCI‑Express** 3. **Start VM** ### Verification (inside VM) ```bash lspci -nn | grep -i nvidia ``` ### VM Config Common Missteps * Forgetting **All Functions** → passthrough of GPU *or* audio only. * CPU model left at *Default* → AVX and other flags unavailable inside VM. * CPU model should be Host --- ## 3. Install NVIDIA Drivers & Configure K3s Containerd ### Option A — Community Quick‑start #### a. GPU Operator via Helm (fully automated) ```bash helm repo add nvidia https://nvidia.github.io/gpu-operator helm repo update kubectl create namespace gpu-operator helm install gpu-operator nvidia/gpu-operator \ --namespace gpu-operator \ --values values.yaml ``` #### b. Single shot `nvidia-ctk` injection (manual shim only) ```bash sudo nvidia-ctk runtime configure \ --runtime=containerd \ --config /var/lib/rancher/k3s/agent/etc/containerd/config.toml sudo systemctl restart k3s ``` > If you used **a. GPU Operator**, skip **b.** The Operator already performs the > injection. The above is not verified. I did both and then it wasn't working > before it started working. ### NVIDIA Install Verification ```bash sudo crictl info | grep -A3 '"nvidia"' nvidia-smi ``` ### NVIDIA Common Missteps From Blogs * Editing `/etc/containerd/config.toml` (K3s ignores this file). * Forgetting to run `nvidia-ctk` *before* K3s starts. --- ## 4. Deploy the NVIDIA Device Plugin ```bash kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.1/deployments/static/nvidia-device-plugin.yml # (Optional) Restrict to GPU node only: kubectl patch ds nvidia-device-plugin -n kube-system \ --type=json -p '[{"op":"add","path":"/spec/template/spec/nodeSelector","value":{"nvidia.com/gpu.present":"true"}}]' ``` ### K3s NVIDIA Integration Verification ```bash kubectl get ds nvidia-device-plugin -n kube-system kubectl logs -l app=nvidia-device-plugin -n kube-system | head -n 20 kubectl describe node still-fawn | grep -A2 Capacity # Expect: nvidia.com/gpu: 1 ``` --- ## 5. Smoke‑test with a CUDA Pod ### gpu-test.yaml ```yaml apiVersion: v1 kind: Pod metadata: name: gpu-test spec: restartPolicy: Never containers: - name: cuda-smi image: nvidia/cuda:11.0-base command: ["nvidia-smi"] resources: limits: nvidia.com/gpu: 1 ``` ```bash kubectl apply -f gpu-test.yaml kubectl wait --for=condition=Succeeded pod/gpu-test --timeout=1m kubectl logs gpu-test # Expect full nvidia-smi output ``` --- ## References [NVIDIA GPU Operator](https://github.com/NVIDIA/gpu-operator) [Proxmox GPU Passthrough Docs](https://pve.proxmox.com/wiki/Pci_passthrough) [UntouchedWagons/K3S-NVidia: A guide on using NVidia GPUs for transcoding or AI in Kubernetes](https://github.com/UntouchedWagons/K3S-NVidia?tab=readme-ov-file#installing-the-gpu-operator) [Installing the NVIDIA Container Toolkit — NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuring-cri-o) [NVIDIA/k8s-device-plugin: NVIDIA device plugin for Kubernetes](https://github.com/NVIDIA/k8s-device-plugin#quick-start) [NVIDIA/k8s-device-plugin: NVIDIA device plugin for Kubernetes](https://github.com/NVIDIA/k8s-device-plugin#prerequisites) [How to use GPUs with DevicePlugin in OpenShift 3.10](https://www.redhat.com/en/blog/how-to-use-gpus-with-deviceplugin-in-openshift-3-10) [NVIDIA GPU passthrough with k3s? : r/kubernetes](https://www.reddit.com/r/kubernetes/comments/lopyu9/nvidia_gpu_passthrough_with_k3s/) [QEMU / KVM CPU model configuration — QEMU documentation](https://qemu-project.gitlab.io/qemu/system/qemu-cpu-models.html) [Fatal glibc error: CPU does not support x86-64-v2 · Issue #287 · JATOS/JATOS](https://github.com/JATOS/JATOS/issues/287) [Installing the NVIDIA Container Toolkit — NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) [NVIDIA/k8s-device-plugin: NVIDIA device plugin for Kubernetes](https://github.com/NVIDIA/k8s-device-plugin#enabling-gpu-support-in-kubernetes) [Adding A GPU node to a K3S Cluster – Radical Geek Technology Solution](https://radicalgeek.co.uk/pi-cluster/adding-a-gpu-node-to-a-k3s-cluster/) [UntouchedWagons/K3S-NVidia: A guide on using NVidia GPUs for transcoding or AI in Kubernetes](https://github.com/UntouchedWagons/K3S-NVidia) [Enable IOMMU or VT-d in your motherboard BIOS - BIOS - Tutorials - InformatiWeb](https://us.informatiweb.net/tutorials/it/bios/enable-iommu-or-vt-d-in-your-bios.html) [still-fawn.maas details | maas MAAS](http://192.168.4.53:5240/MAAS/r/machine/sfem4w/summary) [Intel® Core™ i5-4460 Processor](https://www.intel.com/content/www/us/en/products/sku/80817/intel-core-i54460-processor-6m-cache-up-to-3-40-ghz/specifications.html) [edenreich/ollama-kubernetes: A POC I'm going to demo about how to deploy Ollama onto Kubernetes](https://github.com/edenreich/ollama-kubernetes) [Enabled GPU passthrough of Intel HD 610 with GVT-g in Proxmox 8 | Proxmox Support Forum](https://forum.proxmox.com/threads/enabled-gpu-passthrough-of-intel-hd-610-with-gvt-g-in-proxmox-8.134461/) [Homelab K3s HA Setup](https://chatgpt.com/c/6824e84b-78b8-8007-a843-7d03241b2c32) [Enabled GPU passthrough of Intel HD 610 with GVT-g in Proxmox 8 | Proxmox Support Forum](https://forum.proxmox.com/threads/enabled-gpu-passthrough-of-intel-hd-610-with-gvt-g-in-proxmox-8.134461/)