Working around the AMD GPU Reset bug on Proxmox using vendor-reset

Most modern AMD GPUs suffer from the AMD Reset Bug: The card cannot be reset properly, so it can only be used once per host power-on. The second time the card is tried to be used Linux will attempt to reset it and fail, causing the VM launch to fail, or the guest, host or both to hang.

This is especially a problem if you only have one GPU in your system, because it will be your primary GPU and so be initialised by the host UEFI during boot, rendering it unusable for passthrough even a single time.

gnif’s new vendor-reset project is an attempt to work around this AMD reset issue by replacing AMD’s missing FLR support with vendor-specific reset quirks.

Continue reading Working around the AMD GPU Reset bug on Proxmox using vendor-reset