Installing NVIDIA Drivers for Diskless Environment

Travis DePrato asked:

I’m trying to set up a cluster of 8 computers plus a main file server. Ideally, I’d like to set this up in a pxe-boot, quasi-diskless/quasi-stateless environment (i.e. the only local storage is /var, where things like torque configuration will go). Each of the 8 compute nodes has 4 NVIDIA Tesla K40m’s, but the root file server has no GPU.

Ideally, I’d like to be able to create the complete installation on the file server (at /node) then PXE-boot that to the compute nodes, but, I haven’t found a way to install the NVIDIA drivers without an NVIDIA GPU on board. I found one question on NVIDIA’s forums about how someone unsuccessfully attempted this…

Alternatively, I could install the NVIDIA drivers to one of the compute nodes (one is currently running CentOS on it’s local disks) to (for example) /usr/local/nvidia and keep track of what files it creates and create a tarball of that to copy to the file server installation.

Lastly, I could just maintain eight separate installations, but, I don’t like this from a long-term maintenance perspective (each compute node will be running torque jobs so I’d like the nodes to look more-or-less identical).

In summary, what I’m asking for is this:

  1. Can I install the NVIDIA drivers without an NVIDIA GPU on board?
  2. Is there some other way I should be going about this?

For reference, we’re running CentOS 7.

[root@compute-3 /]# uname -a
Linux compute-3 3.10.0-514.2.2.el7.x86_64 #1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

My answer:

Use RPM packages, like everything else.

At the moment the best built NVIDIA driver packages are from Negativo17.

View the full question and answer on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.