We have
installed and set up JupyterHub in the previous post. To make use of the GPU card in the server, we are going to also install and configure CUDA and cuDNN from NVIDIA.
Setup CUDA and cuDNN
According to
NVIDIA, CUDA is not just an API or a programming language:
CUDA is a parallel computing platform and programming model that makes using a GPU for general purpose computing simple and elegant. The developer still programs in the familiar C, C++, Fortran, or an ever expanding list of supported languages, and incorporates extensions of these languages in the form of a few basic keywords.
To let PyTorch successfully access the power of GPU, we need to first install NVIDIA graphics drivers and CUDA driver and toolkit, which the cuDNN library can communicate with the GPU and provide neural network related primitives for PyTorch. The relationship is shown in below figure:Relationship between GPUs, CUDA driver and toolkit, and cuDNN that is one of the applications. source: NVIDIA
Step zero, clean up environment
Here we are going to remove any previously installed NVIDIA drivers and
nouveau, if presented. That should increase the chance of successful installation.
Check if nouveau is running $ lsmod | grep nouveau. If there is any nouveau running, turn it off by adding a file blacklist-nouveau.conf under /etc/modprobe.d, and paste below content to the file.
blacklist nouveau
options nouveau modeset=0
Generate a new kernel and reboot
$ sudo update-initramfs -u
$ sudo reboot
Check whether nouveau has been turned off: lsmod | grep nouveau
Install drivers
Install NVIDIA graphics driver. There are 2 ways to install the drive, 1 is to download it from NVIDIA and run the setup file; another one is to install from Ubuntu repository.
Option 1: Download driver from NVIDIA website
Check whether your NVIDIA card has been detected: lspci | grep -i nvidia
Go to
Download Drivers page and look for the driver that suits your environment. I just provide my case below as a sample:
$ wget http://us.download.nvidia.com/tesla/450.51.06/NVIDIA-Linux-x86_64-450.51.06.run
$ sudo sh NVIDIA-Linux-x86_64-450.51.06.run
Since we have some programs written in C and needed to be developed in this environment in the near future, we need to upgrade GCC as well. According to the
documentation, GCC should be updated to 9.x in Ubuntu 18.04 environment (Ref:
install gcc-9 on Ubuntu 18.04?).
Download and install the NVIDIA CUDA Toolkit
here, e.g.
$ wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.run
$ sudo sh cuda_11.0.3_450.51.06_linux.run
The installer may prompt you that you have installed the driver through Linux package manager. It is fine if that’s what you just finished in the previous step. So choose Continue at the screen.Choose Continue hereDeselect Driver as we have installed it already.No need to install driver again
After finishing the installation, as instructed, please ensure PATH and LD_LIBRARY_PATH has been set properly:
PATH includes /usr/local/cuda-11.0/bin
LD_LIBRARY_PATH includes /usr/local/cuda-11.0/lib64, or, add /usr/local/cuda-11.0/lib64 to /etc/ld.so.conf and run ldconfig as rootRemember to check environment variables
Check whether the toolkit has been successfully installed: nvcc -V. Sample output:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
Download and install cuDNN for Linux
To download cuDNN, you need to first register as an NVIDIA developer, and then you can download the tar file (cuDNN Library for Linux (x86_64)) or DEB files
here.
Install from a tar file
Expand the cuDNN pacakge to cuda directory: $ tar -xzvf cudnn-x.x-linux-x64-v8.x.x.x.tgz. (Replace x.x and x.x.x with suitable numbers.)
Copy following files to the installation directory of CUDA toolkit. In my case, the CUDA installation directory is /usr/local/cuda-11.0/
Check if cuDNN has been installed: cat /usr/local/cuda-11.0/include/cudnn_version.h | grep "CUDNN_MAJOR" -A 2
Verify CUDA is available in Jupyterlab
Finally, if everything went well, we can have a check if PyTorch is able to access the GPU.
import torch
if torch.cuda.is_available():
print("cuda available!")
torch.device("cuda:0")
else:
print("cuda not available.")
torch.device("cpu")
Output:
# cuda available!
Hurray! The installation is not that difficult, but somehow it is still error-prone, especially during the CUDA stage while checking for the graphics driver. It may be due to the order of steps or the unclean environment. I also have difficulty to install CUDA at the first time because of the previous installed driver. So, take a look and experiment different settings. See you until next time. Ciao.