DEV Community

Dang Hoang Nhu Nguyen
Dang Hoang Nhu Nguyen

Posted on

[BTY] Day 1: Install NVIDA Driver, CUDA, and CUDNN on Ubuntu 20.04

Updated: 23.01.2022

References

How to install Cuda 11.4 on ubuntu 18.04(or 20.04

How to Install CUDA on Ubuntu 20.04

Installing the NVIDIA driver, CUDA and cuDNN on Linux

Section 0 — Gather information

Section 1 — Cleaning remaining files

  • Find and delete files

    ⚠️ BECAREFUL THIS SECTION!

    Deleting any NVIDIA/CUDA packages you may already have installed

    sudo rm /etc/apt/sources.list.d/cuda*
    sudo apt remove --autoremove nvidia-cuda-toolkit
    sudo apt remove --autoremove nvidia-*
    

    Deleting any remaining Cuda files on /usr/local/

    sudo rm -rf /usr/local/cuda*
    

    Purge any remaining NVIDIA configuration files

    sudo apt-get purge nvidia*
    

    updating and deleting unnecessary dependencies.

    sudo apt-get update
    sudo apt-get autoremove
    sudo apt-get autoclean
    

Section 2 — Installing Cuda

  • Installing the NVIDIA driver

    Execute these command (you can get the link by right click and Copy link from Download button from this block):

    wget https://us.download.nvidia.com/XFree86/Linux-x86_64/470.94/NVIDIA-Linux-x86_64-470.94.run
    sudo sh NVIDIA-Linux-x86_64-450.57.run
    nvidia-smi 
    
  • Installing CUDA repository pin

    First, download the CUDA repository pin:

    $ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
    

    Next, move the pin to the /etc/preferences.d directory and rename it as cuda-repository-pin-600.

    $ sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
    
  • Installing CUDA

    You may need to confirm that the display driver is already installed, and de-select installation of the display driver.

    wget https://developer.download.nvidia.com/compute/cuda/11.4.2/local_installers/cuda_11.4.2_470.57.02_linux.run
    sudo sh cuda_11.4.2_470.57.02_linux.run
    

    Unselect the NVIDIA driver because we’ve installed in the previous step.

Section 3 — Adding Cuda to Path

  • For a specific user

    Open .profile file

    sudo nano ~/.profile
    

    and add these lines

    # set PATH for cuda 11.4 installation
    if [ -d "/usr/local/cuda-11.4/bin/" ]; then
        export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}}
        export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
    fi
    

    Or you can add the given directories to your PATH and LD_LIBRARY_PATH by adding the following lines to your .bashrc.zshrc, or whatever shell you are using:

    export PATH=/usr/local/cuda-11.4/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64:$LD_LIBRARY_PATH
    
    
  • For all users (not tested yet)

    If you want to permanently add to all users, following this (The profile file works for all users. If you want it to be valid only for the active user, change the ".bashrc" file):

    # Modify the "/etc/profile" file
    vi /etc/profile
    
    # Press the I key to enter editing mode and move the cursor to the end of the file. Additional entries:
    
    export PATH=$PATH:/path/to/dir;
    
    # Press the Esc key to exit edit mode, and :wq to save the file.
    
    # Make the configuration effective
    source /etc/profile
    
    

    Ref: https://stackoverflow.com/a/53443815/6563277

Section 4 — Checking

Now reboot your computer and check Nvidia driver and Cuda.

  • For checking Nvidia driver

    nvidia-smi
    
  • For checking Cuda version

    nvcc --version
    

Section 5 — Testing

  • Make CUDA file and run it

    sudo nano kernel.cu
    

    and add these lines to this file.

    #include <stdio.h>
    
    __global__
    void saxpy(int n, float a, float *x, float *y)
    {
      int i = blockIdx.x*blockDim.x + threadIdx.x;
      if (i < n) y[i] = a*x[i] + y[i];
    }
    
    int main(void)
    {
      int N = 1<<20;
      float *x, *y, *d_x, *d_y;
      x = (float*)malloc(N*sizeof(float));
      y = (float*)malloc(N*sizeof(float));
    
      cudaMalloc(&d_x, N*sizeof(float));
      cudaMalloc(&d_y, N*sizeof(float));
    
      for (int i = 0; i < N; i++) {
        x[i] = 1.0f;
        y[i] = 2.0f;
      }
    
      cudaMemcpy(d_x, x, N*sizeof(float), cudaMemcpyHostToDevice);
      cudaMemcpy(d_y, y, N*sizeof(float), cudaMemcpyHostToDevice);
    
      // Perform SAXPY on 1M elements
      saxpy<<<(N+255)/256, 256>>>(N, 2.0f, d_x, d_y);
    
      cudaMemcpy(y, d_y, N*sizeof(float), cudaMemcpyDeviceToHost);
    
      float maxError = 0.0f;
      for (int i = 0; i < N; i++)
        maxError = max(maxError, abs(y[i]-4.0f));
      printf("Max error: %f\n", maxError);
    
      cudaFree(d_x);
      cudaFree(d_y);
      free(x);
      free(y);
    }
    

    Next, use nvcc the Nvidia CUDA compiler to compile the code and run the newly compiled binary:

    nvcc -o kernel kernel.cu
    ./kernel
    

    Result :

    Max error: 0.000000
    

Section 6 — Install CUDNN

  • Find the right version

    CuDNN doesn't come with Cuda. To download CuDNN you need to register to become a member of the NVIDIA Developer Program which is free.

    For me I choose 8.2.4 due to the current CUDA version and Triton 21.10 which I’m experimenting.

  • Download the cuDNN Library for Linux

    ⚠️ Download [file](https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.2.4/11.4_20210831/cudnn-11.4-linux-x64-v8.2.4.15.tgz) to local and then upload to the server (use SCP) due to the download link requires login.

    Extract the cuDNN package:

     tar -xzvf cudnn-11.4-linux-x64-v8.2.4.15.tgz
    

    Then copy the following files to the CUDA directory:

    $ sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
    $ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
    $ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
    

Useful command

Export the CUDA device with the lowest GPU ultilization

How do I select which GPU to run a job on?

export CUDA_VISIBLE_DEVICES=$(nvidia-smi --query-gpu=memory.free,index --format=csv,nounits,noheader | sort -nr | head -1 | awk '{ print $NF }')
Enter fullscreen mode Exit fullscreen mode

Find CUDA version (in case of nvcc command not found)

/usr/local/cuda/bin/nvcc --version
Enter fullscreen mode Exit fullscreen mode

Find CuDNN version

  • For CUDA 8.1 and above
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
Enter fullscreen mode Exit fullscreen mode

Top comments (0)