DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’» is a community of 963,274 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Create account Log in
Cover image for Setting up your AMD GPU for Tensorflow in Ubuntu 20.04
Shawon Ashraf
Shawon Ashraf

Posted on • Originally published at shawonashraf.github.io

Setting up your AMD GPU for Tensorflow in Ubuntu 20.04

If you've been working with Tensorflow for some time now and extensively use GPUs/TPUs to speed up your compute intensive tasks, you already know that Nvidia GPUs are your only option to get the job done in a cost effective manner. All you need to have is a GeForce GPU and you can get started crunching numbers in no time. But what about AMD GPUs? I mean, it's been some time that the Team Red has hitting back at the Team Green, they should be a viable option for compute intensive tasks like Deep Learning and such, right? The answer is complicated actually. You can, but not without going the extra mile.

ROCm

I'll keep it brief here since discussing on ROCm isn't the intent of this article and I don't want to open up a large can of worms. In short, ROCm is AMD's answer to Nvidia's CUDA. Thanks to this, you can now easily use various GPU dependent computation libraries and software with AMD GPUs which could previously be used with Nvidia GPUs only. You can read more about it here on their official page.

GPU support

Although ROCm opens up new possibilities for AMD GPUs, not all of them can support it. As of now, only Vega, Polaris, Fiji and Hawaii GPUs are supported. Despite being a recent and popular release, Navi wasn't included and nobody knows why! Check the full list here.

For this setup process I'm using a Radeon VII GPU.

OS Support

It's Linux only as of now. Even so, AMD has builds for only Ubuntu, RHEL and CentOS. As the title says, I'll be setting up ROCm on Ubuntu.

Setup

ROCm

  • Before you begin, make sure to have your system up to date. Run the following commands in Terminal.
sudo apt update
sudo apt dist-upgrade
Enter fullscreen mode Exit fullscreen mode
  • Install the dependency libnuma-dev for ROCm.
sudo apt install libnuma-dev
Enter fullscreen mode Exit fullscreen mode
  • Once libnuma-dev gets installed, add the official ROCm repos to apt
wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add -
Enter fullscreen mode Exit fullscreen mode
echo 'deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main' | sudo tee /etc/apt/sources.list.d/rocm.list
Enter fullscreen mode Exit fullscreen mode
  • Install the ROCm kernel
    sudo apt update
    sudo apt install rocm-dkms
Enter fullscreen mode Exit fullscreen mode
  • Add your user to the VIDEOGROUP
sudo usermod -a -G video $LOGNAME
sudo usermod -a -G render $LOGNAME
Enter fullscreen mode Exit fullscreen mode
  • Open /etc/adduser.conf and add these lines
sudo nano /etc/adduser.conf
Enter fullscreen mode Exit fullscreen mode
ADD_EXTRA_GROUPS=1
EXTRA_GROUPS="render,video"
Enter fullscreen mode Exit fullscreen mode
  • Open /etc/udev/rules.d/70-kfd.rules and add the following
sudo nano /etc/udev/rules.d/70-kfd.rules
Enter fullscreen mode Exit fullscreen mode
SUBSYSTEM=="kfd", KERNEL=="kfd", TAG+="uaccess", GROUP="video"
Enter fullscreen mode Exit fullscreen mode
  • Install libtinfo5
sudo apt install libtinfo5
Enter fullscreen mode Exit fullscreen mode
  • Add ROCm binaries to your path (bash or zsh whichever you use)
echo 'export PATH=$PATH:/opt/rocm/bin:/opt/rocm/profiler/bin:/opt/rocm/opencl/bin/' | sudo tee -a /etc/profile.d/rocm.sh
Enter fullscreen mode Exit fullscreen mode
  • Test if your installation was successful or not. If your installation was successful, you should be able to see the supported GPUs installed on your system in the output.
sudo /opt/rocm/bin/rocminfo
sudo /opt/rocm/opencl/bin/clinfo
Enter fullscreen mode Exit fullscreen mode

Tensorflow

  • Install the dependency packages
sudo apt install rocm-libs hipcub miopen-hip
Enter fullscreen mode Exit fullscreen mode
  • Install rccl from source. The apt package no longer works.
sudo apt install cmake
git clone git@github.com:ROCmSoftwarePlatform/rccl.git
cd rccl
sudo ./install.sh -i
Enter fullscreen mode Exit fullscreen mode
  • Create a virtualenv using python. (Use python3)
# cd into some dir
python3 -m venv ./env

# activate env
source env/bin/activate
Enter fullscreen mode Exit fullscreen mode
  • Install Tensorflow ROCM
pip install tensorflow-rocm
Enter fullscreen mode Exit fullscreen mode
  • You're all done now! Time to test this Tensorflow setup with some python code.

Testing the setup

Open up your favourite text editor and execute the following python script in the venv we created to install Tensorflow.

import tensorflow as tf


x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x*x + y*y + 2


tf.print(f)
Enter fullscreen mode Exit fullscreen mode

Output should be something like this

2020-03-12 22:32:31.858480: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libhip_hcc.so
2020-03-12 22:32:31.909918: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1573] Found device 0 with properties:
pciBusID: 0000:05:00.0 name: Vega 20     ROCm AMD GPU ISA: gfx906
coreClock: 1.801GHz coreCount: 60 deviceMemorySize: 15.98GiB deviceMemoryBandwidth: -1B/s
2020-03-12 22:32:31.948506: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
2020-03-12 22:32:31.949600: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
2020-03-12 22:32:31.950580: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so
2020-03-12 22:32:31.950766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so
2020-03-12 22:32:31.950855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-03-12 22:32:31.951100: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
2020-03-12 22:32:31.955707: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3299240000 Hz
2020-03-12 22:32:31.956437: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7b95380 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-03-12 22:32:31.956476: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-03-12 22:32:31.959003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1573] Found device 0 with properties:
pciBusID: 0000:05:00.0 name: Vega 20     ROCm AMD GPU ISA: gfx906
coreClock: 1.801GHz coreCount: 60 deviceMemorySize: 15.98GiB deviceMemoryBandwidth: -1B/s
2020-03-12 22:32:31.959067: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
2020-03-12 22:32:31.959094: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
2020-03-12 22:32:31.959118: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so
2020-03-12 22:32:31.959141: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so
2020-03-12 22:32:31.959285: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-03-12 22:32:31.959398: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-12 22:32:31.959421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-03-12 22:32:31.959434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-03-12 22:32:31.959730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15306 MB memory) -> physical GPU (device: 0, name: Vega 20, pci bus id: 0000:05:00.0)
27
Enter fullscreen mode Exit fullscreen mode

Done!

That's it! You can now use your AMD GPU with Tensorflow on your Ubuntu installation.

Top comments (0)

🌚 Browsing with dark mode makes you a better developer.

It's a scientific fact.