David

Posted on Jul 6, 2021

SpaCy 3 on a Google Cloud Compute Instance to train a NER Transformer Model

#nlp #machinelearning #spacy #gcloud

Here you will find a step by step guide (last tested and working July 2021) on how to install and use Spacy 3.0 (and Cupy) on a Google Cloud GPU powered instance. I wrote this article in order to spare others whole days testing and installing packages. I've already wasted them, why should you? ;)
I used this architecture to train a NER Transformer Model.

Softwares versions:

cuda v11.2
spacy v3.0

GCloud instance creation

Create a virtual machine instance: is a google cloud virtual machine with this setup

GPU machine
Serie: A2
Machine: a2-highgpu-1g
GPU: 1 x NVIDIA Tesla a100
Image: Debian GNU/Linux 10 (buster)

WARNING: you must modify the standard disk space: 10gb are not enough (at least for my needs). I used 30gb.

NVIDIA driver installation

Connect via ssh to the created virtual machine, update the system and install some useful packages with these commands

sudo apt-get update && sudo apt-get upgrade
sudo apt-get -y install pciutils software-properties-common wget g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev

Check if your gpu is cuda enabled. If not there is probably a problem with your architecture you need to investigate further.
You should have at least one positive output.

lspci | grep -i nvidia

Let's clean eventually previous installation and packages:

sudo apt-get purge nvidia*
sudo apt remove nvidia-*
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt-get autoremove && sudo apt-get autoclean
sudo rm -rf /usr/local/cuda*

gcc compiler is required for development using the cuda toolkit. to verify the version of gcc installed enter

gcc --version

if not present, install it

sudo apt-get -y install gcc

Install kernel headers needed by Nvidia drivers:

sudo apt-get -y install linux-headers-4.19.0-16-cloud-amd64

Now download and install the latest nvidia driver for Debian 10. This is the most up-to-date drivers at the time I'm writing this article: https://www.nvidia.com/Download/driverResults.aspx/173142/en-us. If you decide to install more up-to-date drivers (which I recommend) you'll also probably need to accordingly adjust something else from this guide.
If you want to look for some other update / architectures: https://www.nvidia.com/Download/index.aspx?lang=en-us

# download drivers
wget https://us.download.nvidia.com/tesla/460.73.01/NVIDIA-Linux-x86_64-460.73.01.run
# make it executable
chmod u+x NVIDIA-Linux-x86_64-460.73.01.run
# install the drivers
sudo ./NVIDIA-Linux-x86_64-460.73.01.run

When asked, do not install 32-bit compatibilty packages.
Check that the drivers have been correctly installed with:

nvidia-smi

The ouput should be now something like this. If the command cannot find any GPU, there is something wrong (check for new drivers et similia) and continuing in this guide will be pointless:

CUDA11.3 Toolkit installation

Install NVIDIA CUDA 11.3 toolkit packages for Debian 10. For other installations (not considered in this article) please refer to this useful NVIDIA link: https://developer.nvidia.com/cuda-downloads

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/ /"
sudo add-apt-repository contrib
sudo apt-get update
sudo apt-get -y install cuda-11-2

If asked to remove one NVIDIA package proceed with yes.
Check that the drivers are still correctly installed with:

nvidia-smi

output should be like the previous one.

Spacy installation

We will now create a python virtualenv, install spacy and check if spacy can access the GPU.

# install useful package
sudo apt-get -y install python3-venv
# creates venv
python3 -m venv myvenv
# activate it
source myvenv/bin/activate
# upgrade pip
pip install --upgrade pip

# install spacy
pip install -U spacy
# download the trf model
python -m spacy download en_core_web_trf

# install other pip packages and dependencies
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
# point to the correct cuda folder
export CUDA_PATH="/usr/local/cuda-11"
# install spacy transformers info
pip install -U spacy[cuda113,transformers]

# and install the correct version of cupy
# here more info: https://docs.cupy.dev/en/stable/install.html#installing-cupy
pip install cupy-cuda113

Test spacy and cupy: run python and the following commands

python
>>> import spacy
>>> spacy.require_gpu()

the output must be simply

True

Another test you can do to be absolutely sure everything is correctly installed, always inside a python console:

>>> import cupy
>>> a = cupy.zeros((1,1))

this commands should give no output at all. If it does, it will probably be an explanatory error/exception.

The end

You are now ready and you can use your GPU inside spacy or any other systems using cupy.
Feel free (and please do it) to reach me out for any error you may find or any question you may have.
This article is also a gist here: https://gist.github.com/DavidGerva/86bba9a23e4376e4303d3ca02a422612

References:

This guide is an adaptation to my needs and "today" of this material I found online and I tested over and over again till this working solution: should work "as is".

Top comments (2)

Avihay Bar • Jul 17 '21

thanks for the guide, it was very helpful!
i did run into some issues with the installation on the kernel, needed to run sudo apt install linux-headers-$(uname -r) which fixed the issue.

also, it seems that you are actually installing CUDA V11.2

David • Aug 24 '21

Hi, thank you @avihaybar for your comment. You're right! In some way I installed it. Did you installed the 11.3 following my guide? So I know how to fix it! Thank you!!

DEV Community

SpaCy 3 on a Google Cloud Compute Instance to train a NER Transformer Model

GCloud instance creation

NVIDIA driver installation

CUDA11.3 Toolkit installation

Spacy installation

The end

References:

Top comments (2)

Read next

How to Choose the Right Algorithm for Model Training

Computer Vision Meetup: Using Elastic Vector Search in FiftyOne

T5 (Text-to-Text Transfer Transformer)

Natural Language Planning Boosts Code Generation Capabilities of LLMs