DEV Community

Cover image for How to install Scrapy on Mac into virtualenv 🙌
Stan Kukučka
Stan Kukučka

Posted on • Edited on

How to install Scrapy on Mac into virtualenv 🙌

🚁 Tutorial Update

Article updated on 18th November 2020 because of macOS Catalina system update and identified virtualenv installation issue.

Isolated Scrapy Installation

This article was a spark of bewilderment when I found out I can install Scrapy into a separate environment and keep the whole stuff separated from the system. Just simply, keep it isolated from your macOS system. For this purpose, we will use virtualenv. Don't worry, you'll pick up this concept pretty quickly, just keep going.

Briefly What Is Scrapy About

Scrapy allows you to write custom functions for your crawling spider. Spider than can process (scrape) data for example from websites you want to in the meaning of collecting data, removing data, and saving data to a database or other filetype you want to be known as CSV, XML, or JSON. Let's jump to it.

Install Homebrew

Let's start with the installation of Homebrew. If you're not sure if Homebrew is on your system just check it with brew -v command and check it if working properly with brew doctor command.

brew -v
brew doctor
Enter fullscreen mode Exit fullscreen mode

In case no reply after one or another command will happen it's signal is not on your system. You should receive brew version info after brew -v or after brew doctor message like: "Your system is ready to brew."

If none of these appears just head to the website https://brew.sh/ copy main command and paste that in a macOS terminal. The whole command you gonna paste into your macOS terminal looks like this one hereunder.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
Enter fullscreen mode Exit fullscreen mode

Install Python3

After this step system may ask for restart and installation of additional updates. This is highly recommended as I found out problems to install virtualenvs without it. Now go ahead with python3 installation.

brew install python3
Enter fullscreen mode Exit fullscreen mode

Check if python3 is installed and for what version you already have on your system just type this command. Notice it needs to be a capital letter -V so the whole command is:

python3 -V
Enter fullscreen mode Exit fullscreen mode

Install Virtualenv

Now install the virtual environment into your system. We'll use virtualenvwrapper because with simple pip3 install virtualenv command you may facing issues after macOS Catalina update.

pip3 install virtualenv virtualenvwrapper
Enter fullscreen mode Exit fullscreen mode

Then after type this command to edit zshrc file.

nano ~/.zshrc
Enter fullscreen mode Exit fullscreen mode

Nothing special, we'll just simply add this info into the file to specify info about virtualenvs.

# Configuration for virtualenv
WORKON_HOME="${HOME}/.virtualenvs"
export WORKON_HOME
Enter fullscreen mode Exit fullscreen mode

Check if the virtual environment is now correctly installed on the system and look for version info.

virtualenv --version
Enter fullscreen mode Exit fullscreen mode

You should receive a message into your terminal prompt like this.

virtualenv 20.1.0 from /usr/local/lib/python3.9/site-packages/virtualenv/__init__.p
Enter fullscreen mode Exit fullscreen mode

Create Working Directories and Activating Virtualenv

Now we are going to create a working directory and enter to it to activate virtualenv and fill it with libraries we need to as Scrapy and iPython shell (it will make syntax in Scrapy shell more readable in other words "beautifully colorful"). Use these commands one by one.

Recognize during this tutorial Virtualenvs with capital V stays for the directory (you can name this main directory whatever you want to) and virtualenv stays for command.

mkdir Virtualenvs
cd Virtualenvs
virtualenv scrapyenv
source scrapyenv/bin/activate
Enter fullscreen mode Exit fullscreen mode

The last command will activate the virtual environment named scrapyenv (you can name it whatever you want to as well). You will recognize it in the macOS terminal as your command line will start with (scrapyenv) and then with the user login username. After it follows with the directory name you are already in. In this case it's Virtualenvs. It looks like this example hereunder.

(scrapyenv) yourusername@123 Virtualenvs % 
Enter fullscreen mode Exit fullscreen mode

Important thing about repetitive activation of virtual environment scrapyenv is you need to be inside Virtualenvs directory so cd Virtualenvs command is important to do because source scrapyenv/bin/activate command won't work for you in any other directory. Simple scrapyenv is environment directory inside Virtualenvs directory, so enter into Virtualenvs before.

Now you can get rid of the fear of the unfamiliar. Our environment will make a sort of discipline to install the main libraries.

Install Scrapy Into Environment

Let's install Scrapy using pip command into already activated environment.

pip3 install scrapy
Enter fullscreen mode Exit fullscreen mode

There'll start the quite obsessive downloading process of multiple libraries. You can then check the version of Scrapy with this command.

scrapy -V
Enter fullscreen mode Exit fullscreen mode

It will reward you at the beginning of output with version info and notification that there is no active project at this moment.

Scrapy 2.4.1 - no active project
Enter fullscreen mode Exit fullscreen mode

Install Ipython Into Environment

And let's add iPython for making Scrapy shell looks more friendly to us.

pip3 install ipython
Enter fullscreen mode Exit fullscreen mode

Check Both Scrapy And Ipython

The installation will stay there in your activated environment named as scrapyenv and will stay there even after deactivation. You can check if things are there simple by checking directories or with commands if it's all installed correctly for both Scrapy and iPython. Let's check if Scrapy is fine. Type python then import scrapy. To check Scrapy module type scrapy.

python
>>> import scrapy
>>> scrapy
Enter fullscreen mode Exit fullscreen mode

You will receive a response where the Scrapy module is located like this.

<module 'scrapy' from '/Users/yourusername/Desktop/Virtualenvs/scrapyenv/lib/python3.9/site-packages/scrapy/__init__.py'>
Enter fullscreen mode Exit fullscreen mode

Exit python command line with exit()

To check if iPython installation is fine just paste these two commands into a terminal. Start with ipython and then import this. You will receive a beautiful poem "The Zen of Python" from Tim Peters into your terminal prompt.

ipython
In [1]: import this
Enter fullscreen mode Exit fullscreen mode

And exit it with same command exit().

Deactivation And Activation Of Virtualenv

Now that’s finished, the dream, the passion, the someday playground you have wished for your Scrapy inside the environment is here. You can now deactivate environment by simple command deactivate or by entering Virtualenvs directory and activating with source scrapyenv/bin/activate again.

deactivate
source scrapyenv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Ready To Start Scrapy Shell

Now you are ready to play around with your isolated Scrapy installation. You can start your Scrapy shell simple with the command scrapy shell.

scrapy shell
Enter fullscreen mode Exit fullscreen mode

First Scrapy Command To Fetch URL

And fetch your very first URL as a test if crawling with Scrapy works fine.

fetch("https://dev.to")
Enter fullscreen mode Exit fullscreen mode

This simple command will come with a reply to the command prompt that everything is fine and the website is giving you server response 200.

[scrapy.core.engine] DEBUG: Crawled (200) <GET https://dev.to> (referer: None)
Enter fullscreen mode Exit fullscreen mode

Hope you found this installation introduction helpful. If you have any question, feel free to leave a comment or send me message here, so we can discuss. Happy Scraping.

Thanks to Eric Krull for the cover image from Unsplash.

Top comments (1)

Collapse
 
nathan_walter profile image
nathan walter

I'd just like to personally thank you for this clear and concise tutorial. It was exactly what I was searching for. I hope that I can do that for some coder someday.