DEV Community

Cover image for Create Stocks Web Scraper with Python
Marianna
Marianna

Posted on

Create Stocks Web Scraper with Python

As investing and trading becomes more popular every other day, we, developers, are trying to make our lives easier and have more fun with our code. Today I will explain how to create a web scraper to scrape real-time stonks stock prices with Python.
So let's get started!

For this tutorial we will use Yahoo Finance web site to get stocks data.
You can enter this URL into your browser's search bar https://finance.yahoo.com/MSFT to get the idea what our scraper should look for on the web page.

If you use Google Chrome browser you can open web developer tools by using Ctrl+Shift+I keyboard shortcut. After that go to 'elements' tab and click on the mouse icon in the left-hand corner. You will be able to click on the element on the page and see what it HTML code looks like.

So after inspecting elements and finding what we should look for we can start building our web scraper!

Step 1

You need to install two very important Python libraries - BeautifulSoup and HTML parser html5lib. You can do that by running following commands in your terminal/command line:
pip install beautifulsoup4 html5lib
After successful installations we can start creating our web scraper!

Step 2

Create a working directory for our scraper (you can name it whatever you want 😊):
mkdir stocks-scraper
Then create a file called 'scraper.py'.
First of all import necessary libraries and functions:

import requests
from bs4 import BeautifulSoup
Enter fullscreen mode Exit fullscreen mode

Step 3
Then we will create a function that will accept a ticker (e.g. MSFT) and return real-time data:

def get_stock_data(ticker) :
    url = f'https://finance.yahoo.com/quote/{ticker}'
    headers = {'content-type': 'application/json', 'User-Agent': 'Custom'}
    r = requests.get(url, headers=headers)
    soup = BeautifulSoup(r.content, "html5lib")
    shortName = soup.find('h1', {'class' : 'D(ib) Fz(18px)'}).text
    bracket = shortName.find('(')
    shortName = shortName[: bracket]
    percent = soup.find('fin-streamer', {'data-field' : 'regularMarketChangePercent', 'data-symbol': ticker}).get('value')
currentPrice = soup.find('fin-streamer', {'class' : 'Fw(b) Fz(36px) Mb(-4px) D(ib)'}).text
    return {
        "symbol": ticker,
        "shortName": shortName,
        "currentPrice": currentPrice,
        "percent": percent
        }

Enter fullscreen mode Exit fullscreen mode

Let's look closely on our code.

  • Firstly we create an f-string that will represent an URL we want to scrape data from.
  • Then we will send a GET request with the specified headers (without headers your request can be blocked!).
  • After that we will parse the content of the page that we requested with the use of html5lib library.
  • And then we will search for the tags associated with the data we want. In this example we extract name of the company, percentage of the price change and current stock price.
  • After all we will return all the data in the Python dictionary.

To test our scraper run the following command in the working directory:

python
from scraper import get_stock_data
get_stock_data('MSFT')
Enter fullscreen mode Exit fullscreen mode

You should get similar result:

{'symbol': 'MSFT', 'shortName': 'Microsoft Corporation ', 'currentPrice': '299.84', 'percent': 1.0549038}
Enter fullscreen mode Exit fullscreen mode

That's all! You can use this scraper to get data of any stock that is present on Yahoo Finance site! But be careful - don't send too much requests too frequently or your IP will be blocked!

Thank you for reading and happy coding 😊!

Top comments (0)