Create Stocks Web Scraper with Python

#python #stocks #tutorial

As investing and trading becomes more popular every other day, we, developers, are trying to make our lives easier and have more fun with our code. Today I will explain how to create a web scraper to scrape real-time ~~stonks~~ stock prices with Python.
So let's get started!

For this tutorial we will use Yahoo Finance web site to get stocks data.
You can enter this URL into your browser's search bar https://finance.yahoo.com/MSFT to get the idea what our scraper should look for on the web page.

If you use Google Chrome browser you can open web developer tools by using Ctrl+Shift+I keyboard shortcut. After that go to 'elements' tab and click on the mouse icon in the left-hand corner. You will be able to click on the element on the page and see what it HTML code looks like.

So after inspecting elements and finding what we should look for we can start building our web scraper!

Step 1

You need to install two very important Python libraries - BeautifulSoup and HTML parser html5lib. You can do that by running following commands in your terminal/command line:
pip install beautifulsoup4 html5lib
After successful installations we can start creating our web scraper!

Step 2

Create a working directory for our scraper (you can name it whatever you want 😊):
mkdir stocks-scraper
Then create a file called 'scraper.py'.
First of all import necessary libraries and functions:

import requests
from bs4 import BeautifulSoup

Step 3
Then we will create a function that will accept a ticker (e.g. MSFT) and return real-time data:

def get_stock_data(ticker) :
    url = f'https://finance.yahoo.com/quote/{ticker}'
    headers = {'content-type': 'application/json', 'User-Agent': 'Custom'}
    r = requests.get(url, headers=headers)
    soup = BeautifulSoup(r.content, "html5lib")
    shortName = soup.find('h1', {'class' : 'D(ib) Fz(18px)'}).text
    bracket = shortName.find('(')
    shortName = shortName[: bracket]
    percent = soup.find('fin-streamer', {'data-field' : 'regularMarketChangePercent', 'data-symbol': ticker}).get('value')
currentPrice = soup.find('fin-streamer', {'class' : 'Fw(b) Fz(36px) Mb(-4px) D(ib)'}).text
    return {
        "symbol": ticker,
        "shortName": shortName,
        "currentPrice": currentPrice,
        "percent": percent
        }

Let's look closely on our code.

Firstly we create an f-string that will represent an URL we want to scrape data from.
Then we will send a GET request with the specified headers (without headers your request can be blocked!).
After that we will parse the content of the page that we requested with the use of html5lib library.
And then we will search for the tags associated with the data we want. In this example we extract name of the company, percentage of the price change and current stock price.
After all we will return all the data in the Python dictionary.

To test our scraper run the following command in the working directory:

python
from scraper import get_stock_data
get_stock_data('MSFT')

You should get similar result:

{'symbol': 'MSFT', 'shortName': 'Microsoft Corporation ', 'currentPrice': '299.84', 'percent': 1.0549038}

That's all! You can use this scraper to get data of any stock that is present on Yahoo Finance site! But be careful - don't send too much requests too frequently or your IP will be blocked!

Thank you for reading and happy coding 😊!

DEV Community

Create Stocks Web Scraper with Python

Top comments (0)

Read next

Practical Experience: Integrating Over 50 Neural Networks Into One Open-Source Project

AI-Powered Web Dev: Build a Full Stack App with Just a Few Prompts Using Supabase & Lovable

Why Spaces Are Encoded: %20 with encodeURI and +(plus) with URL / Differences Between encodeURI and URL

Enhancing Observability in Machine Learning with OpenTelemetry: InsightfulAI Update