instascrape is a lightweight library designed for scraping data from Instagram using Python! It makes no assumptions about your project and is instead designed for flexibility and productivity so you can get on your way and start exploring Instagram data easily and efficiently.
Here is a quick glimpse into a scrape that was accomplished using selenium and instascrape to gather how many likes per post a user got per post in 2020.
You can install from PyPI with ye old
$ pip3 install insta-scrape
or clone from the official repo with
$ git clone https://github.com/chris-greening/instascrape.git
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
instascrape: powerful Instagram data scraping toolkit
What is it?
instascrape is a lightweight Python package that provides an expressive and flexible API for scraping Instagram data. It is geared towards being a high-level building block on the data scientist's toolchain and can be seamlessly integrated and extended with industry standard tools for web scraping, data science, and analysis.
Here are a few of the things that
instascrape does well:
- Powerful, object-oriented scraping tools for profiles, posts, hashtags, reels, and IGTV
- Scrapes HTML, BeautifulSoup, and JSON
- Download content to your computer as png, jpg, mp4, and mp3
- Dynamically retrieve HTML embed code for posts
- Expressive and consistent API for concise and elegant code
- Designed for seamless integration with Selenium, Pandas, and other industry standard tools for data collection and analysis
- Lightweight; no boilerplate or configurations necessary
- The only hard dependencies are Requests and…
Let's start by scraping some data from a totally random Instagram page that is definitely not mine 😉
from instascrape import Profile profile = Profile('chris_greening') profile.scrape()
And that's it! In those 3 lines, we scraped 52 data points related to @chris_greening's page. We got how many followers, how many posts, whether they have a business profile, whether they're verified, etc.
Profile, we also have the
Hashtag objects which work with almost the exact same syntax! With methods such
instascrape integrates nicely with tools like pandas and matplotlib so you can scrape, explore, and analyze your data with just a few lines of code. Integration with selenium is encouraged so you can get a powerful Instagram scraper going in no time!
We've only just scraped the surface so dig into the docs 📘 or even better, check out the source and contribute! Being such a young library (started Hacktoberfest 2020), the sky is the limit and it's only going to get more powerful from here 🙌