In a recent post, I introduced my open source Instagram web scraper
instascrape as a lightweight means of collecting data from Instagram using Python!
For this post, I'm going to walkthrough an example using one of
instascrape's recent additions: the ability to scrape an Instagram user's recent posts! With this data, we'll be able to visualize the trend in engagement for that user and see if their page is growing or declining 🙌.
We'll be visualizing data from my Instagram page @chris_greening (shameless self promo 😉) but feel free to remove my username and replace it with your own 😬
Now let's jump right in! To start, we'll
Profile scraper and load the data from Instagram:
from instascrape import Profile chris = Profile('chris_greening') chris.scrape() recent_posts = chris.get_recent_posts()
Out of the box,
selenium or similar)
Now that we have the data, let's create a
dict's that can easily be built into a
import pandas as pd posts_data = [post.to_dict() for post in recent_posts] posts_df = pd.DataFrame(posts_data) print(posts_df[['upload_date', 'comments', 'likes']])
which gives us
upload_date comments likes 0 2020-10-16 14:39:41 8 119 1 2020-10-15 13:11:42 21 165 2 2020-10-14 12:36:21 16 150 3 2020-09-28 12:17:21 6 164 4 2020-09-27 09:27:00 14 210 5 2020-09-26 11:38:27 16 217 6 2020-09-25 10:18:28 17 227 7 2020-09-24 11:01:04 20 239 8 2020-09-17 17:49:18 15 279 9 2020-09-14 10:05:24 14 316 10 2020-09-09 10:24:17 13 244 11 2020-09-08 09:06:05 33 393
Awesome! Now we can get to visualizing our data and see how the page is doing:
import matplotlib.pyplot as plt plt.style.use('seaborn-darkgrid') # Stylistic change plt.scatter(df.upload_date, df.likes) # Plot the data plt.xlabel('Upload Date') # Write labels plt.ylabel('Likes') plt.title('@chris_greening Likes per Post') plt.show() # Show graph
And that's it! As we can see, my Instagram is in fact trending downwards, yayyyy!... 😅
If you wanted to go further, you could use libraries such as
selenium to extend
instascrape and fit regressors to dynamically loaded data for a more comprehensive visualization as shown below:
Let me know your thoughts in the comments below or even better, check out the repo on Github and contribute!
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
instascrape: powerful Instagram data scraping toolkit
Instagram has gotten increasingly strict with scraping and using this library can result in getting flagged for botting AND POSSIBLE DISABLING OF YOUR INSTAGRAM ACCOUNT. This is a research project and I am not responsible for how you use it. Independently, the library is designed to be responsible and respectful and it is up to you to decide what you do with it. I don't claim any responsibility if your Instagram account is affected by how you use this library.
What is it?
instascrape is a lightweight Python package that provides an expressive and flexible API for scraping Instagram data. It is geared towards being a high-level building block on the data scientist's toolchain and can be seamlessly integrated and extended with industry standard tools for web scraping, data science, and analysis.
Here are a few of the things that…