DEV Community

loading...
Cover image for So it goes: instascrape v2.0.0 is in the works

So it goes: instascrape v2.0.0 is in the works

chrisgreening profile image Chris Greening ・2 min read

Hello everyone!

Instagram's been cracking down on web scraping and thus, instascrape's v1.x.x releases are starting to feel dated (despite being less than 4 months old). I've officially started working on what will become instascrape 2.0.0 that will be released some point in the near future.

If you've been experiencing issues with the library, you are not alone and I'm working as fast as I can to get us back to business! With these updates, I'm also going to push forward with a wave of new docs, blog posts, reference material, and features.

Where I'm at

Prior to this week, I have been able to roll with Instagram's little changes to their backend through minor and patch releases.

Unfortunately, their most recent change was significantly harder to figure out. Luckily, 12+ hours and a dozen or so coffees later, I have determined what needs to be done and am implementing it in the code as we speak! 😄

Sneak peak?

Here are some of the changes and features I am anticipating:

  • MASSIVE overhaul/refactor of instascrape's backend implementation (you won't really notice, don't worry)
  • dedicated session and cookie handling
  • official support for selenium (webdriver batteries will not be included, just supported)
  • possible login capabilities (no guarantees)
  • significantly more tools and features outside of the scrapers
  • likely shift away from inplace data modification for stronger method chaining and encapsulation (the only planned breaking change as of now)

Will there be breaking changes?

I'm going to keep the API as consistent with v1.x.x as I can and the changes are not going to be off the wall. You're going to see 99% more new features than you'll see changed features.

The only thing I recommend you to keep an eye on is the shift away from inplace data modification as this could result in code such as profile.scrape() needing to be replaced with profile = profile.scrape().

When to expect the update? ⌚

I plan on releasing 2.0.0 sometime before February so you shouldn't have to wait too long. I am in the midst of job hunting and doing some freelance work so I'm working on the lib whenever I get a chance but it likely won't be for at least another week or two.

Thanks for reading!

If you stuck with me this far, thanks so much for reading! Follow me on Twitter @ChrisGreening as I will likely be tweeting there with small updates.

Additionally, you can check in on the progress (or even contribute) on the major-version-2 branch of the repo on GitHub.

Cheers,

Chris

GitHub logo chris-greening / instascrape

Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically

instascrape: powerful Instagram data scraping toolkit

Version Downloads Release License

Activity Dependencies Issues Code style: black

What is it?

instascrape is a lightweight Python package that provides an expressive and flexible API for scraping Instagram data. It is geared towards being a high-level building block on the data scientist's toolchain and can be seamlessly integrated and extended with industry standard tools for web scraping, data science, and analysis.

Key features

Here are a few of the things that instascrape does well:

  • Powerful, object-oriented scraping tools for profiles, posts, hashtags, reels, and IGTV
  • Scrapes HTML, BeautifulSoup, and JSON
  • Download content to your computer as png, jpg, mp4, and mp3
  • Dynamically retrieve HTML embed code for posts
  • Expressive and consistent API for concise and elegant code
  • Designed for seamless integration with Selenium, Pandas, and other industry standard tools for data collection and analysis
  • Lightweight; no boilerplate or configurations necessary
  • The only hard dependencies are Requests and…

Discussion (0)

pic
Editor guide