DEV Community

Discussion on: The Easy Way to Scrape Instagram Using Python Scrapy & GraphQL

Collapse
 
karisjochen profile image
karisjochen

Do you mind sharing how you adjusted the code to use webscraping.ai instead? Thanks!

Collapse
 
drakula2k profile image
Vlad • Edited

Sure, here it is gist.github.com/Drakula2k/035cc5bd...
I also fixed a couple of bugs there

Thread Thread
 
karisjochen profile image
karisjochen

Thanks so much for sharing! After making the changes I am unfortunately still getting blocked by the robots.txt file. Is this code still working for you?

Thread Thread
 
drakula2k profile image
Vlad

Yes, it's working. You can disable the robots.txt check by setting ROBOTSTXT_OBEY = False on your settings.py. It works via an API so there is no need for the robots.txt check.

Thread Thread
 
karisjochen profile image
karisjochen

incredible, thank you! It worked! So is it always a good idea to set the ROBOTSTXT_OBEY = False considering we dont want to be stopped?

Thread Thread
 
drakula2k profile image
Vlad

Yes, ROBOTSTXT_OBEY is good when you're building something like a search engine and it may request all sorts of random URLs posted on the Internet. In that case, using robots.txt is good to skip non-public pages.

But if you're requesting particularly defined URLs or using an API, robots.txt is not so useful and may block access to the API.

Thread Thread
 
kaiwangyu profile image
kaiwangyu

thanks a lot, I learned a ton from your code... but im still get confused by the query_hash. may I ask how do you get this constant for this tpye of query,pls?

Thread Thread
 
drakula2k profile image
Vlad

Open Inspector in Chrome, visit Instagram and scroll through the posts, you'll see the same GraphQL queries with query_hash.
I'm not sure what query_hash value means exactly, but they're static for each type of query it seems.

Thread Thread
 
kaiwangyu profile image
kaiwangyu • Edited

Ohhh, I see, it's a constant number(every time drop-down the perfil), but for me it's a diferent number, not 'e769aa130647d2354c40ea6a439bfc08', by the way, thank you so much, I am beginner on Scrapy, and do you sugguest any book or tutorial to learn advanced project based on Scrapy, I already bought this book .

Kai
Merry Chrismas
Regards

Thread Thread
 
drakula2k profile image
Vlad

They may have changed something, but the old value still works too, it seems.
I'm not a specialist in Scrapy, but generally, I'd read official docs (docs.scrapy.org/en/latest/) and then start doing some projects using it and learn from them.