DEV Community

Discussion on: The Easy Way to Scrape Instagram Using Python Scrapy & GraphQL

karisjochen profile image
karisjochen

Thanks so much for sharing! After making the changes I am unfortunately still getting blocked by the robots.txt file. Is this code still working for you?

Thread Thread
drakula2k profile image
Vlad

Yes, it's working. You can disable the robots.txt check by setting ROBOTSTXT_OBEY = False on your settings.py. It works via an API so there is no need for the robots.txt check.

Thread Thread
karisjochen profile image
karisjochen

incredible, thank you! It worked! So is it always a good idea to set the ROBOTSTXT_OBEY = False considering we dont want to be stopped?

Thread Thread
drakula2k profile image
Vlad

Yes, ROBOTSTXT_OBEY is good when you're building something like a search engine and it may request all sorts of random URLs posted on the Internet. In that case, using robots.txt is good to skip non-public pages.

But if you're requesting particularly defined URLs or using an API, robots.txt is not so useful and may block access to the API.