DEV Community

Bye Bye 403: Building a Filter Resistant Web Crawler - Part 1: What is Web Scraping?

kaelscion on September 29, 2018

originally published on the Coding Duck blog: www.ccstechme.com/coding-duck-blog If you program with Python, or are interested in the topic, a...
Collapse
 
ferricoxide profile image
Thomas H Jones II

Yeah... You're on really shaky ground with your assertion that they have to be explicit in saying "no, you can't do this." The presence of a robots.txt and other software-based dicouragements has been interpreted by some courts as being an implicit "don't do this." Also, some courts see such actions not just through a civil-liability lens but criminal liability. So, proceed at your own risk and understand the possible jurisdictions where your actions might be tried ...and don't gloss over it when encouraging others (as doing so can carry its own liabilities).

Collapse
 
kaelscion profile image
kaelscion • Edited

I apologize, let me edit this post to update this information and thank you SO much for pointing it out. I have done a lot of this work over the years and routinely spoke to an Intellectual Property lawyer about this legality and this is how the "playing field" has been described to me on repeated occasions. I really appreciate you stepping up and informing me about this error in my assertion. I suppose the understanding I have is through the lens of this attorney's personal experience and, perhaps even the "landscape" that we have here in ME, USA. I will update this post IMMEDIATELY to include a "proceed at your own risk" rather than giving it the "it is my understanding that...". Thank you so much!

EDIT COMPLETE. POST UPDATED