DEV Community

Discussion on: Web Scraping with Javascript and Node.js

Collapse
 
anderrv profile image
Ander Rodriguez

Hi, thanks for the ideas!
Promise.all is definitely an option if you know all the URLs beforehand. I think there is no way to add new ones after launching the process. And setting a concurrency limit would probably be difficult too.

As for the random proxy, yes, you could implement something similar to the sample function for headers. A list of all the proxies available and picking one at random for each request.

Collapse
 
mgonzalo profile image
Gonzalo Muñoz

Right! Didn't think of that. We need something like a sliding window, if you will. Maybe the queue is the simplest option then.