DEV Community

Discussion on: What is the best way for web scraping?

Collapse
 
peter_jachim profile image
Peter Jachim

I really like to use requests to make http requests, with bs4 to parse data. I am generally able to get what I need in about 4 lines of code, or about 10 to iterate through links that meet specific criteria. I think it generally looks pretty pythonic and does everything I need it to.

If there are a lot of tables, you can use pandas to read them in as dataframes, and if you need to click through pop ups or fill out forms you can use selenium (which is less presentable, but still super interpretable).

I actually used all of those techniques in this paper: arxiv.org/abs/2006.13990 (though the code is not available, so not super helpful for you).

Collapse
 
nishantwrp profile image
Nishant Mittal

Thanks! Looks like beautiful soup is a common choice!