Discussion on: How to crawl website using #bash script?

View post

Sm0ke • Jun 27 '19

Crawling means to grab a page and extract page data into a structured format.

Wget does the first part, download the page. For the second phase, you can use Scrapy or BeautifulSoup

Arissk • Nov 19 '19

You can also stay in bash using hxselect and other html and xml bash tools

powerexploit • Jun 27 '19

I admire your response.
You are right to crawl i can use some python like u explain

Or also use some tools

Mohammed Samgan Khan • Jun 27 '19

that's what I was wondering, wget will only download the page. Crawling means going through the content of the page.

powerexploit • Jun 27 '19

I admire your response I know its not a pure crawling but if I only want to crawl one page then I will use wget
Or to download or crawl whole I will surely some python stuffs