DEV Community

Bum Kom
Bum Kom

Posted on

Crawler Web dev.to using Colly when learning Golang

I would like to recommend a website of mine that I made during my Golang learning.
My website http://techdaily.info is for learning golang language.
Besides crawling dev.to, I also crawl some other websites like freecodecamp.com, medium.com, hashnode.com, logrocket.com, infoq.com
So I built a website that specializes in crawling other sites
some technology that i used.

  • Golang
  • Colly
  • Nginx
  • Service
  • Docker
  • Mysql
  • Run action deploy to server
  • Cronjob daily crawl

Build Run Local

Change file app_example.yaml to app.yaml

cp app_example.yaml app.yaml
Enter fullscreen mode Exit fullscreen mode

Build Docker

docker-compose up --build
Enter fullscreen mode Exit fullscreen mode

Install package Golang

docker-compose exec crawl go mod tidy
Enter fullscreen mode Exit fullscreen mode

Folder vendor

docker-compose exec crawl go mod vendor
Enter fullscreen mode Exit fullscreen mode

Run Crawl

docker-compose exec crawl go run cmd/main.go
Enter fullscreen mode Exit fullscreen mode

Use air autoload

docker-compose exec crawl air -c .air.conf
Enter fullscreen mode Exit fullscreen mode

Deploy

Run file makefile build project into folder bin

make copy_template build_app_web build_app_crawl
Enter fullscreen mode Exit fullscreen mode

Create Services in run in background

Create Service and Run App Web

sudo nano /lib/systemd/system/app_web.service
Enter fullscreen mode Exit fullscreen mode

Copy Content

[Unit]
Description=App Web

[Service]
Type=simple
Restart=always
RestartSec=5s
WorkingDirectory=/root/actions-runner/crawl/crawl/crawl/bin
ExecStart=/root/actions-runner/crawl/crawl/crawl/bin/app_web

[Install]
WantedBy=multi-user.target
Enter fullscreen mode Exit fullscreen mode
sudo systemctl enable app_web
sudo systemctl start app_web
sudo systemctl status app_web
Enter fullscreen mode Exit fullscreen mode

Run App Crawl

./app_crawl
Enter fullscreen mode Exit fullscreen mode

Add CronTab

crontab -e
Enter fullscreen mode Exit fullscreen mode

add cron time

*/60 * * * * /root/actions-runner/crawl/crawl/crawl/bin/app_crawl crawl-article
*/20 * * * * /root/actions-runner/crawl/crawl/crawl/bin/app_crawl crawl-article-detail
Enter fullscreen mode Exit fullscreen mode

Reload cron run

sudo service cron reload
Enter fullscreen mode Exit fullscreen mode

Website

http://techdaily.info/


"Buy Me A Coffee"

https://github.com/chieund/crawl

Top comments (0)