DEV Community

Cover image for I built a website image scraper + AI classifier
Jon Lopez de Guereña
Jon Lopez de Guereña

Posted on

I built a website image scraper + AI classifier

I have been a member of the community in the shadows for many years, and I felt like it was time to write a post 😊.

As python is one of the vehicular languages in artificial intelligence concerns and it's incredible how AI and ML are improving day by day, I wanted to start learning about this subject.

At first I thought to start with one of the many python learning websites, but I dont feel like it's the best way to start learning a new lenguage when you already know how to code (just a personal opinion which works for myself).
I prefer to set some goals and then look for the information.⭐

So I decided to build a tiny pet project to get contextual information about a website based on it's images so i could struggle with python and AI in a more realistic way.

The application is able to scrap images from any website in order to classify them with tensorflow.
It's built on FastApi to expose a rest api for an easier management, and it's fully dockerized to deploy it anywhere (and not having to fight with CUDA drivers).

I think it could be useful to create image datasets or just analize websites to get contextual information about them.

Of course any suggestions are welcome! and feel free to check it out and give it a try 💗

py-web-image-scrapper-classifier

Description

This application, exposes an API to scrap and classify images from a given URL.

It's fully dockerized so you can deploy anywhere.

You can easily select the tensorflow model to use for the classification by changing the model_name variable in the imageclassificator.py file.

The options are:

/scrap To scrap the images from the given URL

/classify To scrap and classify the images from the given URL

/image To get the image from the given URL

Uses fastapi, tensorflow and uvicorn.

Usage

If you have CUDA installed, you can just run uvicorn

uvicorn main:app --reload
Enter fullscreen mode Exit fullscreen mode

go to http://localhost:8000/docs

Docker

If you don't have CUDA installed, you can use the docker image

docker build -t py-web-image-scrapper-classifier .
docker run -p 8000:8000 py-web-image-scrapper-classifier
Enter fullscreen mode Exit fullscreen mode



Top comments (2)

Collapse
 
cicirello profile image
Vincent A. Cicirello

I think your approach to learning a new language is the way to go --- give yourself a project you're interested in as motivation.

Collapse
 
jonloo profile image
Jon Lopez de Guereña

So true! it feels much easier to keep going when you set goals.