DEV Community

Cover image for How I Scraped Anime Filler List Website to get list of episodes of anime
Hamza
Hamza

Posted on

How I Scraped Anime Filler List Website to get list of episodes of anime

At the start of this year I was learning Web Development, mainly ( HTML, CSS, JavaScript ) and I wanted to make projects to get a better understanding of concepts. Since I loved watching anime at that time, I thought it would be a great idea to make a clone of 'Anime Filler Website'. So, I started working on it. But instead of using an API ( or I couldn't find one ) I went along with the idea of scraping Anime Filler Website for the details I needed.

Here's how I did it:

Libraries

These are the libraries I used. I used 'requests' to send request to the required website and get the response back along with the required data. I used 'bs4' (BeautifulSoup) so that we can parse the data that we got from the request. I used the library 'json' to convert python array of objects into json array of objects. And lastly I used 'shutil' to move the json file from one location to another.

Anime List

Here I created a list of anime that I wanted to get the episode list of. It is a big list so I did not mention all the name in the image.

Source Code

The function getJsons() in the above image is the main program. It takes the name of the anime as in the website as a parameter. It then makes a request to the website and gets back a response which is then stored in the variable 'result'. Then using bs4 it converts the text data inside result to a parsable data and stores it inside the variable 'soup'. Then we are free to select whichever HTML tag we want from the variable soup using soup.select() and then filtering the arrays to get only the text inside the HTML tags.
Then it initializes an array so that we can store our python objects inside it. Loops throuhgh the list of the textContents and checks if the selected episode is a filler, if it is a filler then adds an HTML class 'red' as a class in the object else add a class 'green'. Stores all the data under required titles in the object and then finally appends the object to the array.
It then converts the resultant array into a json array of bbjects using library 'json'. It then makes a file with anime-name.json and stores our array inside it.
Finally using library 'shutil' it moves the json file from current directory to where we want it.

run

We run this function for all the anime in the anime list.
Thank You!! Hope you learned something from the post.
Here is the link to the github repo of my project AnimVille

Discussion (0)