Idea
When I was in college I had developed a project with jupyter notebook, which consumes data from the Netflix Prime Video Movies and TV Shows set. The idea was to use this set of data, clean, analyze and develop a stage where I could recommend movies and TV shows.
I was very happy with the result. But I wanted more, I wanted to take this notebook and transfer it to an application where I could interact with the project. So create a personal project where I can use what I studied and learned over time.
But something was missing, which was how am I going to show this result of my project. During that time I discovered this tool Streamlit, ohhhhhhhhhh!!!!! Incredible !!! The flexibility I gained using it was very good and in addition to being able to deploy using their platform, this way I can show what I did.
I want to thank Kaggle - @shivamb, for making the sets below available. In addition to the Netflix set, there are 3 more.
- Netflix Movies and TV Shows
- Hulu Movies and TV Shows
- Disney+ Movies and TV Shows
- Amazon Prime Movies and TV Shows
From these 4 sets, the idea of creating a single one came up to be able to expand the data further, to be able to create more recommendations. Follow the link below.
4 Services Streaming Movies and Tv Shows
If you want to understand the process more, I have a post and 4 more notebooks where I explain the notebook I created.
- Post - K-Means Recommend Movies and Tv Shows
- Hulu Notebook
- Amazon Notebook
- Disney Notebook
- Notebook Netflix
Build
After this development, I took my notebooks and separated how I wanted to structure the project.
Containers
Using containers I intended to separate in two ways.
Pipeline container: This container would be the preprocess and model training stage.
Streamlit container: This container is an application using Streamlit.
I decided to do it this way so that I can isolate the steps of my machine. This way I can maintain consistency within the project, avoiding other applications or environment patterns that interfere with development.
I added volume to both containers, to be able to capture the results from the pipeline container and be able to consume them in the application container. With the way I used to carry out the deployment, I didn't use containers, they were used to work locally. In the deploy stage I will add more details.
Preprocess
This step was essential to obtain two new sets, initially I needed to unify the 4 sets to obtain just one and I also added a new column to identify which streaming channel.
The first set was obtained to meet the problem of grouping movie and TV show genres. I needed to clean the data and separate a column for 5, this column had the genres of the films from the initial set all grouped together. You can check this step in this notebook that I developed to create the unified set on the Kaggle platform. Clear data and Generate Dataset.
The second set was created using the initial set but now undergoing data cleaning and other necessary methods. In the training stage I use this set again, but now receiving one more column, this column is the number of each grouping carried out by k-means. This way I can search by number for recommendations for a specific film.
Training
The model I used was Algorithm - K-means. I kept the same algorithm that I used in the notebooks, despite having other plans to evolve this stage. I wanted to do something simple at first. I looked for this algorithm to solve my problem where I need to group different genres of movies and TV Shows. After pre-processing, I managed to obtain a data set where there are only gender columns, thus training my model.
Application Streamlit
To bring my idea to life, I used Streamlit. I added a directory called src and inside it I added everything focused on Streamlit. Next to src after executing the pipeline container stage, the generated files go to a directory called data where the initial to final sets are located. This way I don't need to run the pipeline when uploading my application.
The application was developed within the main.py file, where the visual part of the application is located. I separated the other files that would be the stages of the application's functionalities. This way I don't end up with a huge script, if I need to carry out maintenance it becomes more practical.
Having a single directory for this application, when using the container I transfer everything I will use from the application, starting the container I test how the result turned out and then if I need to change something the volume will be useful so that there is no need to stop and start again.
Deploy
Streamlit is an incredible tool, in addition to having several functionalities, you can deploy it to the cloud and share your application. If you see the list of shared projects, you will fall in love with each one. Very good projects.
With this facility, I needed to create my account on the platform, connect my repository, configure some things, such as where my application file is or which branch of the project. Then click to deploy, wait to build the project. That's it, your application is now available.
You can check out this project that was developed via the link below and the repository.
Blueflix Streamlit
When I was in college I had developed a project with jupyter notebook, which consumes data from the Netflix Prime Video Movies and TV Shows set. The idea was to use this set of data, clean, analyze and develop a stage where I could recommend movies and TV shows.
I was very happy with the result. But I wanted more, I wanted to take this notebook and transfer it to an application where I could interact with the project. So create a personal project where I can use what I studied and learned over time.
But something was missing, which was how am I going to show this result of my project. During that time I discovered this tool Streamlit, ohhhhhhhhhh!!!!! Incredible !!! The flexibility I gained using it was very good and in addition to being able to deploy using their platform, this way I canโฆ
Conclusion
This personal project is a dream that I am developing, I want to evolve it further with the skills I acquire along the way. I won't always be adding updates, because I have ideas of other projects that I want to evolve, but I won't stop paying attention. I hope that other developers understand my codes and that I can transfer what I learned in this time. I hope you enjoyed it. Please, if you could leave a like on my post or on my notebooks, I would really appreciate it, so I can know if you liked it. Thank you for reading this far.
About the author:
A little more about me...
Graduated in Bachelor of Information Systems, in college I had contact with different technologies. Along the way, I took the Artificial Intelligence course, where I had my first contact with machine learning and Python. From this it became my passion to learn about this area. Today I work with machine learning and deep learning developing communication software. Along the way, I created a blog where I create some posts about subjects that I am studying and share them to help other users.
I'm currently learning TensorFlow and Computer Vision
Curiosity: I love coffee
Top comments (2)
Cool project! Nice one. ๐
Thanks !!! ๐๐