It's a technology I do not know much about. But what I do know is that it's an amazing tool to solve great problems. Take self-driving cars for example- the thought of not having to steer the wheel sounds amazing, helping us to reduce traffic, fight climate change while also saving time and making streets safer.
But as with every great technology, there's a lot of buzz around it. And as a developer, to better understand what kind of problems it can solve and how to actually utilise machine learning, I took on a challenge - to become a machine learning engineer in 9 days
In this article, I'll walk you through my journey, processes and share some resources.
I started with almost zero knowledge and ended up building a tool that's able to detect OK and STOP signs in a video/webcam feed.
In order to timebox the task and have a clear path for learning, I created a plan.
- Day 1: Go through no-code ML tools to understand the basics of what's possible
- Day 2: Learn about the basic theory(Courses, MIT course, 3blue1brown or something)
- Day 3: Learn about the basic theory(Courses, MIT course, 3blue1brown or something)
- Day 4: Go through some practical video tutorial(s) that introduces me to tech
- Day 5-9: Build a tool that's able to recognize stop and ok signs in a video
The reality? I can already say, everything did not go as planned.
On my first two days, I learned a lot about no-code ML tools and got a basic understanding of what ML can do. I went through a list of different no-code ML tools and added brief comments about each of them.
It's quite amazing as there's software for enterprise-level but also ones for complete beginners.
For example, if you don't know anything about machine learning, I advise you to play around with Teachable Machine. It's the tool that really wowed me in the beginning as I was easily able to build a pose detector.
Another fun tool to use was Lobe, where I could build an image detector and learn about image labelling. Using Lobe felt like playing a game as its UI was really simple to use.
Peltarion seems to be a really thorough platform. For a beginner not knowing much about ML, there are many options that seemed confusing, but by going through the tutorials, things start to make sense. They also have helping materials to learn about ML/AI jargon.
Regarding Obviously.ai, I advise it for business people who want to understand what's possible with AI, as they seem to use more business jargon which I liked. It also gives the user example dataset to play around with. I made use of the Airbnb dataset.
My goal was to learn more about machine learning and deep learning.
I went through numerous videos and articles to get a basic understanding of things. I'd say the hardest part is to know what to learn and how to learn, as the number of resources out there is huge. How I approached it was that I started out with basic videos and then moved on to deep learning.
From there on, I took an MIT course on Deep Learning. I didn't understand everything, but it definitely gave a great overview of how neural networks work. Additional information was acquired from 3Blue1Brown's channel.
For the very least, you should have an understanding of simple terms like
- Precision, recall, True Positive, False positives, confusion matrix.
When introduced to deep learning, there are also phrases like
- convolutional layers
- learning rate
- gradient descent
And don't worry if you don't understand everything. I certainly do not, but it was enough for me to grasp the main concepts.
After 2 days of learning the basic theory, I wanted to do something more practical.
I found a video in freeCodeCamp YT channel by Deeplizard, which was super practical and well-explained. It went through all the steps on how to
- set up a local environment
- use Jupyter notebook
- create our own datasets for training
- train the model
- create neural networks
- check the confusion matrix.
If you want to learn the basics of how to build an image classifier, I strongly recommend that video. They also provide a source code that I used as a basis for my future endeavours.
By the end of the day, I used the source code provided by the tutorial, played around with it and got my first training results to differentiate cats from dogs! AWESOME!
On my 6th day, I started building my own project.
The goal was to build an application that's able to detect STOP and OK hand signs in a video so I could use those detections to cut out parts in my recordings that I don't need.
I took the source code from Deeplizard's tutorial and started modifying it based on my needs, only to realise that I need some training data for my model.
I created tons of images of myself with both STOP and OK signs, grouped them into test/train/validate folders and fed them to the deep learning model. Hopeful that it's gonna work, it didn't.
I consulted with one of my colleagues and got advice to firstly detect the hand part and then classify it. That knowledge made me rethink the whole approach.
To make the system work, I had to(for each frame)
- detect the hand(if it exists)
- Take a screenshot of the detected hand
- classify it's the hand's an OK or STOP sign
On my 7th day, I started searching for tutorials about object detection and quickly found some great ones about YOLO, or you only learn once object detection system.
The tutorials I used were created by Jay Bhatt and Pysource.
In these tutorials, I was introduced to Colab, which is basically an online version of Jupyter from Google for running your code and train models. It also connects your Google Drive, making file management quite convenient.
As it provides a possibility to use a remote GPU, it was so much faster than my local machine.
After going through the tutorials, it was time to set up my own hand detection. The enthusiasm I had was lost after 5 minutes when I had to start LABELLING MANUALLY.
It meant drawing boxes of around 500 hands in different images to create the training data for my future model. As a developer, I really dislike manual work and well... that was just painful. (ZOOM IN)
Nevertheless, after using the source code from the tutorial and tweaking it to my needs, I got my object detection to work! All the hands were actually detected.
I even tested it locally with a live webcam feed and boy it felt good to see an actual working piece of software using machine learning.
The 8th day started with high hopes - I had a working hand detector, now it was just necessary to classify them.
Using Deeplizard's source code as a basis, I gathered training data from the internet and trained to model. The results? It seemed to work okayish. Until I tested it out on real footage from my videos. It didn't work at all - STOP signs were recognized as OK.
I spent the day hacking around with the code and training data to make sense of what's going on.
On my ninth day, I came up with an idea to use YOLO detector and live-webcam feed to create training data. I set up the webcam and when it detected a hand, saved it in the training data(script created using Python).
I got about 2000 images and was feeling hopeful. When I sent an image from the video feed to the classifier, everything seemed to work. The validation accuracy was 98%, which was clearly too good to be true.
Testing real video footage was still not perfect but good enough for me as I could add extra checks on the application level.
I guess in order to improve the model, I need more training data from various situations and use augmentation(showing the same image in different ways).
Did I become a machine learning engineer? Definitely not. But I did come to understand how machine learning works and what are the kind of problems one could solve using it.
Having done this experiment, I understand how important the data in training the models are.
The main takeaways to me are that
- Firstly, Machine learning isn't as hard as it seems, at least on the basic level.
- You can learn a lot about the field by using no-code tools.
- For developers wanting to try out the tech, the amount of free resources is huge. Just go to Youtube, open freeCodeBootcamp and you'll already see some practical tutorials.
- The most important part in creating ML models is data and its preparation.
- And as with every technology, before using ML, one must ask what is the problem they are trying to solve.
To finish, I want to say that there are many things that I do not know I do not know, but having gone through the experience of building something myself certainly made me even more excited about the future of machine learning.
I genuinely hope I motived you to try out machine learning.
If you have thoughts and ideas about this article/video, then please leave a comment in the comments section below.
- Teachable Machine, https://teachablemachine.withgoogle.com/
- Lobe, https://lobe.ai/
- Obviously.ai, http://obviously.ai/
- Peltarion, https://peltarion.com/
- Machine Learning Basics by Simplilearn, https://youtu.be/ukzFI9rgwfU
- The 7 steps of machine learning by Google Cloud Tech, https://youtu.be/nKW8Ndu7Mjw
- Machine Learning Basics by Edureka, https://youtu.be/hjh1ikznScg
- All Machine Learning Models Explained in 5 Minutes by Learn With Whiteboard, https://youtu.be/yN7ypxC7838
- But what is a Neural Network? by 3Blue1Brown, https://youtu.be/aircAruvnKk
- MIT Course by Alexander Amini, https://youtu.be/5tvmMX8r_OM
- Keras with TensorFlow Course by Deeplizard, https://youtu.be/qFJeN9V1ZsI
- YOLO object detection by Pysource, https://youtu.be/h56M5iUVgGs
- YOLO object detection by Jay Bhatt, https://youtu.be/hTCmL3S4Obw
- About classification, Google's ML crash course, https://developers.google.com/machine...
- Precision, Recall, F1 score, True Positive by Codebasics, https://youtu.be/2osIZ-dSPGE
There were definitely more articles and resources I read but didn't unfortunately write down. So if you have some, please share them in the comments.