A foreword from me from the future : Are you looking to get started in Machine Learning? Wondering whether the Stanford Machine Learning Course on Coursera by Andrew Ng is the best place to start? Look no further! In this blog, I will outline my journey through the course along with my thought process and guide to additional resources. Finally, tips on where you might go upon completion. Without further ado, lets rewind back time shall we?
The date is the 21st of November, 2020. I look around the internet for free resources to learn machine learning. Intense research on Google recommends me Stanford’s Machine Learning Course on Coursera as its top pick. Although the course is completely free, I apply for financial aid just because. I make plans to dedicate an hour each day to working on the course material while in the midst of a busy Freshman Year CS schedule. My background going into the course is as follows :
- Linear Algebra : High-School
- Statistics : High-School
- Programming : C, beginner-level Python
- Machine Learning : NULL
The date is the 23rd of November. I start the course.
Okay this first week seems pretty simple. I get myself familiar with the different types of machine learning and am introduced to the concepts of cost function and gradient descent. High school calculus makes this week feel like a breeze. Partial differentiation - which although new to me - ends up being something I am able to grasp easily. The linear algebra content seems simple enough to skimp on.
A note from future me : Dear me, I am proud of you for completing the first week of the course despite how outdated the slides and content may seem in terms of visual quality. You finally have a proper idea of the different types of machine learning and will later discover a third type called “Reinforcement Learning”, but alas that is for the future. For now you are good 😌 . Also, you will be surprised at how crucial a role the concepts of cost and gradient descent will play in the future weeks and in your overall understanding of Machine Learning.
Okay, so this seems largely similar to the content of Week 1, but with broader scope. I like how it all naturally flows from Week 1 to Week 2. I see that we have a programming assignment this week. Seems intimidating but turns out to be pretty simple. The Octave installation (Octave being the language used to do the course assignments, with an alternative being Matlab) turns out to be pretty straightforward. I complete the programming exercises via the text editor. Oh, the submission facility seems very intuitive! Color me impressed!
A note from future me : Dear me, again proud of you for maintaining consistency. Although, I wish you had not been so intimidated by the first assignment. It is VERY simple. VSCode has this octave extension that simplifies the entire process it turns out - something you will discover in the following days.
Oh, we’re finally doing something different! Classification! Surprisingly, a lot of the same concepts from the previous weeks manage to carry over to this week as well. What’s this though - a new word? Sigmoid … such a peculiar name. The strangeness of the term makes the concept stick. Although I have trouble understanding classification in its entirety , I manage to complete the week.
A note from future me : Oh past me, I wish I could tell you about two things that might have really helped you connect everything together. First, well this picture really :
… which is pretty self - explanatory. The second being telling you to imagine a circular decision boundary where the circle expands based on the size of the feature x. This would have helped you get to Week 4 faster without all the meandering, wondering whether you were really ready for Week 4 - until doing so nonetheless.
Neural Networks! Finally get to know what those are! I understand it as a process of multiple logistic regression with a network that selects its own features. But I do not understand the hype behind their ubiquity in modern ML applications. For the first time, the programming assignment seems a lot harder, but looking through the different tutorials and discussions related to the course, I manage to complete it.
A note from future me : No dear me, a neural network is not multi-step logistic regression. The sigmoid, as you will discover very later, is only one of the many available activation functions. Alas, I am afraid you might complete the course with a very shallow understanding of neural networks. This video
might have helped immensely, had I been able to show it to you.
Okay, so looks like we’re going into the deep end of how neural networks work! Backpropagation - a neural network calculating its errors and going in reverse looks like? So I’m guessing that’s the difference between neural networks and regular regression - they use backpropagation instead of gradient descent. Getting to the assignments this week - Oh my are they a mess! How do they deviate so much from how I’d understood it? I barely pass the assignment - that too thanks to a very detailed guide in the resource section - although I have to admit my understanding is clearly lacking.
A note from future me : Yes past me - that week was indeed a mess no matter how you look at it! Unfortunately, that is one of the pitfalls of a course that is no longer actively maintained. And no past me - it’s not that neural networks do not use gradient descent - it's more the case of them using backpropagation in combination with gradient descent. I really wish I could have directed you to this video
which I’ve found to explain backpropagation in the clearest way possible. It would have saved you all those hours of confusion. However, I am glad you stuck through with the course as that is only one of the only two weeks that I’d describe as categorically bad.
I am very hesitant to start this new week after all the shock from the previous one. But what’s this though - looks like we’re done with neural networks! That’s a relief! This week proves to be one of the most interesting, informative and intuitive weeks in the entire course, dealing with the concept of fine-tuning your models. I feel relieved to have stuck through the hell that was Week 5, and go on to easily complete the week's assignments, with some help from the resources section of course.
We move to an entirely different algorithm this week - Support Vector Machines. I have a hard time grasping the concept of kernels. Those ugly feelings from Week 5 are starting to pop up again - Oh no! I completely fail the week’s quiz. Fortunately however, I discover this video series
which does a really good job explaining how kernel functions work. Although I cannot grasp all the math behind it, the intuition proves adequate to get through the quiz. Unfortunately, the programming assignment does not get any easier. A single error on one line causes me to waste an entire day on debugging as this week’s assignment takes a lot longer than usual to run on the terminal.
A note from future me : You can pat yourself on the back - you have survived the last hell week of the course. It’s smooth sailing from here on out!
This week I learn about my first unsupervised learning algorithm, K-Means, and thankfully it proves to be pretty straightforward. I also learn about dimensionality reduction, another simple concept at face-value but its importance showing to be unmistakably obvious - although I do not understand all of the math behind it completely. I begin to understand the importance of having a better grasp of linear algebra for machine learning.
A note from future me : You are absolutely correct past me - A solid foundation of linear algebra is crucial to machine learning, and as you will discover in the future - so is statistics - especially when you eventually get to reinforcement learning. Not to worry though, as your high school linear algebra will prove to be adequate to finish off the course.
I am nearing the end of the course. As such, this week looks to present more general applications of machine learning in different ways - namely anomaly detection and recommender systems. Although the concept seems simple, I have a tough time wrapping my head around recommender systems. I end up failing the quiz. After failing it a few more times, I discover this helpful video
with which I finally manage to pass. The programming assignment for the week ends up being one of the shortest - I complete it in an hour, again with help from the resources section of course.
A note from future me : Look at that - that was your final programming assignment! Now then, let’s finish off the course!
These prove to be the shortest - each taking about an hour to complete. I finish off the final two weeks in the span of two days. And would you look at that - there’s my Certificate! I’m ecstatic that I have something to show for on LinkedIn for all that effort! The date is the 2nd of January. I seem to have finished well ahead of time!
A note from future me: Congrats me! Your future self is proud of you for having stuck through it and maintaining consistency despite a busy schedule. (Although have to say, the LinkedIn thing sounds kind of pathetic doesn’t it? Ahh well, as long as it gets you going)
In the end, was it worth it? A course not maintained after 2011 with little visual appeal. Assignments in a language barely used in modern machine learning applications. Deep learning and neural networks, which are at the forefront of Machine Learning today, being an aspect that is merely glanced upon, and that too with little clarity. Looking back at it now, how many stars would I rate it out of five if I had to rate it as a starting point for individuals looking to enter the field of machine learning? A sparkling five stars of course! Barring the relatively few inconveniences, I believe the course was some of the best use of my time, as now I can confidently explore different avenues of machine learning with the solid foundation laid by the course. Few courses offer the level of theoretical understanding in the manner this one does across all aspects in the field. Believe the hype around this course - it’s real!
So where to from here? It's been about two weeks since I've finished the course. From exploring the options out there ,what I’ve found is that Kaggle’s plethora of mini-courses are the best way to find out for yourself the answer to that very question!
Their Intro to Machine Learning, Intro to Deep Learning and Pandas courses are what I’ve found to be the best starting points. You will surely be amazed at how fast you will be able to grasp how to work with Python’s different machine learning libraries Scikit-learn and Tensorflow + Keras with your newfound foundational knowledge of machine learning!
At the time of writing, I am trying out Scikit-learn and Pandas using different datasets across the internet, the link to which you can find here :
This is where I store all of the datasets and notebooks I use to practice with scikit-learn, a free software machine learning library for Python. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy (to quote Wikipedia)
All datasets are ones I've found after looking through the internet for specific datasets that would help me practice with a specific algorithm/set of algorithms, as well as gain experience with Pandas, a data manipulation and analysis library for Python
You are completely free to use the datasets as you please, as well as go through my notebooks to see how I've implemented varoius algorithms using particular datasets. Click through the links below to go through the associated notebooks.
I am also in the process of polishing my linear algebra knowledge with this course on YouTube :
My next goal is to start fastai’s Practical Deep Learning for Coders at the end of February. You can expect a next peek at my diary in a few months !
Any questions? Fire away below!