I know that there are A LOT of tutorials/blog posts on neural networks already (some of my favourites include 3B1B series on YouTube),
but I am a big advocate of learning by doing. So this series will not just present a bunch of information to you,
but actually asking you to implement the things we covered in each post.
The inspiration of Artificial Neural Networks (or neural network for short) comes from Biological Neural Networks. But I haven't had a biology class
since high school so I have no idea how a biological neural network works :) but I bet it looks something like this:
For this tutorial, we will go through the primitive building block of an Artificial Neural Networks, which is a perceptron.
- Coordinates geometry
Perceptron and its learning rule is not popular anymore, but it is a great start for building an understanding of how everything works.
The goal of perception is to classify sets of points.
Definition: A perceptron is a function that takes several inputs, and produces one output.
where w's are the weights, f is the activation function (explained below), x's are the inputs, and y is the output.
This is basically putting a polynomial into some function called activation function.
And the goal of perceptron is to classify sets of points.
To understand the importance of weights, it's useful to think about the case where we only have 2 inputs.
Considering only the part where we multiply inputs by weights and sum them up, we have:
Notice that this equation is very similar to the standard form of linear equation, which is of the form:
Consider the following diagram, where we want to classify point A and point B (i.e. finding a way to separate them).
In the diagram, the line has equation
Looking visually, it's clear that the line separates the two points. Below is the mathematical explanation.
From coordinate geometry, we know that any points to the "above" or "to the left" of the line (e.g. point A) will satisfy
and any points to the "below" or "to the right" of the line (e.g. point B) will satisfy
With that straight line, we have successfully classified point A and point B into 2 classes.
But that only works visually, not mathematically yet. To make it work mathematically, we need the activation function.
So a question to ask it how do we find the weights that will correctly classify the points we have. The answer to that is through perceptron learning, and we will cover that in the next post.
More often that not, we want the output in the range 0 to 1 only, to notate if that certain perceptron is activated or not.
So we need some function, called activation function, to do that for us.
One simple way to achieve that is to use the heaviside function, which converts all negative numbers to 0, and all positive numbers (including 0) to 1.
Therefore, we have correctly classify points A and B mathematically.
Weights will determine if a straight line (or plane in higher dimension) can separate the points into classes. Only a set of weights will be able to separate the points.
Activation function is just a function that generalize all points that fit certain criteria.
Write a function that takes a list of pairs of coordinates, and a list of classes, determine if the given weights will be able to classify the classes.
def is_correct_weights(coords, classes, w_0, w_1, w_2) -> bool: pass # example from above coords = [(-0.7, 2.7), (1.5, 1.1)] # classes[i] is class of coords[i] classes = [0, 1] is_correct_weights(coords, classes, -1, -2, 1) # True is_correct_weights(coords, classes, -1, 0, 1) # False
NOTE: do it in any language you want, but it is recommended to use Python, since we will use Python much more later on.