DEV Community

Terra

Posted on • Originally published at pourterra.com

Mathematics for Machine Learning - Day 8

Dear mathmaticians.

I love you math and mathematicians... but why? Today I spent the most time trying to understand and the least amount page read... And so, it has become the meme of the day.

What happened? It all began with:

Algorithms for Solving System of Linear Equations

Until now, the examples and matrices chosen always has an answer, but if we don't make the assumption that there isn't a solution, we can make an approximation to the solution.

One way is using the approach of linear regression.

$Ax = B$

if A is a square matrix and is invertible (non-zero determinant)

$x = A^{-1}B$

If A does have a determinant of zero, we are using something called:

Moore-Penrose Pseudo-inverse

$Ax = B \\ A^{T}Ax = A^{T}B \\ (A^{T}A)^{-1}A^{T}Ax = (A^{T}A)^{-1}A^{T}B \\ x = (A^{T}A)^{-1}A^{T}B$

Though we can solve using these methods. There are some disadvantages when it comes to it:

1. Requiring many computations for matrix-matrix product
2. Requiring the important role of Gaussian elimination, what for?
• Compute determinant
• Compute inverse
• Check for linearly independancy
• Compute the rank of a matrix
• Find basis of a vector space

Well... Technically

In practice there are many linear equations that can be solved indirectly,

1. Stationary iterative method
• Richardson method
• Jacobi method
• Gauss-Seidel method
• Successive over-relaxation method
1. Krylov subspace method
• Generalized minimal residual

Vector Space

A structured space in which vectors live. I know, very helpful explanation.

But before we can learn about vector space, there's something that needs to be established.

Group

A set of elements and an operation defined on these elements that keep some structure of the set intact.

This is where the fun begins.

$\text{A set } \mathscr{G} \text{ and an operation } \bigotimes : \mathscr{G} \times \mathscr{G} \to \mathscr{G} \\ \text{then} \\ G := (\mathscr{G},\bigotimes) \text{ is called a group (with some conditions)}$

Conditions

Now, to make sure you understand how confused I was, I'll notate the equations first then continue with the explanation.

$\text{1. Closure of } \mathscr{G} \text{ under } \bigotimes : \forall x, y \in \mathscr{G} : \\ x \bigotimes y \in \mathscr{G}$
$\text{2. Associativity: } \forall x, y, z \in \mathscr{G} : \\ (x \bigotimes y) \bigotimes z = x \bigotimes (y \bigotimes z)$
$\text{3. Neutral element: } \exist e \in \mathscr{G} \forall x \in \mathscr{G} : \\ x \bigotimes e \And e \bigotimes x = x$
$\text{4. Inverse element: } \forall x \in \mathscr{G} \exist y \in \mathscr{G} : \\ x \bigotimes y = e \And y \bigotimes x = e \text{ where e is the neutral number}$

Where's the numbers?

Good question. I don't know, even the inverse portion doesn't have any -1 on it. Which I have to say... Impressive. Let's start with the very first thing.

G and G

I won't lie, neither KaTeX nor LaTeX have this G and further search leads me to no where. So I assume it's the same thing, which is a group. Here's what the custom-GPT math said

So what's a set?

If you're like me, you basically forgot this even existed aside from inside programming languages. It's very similar.

A set is a collection of unique/distinct objects. For example:

{1,2,3,...} is a set of natural numbers
{2,4,6,...} is a set of even numbers

Now you might think that "Oh, that's easy. Let's move on". Then let me ask you this:

Which one of these is a set?

${0,1,4,9,16,...} \\ {0,1,7,9,16,...}$

${2,3,5,7,11, 13...} \\ {34,1,6,7,3,4,9,10}$

Still both. Finally, this:

${1,1,2,3,5,7,9,...} \\ {\spades, \hearts, \diamonds, \clubs }$

I don't have to say it right? The Fibonacci sequence isn't a set.

\bigotimes or ⨂ (This is the KaTeX notation)

\bigotimes or Big'O as the kids say these days is a placeholder for binary operations. Not to be confused with or which also uses ⨂ and I spent so much time thinking it was these.

What do I mean by placeholder?

Let me give you an example

$x \bigotimes y = 4 \\ \text{if x = 2, what is the value of y?} \\ \text{Answer} = \text{Yes}$

Y can be 2, 1/2 and -2. Since Big'O can be an addition, a subtraction, a multiplication. All three answer is correct depending on further information.

Big'O can also be used in matrices or sets with the same concept. Here are the things Big'O can replace:

2. Subtraction
3. Multiplication
4. Union
5. Intersect

With that out of the way, let's continue with the conditions!

Closure

$\text{1. Closure of } \mathscr{G} \text{ under } \bigotimes : \forall x, y \in \mathscr{G} : \\ x \bigotimes y \in \mathscr{G}$

Closure means, with x and y being an element of the set G, the result of the operation is also an element from the set G.

That means, if G is a set of natural numbers, with X ⨂ Y being elements of the set G, we can deduce that:

1. X and Y > 0
2. X > Y
3. If X ⨂ Y = Z, Z is an element from the set G.

Associativity

$\text{2. Associativity: } \forall x, y, z \in \mathscr{G} : \\ (x \bigotimes y) \bigotimes z = x \bigotimes (y \bigotimes z)$

This is something we've already discussed extensively, so I won't go too long, it just means that the order of the operation doesn't matter.

Neutral element

$\text{3. Neutral element: } \exist e \in \mathscr{G} \forall x \in \mathscr{G} : \\ x \bigotimes e \And e \bigotimes x = x$

Neutral element is an element that when operated with any other value, the result will said other value.

Here are what counts as neutral elements:

$\text{Integers} = 0 \\ \text{Matrices} = \text{Identity Matrix} \\ \text{Sets} = \emptyset$

Inverse element

$\text{4. Inverse element: } \forall x \in \mathscr{G} \exist y \in \mathscr{G} : \\ x \bigotimes y = e \And y \bigotimes x = e \text{ where e is the neutral number}$

This condition goes hand in hand with the last condition, with the value x there exist y that when operated against x will return a neutral number.

Here's what I mean:

$x \bigotimes y = e \\ \text{if x = 3} \\ \therefore \text{y = -3}$

This means that the inverse of x is -3. What's interesting, the author notes that though inverse is often noted with -1 on the top of the element, it doesn't always mean to be divided by said value.

$X^{-1} \not = \frac{1}{X}$

And that's it for today.

If you think it's pretty short and without any mathematical background you can understand very quickly, you're welcome. I'm kidding, sort of. I took me so long to understand those notations and what it really means to be a set, operation, etc.

Since the author said this part is very important in computer science, I want to make sure I understood as clearly as possible and It'll take a while :D

P.S.

I'm starting to understand more and more regarding mathematical notations but whenever I feel like I can breeze a few pages, math just slaps me in the face and kick me in the nuts. Feels bad man.

Acknowledgement

I can't overstate this: I'm truly grateful for this book being open-sourced for everyone. Many people will be able to learn and understand machine learning on a fundamental level. Whether changing careers, demystifying AI, or just learning in general, this book offers immense value even for fledgling composer such as myself. So, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, thank you for this book.

Source:
Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge: Cambridge University Press.
https://mml-book.com