DEV Community

Terra
Terra

Posted on • Originally published at pourterra.com

Mathematics for Machine Learning - Day 2

A brief disclaimer.

Today and all the other days are where I'll suck, I'll still dedicate the same amount of time to read a day, but the content might be less :D since even if a formula is proven, it's better to try and disprove it too. So I'll bribe you with a meme from reddit.

Matrix Meme

A mini review for vectors

  1. An array of numbers (Computer Science)
  2. An array with direction and magnitude (Physics)
  3. An object that obeys addition and scaling (Mathematics)

Chapter 2 (Linear Algebra)

Examples of Vectors

  1. Geometric vectors
  2. Polynomials
  3. Audio
  4. Elements of real numbers

Why?

The simple explanation is, all four examples obey the mathematical definition that these values can be added and becomes a new set of vector.

Vector Mindmap

Vector Mindmap

The tl;dr:

  1. Matrix are composed of vectors and represents system of linear equations
  2. System of linear equations can solve matrix inverse by using gaussian eliminations
  3. Vector are the properties of linear independence, which is explained alongside linear/affine mapping in Chapter 12 Classifications

System of Linear Equations

Why should you know system of linear equations?

The simple flow is if a problem can be transformed to system of linear equations, then that problem can be solved using linear algebra.

Though there is a caveat that the amount of equations must be the same as the amount of unknown variables (Commonly... I'll explain this later on)

Example

No solutions

(1)x1+x2+x3=3(2)x1x2+2x3=2(3)2x1+3x3=1 \begin{align*} (1) \quad & x_1 + x_2 + x_3 = 3 \\ (2) \quad & x_1 - x_2 + 2x_3 = 2 \\ (3) \quad & 2x_1 + 3x_3 = 1 \end{align*}

Adding (1) and (2):

(x1+x2+x3)+(x1x2+2x3)=3+22x1+3x3=5 \begin{align*} (x_1 + x_2 + x_3) + (x_1 - x_2 + 2x_3) &= 3 + 2 \end{align*} \\ \begin{align*} 2x_1 + 3x_3 &= 5 \end{align*}

This contradicts (3):

2x1+3x3=1 \begin{align*} 2x_1 + 3x_3 &= 1 \end{align*}
Found Solution
(1)x1+x2+x3=3(2)x1x2+2x3=2(3)x2+x3=2 \begin{align*} (1) \quad & x_1 + x_2 + x_3 = 3 \\ (2) \quad & x_1 - x_2 + 2x_3 = 2 \\ (3) \quad & x_2 + x_3 = 2 \end{align*}

Subtracting (1) and (2) -> (4):

(4)2x2x3=1 \begin{align*} (4) \quad & 2x_2 - x_3 &= 1 \end{align*}

Subtracting (3) and (4):

(3)x2+x3=2(4)2x2x3=1 \begin{align*} (3) \quad & x_2 + x_3 = 2 \\ (4) \quad & 2x_2 - x_3 = 1 \end{align*}

Subtracting (3) and (4):
3x3=3x3=1 \begin{align*} -3x_3 = -3 \\ \therefore x_3 = 1 \end{align*}

With x3 found, it's just a matter of replacing the known equations and we'll find

x1=1,x2=1,x3=1 \begin{align*} x_1 = 1, x_2 = 1, x_3 = 1 \end{align*}

Surprise Quiz!

There's one more problem. There's no contradictions and (technically) there's an answer. But what's more fun is, try and see what makes this problem difficult to solve by creating a question yourself!

(1)x1+x2+x3=3(2)x1x2+2x3=2(3)2x1+3x3=5 \begin{align*} (1) \quad & x_1 + x_2 + x_3 = 3 \\ (2) \quad & x_1 - x_2 + 2x_3 = 2 \\ (3) \quad & 2x_1 + 3x_3 = 5 \end{align*}

No cheating >:( It's about having fun learning, not finding the answers.


Three base concepts of matrices

  1. Additions and subtractions. The rule is the row and column of both matrix are the same.
    Amn+Bmn=Cmn\begin{align*} A_{mn} + B_{mn} &= C_{mn} \end{align*}
  2. Multiplication (Dot Product). The rule is the number of columns of the first matrix (In this case A) needs to have the same value as the row of the second matrix (In this case B).
    AmnBnk=Cmk\begin{align*} A_{mn} \cdot B_{nk} &= C_{mk} \end{align*}
    The result will return a new matrix with the size of (row A x Column B)
  3. The Identity Matrix A NxN-matrix with the main diagonal being filled with 1 while the rest are zeros.
    I3=(100010001)I_3 = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}
    The main diagonal is when the row and column index is the same. So row 1 column 1, row 2 column 2, etc.

Properties of Matrices

Much like the base concepts, there are three properties described in the book before we delve into Inverse and Transpose.

Associativity

ARm×n,  BRn×p,  CRp×q  :  (AB)C=A(BC) \forall A \in \mathbb{R}^{m \times n}, \; B \in \mathbb{R}^{n \times p}, \;\\ C \in \mathbb{R}^{p \times q} \; : \; (AB)C = A(BC)

Don't be scared or annoyed by these symbols! This just means for all (V) A, B, and C that belongs to real numbers (R) with the dimensions of mxn,nxp,pxq respectively will create the equation.

P.S. If you weren't scared or annoyed, congratulations you're not me.

Distributivity

A,BRm×n,  C,DRn×p  :(A+B)C=AB+BC:A(C+D)=AC+CD \forall A,B \in \mathbb{R}^{m \times n}, \;\\ C,D \in \mathbb{R}^{n \times p} \;\\ : (A+B)C = AB+BC\\ : A(C+D) = AC+CD

Much like scaling the matrix, when there's both addition/subtraction along side multiplication, the multiplication is distributed throughout the addition/subtraction.

Multiplication with identity matrix

ARm×n:Im×mAm×n=Am×nIn×n=Am×n \forall A \in \mathbb{R}^{m \times n}\\ : I_{m \times m} A_{m \times n} = A_{m \times n} I_{n \times n} = A_{m \times n}

When a matrix is multiplied by an identity matrix, the resulting matrix won't change. The same goes for an identity matrix multiplied by a matrix, the result will not change the shape nor the values of the matrix multiplied.

Acknowledgement

I can't overstate this: I'm truly grateful for this book being open-sourced for everyone. Many people will be able to learn and understand machine learning on a fundamental level. Whether changing careers, demystifying AI, or just learning in general, this book offers immense value even for fledgling composer such as myself. So, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, thank you for this book.

Source:
Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge: Cambridge University Press.
https://mml-book.com

Top comments (0)