DEV Community

Terra
Terra

Posted on • Originally published at pourterra.com

Mathematics for Machine Learning - Day 18

High school mathematics

Honestly, this is because I forgot that the book shouldn't need a bachelors in mathematics to read :(

A more mathematical proof

So, remember the long breakdown I did before? So, turns out the author has a more fancy way (and probably the correct way) of finding out why there's an inverse in our formula before hand.

We are going to look at

T1AΦ=AΦSRm×n T^{-1}A_\Phi = A_\Phi S \in \reals^{m \times n}

With:

1. SRn×n being the transformation of idv with respect to B~ onto coordinates with respect to B \text{1. }S \in \reals^{n \times n} \text{ being the transformation of idv } \\ \text{with respect to }\tilde{B} \text{ onto coordinates with respect to } B \\

and

2. TRm×m being the transformation of idv with respect to C~ onto coordinates with respect to C \text{2. }T \in \reals^{m \times m} \text{ being the transformation of idv } \\ \text{with respect to }\tilde{C} \text{ onto coordinates with respect to } C

What does this mean?

We'll get to that, before that though, remember this?

B=(b1,,bn),B~=(b~1,,b~n)VC=(c1,,cn),C~=(c~1,,c~n)W B = (b_1, \dots, b_n), \tilde{B} = (\tilde{b}_1, \dots, \tilde{b}_n) \to V \\ C = (c_1, \dots, c_n), \tilde{C} = (\tilde{c}_1, \dots, \tilde{c}_n) \to W

The linear mapping can be notated as:

bj~=s1jb1,,snjbn=Σi=1nsijbi,j=1,,n \tilde{b_j} = s_{1j} b_1, \dots, s_{nj} b_n = \Sigma_{i = 1}^n s_{ij} b_i, j = 1, \dots, n

With the beggining of the quation bj being the vectors of the new basis B (with a weird eyebrow) of V.

And,

ck~=t1kc1,,smkcm=Σl=1mtlkcl,k=1,,m \tilde{c_k} = t_{1k} c_1, \dots, s_{mk} c_m = \Sigma_{l = 1}^m t_{lk} c_l, k = 1, \dots, m

With the beggining of the quation ck being the vectors of the new basis C (with a weird eyebrow) of W.

Finally

S=((sij))Rn×n and T=((tkl))Rm×m S = ((s_{ij})) \in \reals^{n \times n} \text{ and } T = ((t_{kl})) \in \reals^{m \times m}

being the transformation matrix with respect to B (eyebrow) onto coordinates with respect to B and being the transformation matrix with respect to C (eyebrow) onto coordinates with respect to C respectively

Now, let's find out about the first equation

ΦBj~=Cj~ \Phi \tilde{B_j} = \tilde{C_j}

Note:

The first half of the equation above is what's said in the book, but if it's a transformation of B after a basis change, then it's C

First method:

Apply Mapping

Applying the mapping (Φ)(j=1,,n) \text{Applying the mapping } (\Phi) (j = 1, \ldots, n)

Which results in

Φ(bj~)=k=1makj~ck~=k=1makj~l=1mtlkcl=l=1m(k=1mtlkakj~)cl \Phi(\tilde{b_j}) = \sum_{k=1}^{m} \tilde{a_{kj}} \tilde{c_k} = \sum_{k=1}^{m} \tilde{a_{kj}} \sum_{l=1}^{m} t_{lk} c_l = \sum_{l=1}^{m} \left( \sum_{k=1}^{m} t_{lk} \tilde{a_{kj}} \right) c_l

where we first expressed the new basis vectors

(ck~W) as linear combinations of the basis vectors (clW) (\tilde{c_k} \in W) \text{ as linear combinations of the basis vectors } (c_l \in W)

and then swapped the order of summation.

Second method

We express it as:

(bj~V) is a linear combinations of the basis vectors (bjV) (\tilde{b_j} \in V) \text{ is a linear combinations of the basis vectors } (b_j \in V)

Which we'll arrive at

Φ(bj~)=Φ(i=1nsijbi)=i=1nsijΦ(bi)=i=1nsijl=1malicl \Phi(\tilde{b_j}) = \Phi \left( \sum_{i=1}^{n} s_{ij} b_i \right) = \sum_{i=1}^{n} s_{ij} \Phi(b_i) = \sum_{i=1}^{n} s_{ij} \sum_{l=1}^{m} a_{li} c_l
=l=1m(i=1nalisij)cl,j=1,,n = \sum_{l=1}^{m} \left( \sum_{i=1}^{n} a_{li} s_{ij} \right) c_l , \quad j = 1, \ldots, n

Finally

where we exploited the linearity of matrix transformation which result in

k=1mtlkakj~=i=1nalisij,(j=1,,n),(l=1,,m) \sum_{k=1}^{m} t_{lk} \tilde{a_{kj}} = \sum_{i=1}^{n} a_{li} s_{ij},(j = 1, \ldots, n), (l = 1, \ldots, m)

and therefore,

TAΦ~=AΦSRm×n T \tilde{A_\Phi} = A_\Phi S \in \reals^{m \times n}

such that

AΦ~=T1AΦS \tilde{A_\Phi} = T^{-1} A_\Phi S

But Terra, does that mean the method in the last blog is wrong?

A simple answer is no, it's still stays true since it's working :D

also because the example stated that the matrix is a homomorphism, making the V and W the same.

What does that mean?

It means it's when bj and ck is the same, making T and S the same. The use of P is just arbitrary since we can just change it to T and S and it's still the same result.

What's clear is that if the matrix isn't a homomorphism, then the formula to use is this one, where it differentiates each of the basis change to ensure if the shape of the matrix is different the formula can still be used.


Acknowledgement

I can't overstate this: I'm truly grateful for this book being open-sourced for everyone. Many people will be able to learn and understand machine learning on a fundamental level. Whether changing careers, demystifying AI, or just learning in general, this book offers immense value even for fledgling composer such as myself. So, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong, thank you for this book.

Source:
Axler, Sheldon. 2015. Linear Algebra Done Right. Springer
Deisenroth, M. P., Faisal, A. A., & Ong, C. S. (2020). Mathematics for Machine Learning. Cambridge: Cambridge University Press.
https://mml-book.com

Top comments (0)