DEV Community

Aadit Kamat
Aadit Kamat

Posted on • Updated on

Linear Algebra basics: Part 1

I have been reviewing Linear Algebra concepts to get a better grasp of Machine Learning through a MOOC called "Mathematics for Machine Learning: Linear Algebra" offered by the Imperial College of London on Coursera. This blog post aims to summarize some of the basic concepts in Linear Algebra.


What are vectors?

Geometric representation of a vector

Geometric representation of a vector (source: Math Insight)

Vectors lie at the heart of linear algebra. Geometrically, they are directed line segments represented by two quantities: direction and magnitude. You can represent the magnitude of a vector by the length of the line segment and its direction by the direction in which the arrow points.

How are vectors represented?

Vectors are often represented by a letter symbol with an arrow on top such as v\vec{v} or in bold as v. The latter representation is much easier to use while typesetting mathematical equations.

The magnitude of a vector, sometimes also referred to as the norm of the vector, is represented as v|\vec{v}| or v (no bolding).

Direction of a vector
Direction of a vector (source: Story of Mathematics)

How is the direction of a vector represented? This is done in terms of the angle it forms with a set of standard bases, or axes as you would know them. In 2 dimensions, these would be the x and y axes. In 3 dimensions, these would be x, y and z axes. You can extend this idea to any number of higher dimensions.

What are standard bases?

Standard Bases(source: Wikipedia)

They are vectors of unit length (magnitude 1) that are orthogonal (perpendicular) to each other. In 2 dimensions, the x and y axes form a plane and in 3 dimensions, the x, y and z axes form a cuboid.

The reason why these are called standard bases is because they are the ones often used to describe vectors. Denoting the x axis as i^\hat{i} , y axis as j^\hat{j} and z axis as k^\hat{k} , we can describe any vector v\vec{v} as v=ai^+bj^+ck^\vec{v} = a\hat{i} + b\hat{j} + c\hat{k} where a, b and c are real numbers.

However, we could use other sets of bases to describe a vector space. These could be of unit length and orthogonal to each other, just like the standard bases, but the main idea is that you can have any vectors form bases as long as they are linearly independent.

What is linear dependence?

If you can describe one of the vectors in the set as a linear combination of the other vectors, then these vectors are linearly dependent.

Consider the case of 2 or 3 vectors in a regular 3 dimensional space. If you can describe one vector as a linear combination of the other vectors ( a=nb\vec{a} = n\vec{b} ) then it points in the same or the opposite direction but with its magnitude increased (scaled) by a factor of n. This means that it still lies along the same line as the other vector.

On the other hand, if you can describe one vector can be as a linear combination of two other vectors ( a=mb+nc\vec{a} = m\vec{b} + n\vec{c} ), then it means that it lies in the plane formed by those vectors because that is what the vector addition signifies.

How can you mathematically represent vectors?

You can operate with vectors in mathematics as lists of numbers, formed by writing down the values of the components along the different axes, wrapped in round or square brackets.

For example,
(3 4 5)\begin{pmatrix} 3 \ 4 \ 5 \end{pmatrix}
is a 3-dimensional vector that has 3 units of length in the direction of the x axis, 4 in the direction of the y axis and 5 in the direction of the z axis.

To make things more standardized for discussion, instead of axes, most linear algebra texts use the term standard bases, but know that these essentially refer to the same concept.
You can represent these as (1 0)\begin{pmatrix} 1 \ 0 \end{pmatrix} and (0 1)\begin{pmatrix} 0 \ 1 \end{pmatrix} in the 2 dimensional case and (1 0 0)\begin{pmatrix} 1 \ 0 \ 0 \end{pmatrix} , (0 1 0)\begin{pmatrix} 0 \ 1 \ 0 \end{pmatrix} and (0 0 1)\begin{pmatrix} 0 \ 0 \ 1 \end{pmatrix} in the 3 dimensional case.

Hence, you can represent any vector v^=ai^+bj^+ck^\hat{v} = a\hat{i} + b\hat{j} + c\hat{k} as (a b c)\begin{pmatrix} a \ b \ c \end{pmatrix} which is actually a(1 0 0)+b(0 1 0)+c(0 0 1)a \begin{pmatrix} 1 \ 0 \ 0 \end{pmatrix} + b \begin{pmatrix} 0 \ 1 \ 0 \end{pmatrix} + c \begin{pmatrix} 0 \ 0 \ 1 \end{pmatrix}

Is there an alternate way to represent vectors?

You can also represent them is in terms of r, the magnitude of the vector, and the angles it forms with some of the bases. In the 2 dimensional case, for example, you can do so with variables r and θ\theta , while for the 3 dimensional case, you can use r, θ\theta and α\alpha .

Using the bracket notation, you can write these as (r θ)\begin{pmatrix}r \ \theta \end{pmatrix} and (r θ α)\begin{pmatrix}r \ \theta \ \alpha \end{pmatrix} respectively.

Polar coordinates
Polar Coordinates(source: Wikipedia)

The coordinates of the point that lie at the end of the vector are known as polar coordinates. By varying the angle, you can describe objects like circles and spheres in the 2 dimensional and 3 dimensional cases.

How can vectors be used in the real world?

When you store numerical data in tables in terms of rows and columns, you can think of each of these rows/columns as a vector. For example, in a table consisting of the heights and weights of different people, each person's height and weight is a row vector whereas the list of heights or the list of weights is a column vector.

What are some operations you can do on vectors?

You can add two vectors following the parallelogram law or the triangle law. In order to add two vectors, you need to place the tail of the second vector on the head of the first. You can easily extend this to the general case of adding n vectors to get a polygon.

You can scale a vector by a factor of n. This essential means that you increase its magnitude by a factor of n. If n is positive, then the vector is stretched in the same direction. If n is zero, then the vector is reduced to a point. Otherwise, the vector is stretched in the opposite direction.

Dot Product(source: Math Insight)

You can take the dot product of two vectors, a.b\vec{a}.\vec{b} , which is a scalar that is:

  1. The projection of vector a in the direction of b multiplied by the magnitude of b or
  2. The projection of b in the direction of a multiplied by the magnitude of a.

Mathematically, it works out to be a.b=abcos(θ)\vec{a}.\vec{b} = |\vec{a}||\vec{b}| cos(\theta)

Vector Projections(source:

The projection can be thought of as the shadow of one vector onto the other when light rays fall on it perpendicularly.
The special cases are when: 1. two vectors are orthogonal, (the dot product is zero as θ=90°;cos(θ)=0\theta=90\degree{}; cos(\theta) = 0 ), 2. two vectors lie in the same direction (the dot product is ab as θ=0°;cos(θ)=1\theta=0\degree{};cos(\theta) = 1 ) and 3. two vectors lie in opposite direction (the dot product is -ab as θ=180°;cos(θ)=1\theta=180\degree{}; cos(\theta) = -1 ).

Right hand rule cross product (source: Wikipedia)
Corkscrew Rule (source: Connecticut State University)

Another operation on vectors results in a vector instead of a scalar (number). This is called the cross product. The way to think about it is in terms of the right hand rules or corkscrew rules, the third vector results from the rotation of one vector to another and is orthogonal to both the vectors. The magnitude of the cross product is the area formed by the vectors, usually computed using the determinant of the matrix formed by these vectors, which we will talk about in a later article.


What is a matrix?

Matrix Dimensions (source:

In high school, matrices are generally introduced as a collection of vectors, either row or column vectors. These are essentially similar to tables in that way.

Augmented Matrix for a System of Linear Equations (source: Khan Academy)

They come up when you are trying to solve a system of linear equations, to represent the coefficients that appear within the equations.

But that's only part of the picture. Again, there is a very neat way to think about matrices geometrically just as you can think about vectors geometrically. That involves visualizing them in terms of linear transformations.

What are linear transformations?

Linear transformations represent transformations (changes) of vectors and vector spaces. Remember when we said that we don't just have to use the standard bases to describe a vector space. That's where linear transformations come in handy. You can transform one set of bases to another through such transformations. These are linear in nature, because you get shapes formed by straight line segments, which do not have curves in them. The vector spaces can be thought as grids consisting of cells formed by parallel intersecting line segments that are evenly spaced out.

What are some common types of linear transformations?

Types of Linear Transformations (source: Mathigon)

The common types of linear transformations are:

  1. Rotation: You rotate both the bases about the origin, which means that you keep the relative angles and the sizes of the vectors fixed.

  2. Scaling: You stretch the bases in the same directions, changing their sizes but keeping the relative angles fixed.

  3. Shearing: You kept one of the bases fixed and then rotate the other by an angle.

  4. Reflection: You rotate both the bases about a line, rather than about a point such as the origin.

There are some special types of linear transformations such as the identity transformation, which does nothing, and the zero transformation. You can understand them better in terms of matrix multiplication.

How can you visualize matrix-vector multiplication?

Transformation of bases(source: Math Insight)

You can thinking of multiplying a matrix to a vector as applying a linear transformation to that vector. As we said, a vector is usually described in terms of standard bases, but when we multiply it by a matrix, the new vector we get is the same vector described in terms of the bases represented by column vectors in the matrix.

Matrix vector multiplication

The identity transformation that we spoke out earlier is represented by the identity matrix:

Identity Matrix

Now it makes sense why you get the same vector when you multiply a vector by the identity matrix: you are not transforming the set of bases and hence the vector space. When you apply the zero transformation, represented by the zero matrix, you essentially collapse everything to a point.

In the 2 dimensional case, you can write down the matrices corresponding to the common types of linear transformations as follows:

  1. Rotation:

  2. Scaling:

  3. Shearing:

  4. Reflection:

You can play around with these transformations and visualize them, using this wonderful tool called the transformation matrix playground.

How can you now think about a system of linear equations?

You can think of a system of linear equations in this way: can you find a vector in the standard coordinate system (the vector space described by the standard bases) that results in the solution vector once you apply a linear transformation represented by the matrix?

This is how we go from numbers to vector spaces - algebra to geometry. Since you aren't dealing with variables raised to any power, you are only concerned with lines and their segments. This is why the field is called linear algebra in the first place.

How can you visualize matrix-matrix multiplication?

What happens when you want to apply successive linear transformations? If you guessed that the equivalent was multiplying more than one matrix, then you are right. For example, when you want to rotate a vector (and its associated vector space) and then shear, you will multiply the matrix representing the rotation transformation and then the matrix representing the shearing transformation.

Matrix multiplication is not commutative, since the order of the transformations matters. For example, rotating and then shearing is not the same as shearing and then rotating.


Matrix multiplication is associative, however, since you can group the transformations in different ways, so long as they are applied in the same order.

A(BC)=(AB)CA(BC) = (AB)C

Top comments (0)