DEV Community

machine-gurning
machine-gurning

Posted on

Fastai Chapter 4 - The important parts, part 1: Tensors

Chapter four, "MNIST_BASICS", or "Lesson 4" of the online course, is the most important and apparently most difficult chapter of the fastai course.

In this series of posts post I will cover, in-depth, the most important concepts, and provide a few more examples of each to make them stick. I will include various practice questions throughout; the answers are available at the end of the post. I strongly recommend attempting the questions before looking at the answers.

The book is available online here
The course is accessible here

Full sequence contents:

  1. Tensors and tensor operations
  2. Building and training a simple regression model
  3. Substituting pytorch/fastai components
  4. Building and training a nonlinear model
  5. Answers to practice questions

1. Tensors and tensor operations

Overview

Crucial to all deep learning is the concept of a tensor. Succinctly, a tensor is a generalisation of concepts that we should already be familiar with: vectors and matrices.

  • A one-dimensional (or "Rank 1") tensor is equivalent to a number
  • A two-dimensional (or "Rank 2") tensor is equivalent to a 2D matrix of numbers
  • A three-dimensional (or "Rank 3") tensor is equivalent to a "cube" of numbers, or stacked matrices
  • and so on..

Nothing else to it, for now

Rationale

As you will see / have seen, neural networks fundamentally comprise large matrices of numbers that have been expertly picked by the computer. Tensors are simply the way in which we perform operations on, store, and query those numbers. The data we feed neural networks must also be represented in tables of numbers. Learning about tensors and tensor operations is crucial for manipulating the data that is fed into the network, as well as investigating what the network itself is doing.

Why use pytorch tensors over numpy arrays and good olde python lists? I quote the author:

The vast majority of methods and operators supported by NumPy on these structures are also supported by PyTorch, but PyTorch tensors have additional capabilities. One major capability is that these structures can live on the GPU, in which case their computation will be optimized for the GPU and can run much faster (given lots of values to work on). In addition, PyTorch can automatically calculate derivatives of these operations, including combinations of operations. As you'll see, it would be impossible to do deep learning in practice without this capability.

Operations

Here is a useful video on basic tensor operations that covers more than you need to know

Initialising a tensor

First ensure pytorch is installed and imported, then make your first tensor:

import torch
torch.tensor(1)
Enter fullscreen mode Exit fullscreen mode

Jeremy has aliased torch.tensor to just tensor, and you can do the same with this line:

tensor = torch.tensor
Enter fullscreen mode Exit fullscreen mode

You can turn python lists and numpy arrays into tensors

tensor([1,2,3])

tensor([[1,2,3],[1,2,3]])

import numpy as np

tensor(np.array([1,2,3]))
Enter fullscreen mode Exit fullscreen mode

Other ways of creating a tensor, not all of which you need to commit to memory, are:

  • torch.rand(2,3): Creates a rank 2 tensor of random positive numbers
  • torch.randn(2,3): Creates a rank 2 tensor of random positive and negative numbers
  • torch.empty(4,4): empty 4x4 rank 2 tensor
  • torch.zeros(2,3,4): rank three ("cube") tensor of zeros
  • torch.ones(...): self-explanatory
  • torch.eye(3,3): Creates an identity matrix of dimensions 3x3 (get it? "eye-dentity")
  • torch.arange(4): Creates a rank one tensor of the numbers 1 to 4

Parameters

  • dtype: You can specify data types implicitly or explicitly in the tensor definition.
  • device: specify the device that the tensor is saved to. Try "cuda" - if it is available, your tensor will be saved to the graphics card.
  • requires_grad: to be explained below.
tensor([1.]) # implicitly

tensor([1,2,3], dtype = torch.float32, device="cuda") 
# Personally, I get an error
Enter fullscreen mode Exit fullscreen mode

Transforming tensor type

You can call tensor methods to change their type:

x = tensor([0,1,2,3])
x.bool()   # bool : tensor([false, true, true, true])
x.float()  # float32 : tensor([0.0, 1.0, 2.0, 3.0]) 
x.short()  # int16
x.long()   # int64
x.half()   # float16
x.double() # float64


# Typically does what you would expect:
x = tensor([False, True])
x.float() # tensor([0.0, 1.0])

Enter fullscreen mode Exit fullscreen mode

Transforming tensor shape

We will often need to re-shape tensors to make them fit what we're trying to jam them into. In this chapter, we are introduced to these reshaping methods:

  • torch.stack(): Takes a sequence (list, tuple, whatever) of tensors of the same size and concatenates them "along a new dimension". Imagine having a "stack" of 2D tensors, all the same dimensions, and then combining them to form a "cube" of numbers. The output tensor is of one higher dimension than the inputs. Tensors must be of the same size.
x1 = tensor([1,2])
x2 = tensor([3,4])
x3 = tensor([5,6])

torch.stack([x1,x2,x3])
Enter fullscreen mode Exit fullscreen mode

There is an optional parameter for dimension, dim, which effectively asks you which direction you want the stacking to take place. Consider the scenario of stacking two rank 3 tensors: you can stack them in one of three ways:

Image description

  • torch.cat(): Unlike stack, cat concatenates the tensors along the specified dimension. The output tensor has the same rank as the inputs.
x1 = torch.zeros((2,3))
x2 = torch.ones((2,3))

torch.cat((x1, x2), dim=0) # try dim = 1 and 2
Enter fullscreen mode Exit fullscreen mode

For your information, the following transformations are all considered "views" on an unchanged tensor. When you apply one of the following methods, no new memory is allocated, and the original tensor is simply referenced in a new way. Read more here: https://pytorch.org/docs/stable/tensor_view.html

  • **tensor().view()**: Returns a new tensor with the same data but a different shape. Having the first parameter as -1 lets pytorch infer that dimension from the other ones. The dimensions entered must be valid, in that the tensor's data must fit the dimensions correctly.
  • In the chapter, we use view to transform the shape of our 28x28 rank 2 tensor into a rank 1 tensor of 784 long, to then be fed into the neural network. It would not accept a rank 2 tensor, so we flatten the data out.
x = torch.randn(4, 4)
x.size()
# torch.Size([4, 4])

y = x.view(16)
y.size()
# torch.Size([16])

z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
z.size()
# torch.Size([2, 8])

Enter fullscreen mode Exit fullscreen mode
  • **tensor().squeeze()**: Removes dimensions of size one. Helpful for eliminating "excess" dimensions. Changes a [n, 1] shaped tensor into [n] shape. Optional parameter dim will only squeeze the tensor in that dimension, and ignore other ways it could be squeezed
x1 = torch.zeros((2,1,3))

x1 
# tensor([[[0., 0., 0.]],   # Note that it is 2D data, trapped
#         [[0., 0., 0.]]])  # in a 3D tensor

x1.squeeze()
# tensor([[0., 0., 0.],   
#         [0., 0., 0.]])
Enter fullscreen mode Exit fullscreen mode
  • **tensor().unsqueeze()**: Does the opposite of squeeze. Adds an extra dimension of size one to the data. Useful for when you need to perform matrix multiplication and need extra dimensions to make it work. dim lets you decide which dimension to insert it along.
  • In the chapter, we unsqueeze a tensor that represents the image category (train_y) to make sure it has the same dimensionality as train_x. However, if you don't unsqueeze, the rest of the notebook runs without issue, so I am not entirely sure why he did this, other than good practice...
x1 = torch.zeros((2,1,3))

x1.unsqueeze()
Enter fullscreen mode Exit fullscreen mode

Tensor operations

Finally we get to mathematical operations involving tensors.

Standard operations act the same was as they do in linear algebra, where adding, subtracting, multiplying and dividing by a scalar is performed on every element of the tensor:

x = tensor([1,2,3])

x * 2   # tensor([2,4,6])
x / 2
x + 3
# etc
Enter fullscreen mode Exit fullscreen mode

Crucially, matrix multiplication of tensors works using the standard pythong @ symbol.

This will be instrumental to building our neural network. You need not know more than the fact that the following calculations work, and what matrix multiplication is on a conceptual level (3blue1brown has an insightful series of videos on linear algebra)

x = tensor([[1,2],[1,2]]) 
y = tensor([[3,4],[3,4]])

x@y
# tensor([[ 9, 12],
#         [ 9, 12]])

y@x
# tensor([[ 7, 14],
#         [ 7, 14]])
Enter fullscreen mode Exit fullscreen mode

Other operations you will come across are self-explanatory but worth listing:

  • mean(): Returns a tensor of rank 0 with the value. Inserting a tuple as a parameter constrains the mean to just the dimensions specified in the tuple.
  • In the chapter, we use this to compare a large stack of 1010 training samples all contained within a rank 3 of shape 1010x28x28 tensor, with an "idealised" image in a rank 2 tensor of shape 28x28. We wish to get back a rank 1 tensor that has one mean value per training sample, so we define the following function.
  • (-1, -2) specifies that we want to take the mean of the second last and last dimension only, but keep the other dimensions (i.e. the 1010) intact.
def mnist_distance(a,b): 
  return (a-b).abs().mean((-1,-2)) 

Enter fullscreen mode Exit fullscreen mode
  • abs(): Self-explanatory
  • sqrt(): Self-explanatory
  • item(): returns the actual tensor element value. Only works with rank 0 tensors, i.e. tensors with one element.

Tensor broadcasting

One characteristic of the pytorch tensor which is used frequently during the chapter is that of broadcasting. For many tensor operations, if two tensors are combined in some way, and one tensor has fewer dimensions than the other, pytorch will try to sensibly extend ("broadly cast") the lower dimension or smaller tensor so that it may play nicely with the larger tensor.

A simple example:

tensor([1,2]) * tensor([4])
# tensor([4,8])
Enter fullscreen mode Exit fullscreen mode

An instance of this in the chapter is when we compare a single rank 2 tensor of shape 28x28 with a rank 3 tensor of shape 1010x28x28. We subtract one from the other, and no error occurs because pytorch interprets what we want and artificially 'extends' the first tensor to play nice with the second:

# Function described above
def mnist_distance(a,b): 
  return (a-b).abs().mean((-1,-2)) 

valid_3_dist = mnist_distance(valid_3_tens, mean3)
# 
Enter fullscreen mode Exit fullscreen mode

Conclusion

Familiarity with tensors to the depth of this article is satisfactory for having a good time reading this chapter.

Further information on tensors in this in-depth video by pytorch:

https://www.youtube.com/watch?v=r7QDUPb2dCM

Top comments (0)