NumPy is a Python library that is mainly used to work with arrays. An array is a collection of items that are stored next to each other in memory. For now, just think of them like Python lists.
NumPy is written in Python and C. The calculations in NumPy are done by the parts that are written in C, which makes them extremely fast compared to normal Python code.
Installation
Make sure Python & Pip are installed in your computer. Then open command prompt or terminal and run
pip install numpy
Creating Arrays
We can create a NumPy array by using the numpy module's array()
function.
import numpy as np
arr = np.array([3, 5, 7, 9])
print(type(arr))
Output:
<class 'numpy.ndarray'>
We just created a NumPy array from a list. The type of our arr
variable is numpy.ndarray
. Here ndarray
stands for N-dimensional array.
Dimensions or Axes
In NumPy, dimensions are called axes (plural for axis). I like to think of an axis as a line along which items can be stored. A simple list or a 1 dimensional array can be visualized as:
We will now look at the following:
- Scalars (0D Arrays)
- Vectors (1D Arrays)
- Matrices (2D Arrays)
- 3D Arrays
- 4D Arrays
1) Scalars (0D Arrays)
A scalar is just a single value.
import numpy as np
s = np.array(21)
print("Number of axes:", s.ndim)
print("Shape:", s.shape)
Output:
Number of axes: 0
Shape: ()
Here we have used 2 properties of a numpy array:
-
ndim
: It returns the number of dimensions (or axes) in an array. It returns 0 here because a value in itself does not have any dimensions. -
shape
: It returns a tuple that contains the number of values along each axis of an array. Since a scalar has 0 axes, it returns an empty tuple.
2) Vectors (1D Arrays)
A vector is a collection of values.
import numpy as np
vec = np.array([-1, 2, 7, 9, 2])
print("Number of axes:", vec.ndim)
print("Shape:", vec.shape)
Output:
Number of axes: 1
Shape: (5,)
vec.shape[0]
gives us the number of values in our vector, which is 5 here.
3) Matrices (2D Arrays)
A matrix is a collection of vectors.
import numpy as np
mat = np.array([
[1, 2, 3],
[5, 6, 7]
])
print("Number of axes:", mat.ndim)
print("Shape:", mat.shape)
Output:
Number of axes: 2
Shape: (2, 3)
Here we created a 2x3 matrix (2D array) using a list of lists. Since a matrix has 2 axes, mat.shape
tuple contains two values: the first value is the number of rows and the second value is the number of columns.
Each item (row) in a 2D array is a vector (1D array).
4) 3D Arrays
A 3D array is a collection of matrices.
import numpy as np
t = np.array([
[[1, 3, 9],
[7, -6, 2]],
[[2, 3, 5],
[0, -2, -2]],
[[9, 6, 2],
[-7, -3, -12]],
[[2, 4, 5],
[-1, 9, 8]]
])
print("Number of axes:", t.ndim)
print("Shape:", t.shape)
Output:
Number of axes: 3
Shape: (4, 2, 3)
Here we created a 3D array by using a list of 4 lists, which themselves contain 2 lists.
Each item in a 3D array is a matrix (1D array). Note that the last matrix in the array is the front-most in the image.
5) 4D Ararys
After looking at the above examples, we see a pattern here. An n-dimensional array is a collection of n-1 dimensional arrays, for n > 0.
I hope that now you have a better idea of visualizing multidimensional arrays.
Accessing Array Elements
Just like Python lists, the indexes in NumPy arrays start with 0.
import numpy as np
vec = np.array([-3, 4, 6, 9, 8, 3])
print("vec - 4th value:", vec[3])
vec[3] = 19
print("vec - 4th value (changed):", vec[3])
mat = np.array([
[2, 4, 6, 8],
[10, 12, 14, 16]
])
print("mat - 1st row:", mat[0])
print("mat - 2nd row's 1st value:", mat[1, 0])
print("mat - last row's last value:", mat[-1, -1])
Output:
vec - 4th value: 9
vec - 4th value (changed): 19
mat - 1st row: [2 4 6 8]
mat - 2nd row's 1st value: 10
mat - last row's last value: 16
NumPy arrays also support slicing.
# continuing the above code
print("vec - 2nd to 4th:", vec[1:4])
print("mat - 1st rows 1st to 3rd values:", mat[0, 0:3])
print("mat - 2nd column:", mat[:, 1])
Output:
vec - 2nd to 4th: [4 6 9]
mat - 1st row's 1st to 3rd values: [2 4 6]
mat - 2nd column: [ 4 12]
In the last example, [:, 1]
tells "get 2nd value from all rows". Hence, we get the 2nd column of the matrix as the output.
Example: Indexing in a 4D Array
Let's say we want to access the circled value. It is located in the 2nd 3D array's last matrix's 2nd row's 2nd column. It's a lot so take your time. Here's how to access it:
arr[2, -1, 1, 1]
Python VS NumPy
At the beginning of the post, I said that calculations in NumPy are extremely fast compared to normal Python code. Let's see the difference. We will create two lists with 10 million numbers from 0 to 9,999,999, add them element-wise and measure the time it takes. Then we will convert both lists to NumPy arrays and do the same.
import numpy as np
import time
l1 = list(range(10000000))
l2 = list(range(10000000))
sum = []
then = time.time()
for i in range(len(l1)):
sum.append(l1[i] + l2[i])
print(f"With just Python: {time.time() - then: .2f}s")
arr1 = np.array(l1)
arr2 = np.array(l2)
then = time.time()
sum = arr1 + arr2
print(f"With NumPy: {time.time() - then: .2f}s")
Output:
With just Python: 2.30s
With NumPy: 0.14s
In this case, NumPy was 16x faster than raw Python.
Top comments (0)