MLSYS ENGINEERING

2.1. Tensor

Let's start with the most essential data structure in MLSys, tensor.

Tensor is a concept in mathematics. It is a generalization of scalars, vectors, and matrices to higher dimensions. As shown in Figure 2, a scalar is a 0-dimensional tensor, a vector is a 1-dimensional tensor, a matrix is a 2-dimensional tensor, and so on.

0D Tensor (Scalar): 5 1D Tensor (Vector): 2 4 6 8 2D Tensor (Matrix): 1 2 3 4 5 6 7 8 9 3D Tensor: 1 2 3 4 5 6 7 8 9 4D Tensor: 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9
Figure 2. Visual illustration of tensors.

Describing it with the language of a software developer, I am listing the scalar, vector, matrix, and tensor using numpy arrays with their shapes as follows.

Code 1. Tensor with different dimensions.
import numpy as np
# 0D Tensor, shape: ()
scalar = np.array(5)
# 1D Tensor, shape: (4,)
vector = np.array([2, 4, 6, 8])
# 2D Tensor, shape: (3, 3)
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# 3D Tensor, shape: (3, 3, 3)
tensor = np.array([matrix, matrix, matrix])
# 4D Tensor, shape: (3, 3, 3, 3)
tensor_4d = np.array([tensor, tensor, tensor])

As you can see, a vector is a list of scalars, and a matrix is a list of vectors. Tensors can be a little different. A list of tensors is still a tensor, just with one more dimension.