The layers in PyTorch (1)

#pytorch #layer #deeplearning #machinelearning

*My post explains Recurrent Layer, LSTM, GRU and Transformer.

A layer is a collection of nodes to do a specific task.

Basically, a Neural Network(NN) consists of 3 layers as shown below:

Input Layer:
- is the 1st layer which accepts data and pass it to a hidden layer.
Hidden Layer:
- is the layer between an input and output layer.
- can be zero or more hidden layers in a neural network.
Output layer:
- is the last layer which holds a result.

*There is the Single Layer Neural Network or Perceptron which only has an input and output layer without hidden layer.

And, there are popular layers as shown below. *Some layers can be Neural Networks or models:

(1) Fully-connected Layer:

(2) Convolutional Layer(1982):

can make data stand out by extracting the features from the data with filters(kernels). *Extracting the features from the data also downsamples and reduces the data to reduce computation.
is used for Convolutional Neural Network(CNN): *Memos:
- There are 1D CNN, 2D CNN and 3D CNN.
- 1D CNN is for the 1D data such as the time series data such as audio, text, etc.
- 2D CNN is for the 2D data such as a 2D image.
- 3D CNN is for the 3D data such as video, Magnetic Resonance Imaging(MRI), Computerized Tomography(CT) Scan, etc.
is Conv1d(), Conv2d() or Conv3d() in PyTorch: *Memos:
- Conv1d() is for 1D data.
- Conv2d() is for 2D data.
- Conv3d() is for 3D data.

(3) Transposed Convolutional Layer:

can upsample data.
is used for CNN.
is also called Deconvolutional Layer.
is ConvTranspose1d(), ConvTranspose2d() or ConvTranspose3d() in PyTorch: *Memos:
- ConvTranspose1d() is for 1D data.
- ConvTranspose2d() is for 2D data.
- ConvTranspose3d() is for 3D data.

(4) Pooling Layer:

can downsample(reduce data) keeping features to reduce computation. *The way of downsampling data is different from Convolutional Layer.
is used for CNN.
Max pooling, Average pooling and Min pooling are popular. *Max, Average or Min pooling takes a maximum(brighter), average or minimum(darker) value(pixel) respectively from each filter(kernel) of an image.
is MaxPool1d(), MaxPool2d(), MaxPool3d(), AvgPool1d(), AvgPool2d() or AvgPool3d() in PyTorch: *Memos:
- Min pooling doesn't exist in PyTorch.
- MaxPool1d() and AvgPool1d() are for 1D data.
- MaxPool2d() and AvgPool2d() are for 2D data.
- MaxPool3d() and AvgPool3d() are for 3D data.

(5) Batch Normalization Layer(2015):

can normalize input values channel by channel in a batch in CNN to be similar scale to accelerate(speed up) training.
is unstable with small batch sizes, then it leads to increased train time.
is used for CNN.
is not good with RNN.
is BatchNorm1d(), BatchNorm2d() or BatchNorm3d() in PyTorch: *Memos:
- Min pooling doesn't exist in PyTorch.
- BatchNorm1d() is for 1D data.
- BatchNorm2d() is for 2D data.
- BatchNorm3d() is for 3D data.

(6) Layer Normalization(2016):

can normalize input values layer by layer in many types of NN to be similar scale to accelerate training.
is the improved version of Batch Normalization layer.
is stable with small batch sizes, then it doesn't lead to increased train time.
is good with RNN.
is LayerNorm() in PyTorch.

(7) Dropout Layer(2012):

(8) Embedding Layer:

DEV Community