DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Updated on

The layers in PyTorch (1)

Buy Me a Coffee

*My post explains Recurrent Layer, LSTM, GRU and Transformer.

A layer is a collection of nodes to do a specific task.

Basically, a Neural Network(NN) consists of 3 layers as shown below:

  • Input Layer:

    • is the 1st layer which accepts data and pass it to a hidden layer.
  • Hidden Layer:

    • is the layer between an input and output layer.
    • can be zero or more hidden layers in a neural network.
  • Output layer:

    • is the last layer which holds a result.

*There is the Single Layer Neural Network or Perceptron which only has an input and output layer without hidden layer.

And, there are popular layers as shown below. *Some layers can be Neural Networks or models:

(1) Fully-connected Layer:

  • connects every neuron in one layer to every neuron in the next layer.
  • is also called Linear Layer, Dense Layer or Affine Layer.
  • is Linear() in PyTorch.

(2) Convolutional Layer(1982):

  • can make data stand out by extracting the features from the data with filters(kernels). *Extracting the features from the data also downsamples and reduces the data to reduce computation.
  • is used for Convolutional Neural Network(CNN): *Memos:
    • There are 1D CNN, 2D CNN and 3D CNN.
    • 1D CNN is for the 1D data such as the time series data such as audio, text, etc.
    • 2D CNN is for the 2D data such as a 2D image.
    • 3D CNN is for the 3D data such as video, Magnetic Resonance Imaging(MRI), Computerized Tomography(CT) Scan, etc.
  • is Conv1d(), Conv2d() or Conv3d() in PyTorch: *Memos:
    • Conv1d() is for 1D data.
    • Conv2d() is for 2D data.
    • Conv3d() is for 3D data.

(3) Transposed Convolutional Layer:

  • can upsample data.
  • is used for CNN.
  • is also called Deconvolutional Layer.
  • is ConvTranspose1d(), ConvTranspose2d() or ConvTranspose3d() in PyTorch: *Memos:
    • ConvTranspose1d() is for 1D data.
    • ConvTranspose2d() is for 2D data.
    • ConvTranspose3d() is for 3D data.

(4) Pooling Layer:

  • can downsample(reduce data) keeping features to reduce computation. *The way of downsampling data is different from Convolutional Layer.
  • is used for CNN.
  • Max pooling, Average pooling and Min pooling are popular. *Max, Average or Min pooling takes a maximum(brighter), average or minimum(darker) value(pixel) respectively from each filter(kernel) of an image.
  • is MaxPool1d(), MaxPool2d(), MaxPool3d(), AvgPool1d(), AvgPool2d() or AvgPool3d() in PyTorch: *Memos:
    • Min pooling doesn't exist in PyTorch.
    • MaxPool1d() and AvgPool1d() are for 1D data.
    • MaxPool2d() and AvgPool2d() are for 2D data.
    • MaxPool3d() and AvgPool3d() are for 3D data.

(5) Batch Normalization Layer(2015):

  • can normalize input values channel by channel in a batch in CNN to be similar scale to accelerate(speed up) training.
  • is unstable with small batch sizes, then it leads to increased train time.
  • is used for CNN.
  • is not good with RNN.
  • is BatchNorm1d(), BatchNorm2d() or BatchNorm3d() in PyTorch: *Memos:
    • Min pooling doesn't exist in PyTorch.
    • BatchNorm1d() is for 1D data.
    • BatchNorm2d() is for 2D data.
    • BatchNorm3d() is for 3D data.

(6) Layer Normalization(2016):

  • can normalize input values layer by layer in many types of NN to be similar scale to accelerate training.
  • is the improved version of Batch Normalization layer.
  • is stable with small batch sizes, then it doesn't lead to increased train time.
  • is good with RNN.
  • is LayerNorm() in PyTorch.

(7) Dropout Layer(2012):

  • can reduce overfitting by randomly dropping out nodes during training.
  • is Dropout() in PyTorch.

(8) Embedding Layer:

  • can convert categorical data to numerical data.
  • is used for NLP.
  • is Embedding() in PyTorch.

Top comments (0)