DEV Community

Aayush Dineshkumar Jain
Aayush Dineshkumar Jain

Posted on

Convolutional Neural Network

What is Convolutional Neural Network??

Convolutional Neural Network is a class in Deep Learning which is takes images as inputs and identifies the edges and features that differentiates between the images and gives out the results.

How does CNN works?

CNN used filters and maxpooling to identify the edges horizontally and vertically and get only the required features or features that are giving relevant information.

Filters:

The input image performs convolutional operation with the filter to extract out the edges. There are multiple filters example: sobel filter, scharr filter, etc.

Let there be an image with 7x7 matrix and a filter of 3x3.

Convolutional functionConvolutional function

The 1st 3x3 block from input image is multiplied with the filter (Feature Detector in image) and the 1st value of the Feature map is obtained. This is shown in below:

(0x0)+(0x1)+(0x0)+(0x0)+(1x0)+(0x1)+(0x1)+(0x0)+(0x1)=0

Similarly for 2nd block in 1st row,

(0x0)+(1x1)+(0x0)+(0x0)+(0x0)+(0x1)+(0x1)+(0x0)+(0x1)=1

Like that taking a stride of one each block value is calculated and this represents the edges of the images

The size of the Feature map matrix is = size of the input image -the size of the Filter + 1 i.e. in given example it would be 7–3+1=5 which is 5x5.

But sometimes this creates a problem that some features or information is lost as the size is reduced. This information can be important in that case the padding is applied to the input image.

After padding the 7x7 matrix becomes 9x9 so the feature map size will be 9–3+1=7 i.e. it will be 7x7 which means there is not loss of information.

Maxpooling Layer:

In maxpooling suppose the matrix size of the maxpooling layer is 2x2 then from the maximum value from each 2x2 matrix in imput image is taken and a matrix is formed which is output matrix or output of the maxpooling layer which contains the important features or the features which contains the more information about the input.

MaxPooling LayerMaxPooling Layer

An advantage of filter and maxpooling is that we get the important features from the input image and the size of the image is also reduced.

Now let’s see the code for filter and maxpooling using keras:

import tensorflow as tf

model = models.Sequential()

model.add(layers.Conv2D(64, (3, 3), activation='relu', input_shape=(32, 32, 3)))

model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(32, (3, 3), activation='relu'))

model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(32, (3, 3), activation='relu'))

model.add(layers.Flatten())

model.add(layers.Dense(64, activation='relu'))

model.add(layers.Dense(10))
Enter fullscreen mode Exit fullscreen mode

In above code, the 1st layer contains 64 filters of size 3x3 and the input shape is (32,32,3) which represents an image of 32x32 with 3 channels. The output shape of layer 1 will be (30,30,3). The 2nd is the maxpooling layer of size 2x2. The output shape from 2nd layer will be (15,15,3). Similarly the 3rd layer have 32 filters and of size 3x3 and output shape is (13,13,3) and 4th layer will have output shape of (6,6,3). Last 3 layers are the Flatten, Dense and Output layer with 10 classes.

Top comments (0)