Unleashing the Power of Computer Vision: Your First CNN in TensorFlow Made Simple!

#ai #machinelearning #datascience #computervision

What is a Convolutional Neural Network?

A Convolutional Neural Networks are multilayered Artificial Neural Networks which are used for detecting features in images.

Why was CNN built and what problems does it solve?

Convolutional Neural Networks (CNNs) were built to help computers "see" and understand images more effectively. Before CNNs, computers struggled to recognize objects or patterns in pictures because images contain a huge amount of data. CNNs were designed to mimic how the human brain processes images.

Imagine looking at a picture of a cat. When you see the image, your brain identifies different parts of the cat, like its ears, eyes, and nose. CNNs work similarly. They use specialized layers to look for small patterns in the image, like edges or shapes, and then combine them to recognize more complex features like a cat's face.

By using CNNs, computers can automatically learn what's important in an image without needing humans to tell them. This makes it possible for computers to do tasks like identifying objects, detecting diseases in medical images, or even driving cars on their own. CNNs have revolutionized computer vision and made many amazing technologies possible.

Here are some problems that it solves:

Image classification: CNNs can be trained to classify images into different categories, such as cats, dogs, cars, and people.
Object detection: CNNs can be trained to detect objects in images. This is useful for applications such as self-driving cars and facial recognition.
Segmentation: CNNs can be used to segment images, which means dividing them into different parts. This is useful for applications such as medical image analysis and scene understanding.
Super-resolution: CNNs can be used to improve the resolution of images. This is useful for applications such as image restoration and video compression.

Classifying Food Images with CNN

I will show you how to create a simple CNN models with TensorFlow using the food101 dataset. Here, i will show you hands-on how to create this CNN model with these steps.

Step 1: Install TensorFlow

pip install tensorflow

We check if the tensorflow project is properly installed:

import tensorflow as tf
print(tf.__version__)

Step 2: We load the dataset into our local machine

import zipfile

!wget https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip

# Unzip the file
zip_ref=zipfile.ZipFile('pizza_steak.zip')
zip_ref.extractall()
zip_ref.close()

zip_ref = zipfile.ZipFile('pizza_steak.zip'). This line creates a ZipFile object named zip_ref, which allows us to work with the contents of the ZIP file named "pizza_steak.zip"

The zipfile.ZipFile() function takes the name of the ZIP file as an argument and opens it for reading.

zip_ref.extractall() line extracts all the contents of the ZIP file into the current working directory.

The extractall() method is called on the zip_ref object to perform the extraction.

The zip_ref.close() After the extraction is complete, this line closes the ZipFile object to free up system resources.

It is a good practice to close the ZipFile object after you're done working with it.

Step 3: We preprocess our images by converting them into numerical representations. This enables our model to effectively detect features in the images.

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

#set the seed
tf.random.set_seed(42)
train_datagen = ImageDataGenerator(rescale=1./255)
valid_datagen = ImageDataGenerator(rescale=1./255)

#Setup paths to our data directories
train_dir = 'pizza_steak/train'
test_dir = 'pizza_steak/test'

# Import data from directories and turn it into batches
train_data =train_datagen.flow_from_directory(directory=train_dir,
                                              batch_size=32,
                                              target_size=(224,224),
                                              class_mode='binary',
                                              seed=42)
valid_data = valid_datagen.flow_from_directory(directory=test_dir,
                                               batch_size=32,
                                               target_size=(224,224),
                                               class_mode='binary',
                                               seed=42)

To prepare our data for training, we use the flow_from_directory method on the data generators. For the training data, we create batches of 32 images each, resizing them to a target size of 224x224 pixels. The class_mode is set to 'binary' since we are performing a binary classification (pizza or steak). The same process is applied to the validation data.

By dividing the data into batches, we efficiently feed the numerical representations of the images into our neural network, enabling it to learn patterns from the data during the training process

Step 4: Construct the CNN model with the capability to identify distinctive features within our images.

# Build a CNN Model
model_1= tf.keras.Sequential([
    tf.keras.layers.Conv2D(filters=10,
                           kernel_size=3,
                           activation='relu',
                           input_shape=(224,224,3)),
    tf.keras.layers.Conv2D(10,3,activation='relu'),
    tf.keras.layers.MaxPool2D(pool_size=2,
                              padding='valid'),
    tf.keras.layers.Conv2D(10,3,activation='relu'),
    tf.keras.layers.Conv2D(10,3,activation='relu'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1,activation='sigmoid')
])

# Compile our model
model_1.compile(loss=tf.keras.losses.binary_crossentropy,
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy'])
# Fitting our Model
model_1.fit(train_data,
          epochs=5,
          steps_per_epoch=len(train_data),
          validation_data=valid_data,
          validation_steps=len(valid_data))

Once we train our model, the obtained accuracy is 0.8920 for the training dataset, while the accuracy for the validation dataset is 0.8600. Additionally, the loss values for both the train and validation datasets are 0.2982 and 0.3116, respectively.

To enhance our model's performance and decrease its losses, several approaches can be employed. This includes increasing the number of epochs during training, exploring different optimizers, and augmenting the dataset with more data. Each of these strategies plays a role in fine-tuning the model and improving its overall accuracy and efficiency.

Conclusion

As we conclude this article, I hope you found the journey into the world of Convolutional Neural Networks insightful and exciting. If you've enjoyed exploring the power of CNNs for image classification, then you'll definitely want to catch me on my next adventure - a captivating style transfer project!

In the upcoming article, we'll dive into the fascinating realm of neural style transfer, where we blend the artistry of different images to create mesmerizing masterpieces. You'll witness how CNNs can transform ordinary photographs into extraordinary works of art, seamlessly merging the style of one image with the content of another.

Be prepared to embark on an artistic journey filled with creativity and innovation. Whether you're a developer, an artist, or simply a curious mind, the magic of style transfer will captivate you.

So stay tuned, as we uncover the secrets behind this enchanting process in our next article. Until then, keep exploring the incredible possibilities of deep learning and computer vision!