DEV Community

Cover image for Introduction to TensorFlow for Deep Learning
Aman Gupta
Aman Gupta

Posted on

Introduction to TensorFlow for Deep Learning

Week 1:

  • traditional coding: rules + data= answers
  • ML : answers + data = rules (let computer figure out rules)
  • Keras - an API in TensorFlow
  • Neural Network - functions that can learn patterns
  • Dense - to define a connected layer of neuron’s
  • Loss functions - measures how bad or good the guess was and gives it to optimiser (mean squared error)
  • Optimiser functions - figure out the next guess (SGD which stands for stochastic gradient descent)
  • Importing libraries

    import tensorflow as tf
    import numpy as np
    from tensorflow import keras
  • Defining and compiling a model

    # Build a simple Sequential model
    model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
    # Compile the model
    model.compile(optimizer='sgd', loss='mean_squared_error')
  • Providing data

    # Declare model inputs and outputs for training
    xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
    ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)
  • Training and prediction

    # Train the model, ys, epochs=500)
    # Make a prediction
  • For visualisation and some fun:

Week 2: intro to CV

  • Flatten layer - convert the multi dimensional data into a linear array
  • Neuron’s - variable in a function
  • Using fashion MNIST dataset to perform classification of images
  • Loading dataset

    # Load the Fashion MNIST dataset
    fmnist = tf.keras.datasets.fashion_mnist 
    # Load the training and test split of the Fashion MNIST dataset
    (training_images, training_labels), (test_images, test_labels) = fmnist.load_data()
  • Visualise the dataset

    import numpy as np
    import matplotlib.pyplot as plt
    # You can put between 0 to 59999 here
    index = 0
    # Set number of characters per row when printing
    # Print the label and image
    print(f'LABEL: {training_labels[index]}')
    print(f'\nIMAGE PIXEL ARRAY:\n {training_images[index]}')
    # Visualize the image
  • Normalising the data (it’s a good practice to get more acc results)

    # Normalize the pixel values of the train and test images
    training_images  = training_images / 255.0
    test_images = test_images / 255.0
  • Model

    # Build the classification model
    model = tf.keras.models.Sequential([tf.keras.layers.Flatten(), 
                                        tf.keras.layers.Dense(128, activation=tf.nn.relu), 
                                        tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
    model.compile(optimizer = tf.optimizers.Adam(),
                  loss = 'sparse_categorical_crossentropy',
                  metrics=['accuracy']), training_labels, epochs=5)
    # Evaluate the model on unseen data
    model.evaluate(test_images, test_labels)
  • Sequential - That defines a sequence of layers in the neural network.

  • Each layer of neuron’s need an activation function to tell them what to do. There are a lot of options, but just use these for now: ReLU effectively means:

    if x > 0: 
      return x
      return 0
  • Softmax - takes a list of values and scales these so the sum of all elements will be equal to 1. When applied to model outputs, you can think of the scaled values as the probability for that class. For example, in your classification model which has 10 units in the output dense layer, having the highest value at index = 4 means that the model is most confident that the input clothing image is a coat. If it is at index = 5, then it is a sandal, and so forth.

    # Declare sample inputs and convert to a tensor
    inputs = np.array([[1.0, 3.0, 4.0, 2.0]])
    inputs = tf.convert_to_tensor(inputs)
    print(f'input to softmax function: {inputs.numpy()}')
    # Feed the inputs to a softmax activation function
    outputs = tf.keras.activations.softmax(inputs)
    print(f'output of softmax function: {outputs.numpy()}')
    # Get the sum of all values after the softmax
    sum = tf.reduce_sum(outputs)
    print(f'sum of outputs: {sum}')
    # Get the index with highest value
    prediction = np.argmax(outputs)
    print(f'class with highest probability: {prediction}')
  • Predicting - classifications give the value of all the probabilities for all the labels, we choose the biggest one and that is our classification, basically how confident our model is about that classification

    classifications = model.predict(test_images)
  • Call back - to stop the training of the model, when a certain criteria is met

    class myCallback(tf.keras.callbacks.Callback):
      def on_epoch_end(self, epoch, logs={}):
        if(logs.get('accuracy') >= 0.6): # Experiment with changing this value
          print("\nReached 60% accuracy so cancelling training!")
          self.model.stop_training = True
    callbacks = myCallback()
    fmnist = tf.keras.datasets.fashion_mnist
    (training_images, training_labels) ,  (test_images, test_labels) = fmnist.load_data()
    model = tf.keras.models.Sequential([
      tf.keras.layers.Dense(512, activation=tf.nn.relu),
      tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']), training_labels, epochs=5, callbacks=[callbacks])

Week 3: CNN

  • Convolution - passing multiple window filters over the image, to get convolutions. The idea is that some convolutions will highlight certain aspects in the image (vertical lines, edges)
  • Pooling - a way to compress an image (2x2 to 1 pixel) reducing the size while keeping most of the information intact
  • So TensorFlow tries multiple different convolutions and then decides which one works and because of that the information gets filtered and the model is trained on the useful subset of information instead of the whole bunch
  • Loading dataset

    import tensorflow as tf
    # Load the Fashion MNIST dataset
    fmnist = tf.keras.datasets.fashion_mnist
    (training_images, training_labels), (test_images, test_labels) = fmnist.load_data()
    # Normalize the pixel values
    training_images = training_images / 255.0
    test_images = test_images / 255.0
  • Pre processing data - reason for this is that commonly you will use 3-dimensional arrays (without counting the batch dimension) to represent image data. The third dimension represents the colour using RGB values. Since the data is in numpy.ndarray we can use functions like reshape and divide

    # Reshape the images to add an extra dimension
        images = np.expand_dims(images, axis=-1)
        # Normalize pixel values
        images = np.divide(images,255.0)
  • Model - in Conv2D layer the arguments are ( number of convolutions, size of the window, activation function, size of the input), in MaxPooling2D layer the argument is ( size of the window) and “max” because we are taking the biggest value of those pixels

    # Define the model
    model = tf.keras.models.Sequential([
      # Add convolutions and max pooling
      tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
      tf.keras.layers.MaxPooling2D(2, 2),
      tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
      # Add the same layers as before
      tf.keras.layers.Dense(128, activation='relu'),
      tf.keras.layers.Dense(10, activation='softmax')
    # Print the model summary
    # Use same settings
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    # Train the model
    print(f'\nMODEL TRAINING:'), training_labels, epochs=5)
    # Evaluate on the test set
    print(f'\nMODEL EVALUATION:')
    test_loss = model.evaluate(test_images, test_labels)
  • Summary of the model - its 26x26 instead of 28x28 because we cant use the window on the edge pixels, and by pooling the dimensions reduces by half

Image description

  • Code to visualise convolutions

    import matplotlib.pyplot as plt
    from tensorflow.keras import models
    f, axarr = plt.subplots(3,4)
    layer_outputs = [layer.output for layer in model.layers]
    activation_model = tf.keras.models.Model(inputs = model.input, outputs = layer_outputs)
    for x in range(0,4):
      f1 = activation_model.predict(test_images[FIRST_IMAGE].reshape(1, 28, 28, 1))[x]
      axarr[0,x].imshow(f1[0, : , :, CONVOLUTION_NUMBER], cmap='inferno')
      f2 = activation_model.predict(test_images[SECOND_IMAGE].reshape(1, 28, 28, 1))[x]
      axarr[1,x].imshow(f2[0, : , :, CONVOLUTION_NUMBER], cmap='inferno')
      f3 = activation_model.predict(test_images[THIRD_IMAGE].reshape(1, 28, 28, 1))[x]
      axarr[2,x].imshow(f3[0, : , :, CONVOLUTION_NUMBER], cmap='inferno')

Image description

Week 4: using real world images

  • Downloading dataset and unzipping it

    import zipfile
    # Unzip the dataset
    local_zip = './'
    zip_ref = zipfile.ZipFile(local_zip, 'r')
# Download the validation set

# Unzip validation set
local_zip = './'
zip_ref = zipfile.ZipFile(local_zip, 'r')
Enter fullscreen mode Exit fullscreen mode
  • Extracting the data with paths

    import os
    # Directory with our training horse pictures
    train_horse_dir = os.path.join('./horse-or-human/horses')
    # Directory with our training human pictures
    train_human_dir = os.path.join('./horse-or-human/humans')
    #to view the images
    train_horse_names = os.listdir(train_horse_dir)
    train_human_names = os.listdir(train_human_dir)
    # to print the length of the data
    print('total training horse images:', len(os.listdir(train_horse_dir)))
    print('total training human images:', len(os.listdir(train_human_dir)))
# Directory with validation horse pictures
validation_horse_dir = os.path.join('./validation-horse-or-human/horses')

# Directory with validation human pictures
validation_human_dir = os.path.join('./validation-horse-or-human/humans')

validation_horse_names = os.listdir(validation_horse_dir)
print(f'VAL SET HORSES: {validation_horse_names[:10]}')

validation_human_names = os.listdir(validation_human_dir)
print(f'VAL SET HUMANS: {validation_human_names[:10]}')

print(f'total validation horse images: {len(os.listdir(validation_horse_dir))}')
print(f'total validation human images: {len(os.listdir(validation_human_dir))}')
Enter fullscreen mode Exit fullscreen mode
  • To visualise the dataset

    %matplotlib inline
    import matplotlib.pyplot as plt
    import matplotlib.image as mpimg
    # Parameters for our graph; we'll output images in a 4x4 configuration
    nrows = 4
    ncols = 4
    # Index for iterating over images
    pic_index = 0
    # Set up matplotlib fig, and size it to fit 4x4 pics
    fig = plt.gcf()
    fig.set_size_inches(ncols * 4, nrows * 4)
    pic_index += 8
    next_horse_pix = [os.path.join(train_horse_dir, fname) 
                    for fname in train_horse_names[pic_index-8:pic_index]]
    next_human_pix = [os.path.join(train_human_dir, fname) 
                    for fname in train_human_names[pic_index-8:pic_index]]
    for i, img_path in enumerate(next_horse_pix+next_human_pix):
      # Set up subplot; subplot indices start at 1
      sp = plt.subplot(nrows, ncols, i + 1)
      sp.axis('Off') # Don't show axes (or gridlines)
      img = mpimg.imread(img_path)
  • Model - we are using sigmoid as activation function because its a binary classification

    model = tf.keras.models.Sequential([
        # Note the input shape is the desired size of the image 300x300 with 3 bytes color
        # This is the first convolution
        tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300, 300, 3)),
        tf.keras.layers.MaxPooling2D(2, 2),
        # The second convolution
        tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
        # The third convolution
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        # The fourth convolution
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        # The fifth convolution
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        # Flatten the results to feed into a DNN
        # 512 neuron hidden layer
        tf.keras.layers.Dense(512, activation='relu'),
        # Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('horses') and 1 for the other ('humans')
        tf.keras.layers.Dense(1, activation='sigmoid')
  • Models summary

Image description

  • Running the model - binary_crossentropy loss because it's a binary classification problem, and the final activation is a sigmoid. (For a refresher on loss metrics. We will use the rmsprop optimizer with a learning rate of 0.001.
  • In this case, using the RMSprop optimization algorithm is preferable to stochastic gradient descent (SGD), because RMSprop automates learning-rate tuning for us. (Other optimizers, such as Adam and Adagrad, also automatically adapt the learning rate during training, and would work equally well here.)
  • We are using ImageDataGenerator in order to automatically label the images based on it’s directory and sub directories

    from tensorflow.keras.optimizers import RMSprop
    from tensorflow.keras.preprocessing.image import ImageDataGenerator
    # All images will be rescaled by 1./255
    train_datagen = ImageDataGenerator(rescale=1/255)
    # Flow training images in batches of 128 using train_datagen generator
    train_generator = train_datagen.flow_from_directory(
            './horse-or-human/',  # This is the source directory for training images
            target_size=(300, 300),  # All images will be resized to 300x300
            # Since we use binary_crossentropy loss, we need binary labels
                    # Should be one of "binary", "categorical" or "sparse"
    history =
# Flow validation images in batches of 128 using validation_datagen generator
validation_generator = validation_datagen.flow_from_directory(
        './validation-horse-or-human/',  # This is the source directory for validation images
        target_size=(300, 300),  # All images will be resized to 300x300
        # Since you use binary_crossentropy loss, you need binary labels

history =
      validation_data = validation_generator,
Enter fullscreen mode Exit fullscreen mode
  • Model prediction

    import numpy as np
    from google.colab import files
    from tensorflow.keras.utils import load_img, img_to_array
    uploaded = files.upload()
    for fn in uploaded.keys():
      # predicting images
      path = '/content/' + fn
      img = load_img(path, target_size=(300, 300))
      x = img_to_array(img)
      x /= 255
      x = np.expand_dims(x, axis=0)
      images = np.vstack([x])
      classes = model.predict(images, batch_size=10)
      if classes[0]>0.5:
        print(fn + " is a human")
        print(fn + " is a horse")
  • Visualising intermediate steps in the model

    import numpy as np
    import random
    from tensorflow.keras.utils import img_to_array, load_img
    # Define a new Model that will take an image as input, and will output
    # intermediate representations for all layers in the previous model after
    # the first.
    successive_outputs = [layer.output for layer in model.layers[1:]]
    visualization_model = tf.keras.models.Model(inputs = model.input, outputs = successive_outputs)
    # Prepare a random input image from the training set.
    horse_img_files = [os.path.join(train_horse_dir, f) for f in train_horse_names]
    human_img_files = [os.path.join(train_human_dir, f) for f in train_human_names]
    img_path = random.choice(horse_img_files + human_img_files)
    img = load_img(img_path, target_size=(300, 300))  # this is a PIL image
    x = img_to_array(img)  # Numpy array with shape (300, 300, 3)
    x = x.reshape((1,) + x.shape)  # Numpy array with shape (1, 300, 300, 3)
    # Scale by 1/255
    x /= 255
    # Run the image through the network, thus obtaining all
    # intermediate representations for this image.
    successive_feature_maps = visualization_model.predict(x)
    # These are the names of the layers, so you can have them as part of the plot
    layer_names = [ for layer in model.layers[1:]]
    # Display the representations
    for layer_name, feature_map in zip(layer_names, successive_feature_maps):
      if len(feature_map.shape) == 4:
        # Just do this for the conv / maxpool layers, not the fully-connected layers
        n_features = feature_map.shape[-1]  # number of features in feature map
        # The feature map has shape (1, size, size, n_features)
        size = feature_map.shape[1]
        # Tile the images in this matrix
        display_grid = np.zeros((size, size * n_features))
        for i in range(n_features):
          x = feature_map[0, :, :, i]
          x -= x.mean()
          x /= x.std()
          x *= 64
          x += 128
          x = np.clip(x, 0, 255).astype('uint8')
          # Tile each filter into this big horizontal grid
          display_grid[:, i * size : (i + 1) * size] = x
        # Display the grid
        scale = 20. / n_features
        plt.figure(figsize=(scale * n_features, scale))
        plt.imshow(display_grid, aspect='auto', cmap='viridis')

Image description

  • Extra stuff

    # to load the image and convert it into numpy format 
    from tensorflow.keras.preprocessing.image import img_to_array
    # Load the first example of a happy face
    sample_image  = load_img(f"{os.path.join(happy_dir, os.listdir(happy_dir)[0])}")
    # Convert the image into its numpy array representation
    sample_array = img_to_array(sample_image)
    print(f"Each image has shape: {sample_array.shape}")
    print(f"The maximum pixel value used is: {np.max(sample_array)}")

Top comments (0)