Saaransh Gupta

Posted on Aug 25

ResNet Vs EfficientNet vs VGG Vs NN

#machinelearning #keras #python #deeplearning

As a student, I've witnessed firsthand the frustration caused by our university's inefficient lost and found system. The current process, reliant on individual emails for each found item, often leads to delays and missed connections between lost belongings and their owners.

Driven by a desire to improve this experience for myself and my fellow students, I've embarked on a project to explore the potential of deep learning in revolutionizing our lost and found system. In this blog post, I'll share my journey of evaluating pretrained models - ResNet, EfficientNet, VGG, and NasNet - to automate the identification and categorization of lost items.

Through a comparative analysis, I aim to pinpoint the most suitable model for integrating into our system, ultimately creating a faster, more accurate, and user-friendly lost and found experience for everyone on campus.

ResNet

Inception-ResNet V2 is a powerful convolutional neural network architecture available in Keras, combining the Inception architecture's strengths with residual connections from ResNet. This hybrid model aims to achieve high accuracy in image classification tasks while maintaining computational efficiency.

Training Dataset: ImageNet
Image Format: 299 x 299

Preprocessing function



def readyForResNet(fileName):

    pic = load_img(fileName, target_size=(299, 299))

    pic_array = img_to_array(pic)

    expanded = np.expand_dims(pic_array, axis=0)

    return preprocess_input_resnet(expanded)

Predicting



data1 = readyForResNet(test_file)

prediction = inception_model_resnet.predict(data1)

res1 = decode_predictions_resnet(prediction, top=2)

VGG (Visual Geometry Group)

VGG (Visual Geometry Group) is a family of deep convolutional neural network architectures known for their simplicity and effectiveness in image classification tasks. These models, particularly VGG16 and VGG19, gained popularity due to their strong performance in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2014.

Training Dataset: ImageNet
Image Format: 224 x 224

Preprocessing function



def readyForVGG(fileName):

    pic = load_img(fileName, target_size=(224, 224))

    pic_array = img_to_array(pic)

    expanded = np.expand_dims(pic_array, axis=0)

    return preprocess_input_vgg19(expanded)

Predicting



data2 = readyForVGG(test_file)

prediction = inception_model_vgg19.predict(data2)

res2 = decode_predictions_vgg19(prediction, top=2)

EfficientNet

EfficientNet is a family of convolutional neural network architectures that achieve state-of-the-art accuracy on image classification tasks while being significantly smaller and faster than previous models. This efficiency is achieved through a novel compound scaling method that balances network depth, width, and resolution.

Training Dataset: ImageNet
Image Format: 480 x 480

Preprocessing function



def readyForEF(fileName):

    pic = load_img(fileName, target_size=(480, 480))

    pic_array = img_to_array(pic)

    expanded = np.expand_dims(pic_array, axis=0)

    return preprocess_input_EF(expanded)

Predicting



data3 = readyForEF(test_file)

prediction = inception_model_EF.predict(data3)

res3 = decode_predictions_EF(prediction, top=2)

NasNet

NasNet (Neural Architecture Search Network) represents a groundbreaking approach in deep learning where the architecture of the neural network itself is discovered through an automated search process. This search process aims to find the optimal combination of layers and connections to achieve high performance on a given task.

Training Dataset: ImageNet
Image Format: 224 x 224

Preprocessing function



def readyForNN(fileName):

    pic = load_img(fileName, target_size=(224, 224))

    pic_array = img_to_array(pic)

    expanded = np.expand_dims(pic_array, axis=0)

    return preprocess_input_NN(expanded)

Predicting



data4 = readyForNN(test_file)

prediction = inception_model_NN.predict(data4)

res4 = decode_predictions_NN(prediction, top=2)

Showdown

Accuracy

The table summarizes the claimed accuracy scores of the models above. EfficientNet B7 leads with the highest accuracy, followed closely by NasNet-Large and Inception-ResNet V2. VGG models exhibit lower accuracies. For my application I want to choose a model which has a balance between processing time and accuracy.

Time

As we can see, EfficientNetB0 provides us the fastest results, but InceptionResNetV2 is a better package when taken accuracy in account

Summary

For my smart lost and found system, I decided to go with InceptionResNetV2. While EfficientNet B7 looked tempting with its top-notch accuracy, I was concerned about its computational demands. In a university setting, where resources might be limited and real-time performance is often desirable, I felt it was important to strike a balance between accuracy and efficiency. InceptionResNetV2 seemed like the perfect fit - it offers strong performance without being overly computationally intensive.

Plus, the fact that it's pretrained on ImageNet gives me confidence that it can handle the diverse range of objects people might lose. And let's not forget how easy it is to work with in Keras! That definitely made my decision easier.

Overall, I believe InceptionResNetV2 provides the right mix of accuracy, efficiency, and practicality for my project. I'm excited to see how it performs in helping reunite lost items with their owners!

DEV Community

ResNet Vs EfficientNet vs VGG Vs NN

ResNet

Preprocessing function

Predicting

VGG (Visual Geometry Group)

Preprocessing function

Predicting

EfficientNet

Preprocessing function

Predicting

NasNet

Preprocessing function

Predicting

Showdown

Accuracy

Time

Summary

Top comments (0)

Read next

Unlocking Efficient Training for AI Language Giants: Deep Optimizer States

Fortifying Reliability for Large-Scale ML Research Clusters

Microsoft Copilot: Redefining Productivity with AI

Part 2: Building Your Own AI - Setting Up the Environment for AI/ML Development