DEV Community

Saaransh Gupta
Saaransh Gupta

Posted on

ResNet Vs EfficientNet vs VGG Vs NN

As a student, I've witnessed firsthand the frustration caused by our university's inefficient lost and found system. The current process, reliant on individual emails for each found item, often leads to delays and missed connections between lost belongings and their owners.

Driven by a desire to improve this experience for myself and my fellow students, I've embarked on a project to explore the potential of deep learning in revolutionizing our lost and found system. In this blog post, I'll share my journey of evaluating pretrained models - ResNet, EfficientNet, VGG, and NasNet - to automate the identification and categorization of lost items.

Through a comparative analysis, I aim to pinpoint the most suitable model for integrating into our system, ultimately creating a faster, more accurate, and user-friendly lost and found experience for everyone on campus.

ResNet

Inception-ResNet V2 is a powerful convolutional neural network architecture available in Keras, combining the Inception architecture's strengths with residual connections from ResNet. This hybrid model aims to achieve high accuracy in image classification tasks while maintaining computational efficiency.

Training Dataset: ImageNet
Image Format: 299 x 299

Preprocessing function

def readyForResNet(fileName):
    pic = load_img(fileName, target_size=(299, 299))
    pic_array = img_to_array(pic)
    expanded = np.expand_dims(pic_array, axis=0)
    return preprocess_input_resnet(expanded)
Enter fullscreen mode Exit fullscreen mode

Predicting

data1 = readyForResNet(test_file)
prediction = inception_model_resnet.predict(data1)
res1 = decode_predictions_resnet(prediction, top=2)
Enter fullscreen mode Exit fullscreen mode

VGG (Visual Geometry Group)

VGG (Visual Geometry Group) is a family of deep convolutional neural network architectures known for their simplicity and effectiveness in image classification tasks. These models, particularly VGG16 and VGG19, gained popularity due to their strong performance in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2014.

Training Dataset: ImageNet
Image Format: 224 x 224

Preprocessing function

def readyForVGG(fileName):
    pic = load_img(fileName, target_size=(224, 224))
    pic_array = img_to_array(pic)
    expanded = np.expand_dims(pic_array, axis=0)
    return preprocess_input_vgg19(expanded)
Enter fullscreen mode Exit fullscreen mode

Predicting

data2 = readyForVGG(test_file)
prediction = inception_model_vgg19.predict(data2)
res2 = decode_predictions_vgg19(prediction, top=2)
Enter fullscreen mode Exit fullscreen mode

EfficientNet

EfficientNet is a family of convolutional neural network architectures that achieve state-of-the-art accuracy on image classification tasks while being significantly smaller and faster than previous models. This efficiency is achieved through a novel compound scaling method that balances network depth, width, and resolution.

Training Dataset: ImageNet
Image Format: 480 x 480

Preprocessing function

def readyForEF(fileName):
    pic = load_img(fileName, target_size=(480, 480))
    pic_array = img_to_array(pic)
    expanded = np.expand_dims(pic_array, axis=0)
    return preprocess_input_EF(expanded)
Enter fullscreen mode Exit fullscreen mode

Predicting

data3 = readyForEF(test_file)
prediction = inception_model_EF.predict(data3)
res3 = decode_predictions_EF(prediction, top=2)
Enter fullscreen mode Exit fullscreen mode

NasNet

NasNet (Neural Architecture Search Network) represents a groundbreaking approach in deep learning where the architecture of the neural network itself is discovered through an automated search process. This search process aims to find the optimal combination of layers and connections to achieve high performance on a given task.

Training Dataset: ImageNet
Image Format: 224 x 224

Preprocessing function

def readyForNN(fileName):
    pic = load_img(fileName, target_size=(224, 224))
    pic_array = img_to_array(pic)
    expanded = np.expand_dims(pic_array, axis=0)
    return preprocess_input_NN(expanded)
Enter fullscreen mode Exit fullscreen mode

Predicting

data4 = readyForNN(test_file)
prediction = inception_model_NN.predict(data4)
res4 = decode_predictions_NN(prediction, top=2)
Enter fullscreen mode Exit fullscreen mode

Showdown

Accuracy

Clamed Accuracies

The table summarizes the claimed accuracy scores of the models above. EfficientNet B7 leads with the highest accuracy, followed closely by NasNet-Large and Inception-ResNet V2. VGG models exhibit lower accuracies. For my application I want to choose a model which has a balance between processing time and accuracy.

Time

Reletive Times

As we can see, EfficientNetB0 provides us the fastest results, but InceptionResNetV2 is a better package when taken accuracy in account

Summary

For my smart lost and found system, I decided to go with InceptionResNetV2. While EfficientNet B7 looked tempting with its top-notch accuracy, I was concerned about its computational demands. In a university setting, where resources might be limited and real-time performance is often desirable, I felt it was important to strike a balance between accuracy and efficiency. InceptionResNetV2 seemed like the perfect fit - it offers strong performance without being overly computationally intensive.

Plus, the fact that it's pretrained on ImageNet gives me confidence that it can handle the diverse range of objects people might lose. And let's not forget how easy it is to work with in Keras! That definitely made my decision easier.

Overall, I believe InceptionResNetV2 provides the right mix of accuracy, efficiency, and practicality for my project. I'm excited to see how it performs in helping reunite lost items with their owners!

Top comments (0)