DEV Community

Karan Bhardwaj
Karan Bhardwaj

Posted on

Streamlining Instance Segmentation: A Guide to Utilizing Detectron2 on Google Colab and Performing Inference on Video

Instance segmentation, a challenging task in computer vision that involves detecting and delineating individual objects within an image or video, has seen significant advancements in recent years. One such advancement is Detectron2, a flexible and efficient framework developed by Facebook AI Research. In this guide, we'll explore how to leverage the power of Detectron2 within the Google Colab environment to perform instance segmentation on videos.

Step 1: Check GPU availability

Check whether you have connected to GPU by changing the runtime from the Runtime tab in the dropdown menu.

Change runtime type

After that check whether the GPU is accessible or not by running the command:

!nvidia-smi
Enter fullscreen mode Exit fullscreen mode

If you see something like this, you are all set to go.

Check for GPU

Step 2: Install detectron2

Run this single command to directly install detectron2.

!python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
Enter fullscreen mode Exit fullscreen mode

Step 3: Import libraries

Import the required libraries.

# COMMON LIBRARIES
import os
import cv2

from google.colab.patches import cv2_imshow

# VISUALIZATION
from detectron2.utils.visualizer import Visualizer
from detectron2.utils.visualizer import ColorMode

# CONFIGURATION
from detectron2 import model_zoo
from detectron2.config import get_cfg

# EVALUATION
from detectron2.engine import DefaultPredictor
Enter fullscreen mode Exit fullscreen mode

Step 4: Initialize the predictor

Choose a model as per your requirement from the model zoo. You can see the list of available models here.

cfg = get_cfg()
cfg.merge_from_file("detectron2/configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"
predictor = DefaultPredictor(cfg)
Enter fullscreen mode Exit fullscreen mode

Step 5: Inference on Video

Set the path to your video in the following code, and execute the code. The output will be a video with segmentation applied.

import imageio
import numpy

# Load video
video_path = "path_to_your_video.mp4"
cap = cv2.VideoCapture(video_path)

# Initialize video writer
fps = cap.get(cv2.CAP_PROP_FPS)
output_path = '/content/output.mp4'
writer = imageio.get_writer(output_path, fps=fps)

# Perform instance segmentation on each frame
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    outputs = predictor(frame)

    # Find the classes (Optional)
    pred_classes = instances.pred_classes.cpu().numpy()

    # Find the segment points (Optional)
    pred_masks = instances.pred_masks.cpu().numpy()

    v = Visualizer(frame[:, :, ::-1], metadata=MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=0.8)
    frame = v.draw_instance_predictions(outputs["instances"].to("cpu")).get_image()[:, :, ::-1]
    # Write processed frame to output video
    writer.append_data(frame)

# Release video resources
cap.release()
writer.close()
Enter fullscreen mode Exit fullscreen mode

Top comments (0)