DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Edited on

What is Computer Vision? (1)

Buy Me a Coffee

*My post explains Keypoint Detection(Landmark Detection), Image Matching, Object Tracking, Stereo Matching, Video Prediction, Optical Flow, Image Captioning.

Computer Vision is the technology which enables a computer to understand and analyze the visual things such as images, videos, etc.

There are many Computer Vision technologies as shown below:

(1) Image Classification(Recognition):

  • can classify an entire image into one or more classes(labels) from one or more classes(labels): *Memos:
    • The image can be one frame in a video.
    • There is also Video Classification to classify an entire video into one or more classes(labels) from one or more classes(labels).
  • has the method Single-Label Classification which has two methods Binary Classification and Multi-Class Classification.
  • has the method Multi-Label Classification.

*Memos:

  • Binary Classification can classify an entire image into a single class(label) from two classes(labels).
  • Multi-Class Classification can classify an entire image into a single class(label) from more than two classes(labels).
  • Multi-Label Classification can classify an entire image into multiple classes(labels) from more than two classes(labels).

Image description

(2) Object Localization:

  • can localize the objects and interest regions in an image with bounding boxes. *The image can be one frame in a video.

Image description

(3) Object Detection:

  • can localize and classify the objects and interest regions in an image with classes(labels) and bounding boxes. *The image can be one frame in a video.
  • is the combination of Object Localization and Image Classification(Recognition).
  • used for Object Tracking.

Image description

(4) Image Segmentation:

  • can do Object Detection more precisely, differentiating stuff and things with colors: *Memos:
    • Stuff is uncountable things(classes) such as sky, sea, forrest, road, grass, landscape, etc.
    • Things are countable things(classes) such as car, tree, person, animal, flower, etc.
  • has the popular methods Semantic Segmentation, Instance Segmentation and Panoptic segmentation: *Memos:
    • Semantic Segmentation is good at differentiating stuff but not good at differentiating things.
    • Instance Segmentation is good at differentiating things but not good at differentiating stuff.
    • Panoptic segmentation:
    • is good at differentiating both stuff and things.
    • is the combination of Semantic Segmentation and Instance Segmentation.
  • is used for Medical Imaging(CT scans, MRIs, X-rays, etc), Autonomous Vehicles, Satellite Imagery, Agriculture, Robotics, Surveillance, Industrial Inspection, Face Recognition, etc.

Image description

Image description

Image description

Top comments (0)