What is Computer Vision? (1)

#python #pytorch #computervision #deeplearning

*My post explains Keypoint Detection(Landmark Detection), Image Matching, Object Tracking, Stereo Matching, Video Prediction, Optical Flow, Image Captioning.

Computer Vision is the technology which enables a computer to understand and analyze the visual things such as images, videos, etc.

There are many Computer Vision technologies as shown below:

(1) Image Classification(Recognition):

can classify an image into one or more classes(categories) from one or more classes, seeing the main object or interest region: *Memos:
- The image can be one frame in a video.
- There is also Fine-Grained Image Classification to classify an image into one or more subclasses(subcategories) from one or more subclasses seeing the main object or interest region. E.g. An image is classified to the superclass(supercategories) **Dog, then it's classified to the subclass **Golden Retriever, **Bulldog* or Poodle.
- There is also Scene Classification to classify an image into one or more classes from one or more classes, seeing the whole image.
- There is also Video Classification to classify an video into one or more classes from one or more classes, seeing the whole video.
has the method Single-Label Classification which has two methods Binary Classification and Multi-Class Classification.
has the method Multi-Label Classification.

*Memos:

Binary Classification can classify an image into a single class from two classes.
Multi-Class Classification can classify an image into a single class from more than two classes.
Multi-Label Classification can classify an entire image into multiple classes from more than two classes.

(2) Object Localization:

can localize the objects and interest regions in an image with bounding boxes. *The image can be one frame in a video.

(3) Object Detection:

can localize and classify the objects and interest regions in an image with classes and bounding boxes. *The image can be one frame in a video.
is the combination of Object Localization and Image Classification(Recognition).
used for Object Tracking.

(4) Image Segmentation:

can do Object Detection more precisely, differentiating stuff and things with colors: *Memos:
- Stuff is uncountable things(classes) such as sky, sea, forrest, road, grass, landscape, etc.
- Things are countable things(classes) such as car, tree, person, animal, flower, etc.
has the popular methods Semantic Segmentation, Instance Segmentation and Panoptic segmentation: *Memos:
- Semantic Segmentation is good at differentiating stuff but not good at differentiating things.
- Instance Segmentation is good at differentiating things but not good at differentiating stuff.
- Panoptic segmentation:
- is good at differentiating both stuff and things.
- is the combination of Semantic Segmentation and Instance Segmentation.
is used for Medical Imaging(CT scans, MRIs, X-rays, etc), Autonomous Vehicles, Satellite Imagery, Agriculture, Robotics, Surveillance, Industrial Inspection, Face Recognition, etc.