Juan Cruz Martinez

Posted on Jun 11, 2020 • Originally published at livecodestream.dev on Jun 11, 2020

Essential OpenCV Functions to Get You Started into Computer Vision

#ai #computervision #python

Computer Vision is a field of artificial intelligence that trains computers to interpret and understand the visual world. As such many projects involve the usage of images from cameras and videos and the use of several techniques such as image processing and deep learning models.

OpenCV is a library designed to solve common computer vision problems, it’s super popular among those in the field and it’s great for learning and using in production. The library has interfaces for multiple languages, including Python, Java, and C++.

Throughout this article we will cover different (common) functions inside OpenCV, their applications, and how you can get started with each one. Even though I’ll be providing the examples in Python, the concepts and the functions will be the same for the different supported languages.

What exactly are we going to learn today?

Reading, writing and displaying images
Changing color spaces
Resizing images
Image rotation
Edge Detection

Reading, writing and displaying images

Before we can do anything with computer vision, we need to be able to read and understand how images are processed by computers. The only information computers can process is binary information (0 and 1), this includes text, images, and video.

How do computers work with images

To understand how a computer “understands” an image yo can picture a matrix of the size of the image where on each cell you assign a value that represents the color of the image in that position.

Let’s take an example with an image in greyscale:

For this particular case, we can assign each block (or pixel) in the image a numeric value (which can be interpreted as binary). This numeric value can be from any range, though it’s a convention to use 0 for black, 255 for white, and all the integers in between to represent the intensity.

When we work with color images, things can get a bit different depending on the library and how we choose to represent the colors. We will talk more about that later in the post, however, they all share more or less the same idea, which is using different channels to represent the colors, being RGB (red, green, and blue) one of the most popular options. With RGB we need 3 channels to build each pixel, so our 2d matrix now is a 3d matrix with a depth of 3, where each channel is the intensity of a particular color, and when mixing we get the final color for the pixel.

Working with images using OpenCV

Let’s now jump into the code to perform 3 of the most important functions when dealing with images, reading, showing, and saving.

import cv2
import matplotlib.pyplot as plt

# Reading the image
image = cv2.imread('sample1.jpg')

# Showing the image
plt.imshow(image)
plt.show()

# Saving the image
cv2.imwrite('sample1_output.jpg', image)

If you run our code, now you will get one image saved to disk, and another as a result of the plot.

The image on the left is the one we plotted, vs the one on the right which is the image saved to disk. The difference in sizes aside (due to the plot), the image on the left side looks weird, looks bluish, but why is it different? (by the way, the image on the right is correct).

The reason why the image on the left looks with strange colors has to do with how OpenCV read images by default. OpenCV imread() function will read an image using BGR channels as supposed to RGB which is used by the plot function. This is normal for OpenCV, and there are ways to fix it which we will discuss next.

Changing color spaces

What is color space? In our previous example, we saw how computers are processing images, and we saw that to represent colors we need to use channels, which when combined we get the final color of the image. The configuration in which these channels is set are color spaces. Unknowingly we have already covered 2 different color spaces in our previous code snippet, we used RGB and BGR , but there are more which have very particular and interesting properties. Some other popular color spaces include LAB , YCrCb , HLS , and HSV.Since each color space has its own properties, some algorithms or techniques may work better in one space than others, so changing an image between these color spaces is important, and thankfully, OpenCV provides us with a very easy to use function for exactly this purpose.

Meet cvtColor, and let’s see how we can use it to fix our plot above

import cv2
import matplotlib.pyplot as plt

# Reading the image
image = cv2.imread('sample1.jpg')

# Change color space
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Showing the image
plt.imshow(image)
plt.show()

And we now get a beautiful brown dog:

Let’s explore some other color spaces:

import cv2
import matplotlib.pyplot as plt

# Reading the image
original = cv2.imread('sample1.jpg')

fig = plt.figure(figsize=(8, 2))
axarr = fig.subplots(1, 3)

# Change color space
image = cv2.cvtColor(original, cv2.COLOR_BGR2RGB)
axarr[0].imshow(image)

image = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)
axarr[1].imshow(image)

image = cv2.cvtColor(original, cv2.COLOR_BGR2LAB)
axarr[2].imshow(image)

plt.show()

Resizing images

Now that we are able to load, show, and change the color space for images, the next thing we need to focus on is resizing. Resizing images in computer vision is important, because, the learning models in ML work with fixed-sized input. The size will depend on the model, but to make sure our images will work on the model we would need to resize them accordingly.

OpenCV offers a practical method for doing that called resize, let’s see an example of how to use it.

import cv2

# Reading the image
original = cv2.imread('sample1.jpg')

# Resize
resized = cv2.resize(original, (200, 200))

print(original.shape)
print(resized.shape)

Which outputs

(1100, 1650, 3)
(200, 200, 3)

Image rotation

A crucial part of training models is the dataset we will use to train it. If the dataset doesn’t have enough samples, and well-distributed samples, the trained model is likely to fail. But sometimes we don’t count with a big enough dataset, or we don’t have all the situations we want to train the model into, so we run processes that alter the images we have to generate new ones.There are many scenarios where rotating the image to various angles can help us gain efficiency in our model, but we won’t cover all of them here. Instead, I’d like to show you how to use OpenCV to rotate images.

Let’s see an example of the rotate function in OpenCV

import cv2
import matplotlib.pyplot as plt

# Reading the image
image = cv2.imread('sample1.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

image = cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE)

plt.imshow(image)
plt.show()

Even though this method is super easy to use, it also restricts us to a few options, we can’t rotate in any angle we want. To have more control over the rotation we can use getRotationMatrix2D and warpAffine instead.

import cv2
import matplotlib.pyplot as plt

# Reading the image
image = cv2.imread('sample1.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

rows, cols = image.shape[:2]
deg = 45
# (col/2,rows/2) is the center of rotation for the image
# M is the coordinates of the center
M = cv2.getRotationMatrix2D((cols/2, rows/2), deg, 1)
image = cv2.warpAffine(image, M, (cols, rows))

plt.imshow(image)
plt.show()

Edge Detection

Edges are the points in an image where the image brightness changes sharply or has discontinuities. Such discontinuities generally correspond to:

Discontinuities in depth
Discontinuities in surface orientation
Changes in material properties
Variations in scene illumination

Edges are a very useful feature of an image that can be used as part of a ML pipeline, we already saw some examples on how edges can help us detect shapes or lines on a road.

CV2 provides us with the Canny function for this task, and here is how to use it:

import cv2
import matplotlib.pyplot as plt

# Reading the image
original = cv2.imread('sample2.jpg')

fig = plt.figure(figsize=(6, 2))
axarr = fig.subplots(1, 2)

axarr[0].imshow(cv2.cvtColor(original, cv2.COLOR_BGR2RGB))

threshold1 = 50
threshold2 = 200
grey = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)
image = cv2.Canny(original, threshold1, threshold2)
axarr[1].imshow(image, cmap='gray')

plt.show()

Summary

OpenCV is a great library for working with images and videos, it provides a ton of useful tools and functions for dealing from the most simple to the more complex scenarios. The functions we reviewed today are just a few from the gallery. If you are interested to explore the library docs, look at the samples, there’s a lot, from simple image handling like transposing, to more advance features like contour detection, feature detection, and even face detection.

I hope you enjoy reading through this article, and please join the conversation for what are your favorite OpenCV functions.

Thanks for reading!

If you like the story, please don't forget to subscribe to our newsletter so we can stay connected: https://livecodestream.dev/subscribe

DEV Community

Essential OpenCV Functions to Get You Started into Computer Vision

What exactly are we going to learn today?

Reading, writing and displaying images

How do computers work with images

Working with images using OpenCV

Changing color spaces

Resizing images

Image rotation

Edge Detection

Summary

Top comments (0)

Read next

OneTrainer Fine Tuning vs Kohya SS DreamBooth & Huge Research of OneTrainer’s Masked Training

Our Manifesto — A Tribute to Small Business Owners

Using the Django ORM - Building a Multiplayer Tic Tac Toe with Python, Docker, and AI, chapter 6

I created Ragrank 🎯- An open source ecosystem to evaluate LLM and RAG.