Applying filters to images is not a new concept to anyone. We take a picture, make a few changes to it, and now it looks cooler. But where does Artificial Intelligence come in? Let’s try out a fun use for Unsupervised Machine Learning with K Means Clustering.
I’ve written before about K Means Clustering , so I will assume you’re familiar with the algorithm this time. If you’re not, this is the in-depth introduction I wrote.
And I also tried my hand at image compression (well, reconstruction) with autoencoders, to varying degrees of success.
However this time, my goal is not to reconstruct the best possible image, but just to see the effects of recreating a picture with the least possible colors.
Instead of making the picture look as similar to the original as possible, I just want us to look at it and say “neat!”.
So how do we do this? I’m glad you asked.
How to do image filters with K Means Clustering
First of all, it’s always good to remember an image is just a vector of pixels. Each pixel is a tuple of three integer values between 0 and 255 (an unsigned byte), which represent that pixel’s color’s RGB values.
We want to use K Means clustering to find the k colors that best characterize an image. That just means we could treat each pixel as a single data point (in 3-dimensional space), and cluster them.
So first, we’ll want to turn an image into a vector of pixels in Python. Here’s how we do it.
As an aside, I don’t think the vector_of_pixels function needs to use a Python list. I’m sure there has to be some way to flatten a numpy array , I just couldn’t find any (at least not one that did it in the order I wanted).
If you can think of any way, let me know in the comments!
The next step is fitting the model to the image, so that it clusters the pixels into k colors. Then, it’s just a matter of assigning the corresponding cluster color to each position in the image.
For instance, maybe our pic has only three colors: two reddish ones and a greenish one. If we fit that to 2 clusters, all the reddish pixels would turn some different shade of red (getting clustered together), and the other ones would turn into some greenish one.
But enough with the explanations, let’s see the program in action!
As usual, you are free to run it yourself with any pic you want, here’s the GitHub repository with the code.
The results
We will apply the filter to pictures of kittens, taken from the awesome “Cats vs Dogs” kaggle dataset.
We’ll start with a picture of a cat, and apply the filter with different values for k. Here’s the original picture:
First, let’s check how many colors this picture originally had.
With just one line of numpy, we count the unique values a pixel takes on this picture. This image in particular has 243 different colors , even though it has a total of 166167 pixels.
Now, let’s see the result of clustering it to 2, 5 and 10 different colors only.
Did you notice a trend? Each color we add has diminishing returns. The difference between having 2 colors and having 5, is a lot more than the difference between 5 and 10. However with 10 colors, the flat areas are smaller, and we have more granularity. Moving on to 15 and 24 colors!
Moving on to a different picture: Here’s the original (256 different colors) and here’s a compressed one (24 colors again).
As an interesting note, the “compressed” image weighs 18KB and the uncompressed one 16KB. I don’t really know why this is, since compressors are pretty complicated beasts, but would love to read your theories in the comments.
Conclusions
We were able to make new images with only 10% of the original’s colors, which looked very similar to them. We also got some cool looking filters thanks to K means clustering. Can you think of any other fun application for clustering? Do you think other clustering techniques could have yielded more interesting results?
If you want to answer any of these questions, feel free to contact me on Twitter, Medium or Dev.to.
Are you interested in starting a career in Data Science? Do you want to be an awesome Machine Learning professional? Check out my recommended reading list: “3 Machine Learning Books that will Help You Level Up as a Data Scientist”. One of them actually taught me what I know about K Means Clustering.
The post K Means Clustering with Dask (Editing Pictures of Kittens) appeared first on Data Stuff.
Top comments (5)
Are you doing the thing that 8-bit GIF and PNG images do? There are some algorithms to do that. There are some tricks involving pixel moving allowing you to have better pictures with less colors.
What do GIFs and PNGs do? I'm just clustering the colors and then replacing each pixel with its corresponding cluster's color.
That's one of the ways to reduce image complexity (number of colors) when creating 8-bit gif and png images.
You seem young (that's not an insult, good forbid!) and might not know what I'm taking about. Grab an old version of adobe photoshop, the older the better. They have (or rather: had) this cool dialog "save for web" where you could apply different compression settings and compare images, for example which is better, 50% jpg or 24 color gif, and play with those settings. There were 3 ways of reducing the number of colors, one of them reminds me of what you did here.
I know that! I didn't know about the photoshop bit (never been the artist type), but I remember there was a format (I think it was GIF?) where you actually defined a small set of colors and you'd only use those, so the image was lighter.
That's why I was surprised when the reduced images were heavier than the original ones! I was expecting JPG to somehow profit from the reduction.
Gif, indeed. And eight bit PNGs. They are using simplified palette of colors (so 1 byte instead of 3 or 4) to reduce the size, then apply some kind of dictionary compression (i'm not strict here) to reduce the size, so that large areas of the same color can be effectively compressed.
JPG on the other hand benefits from cosine transform, it's about expressing the image as a set of waves, solid color is difficult to compress because it needs many waves to cancel each other. Quite a lot of math is happening there. In general, jpg compresses by dropping details and having simple gradients. The whole jpgalgorithm is well designed and it's pitfalls are interesting to explore. You experienced such case.
You can do an experiment and compress a picture with jpg 75% quality or and over again. After enough iterations you should see some artifacts and disortions.