DEV Community

Nukala Suraj
Nukala Suraj

Posted on • Updated on

Understanding KNNs

KNNs or K Nearest Neighbors is the simplest Machine Learning algorithm and also quite consequently the slowest too 😂

Nonetheless It's definitely the best algorithm to start your ML journey.

A Machine Learning algorithm can be used for:

  1. Regression (predict based on past history)
  2. Classification (sort an event into it's types)

KNN is an algorithm that can be used for both, however it's most commonly used for classification problems.

The Basics/Pre-Requisites

If you've done pythagoras theorem before, that's really all you need to know to understand the algorithm and this algo's gonna be a cake walk for you.
Pythagoras Theorem

The Algorithm

Consider a graph of a few plotted points
Source: dslytics

Classification with KNN

Let's say we need to find if the star falls under class A or class B.

What is Nearest Neighbors in KNN

The Nearest Neighbors would be the points that are closest to the Star.
We find the distance of each point to the star and sort it in ascending order.

What is K in KNN

The K refers to the X closest neighbors.
so,
for K=3, we would be consider top 3 nearest neighbors for classification
for K=6, we would consider top 6 nearest neighbors for classification

Choosing K for KNN

For each value of K, The class may vary a lot so we need to choose an optimal(best) value

There is no specific way to find the optimal value of K, but most-of-the time K tends to be very near to root of total no of points.

So start with K=√n and play around by increasing or decreasing K until you are happy

Here we have 10 total points,
so K = √10 which is approximately 3
We could get the best result for K = 3

The Classifying

Now that we have top K of the nearest neighbors, all we have to do is find the mode of them.

We find the class that has majority of points under it and viola, we classified the star.

for K=3, since Yellow is the mode. The star falls under Class B
for K=6, since Purple is the mode. The star falls under Class A

Regression with KNN

We were able to find the class of the star.

But what if each of the points had a value (say price)

With Classification you found the class of the point, but you don't know the price of the point(let's say the points represent cars).

Regression almost exactly the same thing as Classification, except here we find the mean of the top K nearest neighbors instead of the mode.
mean formula

And there we have predicted both the class and value of an unknown data-point from a data-set

Closing Thoughts

This might not seem like ML, but this is a very commonly used ML algorithm.

ML is really easy when you have a solid understanding of your MATH.

Thanks for the read

If You ❤️ This Article
Stay Tuned
@twitter
@instagram
@youtube

Top comments (4)

Collapse
 
alimobasheri profile image
MirAli Mobasheri

Thanks for the visualization👍

Collapse
 
lucidmach profile image
Nukala Suraj

Glad it was of help

Collapse
 
mccurcio profile image
Matt Curcio

Good starter article!

Collapse
 
lucidmach profile image
Nukala Suraj

😁
wanted to make it as simple as possible