What is the criteria for setting the number of clusters ? .Is it just selected by vibes and is there a standard approach to selecting this.
This are very valid question has a machine learning expert .In the article we would be look at the best way to pick the number of cluster .
The widely accepted method of picking the number of cluster is the ELBOW METHOD
Remember that clustering is about
Minimizing the distance between points in a cluster
Maximizing. the distance between clusters
For Kmeans this two occurs at the same time .The distance between points in a cluster is measures using
within-cluster sum of squares or WCSS
WCSS is a measure developed within the ANOVA framework.If we minimize WCSS, we have reached the perfect clustering solutions. The elbow of the graph shows the best possible number of cluster to be used.
The below screenshot shows the best solution for this :
Top comments (0)