DEV Community

Rajaniraiyn R
Rajaniraiyn R

Posted on

AHC Clustering Algorithm Demystified

AHC stands for Agglomerative Hierarchical Clustering, which is a hierarchical clustering algorithm that creates a tree of clusters by merging smaller clusters into larger ones. The algorithm does not require specifying the number of clusters in advance, unlike k-means or GMM. Instead, it allows choosing any number of clusters by cutting the tree at the right level.


Step 1: Treat each data point as a singleton cluster and calculate the distance matrix between all pairs of clusters.
Step 2: Find the pair of clusters with the smallest distance and merge them into a new cluster. Update the distance matrix by removing the rows and columns of the merged clusters and adding a row and column for the new cluster.
Step 3: Repeat step 2 until there is only one cluster left, which is the root of the tree.


# Input: data points X, distance measure D
# Output: cluster tree T

# Step 1: Initialize n singleton clusters and n X n distance matrix
C = singleton_clusters(X)
M = distance_matrix(C, D)

# Initialize cluster tree
T = empty_tree()

# Loop until there is only one cluster left
while C.size > 1:

    # Step 2: Find the pair of clusters with the smallest distance and merge them
    i, j = argmin(M)
    new_cluster = merge(C[i], C[j])

    # Update the distance matrix
    M = update_matrix(M, i, j, new_cluster, D)

    # Update the cluster list
    C = update_list(C, i, j, new_cluster)

    # Step 3: Add the merged cluster to the tree
    T = add_node(T, new_cluster)

# Return the final cluster tree
return T
Enter fullscreen mode Exit fullscreen mode


  • It can capture the hierarchy and structure of the data, unlike k-means or GMM which produce flat clusters.
  • It can handle clusters with different shapes and sizes, unlike k-means which assumes spherical clusters.
  • It can choose any number of clusters by cutting the tree at the right level, unlike k-means or GMM which need k as an input.


  • It is computationally expensive, as it requires calculating and updating a large distance matrix at each iteration.
  • It is sensitive to noise and outliers, as they can affect the distance measure and the merging process.
  • It is difficult to handle different cluster densities, as it may merge clusters too early or too late depending on the distance measure.

References and Further Reading

This will be a multipart series which will follow up with more clustering algorithms with their working, pseudocode, advantages and disadvantages.

Please stay tuned for more such content.

If you liked this post, please share it with your friends and fellow developers. And don’t forget to follow us for more programming tutorials and examples! 😊

And also,
have a look👀 @ my Portfolio
code👨‍💻 together @ Github
connect🔗 @ LinkedIn

Top comments (0)