Does Anyone Know of Any *Discrete* Clustering Algorithms?

twitter logo github logo ・1 min read

I'm doing some research into clustering algorithms and every source I seem to find discusses 2D (or higher-dimensional) clustering of continuous data. The nearest thing I've found to what I'm looking for is this article which discusses discrete-continuous clustering (where the x and y axes are quantized into cells, but the z axis is allowed to vary continuously).

Has anyone come across any algorithms which perform cluster analysis of purely discrete data? Specifically 2D?

twitter logo DISCUSS (2)
markdown guide
 

How about Single-Linkage Clustering or Complete-Linkage Clustering, both belong to hierarchical Clustering, you just have to choose a distance metric that works on the grid, like Manhattan Distance.

Actually shouldn't it be possible to adopt any Clustering algorithm: as an example k-means: you need to choose an appropriate distance metric as above and second adjust the calculation of the prototypes to choose a point of the grid.

 

Can you be a bit more specific about what your data looks like? Are x and y categorical features and z continuous? I had, at some point, a SO thread about combining data specific distance functions in a nearest neighbor search. I can't find it anymore, but it would be sort of like def custom_distance(X): return scipy.dice(categorical_features) + scipy.euclidean(continuous_features)

It looks sort of like: members.cbio.mines-paristech.fr/~j...

Found it! Hopefully something in this thread is helpful.
datascience.stackexchange.com/ques...

Classic DEV Post from Apr 30 '18

How engineers can stand out from the applicant pool

Technical founders share stories and advice about how software engineers can stand out from the applicant pool.

Andrew profile image
Got a Ph.D. looking for dark matter, but not finding any. Now I code full-time. Je parle un peu français. He/Him. dogs > cats