Discover more from Daily Dose of Data Science
The Ultimate Categorization of Clustering Algorithms
6 types of clustering algorithms in a single frame.
Clustering is one of the core branches of unsupervised learning in ML.
It involves grouping data points together based on their inherent patterns or characteristics.
By identifying similarities (and dissimilarities) within a dataset, clustering helps in revealing underlying structures, discovering hidden patterns, and gaining insights into the data.
While centroid-based is the most common class of clustering, there's a whole world of algorithms beyond that, which we all should be aware of.
Centroid-based: Cluster data points based on proximity to centroids.
Connectivity-based: Cluster points based on proximity between clusters.
Density-based: Cluster points based on their density.
Graph-based: Cluster points based on graph distance.
Distribution-based: Cluster points based on their likelihood of belonging to the same distribution.
Compression-based: Transform data to a lower dimensional space and then perform clustering
Over to you: What other clustering algorithms will you include here?
Thanks for reading Daily Dose of Data Science! Subscribe for free to learn something new and insightful about Python and Data Science every day. Also, get a Free Data Science PDF (250+ pages) with 200+ tips.
This post was inspired by the clustering document of Sklearn. If you wish to learn more details, I would highly recommend reading this: Sklearn Clustering Guide.
👉 Tell the world what makes this newsletter special for you by leaving a review here :)
👉 If you liked this post, don’t forget to leave a like ❤️. It helps more people discover this newsletter on Substack and tells me that you appreciate reading these daily insights. The button is located towards the bottom of this email.
👉 If you love reading this newsletter, feel free to share it with friends!
Find the code for my tips here: GitHub.