Clustering: A conceptual approach

Sergei Vassilvitskii and Suresh Venkatasubramanian

An easy definition of the problem is
Clustering is the process of grouping items into clusters, so that items in the
same cluster
are similar to each other.
Each underlined word in the above definition is subject to interpretation and design choice. The choices the modeler makes determine what clustering problem she ends up with, what kind of patterns she will be looking for, and what kinds of algorithms she will use.

In this book, we will focus on a conceptual understanding of clustering. We will explain how one might make design choices in clustering, and what those choices mean for the patterns one is looking for.

A first draft (8/30/2018)

We have a draft of part I of the book! Comments would be greatly appreciated: please email them to All suggestions will be acknowledged in the book when we're done.

(This book started as an occasional series of essays on clustering: for all posts in this topic, click here)

Disqus for The Geomblog