Bisecting k-Means is like a combination of k-Means and hierarchical clustering.
It starts with all objects in a single cluster.
The pseudocode of the algorithm is displayed below:
Basic Bisecting K-means Algorithm for finding K clusters
- Pick a cluster to split.
- Find 2 sub-clusters using the basic k-Means algorithm (Bisecting step)
- Repeat step 2, the bisecting step, for ITER times and take the split that produces the clustering with the highest overall similarity.
- Repeat steps 1, 2 and 3 until the desired number of clusters is reached.
The critical part is which cluster to choose for splitting. And there are different ways to proceed, for example, you can choose the biggest cluster or the cluster with the worst quality or a combination of both.
Source: "A comparison of document clustering techniques", M. Steinbach, G. Karypis and V. Kumar. Workshop on Text Mining, KDD, 2000. [pdf]
Source: "A comparison of document clustering techniques", M. Steinbach, G. Karypis and V. Kumar. Workshop on Text Mining, KDD, 2000. [pdf]