K-Means: (20 points) Recall that the K-means algorithm minimizes the squared
Euclidean distance from each point to its cluster mean (the
objective function is given on (8.1) on p.500 of the book).
Assume you are given a clustering with k clusters. In this
problem, you will construct an algorithm that will extend the
given clustering to a
clustering with (k+1) clusters from this clustering.
One way to create a mean for the cluster (k+1) is to choose a vector at
random as the mean. Show that if you choose such a vector and
perform a half step of k-means (the assignment step - step
(3) from Algorithm 8.1 in the book), then the k-means objective
function will not increase (i.e. it will either decrease or
stay the same).
If you continue running the algorithm over the (k+1) clusters, the final
objective function will be smaller than for that of the
original clustering over k points. Argue why this is true.
Give high level pseudo-code for an algorithm that starts with k = 1 clusters
and iteratively increases the number of clusters, using the
scheme given above.
Do you think your algorithm will work well in practice? Why or why not?
Can you suggest any improvements?