Defining cluster centres in SPSS K-means cluster

A student asked how to define initial cluster centres in SPSS K-means clustering and it proved surprisingly hard to find a reference to this online. It turns out to be very easy but I’m posting here to save everyone else the trouble of working it out from scratch.

SPSS offers Hierarchical cluster and K-means clustering. K-means clustering is often used to ‘fine tune’ the results of Hierarchical clustering, taking the cluster solution from Hierarchical clustering as its inputs.

The easiest way to set this up is to read the cluster centres in from an external SPSS datafile: the problem is finding out how this data file should be formatted.

The answer is that that SPSS requires one row of data for each cluster, and one column of cluster means for each variable. The first column must be called CLUSTER_ and is simply the cluster number for each row. So for a two-cluster solution with five variables it should look like this

CLUSTER_ Var_A Var_B Var_C Var_D Var_E
1 2.99 3.00 2.99 2.83 2.87
2 2.15 2.72 2.13 1.87 2.52

The K-means clustering procedure can then be pointed to this file by ticking the Cluster Centers ‘Read initial’ option and telling SPSS where the ‘External data file’ is saved. Note that the ‘Number of Clusters’ also has to be set to the same number as defined in the data file.

See Jane Clatworthy’s paper here for further details on different clustering methods.


6 responses to “Defining cluster centres in SPSS K-means cluster

  1. I am so grateful for this explanation – you have no idea! It has been impossible to find this information anywhere else. I still have a query, though, in regard to the initial cluster centers. Are these the mean values for the variables (used for clustering) for each cluster obtained in the hierarchical cluster analysis. I’m using Ward’s method and squared euclidean distance. Thank you in advance if you have time to respond.

  2. Thank you very much for posting this!

  3. I’m breaking my head to figure out how place the initial cluster centers into a file. Your article saved me. Thank you very much.

  4. thank you much! it’s very useful!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s