MATLAB_KMEANS
Data Clustering with MATLAB's KMEANS() Function


MATLAB_KMEANS is a MATLAB library which illustrates how MATLAB's kmeans() command can be used to handle the K-Means problem, which organizes a set of N points in M dimensions into K clusters.

Because kmeans() is a built-in function in MATLAB, you can examine its source code by starting MATLAB and then typing

edit kmeans

Licensing:

The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.

Languages:

MATLAB_KMEANS is available in a MATLAB version.

Related Data and Programs:

ASA058, a MATLAB library which implements the K-means algorithm of Sparks.

ASA136, a MATLAB library which implements the Hartigan and Wong clustering algorithm.

CITIES, a MATLAB library which handles various problems associated with a set of "cities" on a map.

CITIES, a dataset directory which contains sets of data defining groups of cities.

IMAGE_QUANTIZATION, a MATLAB library which demonstrates how the KMEANS algorithm can be used to reduce the number of colors or shades of gray in an image.

KMEANS, a MATLAB library which contains several different algorithms for the K-Means problem, which organizes a set of N points in M dimensions into K clusters;

KMEANS_FAST, a MATLAB library which contains several different algorithms for the K-Means problem, which organizes a set of N points in M dimensions into K clusters, by Charles Elkan.

LORENZ_CLUSTER, a MATLAB library which takes a set of N points on a trajectory of solutions to the Lorenz equations, and applies the K-means algorithm to organize the data into K clusters.

MATLAB_KMEANS, MATLAB programs which illustrate the use of MATLAB's kmeans() function for clustering N sets of M-dimensional data into K clusters.

SAMMON_DATA, a MATLAB program which generates six sets of M-dimensional data for cluster analysis.

SPAETH, a dataset directory which contains a set of test data.

SPAETH2, a dataset directory which contains a set of test data.

Reference:

  1. John Hartigan, Manchek Wong,
    Algorithm AS 136: A K-Means Clustering Algorithm,
    Applied Statistics,
    Volume 28, Number 1, 1979, pages 100-108.
  2. Wendy Martinez, Angel Martinez,
    Computational Statistics Handbook with MATLAB,
    Chapman and Hall / CRC, 2002.
  3. David Sparks,
    Algorithm AS 58: Euclidean Cluster Analysis,
    Applied Statistics,
    Volume 22, Number 1, 1973, pages 126-130.

Examples and Tests:

There are data files read by the sample code:

You can go up one level to the MATLAB source codes.


Last revised on 01 September 2013.