SPAETH2
Cluster Analysis Tools
SPAETH2
is a FORTRAN90 library which
analyzes data by grouping it
into clusters.
The current implementation
of the code is "under development": some things work, and some
don't.
Licensing:
The computer code and data files made available on this web page
are distributed under
the GNU LGPL license.
Languages:
SPAETH2 is available in
a FORTRAN90 version.
Related Data and Programs:
ASA058,
a FORTRAN90 library which
implements the K-means algorithm of Sparks.
ASA136,
a FORTRAN90 library which
implements the Hartigan and Wong clustering algorithm.
CITIES,
a dataset directory which
contains sets of information about cities and the distances
between them;
CITIES,
a FORTRAN90 library which
handles various problems associated with a set of "cities" on a map.
KMEANS,
a FORTRAN90 library which
contains several different algorithms for the K-Means problem.
LAU_NP,
a FORTRAN90 library which
implements heuristic algorithms for various NP-hard combinatorial problems.
POINT_MERGE,
a FORTRAN90 library which
considers N points in M dimensional space, and counts or indexes
the unique or "tolerably unique" items.
SPAETH,
a FORTRAN90 library which
can cluster data according to various principles.
SPAETH,
a dataset directory which
contains datasets for cluster analysis;
SPAETH2,
a dataset directory which
contains datasets for cluster analysis;
Reference:
-
Helmuth Spaeth,
Cluster Analysis Algorithms
for Data Reduction and Classification of Objects,
Ellis Horwood, 1980,
QA278 S6813.
-
Helmuth Spaeth,
Cluster Dissection and Analysis,
Theory, FORTRAN Programs, Examples,
Ellis Horwood, 1985,
QA278 S68213.
Source Code:
Examples and Tests:
List of Routines:
-
CH_CAP capitalizes a single character.
-
CH_EQI is a case insensitive comparison of two characters for equality.
-
CH_TO_DIGIT returns the integer value of a base 10 digit.
-
CLUDIA clusters data for which a distance matrix has been supplied.
-
CLUSTA solves the multiple location problem in N dimensions.
-
CLUSTER_CENTROIDS determines the centroids of a clustering.
-
CLUSTER_MEDIANS determines the medians of a clustering.
-
CLUSTER_MEDIAN_DISTANCE finds the cluster median distance.
-
CLUSTER_POPULATION sets the cluster populations from the assignment array.
-
CLUSTER_VARIANCE determines the variances associated with a clustering.
-
COLPER seeks a column permutation which maximizes the "bond energy".
-
DATA_D_READ reads a real data set stored in a file.
-
DATA_D_PRINT prints a real data set.
-
DATA_D_SHOW makes a typewriter plot of a real data set.
-
DATA_SIZE counts the size of a data set stored in a file.
-
DIF_INVERSE returns the inverse of the second difference matrix.
-
DISMEA constructs a set of hierarchical clusters.
-
DIVGOW constructs a set of hierarchical clusters by doubling.
-
EMEANS clusters data using a variant of the K-Means algorithm for L1 norms.
-
GET_UNIT returns a free FORTRAN unit number.
-
HIERCL implements seven agglomerative hierarchical clustering methods.
-
HMEANS clusters data using the H-Means algorithm.
-
I4_FACTORIAL computes the factorial N!
-
I4_SWAP swaps two integer values.
-
I4VEC_INDICATOR sets an integer vector to the indicator vector A(I)=I.
-
I4VEC_PERML generates permutations of a vector in lexicographic order.
-
I4VEC_PERMS generates permutations of a vector in lexicographic order.
-
JOINER uses a very simple cluster assignment algorithm.
-
KMEANS clusters data using the K-Means algorithm.
-
LEADER uses a very simple cluster assignment algorithm.
-
LINKER contructs a minimal tree for a symmetric distance matrix.
-
ORDERED clusters one-dimensional ordered data into NC clusters.
-
PROFILE seeks an optimal variable ordering for a set of data.
-
R8_SWAP swaps two R8's.
-
R8MAT_DET computes the determinant of an R8MAT.
-
R8MAT_PRINT prints an R8MAT.
-
R8VEC_ASCENDS determines if a double precision vector is (weakly) ascending.
-
R8VEC_SORT_BUBBLE_A ascending bubble sorts an R8VEC.
-
RANDP randomly partitions a set of M items into N clusters.
-
S_TO_R8 reads an R8 from a string.
-
S_WORD_COUNT counts the number of "words" in a string.
-
STANDN solves the single location problem in N dimensions.
-
TIMESTAMP prints the current YMDHMS date as a time stamp.
-
TRANSF transforms a data set to have zero mean and unit variance.
-
URAND returns a pseudo-random number uniformly distributed in [0,1].
-
WMEANS clusters data using the determinant criterion.
-
ZWEIGO organizes a set of data into two clusters.
You can go up one level to
the FORTRAN90 source codes.
Last revised on 13 November 2006.