MATMUL
A Matrix Multiplication Benchmark
MATMUL
is a C program which
compares various methods for computing the matrix product
A * B = C.
MATMUL can do this for a variety of matrix sizes, and for different
arithmetics (real, complex, double precision, integer, even logical!)
There are many algorithms built in, including the
simple triple DO loop (actually not so simple; there are 6 ways to
set it up), some unrolling techniques, and the level 1 and 2 BLAS
routines.
MATMUL is interactive, so the user can easily pursue any
line of inquiry that seems promising. New algorithms or locally
available methods are not to hard to add.
Licensing:
The computer code and data files described and made available on this web page
are distributed under
the GNU LGPL license.
Languages:
MATMUL is available in
a C version and
a FORTRAN90 version.
Related Data and Programs:
LINPACK_BENCH,
a C program which
measures the time needed to factor and solve a linear system.
MDBNCH,
a FORTRAN77 program which
is a benchmark code for a molecular dynamics calculation.
MEMORY_TEST,
a C program which
declares and uses a sequence of larger
and larger vectors, to see how big a vector can be used on a given
machine and compiler.
MXV,
a C program which
compares the performance of (DO I, DO J) loops, (DO J, DO I ) loops,
and MATMUL for computing the product of an MxN matrix A and an N vector X.
SUM_MILLION,
a C program which
sums the integers from 1 to 1,000,000, as a demonstration of how
to rate a computer's speed;
TIMER,
a C program which
demonstrates how to compute CPU time or elapsed time.
Reference:

John Burkardt, Paul Puglielli,
Pittsburgh Supercomputing Center,
MATMUL: An Interactive Matrix Multiplication Benchmark
Source Code:
Examples and Tests:
List of Routines:

MAIN is the main program for MATMUL.

CH_CAP capitalizes a single character.

C4_IJK computes A = B*C using index order IJK and complex arithmetic.

C4_MATMUL computes A = B*C using FORTRAN90 MATMUL and C4 arithmetic.

C4_SET initializes the A, B and C matrices using C4 arithmetic.

DOMETHOD calls a specific multiplication routine.

GETSHO determines what items the user wishes to print out.

HEADER prints out a header for the results.

HELLO says hello to the user.

HELP prints a list of the available commands.

INIT initializes data.

I4_MATMUL computes A = B*C using FORTRAN90 MATMUL and I4 arithmetic.

I4_IJK multiplies A = B*C using index order IJK, using I4 arithmetic.

I46_IJK multiplies A = B*C using index order IJK, and I46 arithmetic.

I4_SET initializes the A, B and C matrices using I4 arithmetic.

L_IJK "multiplies" A = B*C using index order IJK, using logical data.

L_SET initializes the A, B and C matrices using "logical arithmetic".

MATMUL_CPU_TIMER computes total CPU seconds.

MATMUL_REAL_TIMER returns a reading of the real time clock.

MULT carries out the matrix multiplication, using the requested method.

N_GET determines the problem sizes desired by the user.

N_STEP is used when a set of values of N is being generated.

ORDER_GET reads a new value of order from the user.

ORDER_LIST_PRINT prints the list of choices for the algorithm.

PRINTR prints out those parameters the user wants to see.

R4_DOT_PRODUCT multiplies A = B*C using DOT_PRODUCT and R4 arithmetic.

R4_MATMUL computes A = B*C using FORTRAN90 MATMUL and R4 arithmetic.

R4_IJ sets A = B*C using index order IJ with implicit K and R4 arithmetic.

R4_IJK multiplies A = B*C using index order IJK and R4 arithmetic.

R4_IJK_IMPLICIT sets A = B*C, index order IJK, implicit loops, R4 arithmetic.

R4_IJK_M multiplies A = B*C using index order IJK and R4 arithmetic.

R4_IJK_S sets A = B*C, index order IJK, no Cray vectorization, R4 arithmetic.

R4_IJK_I2 sets A = B*C, index order IJK, unrolling I 2 times, R4 arithmetic.

R4_IJK_I4 sets A = B*C, index order IJK, unrolling I 4 times, R4 arithmetic.

R4_IJK_I8 sets A = B*C, index order IJK, unrolling I 8 times, R4 arithmetic.

R4_IJK_J4 sets A = B*C, index order IJK, unrolling on J, R4 arithmetic.

R4_IJK_K4 sets A = B*C, index order IJK, unrolling on K, R4 arithmetic.

R4_IKJ multiplies A = B*C using index order IKJ, R4 arithmetic.

R4_IKJ multiplies A = B*C, index order IKJ, DOT_PRODUCT, R4 arithmetic.

R4_JIK multiplies A = B*C using index order JIK, R4 arithmetic.

R4_JIK_implicit sets A = B*C, index order JIK, implicit loops, R4 arithmetic.

R4_JKI multiplies A = B*C using index order JKI, R4 arithmetic.

R4_JKI_IMPLICIT sets A = B*C, index order JKI, implicit loops, R4 arithmetic.

R4_KIJ multiplies A = B*C using index order KIJ, R4 arithmetic.

R4_KIJ_DOT sets A = B*C using index order KIJ and DOT_PRODUCT, R4 arithmetic.

R4_KJI multiplies A = B*C using index order KJI, R4 arithmetic.

R4_KJI_IMPLICIT sets A = B*C, index order KJI, implicit loops, R4 arithmetic.

R4_KJI_M sets A = B*C using index order KJI and multitasking, R4 arithmetic.

R4_MXMA multiplies A = B*C using optimized MXMA, R4 arithmetic.

R4_SAXPYC multiplies A = B*C columnwise, using optimized SAXPY, R4 arithmetic.

R4_SAXPYR multiplies A = B*C "rowwise", using optimized SAXPY, R4 arithmetic.

R4_SDOT multiplies A = B*C using optimized SDOT, R4 arithmetic.

R4_SET initializes the A, B and C matrices using R4 arithmetic.

R4_SGEMM multiplies A = B*C using optimized SGEMM, R4 arithmetic.

R4_SGEMMS multiplies A = B*C using optimized SGEMMS, R4 arithmetic.

R4_TAXPYC sets A = B*C columnwise, using source code SAXPY, R4 arithmetic.

R4_TAXPYR multiplies A = B*C rowwise using source code SAXPY, R4 arithmetic

R4_TDOT multiplies A = B * C using source code SDOT, R4 arithmetic.

R4_TGEMM multiplies A = B*C using source code SGEMM, R4 arithmetic.

R8_MATMUL computes A = B*C using FORTRAN90 MATMUL and R8 arithmetic.

R8_IJK multiplies A = B*C using index order IJK and R8 arithmetic.

R8_SET initializes the matrices A, B and C using R8 arithmetic.

REPORT reports the results for each multiplication experiment.

S_BLANK_DELETE removes blanks from a string, left justifying the remainder.

S_CAP replaces any lowercase letters by uppercase ones in a string.

S_EQI is a case insensitive comparison of two strings for equality.

S_TO_I4 reads an I4 from a string.

TAXPY is unoptimized standard BLAS routine SAXPY.

TDOT computes the inner product of two vectors.

TERBLA is the source code for the BLAS error handler.

TGEMM is a source code copy of SGEMM, a BLAS matrix * matrix routine.

TGEMVF is a source code copy of BLAS SGEMVF, a matrix * vector routine.

TIMESTAMP prints the current YMDHMS date as a time stamp.

TLSAME is a source code copy of BLAS LSAME, testing character equality.
You can go up one level to
the C source codes.
Last revised on 22 November 2008.