HB Files
Harwell Boeing
Sparse Matrix File Format
HB is a data directory which
contains examples of files in the "HB" or "Harwell Boeing"
Sparse Matrix File Format, used to store a sparse matrix in a file.
Note that the Rutherford Boeing format is an updated, more flexible
version of the Harwell Boeing format.
The space required to
represent the matrix is reduced by using a compressed column
storage format. If the matrix is read from the file into memory,
it is common to use the same compressed column storage to
represent the matrix in memory.
A matrix is an m by n rectangular array of numbers.
A sparse matrix is one in which "most" of the entries are zero.
If the matrix is sparse enough, then it is often much more efficient
not to allocate space for the full m*n set of entries, but
rather to keep track of the location and value of the nonzero entries.
HB File Characteristics:
-
ASCII;
-
Each data item in the file must occur in particular columns.
-
Oriented to FORTRAN, in the explicit use of FORTRAN FORMAT codes;
HB files begin with a header of 4, or sometimes 5, lines:
-
Line 1:
-
-
TITLE, (72 characters)
-
KEY, (8 characters)
-
Line 2:
-
-
TOTCRD, integer, total number of data lines, (14 characters)
-
PTRCRD, integer, number of data lines for pointers, (14 characters)
-
INDCRD, integer, number of data lines for row or variable
indices, (14 characters)
-
VALCRD, integer, number of data lines for numerical values
of matrix entries, (14 characters)
-
RHSCRD, integer, number of data lines for right hand side vectors,
starting guesses, and solutions, (14 characters)
-
Line 3:
-
-
MXTYPE, matrix type (see table), (3 characters)
-
blank space, (11 characters)
-
NROW, integer, number of rows or variables, (14 characters)
-
NCOL, integer, number of columns or elements, (14 characters)
-
NNZERO, integer, number of row or variable indices. For
"assembled" matrices, this is just the number of nonzero entries.
(14 characters)
-
NELTVL, integer, number of elemental matrix entries. For
"assembled" matrices, this is 0.
(14 characters)
-
Line 4:
-
-
PTRFMT, FORTRAN I/O format for pointers, (16 characters)
-
INDFMT, FORTRAN I/O format for row or variable indices, (16 characters)
-
VALFMT, FORTRAN I/O format for matrix entries, (20 characters)
-
RHSFMT, FORTRAN I/O format for right hand sides, initial guesses,
and solutions, (20 characters)
-
Line 5: (only present if 0 < RHSCRD!)
-
-
RHSTYP, describes the right hand side information, (3 characters)
-
blank space, (11 characters)
-
NRHS, integer, the number of right hand sides, (14 characters)
-
NRHSIX, integer, number of row indices, (14 characters)
Each character of the MXTYPE variable specifies a separate
fact about the matrix:
-
R, C or P indicates that the matrix
values are real, complex, or that only the
pattern of nonzeroes is going to be supplied. Note
that if complex arithmetic is specified, then any data vectors
included in the file will also be assumed to be complex.
FORTRAN I/O treats a complex number as a simple pair of real numbers.
Thus, a line that records the single complex number 12+17i
would look like
12.0 17.0
-
U, S, H, Z or R indicates that
the matrix is symmetric, unsymmetric, Hermitian, skew symmetric,
or rectangular. Each of these facts implies something about how
the nonzero elements of the matrix are stored in the file.
-
U: if the matrix is unsymmetric (and square), then every nonzero
element of the matrix corresponds to an entry in the file.
-
S: if the matrix is symmetric (which implies that it is square),
(and which typically only occurs for
real arithmetic), then half of the nonzero off-diagonal elements don't
need to be stored in the file. A user need only specify the
diagonal elements, and perhaps just those beneath the diagonal.
A program reading the file must, correspondingly, assume that
a value associated with one off-diagonal element should also
be assigned to its corresponding transposed location.
-
H: if the matrix is Hermitian, (which implies that it is square)
(and which typically only occurs for
complex arithmetic), then half of the nonzero off-diagonal elements don't
need to be stored in the file. A user need only specify the
diagonal elements, and perhaps just those beneath the diagonal.
A program reading the file must, correspondingly, assume that
a value associated with one off-diagonal element should also
be used to assign a value to its corresponding transposed location.
-
Z: if the matrix is skew symmetric, (which implies that it is square)
(and which typically only occurs for
real arithmetic), then the diagonal is zero, and only half of the
nonzero offdiagonal elements need to be stored. (I believe
that the Z code is only appropriate for a real matrix,
and that the case of a skew Hermitian matrix is not provide for!)
-
R: if the matrix is rectangular, then every nonzero element of the matrix
must be stored. In effect, this is the same as the unsymmetric case.
-
A indicates that the matrix is "assembled" (the typical case)
while E indicates that the matrix is a finite element matrix
that is going to be described as the "sum" of a set of smaller
matrices.
Each character of the RHSTYP specifies a separate fact about
the right hand side information. (Of course, if there are no right
hand sides, (RHSCRD = 0) then there is no header line 5, and hence
no need to worry about this variable!)
-
F means that all vectors will be listed as "full" vectors,
that is, as a list of NROW numbers; M means that,
instead, all vectors will be listed in the same format as the matrix.
The M option only makes sense if the matrix is being presented
in unassembled format.
-
G means that one or more starting guesses or approximate
solution vectors are being supplied. If no guess vectors are
supplied, this character should be blank (or, actually, anything
but a G).
-
X means that one or more exact solution vectors are being
supplied. If no exact solution vectors are supplied, leave this
character blank.
The Data Records
After the header lines comes data. This data is organized into four
distinct sets, of pointers, indices, matrix values, and right hand side
information. The number of lines devoted to each set of information were
specified in header line number 2. It is common for the final set of data
to be omitted. In a few cases, the third set, which describes matrix
values, is omitted. In that case, the file only has information about
where the nonzero entries of the matrix are, but does not actually
specify what those values are. Such a matrix file is called a "pattern"
matrix, and in fact, if this is the case, usually the first character
of MXTYPE is given as P for "pattern only".
Licensing:
The computer code and data files described and made available on this web page
are distributed under
the GNU LGPL license.
Related Programs and Data:
DLAP_IO,
a FORTRAN90 library which
reads and writes DLAP sparse matrix files;
HB_IO,
a C++ library which
reads and writes sparse linear systems stored in the Harwell-Boeing Sparse Matrix format.
HB_TO_MM,
a MATLAB program which
converts a sparse matrix from Harwell-Boeing to Matrix Market format.
HB_TO_MSM,
a MATLAB program which
converts a sparse matrix stored in a
Harwell Boeing file to MATLAB sparse matrix format;
HB_TO_ST,
a FORTRAN77 program which
converts a sparse matrix from Harwell-Boeing to sparse triplet format.
HBSMC,
a dataset directory which
contains the Harwell Boeing Sparse Matrix Collection;
MM_TO_HB,
a MATLAB program which
reads the sparse matrix information from an MM Matrix Market file
and writes a corresponding HB Harwell Boeing file.
MSM_TO_HB,
a MATLAB program which
writes a MATLAB sparse matrix to a Harwell Boeing (HB) file;
RB,
a data directory which
contains examples of RB files,
the Rutherford Boeing sparse matrix file format;
ST_TO_HB,
a FORTRAN90 program which
converts a sparse matrix file from ST format to
Harwell Boeing format (HB);
SUPERLU,
a C program which
applies a fast direct solution method to a sparse linear system.
Reference:
-
Iain Duff, Roger Grimes, John Lewis,
User's Guide for the Harwell-Boeing Sparse Matrix Collection,
Technical Report TR/PA/92/86,
CERFACS, October 1992.
-
Iain Duff, Roger Grimes, John Lewis,
Sparse Matrix Test Problems,
ACM Transactions on Mathematical Software,
Volume 15, Number 1, March 1989, pages 1-14.
-
Web site:
http://math.nist.gov/MatrixMarket/data/Harwell-Boeing/
Sample Files:
You can go up one level to
the DATA page.
Last revised on 24 April 2010.