GETWGT
Dirichlet Mixture Estimation
GETWGT
is a FORTRAN90 library which
handles Dirichlet Mixture estimation.
The main requirement in the design of this library was that there be
a single routine with a simple interface through which the user
interacts. This routine, called getwgt(), accepts a set of
nucleic acid counts, updates an internal Dirichlet mixture model,
and returns its current estimate for the pseudocounts. All other
transactions and information are hidden from the user.
The program requires a data file, containing the parameters of the
Dirichlet density distributions that make up the mixture.
This file needs to be named weights.txt. Two files are
provided here as possible sources of this data. These files
were obtained from
the UCSC computational biology page,
where more information, papers, and data is available.
Licensing:
The computer code and data files made available on this web page
are distributed under
the GNU LGPL license.
Languages:
GETWGT is available in
a FORTRAN90 version.
Related Data and Programs:
ASA266,
a FORTRAN90 library which
evaluates various properties of a Dirichlet distribution.
BDMLIB,
a FORTRAN90 library which
estimates the weights in a Dirichlet mixture based on sample data;
Reference:
-
William Cody, Kenneth Hillstrom,
Chebyshev Approximations for the Natural Logarithm of the
Gamma Function,
Mathematics of Computation,
Volume 21, Number 98, April 1967, pages 198-203.
-
Brian Everitt, David Hand,
Finite Mixture Distributions,
Chapman and Hall, 1981.
-
Kenneth Lange,
Mathematical and Statistical Methods for Genetic Analysis,
Springer, 1997,
ISBN: 0387953892,
LC: QH438.4.M33.L36.
-
AFM Smith, Udi Makov,
A Quasi-Bayes Sequential Procedure for Mixtures,
Journal of the Royal Statistical Society,
Volume 40, Number 1, B, 1978, pages 106-112.
Source code:
Examples and Tests:
List of Routines:
-
GETWGT updates the Dirichlet mixture weights based on a set of counts.
-
CH_CAP capitalizes a single character.
-
CH_EQI is a case insensitive comparison of two characters for equality.
-
CH_NEXT "reads" space-separated characters from a string, one at a time.
-
CH_TO_DIGIT returns the value of a base 10 digit.
-
COMP_PARAM_PRINT prints the parameters for the mixture components.
-
DIRICHLET_MEAN returns the means of the Dirichlet PDF.
-
DIRICHLET_MULTINOMIAL_PDF evaluates a Dirichlet Multinomial PDF.
-
EVENT_PROCESS updates the mixture weight distribution parameters.
-
GET_UNIT returns a free FORTRAN unit number.
-
I4_NEXT "reads" integers from a string, one at a time.
-
MIXTURE_READ reads the Dirichlet mixture parameters from a file.
-
R8_GAMMA_LOG evaluates the logarithm of the gamma function.
-
R8_NEXT "reads" real numbers from a string, one at a time.
-
R8VEC_PRINT prints an R8VEC.
-
R8VEC_UNIT_SUM normalizes an R8VEC to have unit sum.
-
S_BEGIN is TRUE if one string matches the beginning of the other.
-
S_TO_I4 reads an I4 from a string.
-
S_TO_R8_OLD reads an R8 value from a string.
-
TIMESTAMP prints the current YMDHMS date as a time stamp.
You can go up one level to
the FORTRAN90 source codes.
Last revised on 16 July 2013.