CVT_DATASET is a C++ program which creates a CVT dataset and writes it to a file.
The program is interactive, and allows the user to choose the parameters that define the dataset.
Normally, data is computed in the unit hypercube, with uniform density. However, if you wish to work in a more interesting geometry, or to control the density function, it is necessary to modify the USER routine in the CVT library, and then direct CVT_DATASET to use that routine for initialization and sampling.
The data that the user may set includes:
A "CVT" is a Centroidal Voronoi Tessellation. Essentially, a CVT is a set of sample points in a (finite) region with the property that each point is the centroid of its Voronoi subregion. A "random" set of sample points will not have this property. However, it is possible to begin with a random set of sample points, and drive it towards a CVT set, by applying an iterative refinement process.
The generation of a CVT dataset is of necessity more complicated than for a quasirandom sequence. An iteration is involved, so there must be an initial assignment for the generators, and then a number of iterations. Moreover, in each iteration, estimates must be made of the volume and location of the Voronoi subregions. This is typically done by Monte Carlo sampling. The accuracy of the resulting CVT depends in part on the number of sampling points and the number of iterations taken.
A reasonable set of input data might be:
2 spatial dimension is 2 10 compute 10 points 123456789 seed for random numbers uniform initialize by UNIFORM 40 40 iterations 0.0 zero tolerance; won't stop early uniform sample using UNIFORM 10000 use 10,000 sample points on each iteration 1000 create 1,000 sample points at a time -1 stop; don't want to define another set.
Once these parameters are set, the program generates the data and writes it to a file. The user may then specify another set of input data, or terminate the program.
The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.
CVT_DATASET is available in a C++ version and a FORTRAN90 version and a MATLAB version.
CCVT_BOX, a C++ program which computes a CVT with some points forced to lie on the boundary.
CVT, a C++ library which creates a CVT dataset.
CVT, a dataset directory which contains a collection of datasets created by CVT_DATASET (along with the commands used to create them).
FAURE_DATASET, a C++ program which creates a Faure quasirandom dataset;
GRID_DATASET, a C++ program which creates a grid sequence and writes it to a file.
LATIN_CENTER_DATASET, a C++ program which creates a Latin Center Hypercube dataset;
LATIN_EDGE_DATASET, a C++ program which creates a Latin Edge Hypercube dataset;
LATIN_RANDOM_DATASET, a C++ program which creates a Latin Random Hypercube dataset;
NIEDERREITER2_DATASET, a C++ program which creates a Niederreiter quasirandom dataset with base 2;
NORMAL_DATASET, a C++ program which generates a dataset of multivariate normal pseudorandom values and writes them to a file.
SOBOL_DATASET, a C++ program which computes a Sobol quasirandom sequence and writes it to a file.
UNIFORM_DATASET, a C++ program which generates a dataset of uniform pseudorandom values and writes them to a file.
VAN_DER_CORPUT_DATASET, a C++ program which creates a van der Corput quasirandom sequence and writes it to a file.
Test 1 computes 85 CVT points in 2 dimensions, using uniform initialization, a seed of 123456789, 40 iterations, a zero tolerance, uniform sampling, 10,000 sample points in batches of 1000:
Test 2 repeats Test 1, but with 80 iterations:
Test 3 repeats test 1, but with 1,000,000 sample points:
Test 4 repeats test 1, but with Halton sampling:
Test 5 repeats test 1, but with Grid sampling:
Test 6 repeats Test 1, but with Random sampling:
Test 7 repeats Test 1, but with a seed of 987654321:
Test 8 repeats Test 1, but with a batch size of 5:
Test 9 computes 100 CVT points in 3 dimensions, using uniform initialization, a seed of 123456789, 40 iterations, a tolerance of 0.000001, uniform sampling, 10,000 sample points in batches of 1000:
Test 10 investigates the unstable CVT formed by a Cartesian grid of 100 points in 2D. Starting from this unstable solution, the iteration proceeds towards a more "hexagonal" pattern :
Test 11 shows how the user may specify the initial point locations in a file. 15 points are specified in 2D:
Test 12:
Test 13:
Test 14 shows how the user may refer to the USER routine for a different geometry. The default USER routine is set up to sample the unit circle in 2D. 100 points are requested:
You can go up one level to the C++ source codes.