REJOIN
Rejoin or Split Distributed Data


REJOIN is a FORTRAN90 library which demonstrates a way to split or merge data files for parallel computations. In the case considered here, each processor writes data to separate "parallel" files, and after program execution, it is desired that these file be gathered into a single "sequential" file.

The problem was simple. Suppose that a logical array of NPX by NPY processors was used, and that the computation was associated with a physical array of NX_GLOBAL by NY_GLOBAL processors. The original physical array was divided up among the processors, and each handled its own piece. (There was some communication between pieces, via what amounted to internal boundary conditions). Once the computation was done, each processor wrote out its information to a file, whose name had the processor ID embedded in it. In order to conveniently graph or analyze the data, it was desired to rejoin all this data into a single file, as though one processor had handled it. This is what REJOIN does, for the data from a specific program.

Of course, no sooner was code written to rejoin data from a logical array of NPX by NPY processors, than a request was made for a new feature, the ability to reshape the data. Reshaping the data means writing a new set of files corresponding to a different processor configuration of, say, MPX by MPY processors.

Rehaping allows you to run on one configuration on Tuesday, save the data, and resume the calculation on Wednesday on a different processor configuration. It turns out that the easiest (though not most efficient!) way to do this is simply to add a "split" option to the code, which splits up a sequential data file into an arbitrary set of parallel data files.

Once we have a split option, the reshape option can be carried out in two steps:

  1. "rejoin" the NPX by NPY data into a single sequential file;
  2. "split" the sequential data into MPX by MPY data
See the SPLIT_SAVE routine for details.

Licensing:

The computer code and data files described and made available on this web page are distributed under the GNU LGPL license.

Languages:

REJOIN is available in a FORTRAN90 version.

Related Data and Programs:

FILE_MERGE, a FORTRAN90 program which merges two sorted files.

Source Code:

Examples and Tests:

REJOIN_PRB "rejoins" a set of files from a parallel run. Files you may copy include:

SPLIT_PRB splits sequential data into separate files. Files you may copy include:

U_O_F converts unformatted data to formatted data. Files you may copy include:

You may copy the single "sequential" data file:

You may copy the 8 "pieces" into which the file was divided:

List of Routines:


Last revised on 18 May 2003.