checkpoint
Checkpointing is controlled entirely by the driver and the following
parameters, set by the user:
The parameter chkpt_period controls the frequency of the
checkpoint output. When chkpt_period = 0, checkpointing is
disabled. This is the default value.
The checkpointing routines alternatively write to files called "hello.sdf"
and then "goodbye.sdf". This is to ensure that there is always a
checkpointed state to recover from, even if a simulation is killed
while in the process of writing out a checkpoint state. Each checkpoint
state, hello and goodbye, overwrites its respective checkpoint state from
previous iterations.
The parameter chkpt_readstate controls which checkpointing files
to use for restarting a simulation -- either the "hello" or "goodbye" files.
- chkpt_readstate = 1 reads the "hello" state.
- chkpt_readstate = 2 reads the "goodbye" state.
Notes
- When restarting a checkpointed state in parallel, you must always
run on the same number of processors and the checkpointed state.
- Parameter file changes when restarting from a checkpointed state
have no impact except the above mentioned parameters.
- With the exception of the "hello" and "goodbye" sdf files, all
output data in the form of sdf files should be moved to
a separate directory prior to restarting from a checkpointed state lest
that data be overwritten and lost.
See Also
chkpt_control
Index of all manual pages
Examples