Rationale This message mode is usefull for applications, which keep precisly track of the current state of each process and would like to minimize the roll-back necessary after recovery.
Advice to users If an application would like to receive a message which has been initiated before an error occured after the recovery operation, it has to reconstruct the communicators in the very same order like previously.
Advice to implementors An MPI implementation has to insure, that two sequences creating communicators in an identical manner in di erent generation counts will produce the same communica- tor/context ID’s.
Blocking operations: A send operation which returned MPI SUCCESS will deliver the data, even if a failure occurs before the data could reach the destination. If the return code of the send operation is MPI ERR OTHER, the operation will have to be repeated after the recovery procedure.
Non-blocking operations: if a non-blocking point-to-point operation returned MPI SUCCESS to a process, which has not failed, than the operations will be finished successfully. If the according Wait/Test operations returns MPI ERR OTHER, the user will have to re-post the Wait/Test operation after recovery.
posted messages are discarded as soon as a recovery operation has been started.
Rationale This message mode is usefull for all applications, which on error go back to the last consistent state in the application. As an example, going from iteration 432 (when the error occured) back to iteration 400 (the last checkpoint) implies that any mes- sage from iteration 432 would disturb and be misplaced.
FTMPI MSG MODE CONT: in this mode, the generation count is not used for message matching. Thus, a message sent from process a to process b before a failure occured, will be delivered after the recovery operation. All operations, which returned MPI SUCCESS to a non failing process will be finished successfully after recovery.