There are special challenges associated with writing reliable
numerical software on networks containing heterogeneous
processors. That is, processors which may do floating point
arithmetic differently. This includes not just machines
with completely different floating point formats and
semantics (e.g. Cray versus workstations running IEEE
standard floating point arithmetic), but even supposedly
identical machines running with different compilers or
even just different compiler options or runtime environment. The basic problem
occurs when making *data dependent branches* on different
processors.
The flow of an algorithm is usually data dependent and so
slight variations in the data may lead to different processors executing
completely different sections of code.

A simple example of where an algorithm might not work correctly is an iteration where the stopping criterion depends on the value of the machine precision. If the precision varies from process to process, different processes may have significantly different stopping criteria. In particular, the stopping criterion used by the most accurate process may never be satisfied if it depends on data computed less accurately by other processes.

Many such problems can be eliminated by using the *largest* machine
precision among all participating processes. In LAPACK routine `
DLAMCH` returns the (double precision) machine precision (as well as
other machine parameters). In ScaLAPACK this is replaced by `
PDLAMCH` which returns the largest value over all the processes,
replacing the uniprocessor value returned by `DLAMCH`. Similarly,
one should use the smallest overflow threshold and largest underflow
threshold over the processes being used. In a non-homogeneous
environment the ScaLAPACK routine `PDLAMCH` runs the LAPACK routine
`DLAMCH` on each process and computes the relevant maximum or
minimum value. We refer to these machine parameters as the *
multiprocessor machine parameters*.

It should be noted that if the code contains communication between processes within an iteration, it will not complete if one process converges before the others. In a heterogeneous environment, the only way to guarantee termination is to have one process make the convergence decision and broadcast that decision. Further problems and suggested solutions are discussed in [12, 6].

Thu Jul 25 15:38:00 EDT 1996