Parallelizing the preconditioner solve

next up previous contents index
Next: Block factorization methods Up: Point incomplete factorizations Previous: Vectorization of the

Parallelizing the preconditioner solve


The algorithms for vectorization outlined above can be used on parallel computers. For instance, variables on a wavefront  can be processed in parallel, by dividing the wavefront over processors. More radical approaches for increasing the parallelism in incomplete factorizations are based on a renumbering of the problem variables. For instance, on rectangular domains one could start numbering the variables from all four corners simultaneously, thereby creating four simultaneous wavefronts, and therefore four-fold parallelism (see Dongarra, et al. [71], Van der Vorst [204][202]) . The most extreme case is the red/black ordering (or for more general matrices the multi-color ordering) which gives the absolute minimum number of sequential steps.

Multi-coloring is also an attractive method for vector computers. Since points of one color are uncoupled, they can be processed as one vector; see Doi [68], Melhem [154], and Poole and Ortega [176].

However, for such ordering strategies there is usually a trade-off between the degree of parallelism and the resulting number of iterations. The reason for this is that a different ordering may give rise to a different error matrix, in particular the norm of the error matrix may vary considerably between orderings. See experimental results by Duff and Meurant [79] and a partial explanation of them by Eijkhout [85].    

Jack Dongarra
Mon Nov 20 08:52:54 EST 1995