**Previous:** Wavefronts in the Gauss-Seidel and Conjugate Gradient methods

**Up:** Parallelism

**Previous Page:** Wavefronts in the Gauss-Seidel and Conjugate Gradient methods

**Next Page:** Remaining topics

In addition to the usual matrix-vector product, inner products and vector updates, the preconditioned GMRES method (see §) has a kernel where one new vector, , is orthogonalized against the previously built orthogonal set {, ,..., }. In our version, this is done using Level 1 BLAS, which may be quite inefficient. To incorporate Level 2 BLAS we can apply either Householder orthogonalization or classical Gram-Schmidt twice (which mitigates classical Gram-Schmidt's potential instability; see Saad [179]). Both approaches significantly increase the computational work, but using classical Gram-Schmidt has the advantage that all inner products can be performed simultaneously; that is, their communication can be packaged. This may increase the efficiency of the computation significantly.

Another way to obtain more parallelism and
data locality is to generate a basis
{, , ..., } for the Krylov subspace first,
and to orthogonalize this set afterwards; this is called
*m*-step GMRES(*m*) (see Kim and Chronopoulos [137]).
(Compare this to the GMRES method in §, where each
new vector is immediately orthogonalized to all previous vectors.)
This approach does not
increase the computational work and, in contrast to CG, the numerical
instability due to generating a possibly near-dependent set is not
necessarily a drawback.