Deriving a Block Algorithm

Next: Examples of Block Algorithms Up: Block Algorithms and Their Previous: Block Algorithms and Their

Deriving a Block Algorithm

To derive a block form of Cholesky factorization, we partition the matrices as shown in Figure 4, in which the diagonal blocks of A and U are square, but of differing sizes. We assume that the first block has already been factored as , and that we now want to determine the second block column of U consisting of the blocks and . Equating submatrices in the second block of columns, we obtain
eqnarray167
Hence, since has already been computed, we can compute as the solution to the equation

by a call to the Level 3 BLAS routine STRSM; and then we can compute from

This involves first updating the symmetric submatrix by a call to the Level 3 BLAS routine SSYRK, and then computing its Cholesky factorization. Since Fortran 77 does not allow recursion, a separate routine must be called (using Level 2 BLAS rather than Level 3), named SPOTF2 in Figure 3. In this way, successive blocks of columns of U are computed. The LAPACK-style code for the block algorithm is shown in Figure 3. This code runs at 49 megaflops on an IBM 3090, more than double the speed of the LINPACK code. On a CRAY Y-MP, the use of Level 3 BLAS squeezes a little more performance out of one processor, but makes a large improvement when using all 8 processors. Table 2 summarizes the results.

Figure 3: The body of the ``LAPACK-style'' routine SPOFA for block Cholesky factorization. In this code fragment, nb denotes the width of the blocks.
Figure 2: The body of the ``LAPACK-style'' routine SPOFA for Cholesky factorization.
Figure 1: The body of the LINPACK routine SPOFA for Cholesky factorization.

Figure 4: Partitioning of A, , and U into blocks. It is assumed that the first block has already been factored as , and we next want to determine the block column consisting of and . Note that the diagonal blocks of A and U are square matrices.

table221
Table 2: Speed (Megaflops) of Cholesky Factorization for n = 500

Next: Examples of Block Algorithms Up: Block Algorithms and Their Previous: Block Algorithms and Their

Jack Dongarra
Sun Feb 9 10:05:05 EST 1997