Suppose the matrix A is partitioned as shown in Figure
5, and we seek a factorization A=LU, where the partitioning
of L and U is also shown in Figure 5. Then we may write,
where is , is , is , and is . and are lower triangular matrices with 1s on the main diagonal, and and are upper triangular matrices.
Figure 5: Block LU factorization of the partitioned matrix A. is , is , is , and is . and are lower triangular matrices with 1's on the main diagonal, and and are upper triangular matrices.
Equations 1 and 2 taken together perform an LU
factorization on the first panel of A (i.e., and
). Once this is completed,
the matrices , , and are known, and the lower
triangular system in Eq. 3 can be solved to give .
Finally, we rearrange Eq. 4 as,
From this equation we see that the problem of finding and reduces to finding the LU factorization of the matrix . This can be done by applying the steps outlined above to instead of to A. Repeating these steps K times, where
where denotes the least integer greater than or equal to x, we obtain the LU factorization of the original matrix A. For an in-place algorithm, A is overwritten by L and U - the 1s on the diagonal of L do not need to be stored explicitly. Similarly, when A is updated by Eq. 5 this may also be done in place.
After k of these K steps, the first kr columns of L and the first kr rows of U have been evaluated, and matrix A has been updated to the form shown in Figure 6, in which panel B is and C is . Step k+1 then proceeds as follows,
Figure 6: Stage k+1 of the block LU factorization algorithm showing how the panels B and C, and the trailing submatrix E are updated. The trapezoidal submatrices L and U have already been factored in previous steps. L has kr columns, and U has kr rows. In the step shown another r columns of L and r rows of U are evaluated.
The LAPACK implementation of this form of LU factorization uses the Level 3 BLAS routines xTRSM and xGEMM to perform the triangular solve and rank-r update. We can regard the algorithm as acting on matrices that have been partitioned into blocks of elements, as shown in Figure 7.
Figure 7: Block-partitioned matrix A. Each block consists of matrix elements.