LU Factorization



next up previous
Next: QR Factorization Up: Factorization Routines Previous: Factorization Routines

LU Factorization

 

The LU factorization applies a sequence of Gaussian eliminations to form , where and are matrices, and is an matrix. is unit lower triangular (lower triangular with 1's on the main diagonal), is upper triangular, and is a permutation matrix, which is stored in a vector.

At the -th step of the computation (), it is assumed that the submatrix of (, ) is to be partitioned as follows,

where the block is , is , is , and is . is a unit lower triangular matrix, and is an upper triangular matrix.

At first, a sequence of Gaussian eliminations is performed on the first panel of (i.e., and ). Once this is completed, the matrices , , and are known, and we can rearrange the block equations,

The LU factorization can be done by recursively applying the steps outlined above to the matrix . Figure 3 shows a snapshot of the block LU factorization. It shows how the column panel, and , and the row panel, and , are computed, and how the trailing submatrix is updated. In the figure, the shaded areas represent data for which the corresponding computations are completed. Later, row interchanges will be applied to and .

  
Figure 3: A snapshot of block LU factorization.

The computation of the above steps in the LAPACK routine, DGETRF, involves the following operations:

  1. DGETF2: Apply the LU factorization on an column panel of (i.e., and ).

  2. DLASWP: Apply row interchanges to the left and the right of the panel.

  3. DTRSM: Compute the row panel of ,

  4. DGEMM: Update the rest of the matrix, ,

The corresponding parallel implementation of the ScaLAPACK routine, PDGETRF, proceeds as follows:

  1. PDGETF2: The current column of processes performs the LU factorization on an panel of (i.e., and ).

  2. PDLASWP: All processes apply row interchanges to the left and the right of the current panel.

  3. PDTRSM: is broadcast along the current row of processes, which converts the row panel to .

  4. PDGEMM: The column panel is broadcast rowwise across all columns of processes. The row panel is broadcast columnwise down all rows of processes. Then, all processes update their local portions of the matrix, .



next up previous
Next: QR Factorization Up: Factorization Routines Previous: Factorization Routines



Susan Ostrouchov
Fri Apr 28 09:37:26 EDT 1995