Cholesky factorization factors an
, symmetric, positive-definite
matrix
into
the product of a lower triangular matrix
and its transpose, i.e.,
(or
, where
is upper triangular).
It is assumed that the lower triangular portion of
is stored
in the lower triangle of a two-dimensional array
and that the computed elements of
overwrite the given elements of
.
At the
-th step, we partition the
matrices
,
,
and
, and write the system as

where the block
is
,
is
,
and
is
.
and
are lower triangular.
The block-partitioned form of
Cholesky factorization may be inferred inductively as follows.
If we assume that
, the lower triangular Cholesky factor
of
, is known, we can rearrange the block equations,

A snapshot of the block Cholesky factorization algorithm in
Figure 5 shows
how the column panel
(
and
) is computed
and how the trailing submatrix
is updated.
The factorization can be done by recursively applying the steps
outlined above to the
matrix
.
Figure 5:
A snapshot of block Cholesky factorization.
In the right-looking version of the LAPACK routine, the computation of the above steps involves the following operations:
.
,
The parallel implementation of the corresponding ScaLAPACK routine, PDPOTRF, proceeds as follows:
, which has the
diagonal block
,
performs the Cholesky factorization of
.
performs
, and
sets a flag if
is not positive definite.
broadcasts the flag to all other processes
so that the computation can be stopped if
is not positive definite.
is broadcast columnwise by
down all rows in
the current column of processes, which
computes the column of blocks of
.
is broadcast rowwise across
all columns of processes and then transposed.
Now, processes have their own portions of
and
.
They update their local portions of the matrix
.