The traditional
algorithm for *QR*
factorization
is based on the use
of elementary Householder
matrices of the general
form

where *v* is a column
vector and is
a scalar. This leads
to an algorithm with
very good vector
performance, especially
if coded to use Level 2
PBLAS.

The key to developing
a distributed block form
of this algorithm is to
represent a product of
*b* elementary Householder
matrices of order *n* as
a block form of a Householder
matrix .
This can be done in various
ways. ScaLAPACK uses the
following form [31]:

where *V* is an *n* by *b*
matrix whose columns are the
individual vectors associated with
the Householder matrices , and *T*
is an upper triangular matrix
of order *b*. Extra work is
required to compute the
elements of *T*, but once again
this is compensated for by the
greater speed of applying the
block form. Table 7
summarizes results obtained with
the ScaLAPACK routine
PDGEQRF .

**Table 7:** Speed in Megaflop/s of PDGEQRF for Square Matrices of
Order *N*

Sat Feb 1 08:18:10 EST 1997