Concurrent matrix algorithms were among the
first to be studied on the hypercubes at Caltech [Fox:82a], and
they have also been intensively studied at other institutions, notably
Yale [Ipsen:87b], [Johnsson:87b;89a],
[Saad:85a], and Oak Ridge National
Laboratory [Geist:86a;89a], [Romine:87a;90a].
The
motivation for this interest is the fact that matrix algorithms play a
prominent role in many scientific and engineering computations. In
this chapter, we study the so called-*full* or *dense* (and
closely related *banded*) matrix algorithms
where essentially all elements of the matrix are nonzero. In
Chapters 9 and 12, we will treat the more common case
of problems, which, if formulated as matrix equations, are represented
by *sparse* matrices. Here, most of the elements of the matrix
are zero; one can apply full matrix algorithms to such sparse cases,
but there are much better algorithms that exploit the sparseness to
reduce the computational complexity. Within CP, we found two
classes of important problems that needed full matrix algorithms. In
Sections 8.2 and 8.3, we cover two chemical scattering
problems, which involve relatively small full matrices-where the
matrix rows and columns are labelled by the different reaction
channels. The currently most important real-world use of full matrix
algorithms is computational electromagnetic simulations
[Edelman:92a]. These are used by the defense industry to design
aircraft and other military vehicles with low radar cross sections.
Solutions of large sets of linear equations come from the method of
moments approach to electromagnetic equations [Wang:91b]. We
investigated this method successfully at JPL [Simoni:89a] but in
this book, we only describe (in Section 9.4) the alternative
approaches, finite elements, to electromagnetic simulations. Such
sparse matrix formulation will be more efficient
for large electromagnetic problems, but the moment method and
associated full matrix is probably the most common and numerically
reliable approach.

Early work at Caltech on full matrices (1983 to 1987) focused on specific algorithms, such as matrix multiplication, matrix-vector products, and LU decomposition. A major issue in determining the optimal algorithm for these problems is choosing a decomposition which has good load balance and low communication overhead. Many matrix algorithms proceed in a series of steps in which rows and/or columns are successively made inactive. The scattered decomposition described in Section 8.1.2 is usually used to balance the load in such cases. The block decomposition, also described in Section 8.1.2, generally minimizes the amount of data communicated, but results in sending several short messages rather than a few longer messages. Thus, a block decomposition is optimal for a multiprocessor with low message latency, or startup cost, such as the Caltech/JPL Mark II hypercube. For machines with high message latency, such as the Intel iPSC/1, a row decomposition may be preferable. The best decomposition, therefore, depends crucially on the characteristics of the concurrent hardware.

In recent years (1988-1990), interest has centered on the development of libraries of concurrent linear algebra routines. As discussed in Section 8.1.7, two approaches have been followed at Caltech. One approach by Fox, et al. has led to a library of routines that are optimal for low latency, homogeneous hypercubes, such as the Caltech/JPL Mark II hypercubes. In contrast, Van de Velde has developed a library of routines that are generally suboptimal, but which may be ported to a wider range of multiprocessor architectures, and are suitable for incorporation into programs with dynamically changing data distributions.

- 8.1.1 Matrix Decomposition
- 8.1.2 Basic Matrix Arithmetic
- 8.1.3 Matrix Multiplication for Banded Matrices
- 8.1.4 Systems of Linear Equations
- 8.1.5 The Gauss-Jordan Method
- 8.1.6 Other Matrix Algorithms
- 8.1.7 Concurrent Linear Algebra Libraries
- 8.1.8 Problem Structure
- 8.1.9 Conclusions

Other References

HPFA Applications and Paradigms

Wed Mar 1 10:19:35 EST 1995