The BLAS (Basic Linear Algebra Subprograms) [14, 15, 26] include subroutines for common linear algebra computations such as dot-products, matrix-vector multiplication, and matrix-matrix multiplication. As is well known, using matrix-matrix multiplication tuned for a particular architecture can effectively mask the effects of the memory hierarchy (cache misses, TLB misses, etc.), and permit floating point operations to be performed at the top speed of the machine.

As mentioned before, LAPACK, or Linear Algebra PACKage
[1], is a collection of routines for linear
system solving, linear least squares problems, and eigenproblems.
High performance is attained by using algorithms that do most
of their work in calls to the BLAS, with an emphasis on matrix-matrix
multiplication. Each routine has one or more *performance
tuning parameters*, such as the sizes of the blocks operated
on by the BLAS. These parameters are machine dependent, and
are obtained from a table at run-time.

The LAPACK routines are designed for single processors. LAPACK can also accommodate shared memory machines, provided parallel BLAS are available (in other words, the only parallelism is implicit in calls to BLAS). Extensive performance results for LAPACK can be found in the second edition of the users' guide [1].

The BLACS (Basic Linear Algebra Communication Subprograms)
[18] are a message passing library designed for linear
algebra. The computational model consists of a one or two dimensional
grid of processes, where each process stores matrices and vectors.
The BLACS include synchronous send/receive routines to send a matrix
or submatrix from one process to another, to broadcast submatrices to
many processes, or to compute global reductions (sums, maxima and minima).
There are also routines to construct, change, or query the process grid.
Since several ScaLAPACK algorithms require broadcasts or reductions
among different subsets of processes, the BLACS permit a processor
to be a member of several overlapping or disjoint process grids,
each one labeled by a *context*. Some message passing systems,
such as MPI [27, 33], also include this context concept.
(MPI calls this a communicator.) The BLACS
provide facilities for safe interoperation of system contexts and
BLACS contexts.

Thu Jul 25 15:38:00 EDT 1996