Can we provide portable software for computations in dense linear algebra that is efficient on a wide range of modern high-performance computers? If so, how? Answering these questions -- and providing the desired software -- has been the goal of the LAPACK project.
LINPACK  and EISPACK [92,54] have for many years provided high-quality portable software for linear algebra; but on modern high-performance computers they often achieve only a small fraction of the peak performance of the machines. Therefore, LAPACK has been designed to supersede LINPACK and EISPACK, principally by achieving much greater efficiency -- but at the same time also adding extra functionality, using some new or improved algorithms, and integrating the two sets of algorithms into a single package.
LAPACK was originally targeted to achieve good performance on single-processor vector machines and on shared memory multiprocessor machines with a modest number of powerful processors. Since the start of the project, another class of machines has emerged for which LAPACK software is equally well-suited--the high-performance ``super-scalar'' workstations. (LAPACK is intended to be used across the whole spectrum of modern computers, but when considering performance, the emphasis is on machines at the more powerful end of the spectrum.)
Here we discuss the main factors that affect the performance of linear algebra software on these classes of machines.