Scalable High Performance Libraries in Linear Algebra

Kinds of Problems, Algorithms, and Parallelism

Why Are We Still Doing This?

The Importance of ExploitingMathematical Structure

Changing Algorithmic Approach

Software Technology & Performance

Different Architectures

High-Performance Computing Directions

Performance Issues - Cache & Bandwidth

Where Do the Flops Go?Memory Hierarchy

BLAS for Performance

How To Get Performance From Commodity Processors?

Code Generation Strategy

ATLAS 500x500 DGEMM Across Various Architectures

Recursive Approach for Other Level 3 BLAS

500x500 Recursive BLAS on 433Mhz DEC 21164

LAPACK

ScaLAPACKFor Distributed Memory

Choosing a Data Distribution

Possible Data Layouts

New Algorithms/Software

Parallelism in ScaLAPACK

ScaLAPACK - What’s Included?

Heterogeneous Computing

Sparse Direct Methods

Example: Super-LU X.S. Li and J. Demmel,UCB

Sparse Gaussian Elimination

Many Sparse Direct Solvers

Iterative Solvers - Krylov Subspace Methods

Decision Tree

Iterative Solvers

Three Good Starting Places

DOE ASCI Program: Stockpile Stewardship via Numerical Simulation and Non-nuclear Experiments

Application needs drive the platforms and software infrastructure development

ASCI is Developing Application Codes that Scale to Thousands of Processors

Scalable Algorithms are the Key to Terascale Simulation

Multigrid for Scalable Solvers

Numerical Libraries

Research Directions

Conclusions

Thanks to…

Email: dongarra@cs.utk.edu

Home Page: http://www.netlib.org/utk/people/JackDongarra/