References

Next: Questions for the Up: LAPACK Working Note Previous: Solving Linear Systems

References

1: M. Aboelaze, N. Chrisochoides, and E. Houstis. ``The parallelization of Level 2 and 3 BLAS Operations on Distributed Memory Machines''. Technical Report CSD-TR-91-007, Purdue University, West Lafayette, IN, 1991.
2: E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov, and D. Sorensen. ``LAPACK Users' Guide, Second Edition''. SIAM, Philadelphia, PA, 1995.
3: R. Brent and P. Strazdins. ``Implementation of BLAS Level 3 and LINPACK Benchmark on the AP1000''. Fujitsu Scientific and Technical Journal, 5(1):61-70, 1993.
4: J. Choi, J. Demmel, I. Dhillon, J. Dongarra, S. Ostrouchov, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ``ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance''. Technical Report UT CS-95-283, LAPACK Working Note #95, University of Tennessee, 1995.
5: J. Choi, J. Demmel, I. Dhillon, J. Dongarra, S. Ostrouchov, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ``Installation Guide for ScaLAPACK''. Technical Report UT CS-95-280, LAPACK Working Note #93, University of Tennessee, 1995.
6: J. Choi, J. Dongarra, S. Ostrouchov, A. Petitet, D. Walker, and R. C. Whaley. ``The Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines''. Technical Report UT CS-94-246, LAPACK Working Note #80, University of Tennessee, 1994.
7: J. Choi, J. Dongarra, R. Pozo, and D. Walker. ``ScaLAPACK: A Scalable Linear Algebra Library for Distributed Memory Concurrent Computers''. In Proceedings of Fourth Symposium on the Frontiers of Massively Parallel Computation (McLean, Virginia), pages 120-127. IEEE Computer Society Press, Los Alamitos, California, 1992. (also LAPACK Working Note #55).
8: J. Choi, J. Dongarra, and D. Walker. Parallel matrix transpose algorithms on distributed memory concurrent computers. In Proceedings of Fourth Symposium on the Frontiers of Massively Parallel Computation (McLean, Virginia), pages 245-252. IEEE Computer Society Press, Los Alamitos, California, 1993. (also LAPACK Working Note #65).
9: J. Choi, J. Dongarra, and D. Walker. ``PB-BLAS: A Set of Parallel Block Basic Linear Algebra Subroutines''. In ``Proceedings of the Scalable High Performance Computing Conference'', pages 534-541, Knoxville, TN, 1994. IEEE Computer Society Press.
10: J. Choi, J. Dongarra, and D. Walker. ``PUMMA: Parallel Universal Matrix Multiplication Algorithms on Distributed Memory Concurrent Computers''. Concurrency: Practice and Experience, 6(7):543-570, 1994. (also LAPACK Working Note #57).
11: J. Dongarra, J. Du Croz, I. Duff, and S. Hammarling. ``A Set of Level 3 Basic Linear Algebra Subprograms''. ACM Transactions on Mathematical Software, 16(1):1-17, 1990.
12: J. Dongarra, J. Du Croz, S. Hammarling, and R. Hanson. ``Algorithm 656: An extended Set of Basic Linear Algebra Subprograms: Model Implementation and Test Programs''. ACM Transactions on Mathematical Software, 14(1):18-32, 1988.
13: J. Dongarra, R. van de Geijn, and D. Walker. ``Scalability Issues in the Design of a Library for Dense Linear Algebra''. Journal of Parallel and Distributed Computing, 22(3):523-537, 1994. (also LAPACK Working Note #43).
14: J. Dongarra and R. C. Whaley. ``A User's Guide to the BLACS v1.0''. Technical Report UT CS-95-281, LAPACK Working Note #94, University of Tennessee, 1995.
15: A. Elster. ``Basic Matrix Subprograms for Distributed Memory Systems''. In D. Walker and Q. Stout, editors, ``Proceedings of the Fifth Distributed Memory Computing Conference'', pages 311-316. IEEE Press, 1990.
16: R. Falgout, A. Skjellum, S. Smith, and C. Still. ``The Multicomputer Toolbox Approach to Concurrent BLAS''. submitted to Concurrency: Practice and Experience, 1993. (preprint).
17: Message Passing Interface Forum. ``MPI: A Message Passing Interface Standard''. International Journal of Supercomputer Applications and High Performance Computing, 8(3-4), 1994.
18: G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker. ``Solving Problems on Concurrent Processors'', volume 1. Prentice Hall, Englewood Cliffs, N.J, 1988.
19: G. Fox, S. Otto, and A. Hey. ``Matrix Algorithms on a Hypercube I: Matrix Multiplication''. Parallel Computing, 3:17-31, 1987.
20: A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam. ``PVM: Parallel Virtual Machine. A User's Guide and Tutorial for Networked Parallel Computing''. The MIT Press, Cambridge, Massachusetts, 1994.
21: R. Hanson, F. Krogh, and C. Lawson. ``A Proposal for Standard Linear Algebra Subprograms''. ACM SIGNUM Newsl., 8(16), 1973.
22: S. Huss-Lederman, E. Jacobson, A. Tsao, and G. Zhang. ``Matrix Multiplication on the Intel Touchstone DELTA''. Concurrency: Practice and Experience, 6(7):571-594, 1994.
23: C. Koebel, D. Loveman, R. Schreiber, G. Steele, and M. Zosel. ``The High Performance Fortran Handbook''. The MIT Press, Cambridge, Massachusetts, 1994.
24: C. Lawson, R. Hanson, D. Kincaid, and F. Krogh. ``Basic Linear Algebra Subprograms for Fortran Usage''. ACM Transactions on Mathematical Software, 5(3):308-323, 1979.
25: H. Schildt. ``The Annoted ANSI C standard. American National Standard for Programming Languages - C. ANSI/ISO 9899-1990''. OsBorne, Berkeley, CA, 1990.
26: R. van de Geijn and J. Watts. ``SUMMA: Scalable Universal Matrix Multiplication Algorithm''. Technical Report UT CS-95-286, LAPACK Working Note #96, University of Tennessee, 1995.
27: R. C. Whaley. ``Basic Linear Algebra Communication Subprograms: Analysis and Implementation Across Multiple Parallel Architectures''. Technical Report UT CS-94-234, LAPACK Working Note #73, University of Tennessee, 1994.

Jack Dongarra
Thu Aug 3 07:53:00 EDT 1995