home | books | courses | jobs | notes | papers | projects | talks


Years


2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

1989

1988

1987

1986

1985

1984

1983

1982

1979
Papers: See my Google Scholar page for a close-to-comprehensive list of publications and my ResearchGate page for the full text of many.  (My DBLP page provides a good list too, organized nicely by year, and with a co-author index.)


- 2021 -

A Set of Batched Basic Linear Algebra Subprograms, Abdelfattah, A., T. Costa, J. Dongarra, M. Gates, A. Haidar, S. Hammarling, N. J. Higham, J. Kurzak, P. Luszczek, S. Tomov, et al.,  ACM Transactions on Mathematical Software, accepted October 2020.

Efficient Exascale Discretizations: High-Order Finite Element Methods, Kolev, Tzanio; Fischer, Paul; MIN, MISUN; Dongarra, Jack; Brown, Jed; Dobrev, Veselin; Warburton, Timothy; Tomov, Stanimire; Shephard, Mark; Abdelfattah, Ahmad; Barra, Valeria; Beams, Natalie; Camier, Jean-Sylvain; Chalmers, Noel; Dudouit, Yohann; Karakus, Ali; Karlin, Ian; Kerkemeier, Stefan; Lan, Yu-Hsiang; Medina, David; Merzari, Elia; Obabko, Aleksandr; Pazner, Will; Rathnayake, Thilina; Smith, Cameron; Spies, Lukas; Świrydowicz, Kasia; Thompson, Jeremy; Tomboulides, Ananias; Tomov, Vladimir, accepted in International Journal of High Performance Computing Applications, May 2021.

A PDF version is available.

A Survey of Numerical Linear Algebra Methods Utilizing Mixed Precision Arithmetic, Ahmad Abdelfattah, Hartwig Anzt, Erik G. Boman, Erin Carson, Terry Cojean Jack Dongarra, Alyson Fox, Mark Gates, Nicholas J. Higham, Xiaoye S. Li, Jennifer Loe, Piotr Luszczek, Srikara Pranesh, Siva Rajamanickam, Tobias Ribizel, Barry F. Smith, Kasia Swirydowicz, Stephen Thomas, Stanimire Tomov, Yaohung M. Tsai, and Ulrike Meier Yang, International Journal of High Performance Computing Applications, February 2021. https://journals.sagepub.com/doi/10.1177/10943420211003313 

A PDF version is available.

Distributed-Memory Multi-GPU Block-Sparse Tensor Contraction for Electronic Structure, Herault, T., Y. Robert, G. Bosilca, R. Harrison, C. Lewis, E. Valeev, and J. Dongarra, 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021, accepted December 2020.

A PDF version is available.

Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems, Cao, Q., Y. Pei, K. Akbudak, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, 35th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2021), Portland, OR, IEEE, May 2021, accepted December 2020.

A PDF version is available.

Revisiting Credit Distribution Algorithms for Distributed Termination Detection, George Bosilca, Aurelien Bouteiller, Thomas Herault, Valentin Le Fčvre, Yves Robert and Jack Dongarra, IPDPS-APDCM2021 (Workshop on Advances in Parallel and Distributed Computational Models, accepted March 2021.

A PDF version is available.

Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach with PaRSEC, Sameh Abdulah, George Bosilca, Qinglei Cao, Jack Dongarra, Marc Genton, David Keyes, Hatem Ltaief, Yu Pei, Ying Sun, accepted in IEEE Transactions on Parallel and Distributed Computing, May 2021.

A PDF version is available.

- 2020 -

Harnessing the Computing Continuum for Programming Our World, P., Beckman, J. Dongarra, N. Ferrier, G. Fox, T. Moore, D. Reed, and M. Beck, Fog Computing: Theory and Practice, John Wiley & Sons, Inc., 2020. DOI: 10.1002/9781119551713.ch7
A PDF version is available.


MAGMA Templates for Scalable Linear Algebra on Emerging Architectures, Farhan, M. Al, A. Abdelfattah, S. Tomov, M. Gates, D. Sukkari, A. Haidar, R. Rosenberg, and J. Dongarra, “The International Journal of High Performance Computing Applications, vol. 34, issue 6, pp. 645-658, November 2020. DOI: https://doi.org/10.1177/1094342020938421
A PDF version is available.


Design, Optimization, and Benchmarking of Dense Linear Algebra Algorithms on AMD GPUs, Brown, C., A. Abdelfattah, S. Tomov, and J. Dongarra, 2020 IEEE High Performance Extreme Computing Virtual Conference: IEEE, September 2020.

A PDF version is available.

HAN: A Hierarchical AutotuNed Collective Communication Framework, Luo, X., W. Wu, G. Bosilca, Y. Pei, Q. Cao, T. Patinyasakdikul, D. Zhong, and J. Dongarra, IEEE Cluster Conference, Kobe, Japan, Best Paper Award, IEEE Computer Society Press, September 2020.

A PDF version is available.

Flexible Data Redistribution in a Task-Based Runtime System, Cao, Q., G. Bosilca, W. Wu, D. Zhong, A. Bouteiller, and J. Dongarra, IEEE International Conference on Cluster Computing (Cluster 2020), Kobe, Japan, IEEE, September 2020. DOI: https://doi.org/10.1109/CLUSTER49012.2020.00032
A PDF version is available.

Evaluating the Performance of NVIDIA’s A100 Ampere GPU for Sparse and Batched Computations, Anzt, H., Y. M. Tsai, A. Abdelfattah, T. Cojean, and J. Dongarra, 2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS): IEEE, November 2020.

A PDF version is available.

High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs, Beams, N., A. Abdelfattah, S. Tomov, J. Dongarra, T. Kolev, and Y. Dudouit, 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.

A PDF version is available.

Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques, Lindquist, N., P. Luszczek, and J. Dongarra, 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA): IEEE, November 2020.
A PDF version is available.

Using Advanced Vector Extensions AVX-512 for MPI Reduction, Zhong, D., Q. Cao, G. Bosilca, and J. Dongarra, EuroMPI/USA '20: 27th European MPI Users' Group Meeting, Austin, TX, September 2020. DOI: https://doi.org/10.1145/3416315.3416316

A PDF version is available.

Mixed-Precision Iterative Refinement using Tensor Cores on GPUs to Accelerate Solution of Linear Systems, Haidar, A., H. Bayraktar, S. Tomov, J. Dongarra, and N. J. Higham,  Proceedings of the Royal Society A, vol. 476, issue 2243, November 2020. DOI: https://doi.org/10.1098/rspa.2020.0110
A PDF version is available.


Matrix Multiplication on Batches of Small Matrices in Half and Half-Complex Precisions, Abdelfattah, A., J. Dongarra, and S. Tomov,  Journal of Parallel and Distributed Computing, vol. 145, pp. 188–201, November 2020. DOI: https://doi.org/10.1016/j.jpdc.2020.07.001
A PDF version is available.
 

Translational Process: Mathematical Software Perspective, Dongarra, J., M. Gates, P. Luszczek, and S. Tomov, Journal of Computational Science, August 2020. DOI: https://doi.org/10.1016/j.jocs.2020.101216
A PDF version is available.


Scalable Data Generation for Evaluating Mixed-Precision Solvers, Luszczek, P., Y. Tsai, N. Lindquist, H. Anzt, and J. Dongarra, 2020 IEEE High Performance Extreme Computing Conference (HPEC): IEEE, September 2020.
A PDF version is available.


Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications, Cao, Q., Y. Pei, K. Akbudak, A. Mikhalev, G. Bosilca, H. Ltaief, D. Keyes, and J. Dongarra, The Platform for Advanced Scientific Computing (PASC) Conference (PASC20), 2:1-2:11 DOI: https://doi.org/10.1145/3394277.3401846

A PDF version is available.

Numerical Algorithms for High-Performance Computational Science, J., Dongarra, L. Grigori, and N. J. Higham, Philosophical Transactions of the Royal Society A, vol. 378, issue 2166, 2020. DOI: 10.1098/rsta.2019.0066
A PDF version is available.

FFT-ECP API and High-Performance Library Prototype for 2-D and 3-D FFTs on Large-Scale Heterogeneous Systems with GPUs, S., Tomov, A. Ayala, A. Haidar, and J. Dongarra, no. FFT-ECP STML13-27, Innovative Computing Laboratory, University of Tennessee, January 2020.
A PDF version is available.

Formulation of Requirements for new PAPI++ Software Package: Part I: Survey Results, H., Jagode, A. Danalis, and J. Dongarra, PAPI++ Working Notes, no. No. 1, ICL-UT-20-02, Innovative Computing Laboratory, University of Tennessee Knoxville, January 2020.
A PDF version is available.

Project-Based Research and Training in High Performance Data Sciences, Data Analytics, and Machine Learning, K., Wong, S. Tomov, and J. Dongarra, The Journal of Computational Science Education, vol. 11, issue 1, 36-44, January 2020. DOI: 10.22369/issn.2153-4136/11/1/7
A PDF version is available.

Performance Tuning SLATE, M., Gates, A. Charara, A. YarKhan, D. Sukkari, M. Al Farhan, and J. Dongarra, SLATE Working Notes, no. 14, ICL-UT-20-01, Innovative Computing Laboratory, University of Tennessee, January 2020.
A PDF version is available.

Load-balancing Sparse Matrix Vector Product Kernels on GPUs, H., Anzt, Y-C. Chen, T. Cojean, J. Dongarra, G. Flegar, R. Nayak, E. S. Quintana-Orti, Y. Tsai, and W. Wang, ACM Transactions on Parallel Computing, issue 2, March 2020. DOI: 10.1145/3380930
A PDF version is available.

Asynchronous SGD for DNN Training on Shared-Memory Parallel Architectures, F., Lopez, E. Chow, S. Tomov, and J. Dongarra, Innovative Computing Laboratory Technical Report, no. ICL-UT-20-04, University of Tennessee, Knoxville, March 2020.
A PDF version is available.

Reducing the Amount of out-of-core Data Access for GPU-Accelerated Randomized SVD, Y., Lu, I. Yamazaki, F. Ino, Y. Matsushita, S. Tomov, and J. Dongarra, Concurrency and Computation: Practice and Experience, April 2020. DOI: 10.1002/cpe.5754
A PDF version is available.

Using Arm Scalable Vector Extension to optimize Open MPI, D., Zhong, P. Shamis, Q. Cao, G. Bosilca, and J. Dongarra, 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID 2020), Melbourne, Australia, IEEE/ACM, May 2020.

Asynchronous SGD for DNN training on Shared-memory Parallel Architectures, F., Lopez, E. Chow, S. Tomov, and J. Dongarra, Workshop on Scalable Deep Learning over Parallel And Distributed Infrastructures (ScaDL 2020), May 2020.
A PDF version is available.

Mixed-Precision Solution of Linear Systems Using Accelerator-Based Computing, A., Haidar, H. Bayraktar, S. Tomov, J. Dongarra, and N. J. Higham, Innovative Computing Laboratory Technical Report, no. ICL-UT-20-05, University of Tennessee, May 2020.
A PDF version is available.

Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime, Y., Pei, Q. Cao, G. Bosilca, P. Luszczek, V. Eijkhout, and J. Dongarra, 21st IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2020), New Orleans, LA, IEEE, May 2020.
A PDF version is available.

Twenty Years of Computational Science, V., Krzhizhanovskaya, G. Závodszky, M. Lees, J. Dongarra, P. Sloot, S. Brissos, and J. Teixeira, International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.

heFFTe: Highly Efficient FFT for Exascale, A., Ayala, S. Tomov, A. Haidar, and J. Dongarra, International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, June 2020.

Investigating the Benefit of FP16-enabled Mixed-precision Solvers for Symmetric Positive Definite Matrices using GPUs, A., Abdelfattah, S. Tomov, and J. Dongarra, International Conference on Computational Science (ICCS 2020), Amsterdam, Netherlands, Elsevier, June 2020.

Report on the Fujitsu Fugaku System, J., Dongarra, Innovative Computing Laboratory Technical Report, no. ICL-UT-20-06, University of Tennessee, June 2020.
A PDF version is available.

Improving the Performance of the GMRES method using Mixed-Precision Techniques, N., Lindquist, P. Luszczek, and J. Dongarra, Smoky Mountains Computational Sciences & Engineering Conference (SMC2020), August 2020.

SLATE Users' Guide, Gates, M., A. Charara, J. Kurzak, A. YarKhan, M. Al Farhan, D. Sukkari, and J. Dongarra, SLATE Working Notes, no. 10, ICL-UT-19-01: Innovative Computing Laboratory, University of Tennessee, July 2020.