 2017 Preconditioned Krylov solvers on GPUs, Hartwig Anzt, Mark Gates, Jack Dongarra, Moritz Kreutzerd, Gerhard Welleind, Martin Köhlere, Parallel Computing, DOI:10.1016/j.parco.2017.05.006, June 2017. A pdf version is available. Evaluation of Directivebased Performance Portable Programming Models,“ M. Graham Lopez, Wayne Joubert, Veronica Vergara Larrea, Oscar Hernandez, Azzam Haidar, Stanimire Tomov, Jack Dongarra, International Journal of High Performance Computing and Networking, accepted May 2017. A pdf version is available. A Framework for Out of Memory SVD Algorithms, K. Kabir, A. Haidar, S. Tomov, A. Bouteiller, J. Dongarra, in Kunkel J., Yokota R., Balaji P., Keyes D. (eds) High Performance Computing, ISC 2017. Lecture Notes in Computer Science, vol 10266. Springer, Frankfurt, Germany, June 1921, 2017, DOI:10.1007/9783319586670_9 A pdf version is available. Batched GaussJordan Elimination for BlockJacobi Preconditioner Generation on GPUs, Hartwig Anzt, Jack Dongarra, Goran Flegar and Enrique S. QuintanaOrti, Proceeding PMAM'17 Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, Pages 110, Austin, TX, USA — February 04  08, 2017, ISBN: 9781450348836 DOI:10.1145/3026937.3026940 A pdf version is available. HighPerformance Cholesky Factorization for GPUOnly Execution, Azzam Haidar, Ahmad Abdelfattah, Stanimire Tomov and Jack Dongarra, Proceeding GPGPU10 Proceedings of the General Purpose GPUs, Pages 4252 Austin, TX, USA — February 04  08, 2017, DOI:10.1145/3038228.3038237 A pdf version is available. Updating Incomplete Factorization Preconditioners for Model Order Reduction, Hartwig Anzt, Edmond Chow, Jens Saak, and Jack Dongarra, Numerical Algorithms, November 2016, Volume 73, Issue 3, pp 611–630, DOI:10.1007/s1107501601102 A pdf version is available. Accelerating NWChem Coupled Cluster through dataflowbased Execution, A. Danalis, H. Jagode, and J. Dongarra, The International Journal of High Performance Computing Applications, 2017, DOI:10.1177/1094342016672543 A pdf version is available. On the Performance and Energy Efficiency of Sparse Linear Algebra on GPU, Hartwig Anzt, Stanimire Tomov, and Jack Dongarra, International Journal of High Performance Computing, 2017, DOI:10.1177/1094342016672081 A pdf version is available. Solving Dense Symmetric Indefinite Systems using GPUs, M. Baboulin, J. Dongarra, A. Remy, S. Tomov, I. Yamazaki, Concurrency and Computation: Practice and Experience, 2017, DOI:10.1002/cpe.4055 A pdf version is available. Finegrained BitFlip Protection for Relaxation Methods, H. Anzt, J. Dongarra, and E QuintanaOrti, the Journal of Computational Science, 2017, DOI:10.1016/j.jocs.2016.11.013 A pdf version is available. Fast Cholesky Factorization on GPUs for Batch and Native Modes in MAGMA, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, and Jack Dongarra, Journal of Computational Science, Volume 20, May 2017, Pages 85–93 DOI:10.1016/j.jocs.2016.12.009 A pdf version is available. With Extreme Computing, the Rules Have Changed, Jack Dongarra, Stanimire Tomov, Piotr Luszczek, Jakub Kurzak, Mark Gates, Ichitaro Yamazaki, Hartwig Anzt, Azzam Haidar, and Ahmad Abdelfattah, IEEE CISE, April 2017, DOI:10.1109/MCSE.2017.48 A pdf version is available. Structureaware Linear Solver for Realtime Convex Optimization for Embedded Systems," I. Yamazaki, S. Tomov, J. Dongarra, IEEE Embedded Systems Letters, May 2017, DOI: 10.1109/LES.2017.2700401 A pdf version is available. Design and Implementation of the PULSAR Programming System for Large Scale Computing, J. Kurzak, P. Luszczek, I. Yamazaki, Y. Robert, J. Dongarra, Supercomputing Frontiers and Innovations, 2017, DOI:10.14529/jsfi170101 A pdf version is available. Bringing High Performance Computing to Big Data Algorithms, H. Anzt, J. Dongarra, M. Gates, J. Kurzak , P. Luszczek, S. Tomov, I. Yamazaki in Handbook of Big Data Technologies Editors: Albert Y. Zomaya, Sherif Sakr, ISBN: 9783319493398 (Print) 9783319493404 (Online), DOI:10.1007/9783319493404, Springer, 2017. A pdf version is available. Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices, Tingxing Dong, Azzam Haidar, Stanimire Tomov and Jack Dongarra, ICCS’17, ETH Zurich, Procedia Computer Science, Volume 108, 2017, Pages 1008–1018, DOI:10.1016/j.procs.2017.05.237 A pdf version is available. Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, and Jack Dongarra, ICCS’17, ETH Zurich, Procedia Computer Science, Volume 108, 2017, Pages 606615, DOI:10.1016/j.procs.2017.05.250 A pdf version is available. The Design and Performance of Batched BLAS on Modern HighPerformance Computing Systems, Jack Dongarra, Sven Hammarling, Nick Higham, Samuel Relton, Pedro ValeroLaraand Mawussi Zounon, ICCS’17, ETH Zurich, Procedia Computer Science, Volume 108, 2017, Pages 495504, DOI:10.1016/j.procs.2017.05.138 A pdf version is available. Novel HPC Techniques to Batch Execution of Many Variable Size BLAS Computations on GPUs, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov and Jack Dongarra, ICS 2017 Chicago, June 14 2017, DOI:10.1145/3079079.3079103 A pdf version is available. Bidiagonalization and RBidiagonalization: Parallel Tiled Algorithms, Critical Paths and DistributedMemory Implementation, Mathieu Faverge, Julien Langou, Yves Robert, Jack J. Dongarra, 2017 IPDPS Conference, DOI:10.1109/IPDPS.2017.46 A pdf version is available. VariableSize Batched GaussHuard for BlockJacobi Preconditioning, Hartwig Anzt, Jack Dongarra, Goran Flegar, Enrique S. QuintanaOrtí, Andrés E. Tomás, Procedia Computer Science, Volume 108, pp 1783  1792, 2017, International Conference on Computational Science, ICCS 2017, 1214 June 2017, Zurich, Switzerland, ISSN 18770509, DOI:10.1016/j.procs.2017.05.186. A pdf version is available. Batched GaussJordan Elimination for BlockJacobi Preconditioner Generation on GPUs, Hartwig Anzt, Jack Dongarra, Goran Flegar and Enrique S. QuintanaOrti, accepted PMAM 2017, December 2016. A pdf version is available.  2016 Report on the Sunway TaihuLight System, Jack Dongarra, University of Tennessee, Department of Electrical Engineering and Computer Science Tech Report UTEECS16742, June 2016. A pdf version is available. Sunway TaihuLight Supercomputer Makes Its Appearance, Jack Dongarra, The National Science Review 2016 3: 265266, September 2016, DOI: 10.1093/nsr/nww044. A pdf version is available. On the Performance and Energy Efficiency of Sparse Linear Algebra on GPUs, H. Anzt, S. Tomov, and J. Dongarra, The International Journal of High Performance Computing Applications, DOI: 10.1177/1094342016672081. A pdf version is available. Performance Tuning and Optimization Techniques of Fixed and Variable Size Batched Cholesky Factorization on GPUs, A. Abdelfattah, A. Haidar, S. Tomov, and J. Dongarra, International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016. A pdf version is available. HighPerformance Tensor Contractions for GPUs, A. Abdelfattah, M. Baboulin , V. Dobrev, J. Dongarra , C. Earl , J. Falcou , A. Haidar , I. Karlin , T. Kolev , I. Masliah, International Conference on Computational Science (ICCS'16), San Diego, CA, June 2016 A pdf version is available. Efficiency of General Krylov Methods on GPUs – An Experimental Study, Hartwig Anzt, Jack Dongarra, Moritz Kreutzer, Gerhard Wellein, Martin Köhler, AsHES Workshop, IPDPS, 2016. A pdf version is available. On the Development of Variable Size Batched Computation for Heterogeneous Parallel Architectures, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack Dongarra, The 17th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2016), IPDPS 2016, Chicago, IL, IEEE, May 2016. A pdf version is available. GPUAware Noncontiguous Data Movement In Open MPI, W. Wu, G. Bosilca, R. vandeVaart, S. Jeaugey, and J. Dongarra, The 25th International Symposium on High Performance Distributed Computing (HPDC2016). A pdf version is available. Creating a Standardised Set of Batched BLAS Routines, Jack Dongarra, Sven Hammarling, Nicholas J. Higham, Samuel D. Relton, Pedro ValeroLara and Mawussi Zounon, in the Proceedings of the Fourth Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE4, 2016), Gabrielle Allen, Jeffrey Carver et al, volume 1686, CEUR Workshop Proceedings, http://ceurws.org/Vol1686/WSSSPE4_paper_3.pdf. A pdf version is available. Hessenberg Reduction with Transient Error Resilience on GPUBased Hybrid Architectures, Y. Jai, P. Luszczek, and J. Dongarra, The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) 2016, May 2016, Chicago. DOI: 10.1109/IPDPSW.2016.34 A pdf version is available. NonGPUresident Dense Symmetric Indefinite Factorization, I. Yamazaki, S. Tomov, and J. Dongarra, Concurrency and Computation: Practice and Experience, DOI: 10.1002/cpe.4012, November 2016. A pdf version is available. A New Metric for Ranking High Performance Computing Systems, Jack Dongarra, Michael A. Heroux, and Piotr Luszczek, National Science Review, Volume 3, Issue 1, March 2016, pp 3035, DOI: 10.1093/nsr/nwv084. A pdf version is available. Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results, Julien Herrmann, George Bosilca, Thomas Hérault, Loris Marchal, Yves Robert, and Jack Dongarra, Parallel Computing, Volume 52, February 2016, pp. 22–41, DOI: 10.1016/j.parco.2015.09.005. A pdf version is available. Optimization and Performance Evaluation of the IDR Iterative Krylov Solver on GPUs, Hartwig Anzt, Moritz Kreutzer, Eduardo Ponce, Gregory D. Peterson, Gerhard Wellein, Jack Dongarra, The International Journal of High Performance Computing Applications, 1–11, 2016, DOI: 10.1177/1094342016646844 A pdf version is available. Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs, Anzt, H., B. Haugen, J. Kurzak, P. Luszczek, and J. Dongarra, Concurrency in Computation: Practice and Experience, vol. 27, issue 17, pp. 50965113, DOI: 10.1002/cpe.3516. A pdf version is available. High Performance Conjugate Gradient Benchmark: A new Metric for Ranking High Performance Computing Systems,”J. Dongarra, M. Heroux, P. Luszczek, The International Journal of High Performance Computing Applications, Volume 30 Issue 1, Spring 2016. DOI: 10.1177/1094342015593158. A pdf version is available. Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results, Herrmann, J., G. Bosilca, T. Herault, L. Marchal, Y. Robert, and J. Dongarra, Parallel Computing, vol. 52, pp. 2241, February 2016. DOI: 10.1016/j.parco.2015.09.005. A pdf version is available. Updating Incomplete Factorization Preconditioners for Model Order Reduction, Hartwig Anzt, Edmond Chow, Jens Saak, and Jack Dongarra, accepted in Numerical Algorithms, January 2016. A pdf version is available. Stability and Performance of Various Singular Value QR Implementations and Casestudies with Adaptive Mixed Precision on Multicore CPU with GPUs, Ichitaro Yamazaki, Stanimire Tomov, and Jack Dongarra, Accepted TOMS, February 2016. A pdf version is available. Performance Optimization of Sparse MatrixVector Multiplication for Multicomponent PDEbased Applications using GPUs, Ahmad Ahmad, Hatem Ltaief, David Keyes, and Jack Dongarra, accepted Concurrency and Computation: Practice and Experience, April 2016. A pdf version is available. Porting the PLASMA Numerical Library to the OpenMP Standard, Asim YarKhan, Jakub Kurzak, Piotr Luszczek, and Jack Dongarra, accepted in International Journal of Parallel Programming, May 2016. A pdf version is available. Domain Overlap for Iterative Sparse Triangular Solves on GPUs, Hartwig Anzt, Edmond Chow, Daniel Szyld, and Jack Dongarra, Software for Exascale Computing, Leibniz Supercomputing Centre, Munich, Germany, Volume 113 of the series Lecture Notes in Computational Science and Engineering pp 527545, Jan 25–27, 2016. DOI: 10.1007/9783319405285_24 A pdf version is available. Performance, Design, and Autotuning of Batched GEMM for GPUs, Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, Jack Dongarra, High Performance Computing, Volume 9697 of the series Lecture Notes in Computer Science pp 2138, 2016, DOI: 10.1007/9783319413211_2 A pdf version is available. Accelerating the Conjugate Gradient Algorithm with GPU in CFD Simulations, Hartwig Anzt, Marc Baboulin, Jack Dongarra, Yvan Fournier, Frank Hulsemann, Amal Khabou and Yushan Wang, VECPAR 2016. A pdf version is available. TaskBased Cholesky Decomposition on Knights Corner using OpenMP, Joseph Dorris, Jakub Kurzak, Piotr Luszczek, Asim Yarkhan, Jack Dongarra, Awarded the Best Paper Award at the P^3MA workshop colocated with ISC, High Performance Computing, Volume 9945 of the series Lecture Notes in Computer Science pp 544562, DOI: 10.1007/9783319460796_37 A pdf version is available. LU, QR, and Cholesky Factorizations: Programming Model, Performance Analysis and Optimization Techniques for the Intel Knights Landing Xeon Phi, Azzam Haidar, Stanimire Tomov, Konstantin Arturov, Murat Guney, Shane Story, Jack Dongarra, 2016 IEEE High Performance Extreme Computing Conference (HPEC ‘16) Twentieth Annual HPEC Conference 13  15 September 2016, Waltham, MA USA. A pdf version is available. Performance Analysis and Acceleration of Explicit Integration for Large Kinetic Networks using Batched GPU Computations, A. Haidar, B. Brock, S. Tomov, M. Guidry, J. Billings, D. Shyles, J. Dongarra, 2016 IEEE High Performance Extreme Computing Conference (HPEC ‘16), September 1315, 2016. A pdf version is available. Failure Detection and Propagation in HPC systems, George Bosilca, Aurelien Bouteiller, Amina Guermouche, Thomas Herault, Yves Robert, Pierre Sens, Jack Dongarra, Nominated for Best Paper, Proceedings of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Salt Lake City, Utah, IEEE Press, pp. 27:127:11, November 2016. A pdf version is available. PerformancePortable Autotuning of OpenCL Kernels for Convolutional Layers of Deep Neural Networks, Yaohung Tsai, Piotr Luszczek, Jakub Kurzak and Jack Dongarra, in the Machine Learning and HPC Environments Workshop associated with SC16, November 2016. A pdf version is available. Batched Generation of Incomplete Sparse Approximate Inverses on GPUs, H. Anzt, E. Chow, T. Huckle, J. Dongarra, Proceedings of the 7th Workshop on Latest Advances in Scalable Algorithms for LargeScale Systems, pp. 49–56, November 2016. A pdf version is available. Towards Achieving Performance Portability Using Directives for Accelerators, M. Lopez, V. Larrea, W. Joubert, O. Hernandez, A. Haidar, S. Tomov, and J. Dongarra, The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16), Third Workshop on Accelerator Programming Using Directives (WACCPD), Salt Lake City, Utah, Innovative Computing Laboratory, University of Tennessee, November 2016. A pdf version is available. Performance Analysis and Acceleration of Explicit Integration for Large Kinetic Networks using Batched GPU Computations, A. Haidar, B. Brock, S. Tomov, M. Guidry, J. Billings, D. Shyles, and J. Dongarra, 2016 IEEE High Performance Extreme Computing Conference (HPEC ‘16), Waltham, MA, IEEE, September 2016. A pdf version is available. Power Management and Event Verification in PAPI, H. Jagode, A. YarKhan, A. Danalis , and J. Dongarra, Tools for High Performance Computing 2015: Proceedings of the 9th International Workshop on Parallel Tools for High Performance Computing, September 2015, Dresden, Germany, Dresden, Germany, Springer International Publishing, pp. pp. 4151, 2016. A pdf version is available. Search Space Generation and Pruning System for Autotuners, Piotr Luszczek, Mark Gates, Jakub Kurzak, Anthony Danalis, and Jack Dongarra, the 30th IEEE International Parallel & Distributed Processing Symposium, Chicago, IL, IEEE, May 2016. A pdf version is available. Highperformance MatrixMatrix Multiplications of Very Small Matrices, I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, and J. Dongarra, 22nd International European Conference on Parallel and Distributed Computing (EuroPar'16), Grenoble, France, Springer International Publishing, August 2016. A pdf version is available. Heterogeneous Streaming, C. Newburn, et al., The Sixth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), IPDPS 2016, Chicago, IL, IEEE, May 2016. A pdf version is available. CUDAaware noncontiguous data movement in Open MPI, Wei Wu, George Bosilca, Rolf vandeVaart, and Jack Dongarra, 25th International Symposium on HighPerformance Parallel and Distributed Computing (HPDC'16), Kyoto, Japan, ACM, June 2016. A pdf version is available.  2015 Exascale Computing and Big Data: The Next Frontier, Daniel A. Reed and Jack Dongarra, Communications of the ACM, Vol. 58 No. 7, Pages 5668, DOI: 10.1145/2699414. A pdf version is available. Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures, M. Baboulin, J. Dongarra, A. Rémy, S. Tomov, I. Yamazaki, the Proceedings of the 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015), Volume 9573 of the series Lecture Notes in Computer Science pp 8695, DOI: 10.1007/9783319321493_9 A pdf version is available. Accelerating Collaborative Filtering Using Concepts from High Performance Computing, Mark Gates, Hartwig Anzt, Jakub Kurzak, and Jack Dongarra, 2015 IEEE International Conference on Big Data (IEEE BigData, November 2015). DOI: 10.1109/BigData.2015.7363811 A pdf version is available. Strengthening compute and data intensive capacities of Armenia,” H. Astsatryan, V. Sahakyan, Y. Shoukourian, P.H. Cros, M. Dayde, J. Dongarra, P. Oster, in RoEduNet International Conference  Networking in Education and Research (RoEduNet NER), 2015 14th, vol., no., pp.2833, 2426, Sept. 2015 DOI: 10.1109/RoEduNet.2015.7311823 A pdf version is available. Parallel Programming Models for Dense Linear Algebra on Heterogeneous Systems, M. Abalenkovs, A. Abdelfattah, J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, S. Tomov, I. Yamazaki, A. YarKhan, Supercomputing Frontiers and Innovations, Volume 2, Number 4, pages 6786, 2015, DOI: 10.14529/jsfi1504 A pdf version is available. The TOP500 List of Supercomputers and Progress in High Performance Computing, Erich Strohmaier, Hans W. Meuer, Jack Dongarra, Horst D. Simon, IEEE Computer, No.11  Nov. (2015 vol.48), pp. 42–49, http://doi.ieeecomputersociety.org/10.1109/MC.2015.338. A pdf version is available. Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs, Jakub Kurzak, Hartwig Anzt, Mark Gates, and Jack Dongarra, IEEE Transactions on Parallel and Distributed Systems, no. 1045–9219, November 2015. A pdf version is available. Mixing LUQR Factorization Algorithms to Design HighPerformance Dense Linear Algebra Solvers, Mathieu Faverge, Julien Herrmann, Julien Langou, Bradley Lowery, Yves Robert, and Jack Dongarra, Journal on Parallel and Distributed Computing, Volume 85, November 2015, pp. 32–46, http://dx.doi.org/10.1016/j.jpdc.201. A pdf version is available. A Scalable Approach to Solving Dense Linear Algebra Problems on Hybrid CPUGPU Systems, Fengguang Song and Jack Dongarra, Concurrency and Computation: Practice and Experience, Volume 27, Issue 14, 25 September 2015, pp. 3702–3723, DOI: 10.1002/cpe.3403. A pdf version is available. A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination, Simplice Donfack, Jack Dongarra, Mathieu Faverge, Mark Gates, Jakub Kurzak, Piotr Luszczek, and Ichitaro Yamazaki, Concurrency and Computation: Practice and Experience Volume 27, Issue 5, pp. 1292–1309, 10 April 2015, http://dx.doi.org/10.1002/cpe.3306. A pdf version is available. Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs, Hartwig Anzt, Blake Haugen, Jakub Kurzak, Piotr Luszczek, and Jack Dongarra, Concurrency and Computing: Practice and Experience, Volume 27, Issue 17, December 2015, pp. 5096–5113, http://dx.doi.org/10.1109/IPDPSW.2014.107. A pdf version is available. MixedPrecision Cholesky QR Factorization and its Case Studies on Multicore CPUS with Multiple GPUs, Ichitaro Yamazaki, Stanimire Tomov, and Jack Dongarra, SIAM J. Sci. Comput. 373 (2015), pp. C307C330, http://dx.doi.org/10.1137/14M0973773. A pdf version is available. A New Metric for Ranking High Performance Computing Systems, Jack Dongarra, Michael A. Heroux, and Piotr Luszczek, National Science Review, January 2016, DOI: 10.1093/nsr/nwv084. A pdf version is available. Computing Lowrank Approximation of a Dense Matrix on Multicore CPUs with a GPU and its Application to Solving a Hierarchically Semiseparable Linear System of Equations, Ichitaro Yamazaki, Stanimire Tomov and Jack Dongarra, Scientific Programming, vol. 2015, Article ID 246019, 17 pages, 2015, http://dx.doi.org/10.1155/2015/246019. A pdf version is available. Batched Matrix Computations on Hardware Accelerators Based on GPUs, Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, and Jack Dongarra, The International Journal of High Performance Computing Applications, May 2015 29: 193208, first published on February 9, 2015, http://dx.doi.org/1177/1094342014567546. A pdf version is available. PaRSEC in Practice: Optimizing a Legacy Chemistry Application through Distributed TaskBased Execution, Anthony Danalis, Heike Jagode, George Bosilca and Jack Dongarra, to appear IEEE Cluster 2015, Chicago, Illinois, USA, Sept. 811, 2015. A pdf version is available. Random Sampling to Update Partial Singular Value Decomposition on a Hybrid CPU/GPU Cluster, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack Dongarra, to appear SC15, November 2015. A pdf version is available. Performance of Random Sampling for Computing Lowrank Approximations of a Dense Matrix on GPUs, ThÃ©o Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack Dongarra, to appear SC15, November 2015. A pdf version is available. Practical Scalable Consensus for PseudoSynchronous Distributed Systems, Thomas Herault, Aurelien Bouteiller, George Bosilca, Marc Gamell, Keita Teranishi, Manish Parashar, Jack Dongarra, to appear SC15, November 2015. A pdf version is available. Efficient Implementation Of Quantum Materials Simulations On Distributed CPUGPU Systems, Raffaele SolcÃ , Anton Kozhevnikov, Azzam Haidar, Stanimire Tomov, Thomas C. Schulthess, Jack Dongarra, to appear SC15, finalist for the Best Paper Award, November 2015. A pdf version is available. Dense Symmetric Indefinite Factorization on GPU Accelerated Architecture, Marc Baboulin, Jack Dongarra, Adrien Remy, Stanimire Tomov, and Ichitaro Yamazaki, to appear PPAM 2015, Krakow Poland, 2015. A pdf version is available. Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery, Aurelien Bouteiller, George Bosilca and Jack Dongarra, to appear EUROMPI Conference, Spetember 2015. A pdf version is available. Flexible Linear Algebra Development and Scheduling with Cholesky Factorization, Azzam Haidar, Asim YarKhan, Chongxiao Cao, Piotr Luszczek, Stanimire Tomov, Jack Dongarra, 17th IEEE International Conference on High Performance Computing and Communications, New York, New York, August 2015. A pdf version is available. Iterative Sparse Triangular Solves for Prconditioning, Hartwig Anzt, Edmond Chow and Jack Dongarra, to appear in EuroPar 2015, Vienna Austria, August 2015. A pdf version is available. Design for a Soft Error Resilient Dynamic Taskbased Runtime, Chongxaio Cao, George Bosilca, Thomas Herault, and Jack Dongarra, 29th IEEE International Parallel & Distributed Processing Symposium, Hyderabad, INDIA, May 2015. A pdf version is available. Hierarchical DAG Scheduling for Hybrid Distributed Systems, Wei Wu, George Bosilca, Aurelien Bouteiller, Mathieu Faverge, and Jack Dongarra, 29th IEEE International Parallel & Distributed Processing Symposium, Hyderabad, INDIA, May 2015. A pdf version is available. Performance Analysis and Optimisation of TwoSided Factorization Algorithms for Heterogeneous Platform, International Conference on Computational Science 2015, ICCS 2015, Computational Science at the Gates of Nature Edited By Slawomir Koziel, Leifur Leifsson, Michael Lees, Valeria V. Krzhizhanovskaya, Jack Dongarra and Peter M.A. Sloot. doi:10.1016/j.procs.2015.05.222 A pdf version is available. Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product, H. Anzt, S. Tomov, and J. Dongarra, In Spring Simulation MultiConference 2015 (SpringSim15), 2015. A pdf version is available. Performance Analysis and Design of a Hessenberg Reduction using Stabilized Blocked Elementary Transformations for New Architecture, Khairul Kabir, Azzam Haidar, Stanimire Tomov, Jack Dongarra. Best Paper Award at 2015 Spring Simulation Multiconference, 23rd High Performance Computing Symposium (HPC 2015). A pdf version is available. Energy Efficiency and Performance Frontiers for Sparse Computations on GPU Supercomputers, Hartwig Anzt, Stan Tomov, and Jack Dongarra, PMAM '15 Proceedings of the Sixth International Workshop on Programming Models and Applications for Multicores and Manycores, ACM New York, NY, USA 2015, doi:10.1145/2712386.2712387 A pdf version is available. Towards Batched Linear Solvers on Accelerated Hardware Platforms, Azzam Haidar, Piotr Luszczek, Stanimire Tomov, and Jack Dongarra, In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, February 711, 2015. 10.1145/2688500.2688534 A pdf version is available. Optimization for Performance and Energy for Batched Matrix Computations on GPUs, Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, and Jack Dongarra, 8th Workshop on General Purpose Processing Using GPUs, (GPGPU 8), San Francisco, February 7, 2015. 10.1145/2716282.2716288 A pdf version is available. Optimizing Krylov Subspace Solvers on Graphics Processing Units, Hartwig Anzt, Stanimire Tomov, Piotr Luszczek, Ichitaro Yamazaki, Jack Dongarra, and William Sawyer, Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International, pp 941949, DOI: 10.1109/IPDPSW.2014.107 A pdf version is available. Experiences in Autotuning Matrix Multiplication for Energy Minimization on GPUs, Hartwig Anzt, Blake Haugen, Jakub Kurzak, Piotr Luszczek, and Jack Dongarra, accepted in Concurrency and Computing: Practice and Experience, March 2015. DOI: 10.1002/cpe.3516 A pdf version is available. Mixing LUQR Factorization Algorithms to Design HighPerformance Dense Linear Algebra Solvers, Mathieu Faverge, Julien Herrmann, Julien Langou, Bradley Lowery, Yves Robert, and Jack Dongarra, accepted in Journal on Parallel and Distributed Computing, March 2015. http://dx.doi.org/10.1016/j.jpdc.201 A pdf version is available. MixedPrecision Cholesky QR Factorization and its Case Studies on Multicore CPUS with Multiple GPUs, I. Yamazaki, S. Tomov, and J. Dongarra, SIAM J. Sci. Comput., Volume 37, Issue 3, DOI:10.1137/14M0973773 A pdf version is available. Updating Incomplete Factorization Preconditioners for Model Order Reduction, Hartwig Anzt, Edmond Chow, Jens Saak, and Jack Dongarra, To appear in Parallel Computing. A pdf version is available. A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination, Simplice Donfack, Jack Dongarra, Mathieu Faverge, Mark Gates, Jakub Kurzak, Piotr Luszczek, Ichitaro Yamazaki, Submitted to Concurrency and Computation: Practice and Experience, Volume 27, Issue 5, pages 12921309, April 2015. DOI: 10.1002/cpe.3306 A pdf version is available. Computing Lowrank Approximation of a Dense Matrix on Multicore CPUs with a GPU and its Application to Solving a Hierarchically Semiseparable Linear System of Equations, Ichitaro Yamazaki, Stanimire Tomov and Jack Dongarra, Scientific Programming, vol. 2015, Article ID 246019, 17 pages, 2015. http://dx.doi.org/10.1155/2015/246019. A pdf version is available. Acceleration of GPUbased Krylov Solvers via Data Transfer Reduction, Hartwig Anzt, Stanimire Tomov, Piotr Luszczek, William Sawyer and Jack Dongarra, The International Journal of High Performance Computing Applications, accepted April 2015, http://dx.doi.org/10.1177/1094342015580139. A pdf version is available. Algorithmbased Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy, Aurelien Bouteiller, Thomas Herault, George Bosilca, Peng Du, and Jack Dongarra, ACM Transactions on Parallel Computing, Volume 1 Issue 2, January 2015, http://dx.doi.org/10.1145/2686892. A pdf version is available. HPC Programming on Intel ManyIntegratedCore Hardware with MAGMA Xeon Phi, Jack Dongarra, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Piotr Luszczek, and Stanimire Tomov, Scientific Programming, Volume 2015 (2015), Article ID 502593, 11 pages http://dx.doi.org/10.1155/2015/502593. A pdf version is available. Batched Matrix Computations on Hardware Accelerators Based on GPUs, Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov, and Jack Dongarra, The International Journal of High Performance Computing Applications, May 2015 29: 193208, first published on February 9, 2015, http://dx.doi.org/1177/1094342014567546. A pdf version is available. Composing Resilience Techniques: ABFT, Periodic and Incremental Checkpointing, George Bosilca, Aurelien Bouteiller, Thomas Herault, Yves Robert, and Jack Dongarra, International Journal of Networking and Computing, Volume 5, Number 1, pages 225, January 2015. A pdf version is available. Exascale Computing and Big Data: The Next Frontier, Daniel A. Reed and Jack Dongarra, accepted in Communications of the ACM, Vol. 58 No. 7, Pages 5668, DOI: 10.1145/2699414. A pdf version is available.  2014Unified Model for Assessing Checkpointing Protocols at ExtremeScale, George Bosilca, Aurelien Bouteiller, Elisabeth Brunet, Franck Cappello, Jack Dongarra, Amina Guermouche, Thomas Herault, Yves Robert, Frederic Vivien, and Dounia Zaidouni, Concurrency and Computation: Practice and Experience, Volume 26, Issue 17, pp. 2772–2791, 10 December 2014, DOI: 10.1002/cpe.3173.A pdf version is available. Performance of Various Computers Using Standard Linear Equations Software, (Linpack Benchmark Report), Jack J. Dongarra, University of Tennessee Computer Science Technical Report, CS8985, 2014. A postscript version is available. Parallel Simulation of Superscalar Scheduling, Blake Haugen, Piotr Luszczek, Jakub Kurzak, Asim YarKhan, and Jack Dongarra, CPP'14: International Conference on Parallel Processing, Minneapolis, MN, 2014, DOI: 10.1109/ICPP.2014.21 A pdf version is available. Performance and Portability with OpenCL for ThroughputOriented HPC Workloads Across Accelerators, Coprocessors, and Multicore Processors, Azzam Haidar, Chongxiao Cao, Ichitaro Yamazaki, Jack Dongarra, Mark Gates, Piotr Luszczek, and Stan Tomov, Scala 2014, ACM, New Orleans, LA, November 17, 2014, DOE:10.1109/ScalA.2014.8 A pdf version is available. Accessaverse Framework for Computing Lowrank Matrix Approximations, Ichitaro Yamazaki, Theo Mary, Jakub Kurzak, Stanimire Tomov, and Jack Dongarra, First International Workshop on High Performance Big Graph Data Management, Analysis, and Mining (in Conjunction with IEEE BigData'14), October, 27, 2014, Bethesda, MD, Pages: 70  77, DOI: 10.1109/BigData.2014.7004374 A pdf version is available. PTG: An Abstraction for Unhindered Parallelism, Anthony Danalis, George Bosilca, Aurelien Bouteiller, Thomas Herault, and Jack Dongarra, WOLFHPC '14 Proceedings of the Fourth International Workshop on DomainSpecific Languages and HighLevel Frameworks for High Performance Computing Pages 2130, SC14 Workshop, New Orleans, LA, November 17, 2014, DOI:10.1109/WOLFHPC.2014.8 A pdf version is available. Deflation Strategies to Improve the Convergence of CommunicationAvoiding GMRES, Ichitaro Yamazaki, Stanimire Tomov, and Jack Dongarra, ScalA2014, Workshop on Latest Advances in Scalable Algorithms for LargeScale Systems (ScalA), New Orleans, LA, November 17, 2014. DOI:10.1109/ScalA.2014.6 A pdf version is available. Power Monitoring with PAPI for Extreme Scale Architectures and Dataflowbased Programming Models, McCraw, Heike, Ralph, James, Danalis, Anthony, Dongarra, Jack, Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications (HPCMASPA 2014), IEEE Cluster 2014, IEEE, Madrid, Spain, September, 2014. DOI: 10.1109/CLUSTER.2014.6968672 A pdf version is available. LU Factorization of Small Matrices: Accelerating Batched DGETRF on the GPU, Tingxing Dong, Azzam Haidar, Piotr Luszczek, James Austin Harris, Stanimire Tomov, and Jack Dongarra, High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), Paris, France, 2014, DOI:10.1109/HPCC.2014.30 A pdf version is available. A Step towards Energy Efficient Computing: Redesigning A Hydrodynamic Application on CPUGPU, Tingxing Dong, Veselin Dobrev, Tzanio Kolev, Robert Rieben, Stanimire Tomov, and Jack Dongarra, 28th IEEE International Parallel & Distributed Processing Symposium, 2014, DOI: 10.1109/IPDPS.2014.103 A pdf version is available. clMAGMA: High Performance Dense Linear Algebra with OpenCL, Chongxiao Cao, Jack Dongarra, Peng Du, Mark Gates, Piotr Luszczek, Stanimire Tomov, IWOCL '14, May 12  13 2014, Bristol, United Kingdom. A pdf version is available. A Scalable Approach to Solving Dense Linear Algebra Problems on Hybrid CPUGPU Systems, Fengguang Song and Jack Dongarra, accepted in Concurrency and Computation: Practice and Experience, August 2014. DOI: 10.1002/cpe.3403 A pdf version is available DOE: Assessment of Workforce Development Needs in office of Science Research Disciplines, DOE ASCAC Subcommittee Report, B. Chapman, et. al, July 2014. A pdf version is available. Top Ten Exascale Research Challenges, DOE ASCAC Subcommittee Report, 2014, R. Lucas, et. al. A pdf version is available. Applied Mathematics Research for Exascale Computing, Jack Dongarra (cochair, Oak Ridge National Laboratory) and Jeffrey Hittinger (cochair, Lawrence Livermore National Laboratory, et. al. DOE Report for the Office of Science, Advanced Scientific Computing Research, 2014. A pdf version is available. Unified Model for Assessing Checkpointing Protocols at ExtremeScale, George Bosilca, Aurelien Bouteiller, Elisabeth Brunet, Franck Cappello, Jack Dongarra, Amina Guermouche, Thomas Herault, Yves Robert, Frederic Vivien, and Dounia Zaidouni, accepted in Concurrency and Computation: Practice and Experience, Volume 26, Issue 17, pages 27722791, 10 December 2014, DOI: 10.1002/cpe.3173. A pdf version is available. Accelerating Numerical Dense Linear Algebra Calculations with GPUs, Jack Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, and Ichitaro Yamazaki, pp. 328, in Numerical Computations with GPUs, edited by Volodymyr Kindratenko, Springer, 2014, DOI:10.1007/9783319065489_1. A pdf version is available. Looking Back at Dense Linear Algebra Software, Piotr Luszczek, Jakub Kurzak, and Jack Dongarra, Journal of Parallel and Distributed Computing, pp 25482560, 2014. http://dx.doi.org/10.1016/j.jpdc.2013.10.005 A pdf version is available. A Novel Hybrid CPUGPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks, Azzam Haidar, Stanimire Tomov, Jack Dongarra, Raffaele Solc`a, Thomas Schulthess, International Journal of High Performance Computing Applications, volume 28, number 2 pp 196209, 2014. DOI: 10.1177/1094342013502097 A pdf version is available. Update Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization, J. Dongarra, M. Faverge, P. Luszcsek, Concurrency and Computation: Practice and Experience, Volume 26, Issue 7, pp 14081431, DOI: 10.1002/cpe.3110, 2014. A pdf version is available. ModelDriven OneSided Factorizations on Multicore, Accelerated Systems, Jack Dongarra, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Asim YarKhan, Supercomputing Frontiers and Innovations, volume 1, number 1, 2014. A pdf version is available. Performance and Reliability Tradeoffs for the Double Checkpointing Algorithm, Jack Dongarra, Thomas Herault and Yves Robert, The International Journal of Networking and Computing, Vol 4 No 1, p. 2341, 2014. A pdf version is available. An Efficient Distributed Randomized Algorithm For Solving Large Dense Symmetric Indefinite Linear Systems, Marc Baboulin, Dulceneia Becker, George Bosilca, Anthony Danalis, and Jack Dongarra, Parallel Computing, Volume 40 Issue 7, July 2014, pp 213223. DOI: 10.1016/j.parco.2013.12.003 A pdf version is available. HPC Programming on Intel ManyIntegratedCore Hardware with MAGMA Port to Xeon Phi, Jack Dongarra, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Piotr Luszczek, and Stanimire Tomov, Volume 2015 (2015), Article ID 502593, Scientific Programming. DOI: 10.1155/2015/502593 A pdf version is available. Exascale Computing and Big Data: The Next Frontier, Daniel A. Reed and Jack Dongarra, DOI: 10.1145/2699414, Communications of the ACM, Vol. 58 No. 7, Pages 5668, July 2015. A pdf version is available. CommunicationAvoiding SymmetricIndefinite Factorization, G. Ballard, D. Becker, J. Demmel, J. Dongarra, A. Druinsky, I. Peled, O. Schwartz, S. Toledo, and I. Yamazaki, DOI:10.1137/130929060, SIAM J. Matrix Anal. Appl. 35(4): 13641460 (2014). A pdf version is available. Algorithmbased Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy, Aurelien Bouteiller, Thomas Herault, George Bosilca, Peng Du, and Jack Dongarra, DOI: 10.1145/2686892, ACM Transactions on Parallel Computing, Volume 1 Issue 2, January 2015. A pdf version is available. Assessing the Cost of Redistribution followed by a Computational Kernel: Complexity and Performance Results, Julien Herrmann, George Bosilca, Thomas Hurault, Loris Marchal, Yves Robert, Jack Dongarra, submitted to Parallel Computing May 2014. A pdf version is available. Optimizing Krylov Subspace Solvers on Graphics Processing Units, Hartwig Anzt, Stanimire Tomov, Piotr Luszczek, Ichitaro Yamazaki, Jack Dongarra, and William Sawyer, submitted to International Journal of High Performance Computing Applications 2014. A pdf version is available. A Scalable Approach to Solving Dense Linear Algebra Problems on Hybrid CPUGPU Systems, Fengguang Song and Jack Dongarra, DOI: 10.1002/cpe.3403, Concurrency and Computation: Practice and Experience, October 2014. A pdf version is available. LAPACK, CRC Handbook on Linear Algebra, Second Edition, Zhaojun Bai, James Demmel, Jack Dongarra, Julien Langou, and Jenny Wang, Editor Leslie Hogben, CRC Press, ISBN 9781466507289, 2014. A pdf version is available. Accelerating Numerical Dense Linear Algebra Calculations with GPUs, Jack Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, and Ichitaro Yamazaki, to appear in Numerical Computations with GPUs, edited by Volodymyr Kindratenko, Springer, 2014. A pdf version is available. Computing Least Squares Condition Numbers on Hybrid Multicore/GPU Systems, M. Baboulin and J. Dongarra and R. Lacroix, Proceedings for the Applied Mathematics, Modeling and Computational Science (AMMCS) conference, Vol. 117 (2015). A pdf version is available. New MultiStage Algorithm for Symmetric Eigenvalues and Eigenvectors Achieves TwoFold Speedup, A. Haidar, P. Luszczek, J. Dongarra, Best Paper Award, Workshop on Parallel and Distributed Scientific and Engineering Computing, Phoenix, AZ, May, 2014. A pdf version is available. Designing LUQR Hybrid Solvers for Performance and Stability, Mathieu Faverge, Julien Herrmann, Julien Langou, Bradley Lowery, Yves Robert, and Jack Dongarra, 28th IEEE International Parallel & Distributed Processing Symposium, 2014. A pdf version is available. Redesigning A Hydrodynamic Application on CPUGPU, Tingxing Dong, Veselin Dobrev, Tzanio Kolev, Robert Rieben, Stanimire Tomov, Jack Dongarra, 28th IEEE International Parallel & Distributed Processing Symposium. A pdf version is available. Improving the Performance of CAGMRES on Multicores with Multiple GPUs, I. Yamazaki, H. Anzt, S. Tomov, M. Hoemmen, and J. Dongarra, 28th IEEE International Parallel & Distributed Processing Symposium. A pdf version is available. Unified Development for Mixed MultiGPU and MultiCoprocessor Environments using a Lightweight Runtime Environment, A. Haidar, C. Cao, J. Dongarra, P. Luszczek, S. Tomov, A. YarKhan, K. Kabir, 28th IEEE International Parallel & Distributed Processing Symposium. A pdf version is available. MixedPrecision Orthogonalization Scheme and Adaptive Step Size for CAGMRES on GPUs, Best Paper Award, Ichitaro Yamazaki, Stanimire Tomov, Tingxing Dong and Jack Dongarra VECPAR 2014, June 30  July 3, 2014, Eugene, Oregon. A pdf version is available. Accelerating computation of eigenvectors in the nonsymmetric eigenvalue problem, Mark Gates, Azzam Haidar and Jack Dongarra VECPAR 2014, June 30  July 3, 2014, Eugene, Oregon. A pdf version is available. SelfAdaptive Multiprecision Preconditioners on Multicore and Manycore Architectures, Hartwig Anzt, Dimitar Lukarski, Stan Tomov and Jack Dongarra VECPAR 2014, June 30  July 3, 2014, Eugene, Oregon. A pdf version is available. Hybrid MultiElimination ILU Preconditioners on GPUs, Dimitar Lukarski, Hartwig Anzt, Stanimire Tomov, and Jack Dongarra, 23rd Heterogeneity in Computing Workshop (HCW 2014), in Proc. of IPDPS 2014, Phoenix, Arizona, May 1923, 2014. A pdf version is available. Optimizing Krylov Subspace Solvers on Graphics Processing Units, Hartwig Anzt, Stanimire Tomov, Piotr Luszczek, Ichitaro Yamazaki, Jack Dongarra, and William Sawyer, The Third International Workshop on Accelerators and Hybrid Exascale Systems (AsHES), May 19, 2014, Phoenix, AZ, part of IPDPS Conference. A pdf version is available. MIAMI: A Framework for Application Performance Diagnosis, G. Marin, J. Dongarra, and D. Terpstra, ISPASS2014 2014 IEEE International Symposium on Performance Analysis of Systems and Software March 2325, 2014 Hyatt Regency Hotel in Monterey, CA. A pdf version is available. Assessing the Impact of ABFT and Checkpoint Composite Strategies, Bosilca, G., Bouteiller, A., Herault, T., Robert, Y., Dongarra, J. IPDPSW, APDCM 2014, Phoenix, AZ, May, 2014. A pdf version is available. Dynamically balanced synchronizationavoiding LU factorization with multicore and GPUs, Simplice Donfack, Stanimire Tomov and Jack Dongarra, Fourth International Workshop on Accelerators and Hybrid Exascale Systems, May 19, 2014. A pdf version is available. Design and Implementation of a Large Scale TreeBased QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack Dongarra, Parallel Processing Letters, Volume 24, Number 4, December 2014, doi: 10.1142/S0129626414420043. A pdf version is available. Scaling Up Matrix Computations on SharedMemory Manycore Systems with 1000 CPU Cores, Fengguang Song and Jack Dongarra, Proceeding ICS '14 Proceedings of the 28th ACM international conference on Supercomputing, pp 333342, ACM New York, NY, USA, ISBN: 9781450326421 doi>10.1145/2597652.2597670 A pdf version is available. Heterogenous Acceleration for Linear Algebra in MulitCoprocessor Environments, Azzam Haidar, Piotr Luszczek, Stanimire Tomov and Jack Dongarra VECPAR 2014, June 30  July 3, 2014, Eugene, Oregon, accepted March 2014. A pdf version is available. A Fast Batched Choleksy Factorization on a GPU, Tingxing Dong, Azzam Haidar, Stanimire Tomov and Jack Dongarra, 43rd International Conference on Parallel Processing (ICPP2014), Minneapolis, USA, during September 912, 2014. A pdf version is available. clMAGMA: High Performance Dense Linear Algebra with OpenCL, Chongxiao Cao, Jack Dongarra, Peng Du, Mark Gates, Piotr Luszczek, Stanimire Tomov, The International Workshop on OpenCL, Bristol University, England, May 1213, 2014. A pdf version is available. Utilizing Dataflowbased Execution for Coupled Cluster Methods, Heike McCraw, Anthony Danalis, Thomas Herault, George Bosilca, Jack Dongarra, Karol Kowalski, Theresa L. Windus, Poster at Clusters 2014. A pdf version is available.  2013Trip Report to Changsha and the Tianhe2 Supercomputer, J. Dongarra, June 3, 2013.A pdf version is available. Extending the Scope of the CheckpointonFailure Protocol for Forward Recovery in Standard MPI, Wesley Bland, Peng Du, Aurelien Bouteiller, Thomas Herault, George Bosilca, and Jack J. Dongarra, Concurrency and Computing: Practice and Experience, Volume 25, Issue 17, pp. 2381–2393, DOI: 10.1002/cpe.3100. A pdf version is available. Extending the Scope of the CheckpointonFailure Protocol for Forward Recovery in Standard MPI, Wesley Bland, Peng Du, Aurelien Bouteiller, Thomas Herault, George Bosilca, and Jack J. Dongarra, Concurrency and Computing: Practice and Experience, Volume 25, Issue 17, pages 23812393, 2013, DOI: 10.1002/cpe.3100. A pdf version is available. Toward a New Metric for Ranking High Performance Computing Systems, M. Heroux and J. Dongarra, UTK EECS Tech Report and Sandia National Labs Report SAND20134744, June 2013. A pdf version is available. A Novel Hybrid CPUGPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks, Azzam Haidar, Stanimire Tomov, Jack Dongarra, Raffaele Solc`a, Thomas Schulthess, International Journal of High Performance Computing Applications, accepted July 2013. A pdf version is available. PaRSEC: A programming paradigm exploiting heterogeneity for enhancing scalability, George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Thomas Herault, Jack J. Dongarra, accepted in IEEE Computing in Science and Engineering, September 2013. A pdf version is available. Unified Model for Assessing Checkpointing Protocols at ExtremeScale, George Bosilca, Aurelien Bouteiller, Elisabeth Brunet, Franck Cappello, Jack Dongarra, Amina Guermouche, Thomas Herault, Yves Robert, Frederic Vivien, and Dounia Zaidouni, accepted in Concurrency and Computation: Practice and Experience, October 2013. A pdf version is available. Tridiagonalization of a Dense Symmetric Matrix On Multiple GPUs and Its Application to Symmetric Eigenvalue Problems, Ichitaro Yamazaki, Tingxing Dong, Raffaele Solcï¿½, Stanimire Tomov, Jack Dongarra, Thomas Schulthess, Concurrency and Computation: Practice and Experience, published online, October 2013, DOI: 10.1002/cpe.3152 A pdf version is available. PostFailure Recovery of MPI Communication Capability: Design and Rationale, Wesley Bland, Aurelien Bouteiller, Thomas Herault, George Bosilca and Jack J. Dongarra, International Journal of High Performance Computing Applications, Volume 27, Issue 3, Fall 2013, pp 44254, DOI: 10.1177/1094342013488238. A pdf version is available. Toward High Performance Divide and Conquer Eigensolver for Dense Symmetric Matrices, Azzam Haidare Hatem Ltaief, and Jack Dongarra, SIAM SISC, Vol. 34, No. 6, pp. C249C274. A pdf version is available. Accelerating Linear System Solutions Using Randomization Techniques, Marc Baboulin, Jack Dongarra, Julien Herrmann, and Stanimire Tomov, ACM TOMS, Vol. 39, No 2 (2013). A pdf version is available. Level3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms, Fred G. Gustavson, Jerzy Wasniewski, Jack J. Dongarra, J. Herrero, and J. Langou, ACM Transactions on Mathematical Software (TOMS), Vol. 39, No 2 (2013). A pdf version is available. High Performance Bidiagonal Reduction using Tile Algorithms on Homogeneous Multicore Architectures, H. Ltaief, P. Luszczek, and J. Dongarra, ACM Transactions on Mathematical Software, Volume 39, Issue 3, April 2013. A pdf version is available. An Evaluation of UserLevel Failure Mitigation support in MPI, Aurelien Bouteiller, Wesley Bland, Thomas Herault, Joshua Hursey, George Bosilca and Jack Dongarra, Recent Advances in the Message Passing Interface, Lecture Notes in Computer Science Volume 7490, 2012, pp 193203, ISSN: 0010485X, April 2013. A pdf version is available. KernelAssisted and TopologyAware Collective Communications on Multicore/Manycore Platforms, Teng Ma, George Bosilca, Aurelien Bouteiller, Jack Dongarra, Journal of Parallel and Distributed Computing, Volume 73, Issue 7, pp. 10001010, July 2013. (Best paper award IPDPS 2013 Conference) A pdf version is available. BlackjackBench: Portable Hardware Characterization with Automated Results Analysis, Anthony Danalis, Piotr Luszczek, Gabriel Marin, Jeffrey S. Vetter and Jack Dongarra, Computer Journal, 2013; doi: 10.1093/comjnl/bxt057. A pdf version is available. Enabling Workflows in GridSolve: Request Sequencing and Service Trading, Yinan Li, Asim YarKhan, Jack Dongarra, Keith Seymour, and Aurlie Hurault, The Journal of Supercomputing, June 2013, Volume 64, Issue 3, pp 11331152. A pdf version is available. Correlated Set Coordination in Fault Tolerant Message Logging Protocols, A. Boureiller, T. Herault, G. Bosilca, J. Dongarra, Concurrency and Computation: Practice and Experience, Volume 25, Issue 4, pages 572585, 2013. A pdf version is available. LU Factorization with Partial Pivoting for a Multicore System with Accelerators, J. Kurzak, P. Luszczek, and J. Dongarra, IEEE Transactions on Parallel and Distributed Computing, August 2013 (vol. 24 no. 8), pp. 16131621. A pdf version is available. Soft Error Resilient QR Factorization for Hybrid System with GPGPU,P. Du, P. Luszczek, S. Tomov, and J. Dongarra, accepted in Journal of Computational Science, January 2013. A pdf version is available. Hierarchical QR factorization algorithms for multicore cluster systems, Jack Dongarra, Mathieu Faverge, Thomas Herault, Mathias Jacquelin, Julien Langou, Yves Robert, Parallel Computing, Volume 39, Issues 45, AprilMay 2013, Pages 212â€“232. A pdf version is available. A BlockAsynchronous Relaxation Method for Graphics Processing Units, Hartwig Anzt, Stanimire Tomov, Jack Dongarra, Vincent Heuveline, Journal of Parallel and Distributed Computing, Journal of Parallel and Distributed Computing, Online June 6, 2013, http://dx.doi.org/10.1016/j.bbr.2011.03.031 A pdf version is available. Extending the Scope of the CheckpointonFailure Protocol for Forward Recovery in Standard MPI, Wesley Bland, Peng Du, Aurelien Bouteiller, Thomas Herault, George Bosilca, Jack J. Dongarra, accepted in Concurrency and Computing: Practice and Experience, June 2013. A pdf version is available. Achieving Numerical Accuracy and High Performance using Recursive Tile LU Factorization, J. Dongarra, M. Faverge, P. Luszcsek, Accepted Concurrency and Computation: Practice and Experience, July 2013. A pdf version is available. Optimizing MemoryBound Numerical Kernels on GPU Hardware Accelerators, A. Abdelfattah, J. Dongarra, D. Keyes, and H. Ltaief, 10th International Meeting on HighPerformance Computing for Computational Science (VECPAR 2012), Lecture Notes in Computer Science 7851, pp 7279, 2013. A pdf version is available. Programming the LU Factorization for a Multicore System with Accelerators, Jakub Kurzak, Piotr Luszczek, Mathieu Faverge, and Jack Dongarra, 10th International Meeting on HighPerformance Computing for Computational Science (VECPAR 2012), Lecture Notes in Computer Science 7851, pp 2835, 2013. A pdf version is available. Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach, George Bosilca, Aurelien Bouteiller, Anthony Danalis, Thomas Herault, Piotr Luszczek, and Jack J. Dongara, in the book Scalable Computing and Communications: Theory and Practice, edited by Samee U. Khan, Lizhe Wang, and Albert Y. Zomaya, Publisher John Wiley & Sons, ISBN: 9781118162651, 2013. A pdf version is available. Keeneland: Computational Science Using Heterogeneous GPU Computing, J. Vetter, R. Glassbrook, K. Schwan, S. Yalamanchili, M. Horton, A. Gavrilovska, M. Slawinska, J. Meredith, P. Roth, K. Spafford, S. Tomov, J. Wynkoop, Ed. Jeffrey S. Vetter, Contemporary High Performance Computing: From Petascale Toward Exascale, Taylor and Francis, Boca Raton, CRC Computational Science Series, 2013. A pdf version is available. HPC Challenge: Design, History, and Implementation Highlights, J. Dongarra and P. Luszczek, Ed. Jeffrey S. Vetter, Contemporary High Performance Computing: From Petascale Toward Exascale, Taylor and Francis, Boca Raton, CRC Computational Science Series, 2013, ISBN: 9781466568341. A pdf version is available. Multithreading in the PLASMA Library, Jakub Kurzak, Piotr Luszczek, Asim YarKhan, Mathieu Faverge, Julien Langou, Henricus Bouwmeester, and Jack Dongarra in Mult and Manyâ€�Core Processing: Architecture, Programming, Algorithms, & Applications, Edited by Mohamed Ahmed, Reda A. Ammar, Sanguthevar Rajasekaran Series: Chapman & Hall/CRC Computer & Information Science Series, published by Taylor & Francis, 2013. A pdf version is available. Looking Back at Dense Linear Algebra Software, Piotr Luszczek, Jakub Kurzak, and Jack Dongarra, submitted to Journal of Parallel and Distributed Computing, August 2013. A pdf version is available. Scalable Dense Linear Algebra on Heterogeneous Hardware, George Bosilca, Aurelien Bouteiller, Anthony Danalis, Thomas Herault, Jakub Kurzak, Piotr Luszczek, Stan Tomov, Jack Dongarra, to appear in the book HPC: Transition Towards Exascale Processing, in the series Advances in Parallel Computing, IOS Press. A pdf version is available. LAPACK, CRC Handbook on Linear Algebra, Second Edition, Zhaojun Bai, James Demmel, Jack Dongarra, Julien Langou, and Jenny Wang, Editor Leslie Hogben, CRC Press, to appear 2013. A pdf version is available. Revisiting the Double Checkpointing Algorithm, Jack Dongarra, Thomas Herault and Yves Robert, 15th Workshop on Advances in Parallel and Distributed Computational Models, at the IEEE International Parallel & Distributed Processing Symposium 2013, Boston MA, January 2013. A pdf version is available. Implementing a Blocked Aasen's Algorithm with a Dynamic Scheduler on Multicore Architectures, Ichitaro Yamazaki, Dulceneia Becker, Jack Dongarra, Alex Druinsky, Inon Peled, and Sivan Toledo, Grey Ballard, James Demmel, and Oded Schwartz, 15th Workshop on Advances in Parallel and Distributed Computational Models, at the IEEE International Parallel & Distributed Processing Symposium 2013, (Best Paper Award0, Boston MA, January 2013. A pdf version is available. Virtual Systolic Array for QR Decomposition, Jakub Kurzak, Piotr Luszczek, Mark Gates, Ichitaro Yamazaki, and Jack Dongarra, 15th Workshop on Advances in Parallel and Distributed Computational Models, at the IEEE International Parallel & Distributed Processing Symposium 2013, Boston MA, January 2013. A pdf version is available. clMAGMA: High Performance Dense Linear Algebra with OpenCL, C. Cao, Jack Dongarra, Peng Du, Mark Gates, Piotr Luszczek, Stanimire Tomov, International Workshop on OpenCL (IWOCL), GATech, May 1314, 2013. A pdf version is available. A Parallel solver for Incompressible Fluid Flows, Y. Wang, M. Baboulin, J. Dongarra, J. Falcou, Y Fraigneau, and O. Le Maitre, International Conference on Computational Science, ICCS 2013, Barcelona, Spain, May, 2013. A pdf version is available. Leading Edge Hybrid MultiGPU Algorithms for Generalized Eigenproblems in Electronic Structure Calculations, Azzam Haidar, Raffaele Solca, Mark Gates, Stanimire Tomov, Thomas Schulthess, and Jack Dongarra, International Supercomputing Conference ISC, Germany, Lecture Notes in Computer Science, Volume 7905, 2013, pp 6780. A pdf version is available. Beyond the CPU: Hardware Performance Counter Monitoring on Blue Gene/Q, Dan Terpstra, Kris Davis, Heike McCraw, Jack Dongarra, International Supercomputing Conference ISC, Germany, Lecture Notes in Computer Science, Volume 7905, 2013, pp 213225. A pdf version is available. Toward a scalable multiGPU eigensolver via computeintensive kernels and efficient communication, Azzam Haidar, Mark Gates, Stanimire Tomov, Jack Dongarra, ICS '13 Proceedings of the 27th international ACM conference on International conference on supercomputing, Pages 223232, ACM New York, NY, USA, June 2013, Eugene Oregon. A pdf version is available. Portable HPC Programming on Intel ManyIntegratedCore Hardware with MAGMA Port to Xeon Phi, Jack Dongarra, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Piotr Luszczek and Stan Tomov, To appear in the PPAM Conference 2013, Warsaw, Poland, September 2013. A pdf version is available. Standards for Graph Algorithm Primitives, Tim Mattson et. al, to appear HPECâ€™2013, Boston, September 10, 2013. A pdf version is available. Implementing a Systolic Algorithm for QR Factorization on Multicore Clusters with PaRSEC, Guillaume Aupy, Mathieu Faverge, Yves Robert, Jakub Kurzak, Piotr Luszczek, and Jack Dongarra, accepted in the 6th Workshop on Productivity and Performance held in conjunction with EuroPar 2013, Aachen, Germany August 26 or 27, 2013. A pdf version is available. Parallel Reduction to Hessenberg Form with Algorithmbased Fault Tolerance, Yulu Jia, George Bosilca, Piotr Luszczek, and Jack J. Dongarra, accepted in SC2013, July 2013. A pdf version is available.  2012 Autotuning GEMMs for Fermi, Jakub Kurzak, Stanimire Tomov, and Jack Dongarra, IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 11, November 2012, pp 20452057.A pdf version is available. Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture, Jack Dongarra, Hatem Ltaief, Piotr Luszczek, and Vince M. Weaver, The 2nd International Conference on Cloud and Green Computing(CGC 2012), pp 274  281, ISBN: 9781467330275, November 13, 2012, Xiangtan, Hunan, China. A pdf version is available. A Novel Hybrid CPUGPU Generalized Eigensolver for Electronic Structure Calculations Based on Fine Grained Memory Aware Tasks, Raffaele Solcï¿½, Azzam Haidar, Stanimire Tomov, Jack Dongarra, and Thomas C. Schulthess, Proceeding SC '12 Proceedings of the 2012, High Performance Computing, Networking Storage and Analysis, Pages 13381339 IEEE Computer Society Washington, DC, USA. A pdf version is available. Autotuning GEMMs for Fermi,Jakub Kurzak, Stanimire Tomov, and Jack Dongarra, IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 11, November 2012, pp 20452057. A pdf version is available. Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures, A. Haidar, H. Ltaief, A, YarKhan, J. Dongarra, Concurrency and Computations, Volume 24, Issue 3, pages 305â€“321, 10 March 2012. A pdf version is available. From CUDA to OpenCL: Towards a Performanceportable Solution for Multiplatform GPU Programming, P. Du, R. Weber, P. Luszczek, S. Tomov, G. Peterson, and J. Dongarra, Parallel Computing, Volume 38, Issue 8, August 2012, pp. 391407. A pdf version is available. Highperformance computing systems: Status and Outlook, , Jack Dongarra and A. J. van der Steen, Acta Numerica (2012), pp. 196. A pdf version is available. An Implementation of the Tile QR Factorization for a GPU and Multiple CPUs, Jakub Kurzak, Rajib Nath, Peng Du, and Jack Dongarra, in Applied Parallel and Scientific Computing, PARA 2010, Editor Lristjan Jonasson, Springer, LNCS, Volume 7133, pp 248257, 2012. A pdf version is available. DAGuE: A generic distributed DAG engine for high performance computing, G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier, J. Dongarra, Parallel Computing, Volume 38, Issue 12, pp. 37 â€“ 51, 2012. A pdf version is available. Divide and Conquer on Hybrid GPUAccelerated Multicore Systems, Christof Vï¿½mel, Stanimire Tomov, and Jack Dongarra, SIAM J. Sci. Comput. Volume 34, pp. C70C82, 2012. A pdf version is available. A Comprehensive Study of Task Coalescing for Selecting Parallelism Granularity in a TwoStage Bidiagonal Reduction, A. Haidar, H. Ltaief, P. Luszczek, and J. Dongarra, 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Shanghai, China, May 2012. A pdf version is available. A Tiled Parallel Solver For Symmetric Indefinite Systems On Multicore Architectures,Marc Babolin, D. Becker, and J. Dongarra, 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Shanghai, China, May 2012. A pdf version is available. AlgorithmBased Fault Tolerance for Dense Matrix Factorization,Peng Du, Aurelien Bouteiller, George Bosilca, Jack J. Dongarra, Thomas Herault, 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), February 2529, 2012, New Orleans, LA. A pdf version is available. Blockasynchronous Multigrid Smoothers for GPUaccelerated Systems,Hartwig Anzt, Stan Tomov, Mark Gates, Jack Dongarra, and Vincent Heuveline, Procedia Computer Science, Proceedings of the International Conference on Computational Science, ICCS 2012, Volume 9, 2012, Pages 7â€“16, 2012. A pdf version is available. From Serial Loops to Parallel Execution on Distributed Systems, Anthony Danalis, Aurelien Bouteiller, George Bosilca, Jack J. Dongarra, Thomas Herault, submitted to PPoPP 2012. A pdf version is available. HierKNEM: An Adaptive Framework for KernelAssisted and TopologyAware Collective Communications on Manycore Clusters,(Best Paper), Teng Ma, G. Bosilca, A. Bouteiller, J. Dongarra, 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Shanghai, China, May 2012.. A pdf version is available. Weighted BlockAsynchronous Relaxation for GPUAccelerated Systems, Hartwig Anzt, Jack Dongarra, and Vincent Heuveline, submitted to SIAM Journal on Computing March 2012. A pdf version is available. Dense Linear Algebra on Accelerated Multicore Hardware, Jack Dongarra, Jakub Kurzak, Piotr Luszczek, and Stanimire Tomov, in High Performance Scientific Computing: Algorithms and Applications, Editors Michael W. Berry, Kyle A. Gallivan, Efstratios Gallopoulos, Ananth Grama, Bernard Philippe, Yousef Saad and Faisal Saied, Springer, 2012. A pdf version is available. Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures using Tree Reduction, H. Ltaief, P. Luszczek, and J. Dongarra, in Lecture Notes in Computer Science, Volume 7203, 2012, Parallel Processing and Applied Mathematics 9th International Conference, PPAM 2011, Torun, Poland, September 1114, 2011, Part I, Roman Wyrzykowski, Jack Dongarra , Konrad Karczewski and Jerzy Wasniewski, pp 661670, 2012. A pdf version is available. Reducing the Amount of Pivoting in Symmetric Indefinite Systems, D. Becker, M. Babolin, J. Dongarra, in Lecture Notes in Computer Science, Volume 7203, 2012, Parallel Processing and Applied Mathematics 9th International Conference, PPAM 2011, Torun, Poland, September 1114, 2011, Part I, Roman Wyrzykowski, Jack Dongarra , Konrad Karczewski and Jerzy Wasniewski, pp 133142, 2012. A pdf version is available. Blockasynchronous Multigrid Smoothers for GPUaccelerated Systems, Hartwig Anzt, Stan Tomov, Mark Gates, Jack Dongarra, and Vincent Heuveline, International Conference on Computational Science, International Conference on Computational Science, (ICCS) 2012, May 2012, Omaha NE. A pdf version is available. Onesided dense matrix factorizations on a multicore with multiple GPU accelerators in MAGMA, Ichitaro Yamazaki, Stanimire Tomov, and Jack Dongarra, International Conference on Computational Science, ICCS 2012, Omaha NE. A pdf version is available. A Class of CommunicationAvoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines, Marc Baboulin, Simplice Donfack, Jack Dongarra, Laura Grigori, Adrien Rï¿½emy, Stanimire Tomov, International Conference on Computational Science, ICCS 2012, Omaha NE. A pdf version is available. High Performance Dense Linear System Solver with Resilience to Multiple Soft Errors, P. Du, P. Luszczek, and J. Dongarra, International Conference on Computational Science, ICCS 2012, Omaha NE. A pdf version is available. Enabling and Scaling Matrix Computations on Heterogeneous MultiCore and MultiGPU Systems, Fengguang Song and Jack Dongarra, ICS 2012 Conference, 26th International Conference on Supercomputing, 2529 June 2012, San Servolo Island, Venice, Italy. A pdf version is available. A Scalable Framework for Heterogeneous GPUBased Clusters, F. Song and J. Dongarra, ACM Symposium on Parallelism in Algorithms and Architectures (SPAA '12), Pittsburgh, USA on January 2012. A pdf version is available. A CheckpointonFailure Protocol for AlgorithmBased Recovery in Standard MPI, Wesley Bland, Peng Du, Aurelien Bouteiller, Thomas Herault, George Bosilca, and Jack J. Dongarra, EuroPar 2012 Parallel Processing, Lecture Notes in Computer Science Volume 7484, 2012, pp 477488 as a distinguished paper. A pdf version is available. From Serial Loops to Parallel Execution on Distributed Systems, Anthony Danalis, Aurelien Bouteiller, George Bosilca, Jack J. Dongarra, Thomas Herault, EuroPar 2012 Parallel Processing, Lecture Notes in Computer Science Volume 7484, 2012, pp 246257. A pdf version is available. Power Profiling of Cholesky and QR Factorizations on Distributed Memory Systems, George Bosilca, Jack Dongarra, and Hatem Ltaief, accepted at the EnAHPC 2012 : Third International Conference on EnergyAware High Performance Computing, International Conference on EnergyAware High Performance Computing, September 1214, 2012. A pdf version is available. Energy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture, Jack Dongarra, Hatem Ltaief, Piotr Luszczek, and Vince M. Weaver, submitted to The 2nd International Conference on Cloud and Green Computing(CGC 2012) November 13, 2012, Xiangtan, Hunan, China. A pdf version is available. Anatomy of a Globally Recursive Embedded LINPACK Benchmark, Piotr Luszczek and Jack Dongarra, accepted in 2012 IEEE High Performance Extreme Computing Conference, Waltham, Massachusetts, September 2012. A pdf version is available. Weights for BlockAsynchronous Iteration on GPUAccelerated Systems, Hartwig Anzt, Stanimire Tomov, Jack Dongarra, and Vincent Heuveline, To appear in the 10th HeteroPar'2012 (Tenth International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms), Rhodes Island, Greece, August 2012. A pdf version is available. GPUAccelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement, H. Anzt, P. Luszczek, J. Dongarra, V. Heuveline, EuroPar 2012 Parallel Processing, Lecture Notes in Computer Science Volume 7484, 2012, pp 908919, Rhodes Island, Greece, August 2012. A pdf version is available.  2011 HighPerformance HighResolution SemiLagrangian Tracer Transport on a Sphere, T. White and J. Dongarra, Journal of Computational Physics, Volume 230 Issue 17, July, 2011, pp 67786799. A pdf version is available. A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures, M. Horton, S. Tomov, and J. Dongarra, to appear 2011 Symposium on Application Accelerators in High Performance Computing, 1921 July, 2011, Knoxville TN. A pdf version is available. Algorithmbased Fault Tolerance Method for Soft Error Resilience in HighPerformance Linpack, Peng Du, Piotr Luszczek, and Jack Dongarra, IEEE Cluster 2011, September 2630, Austin, TX. A pdf version is available. Analysis of Dynamically Scheduled Tile Algorithms for Dense Linear Algebra on Multicore Architectures, Azzam , Hatem Ltaief, Asim YarKhan and Jack Dongarra, IPDPS 2011, Anchorage, AK, May 2011. A pdf version is available. BLAS for GPUs, R. Nath, S. Tomov, and J. Dongarra, pp 5780, in Scientific Computing with Multicore and Accelerators, Edited by Jakub Kurzak, David Bader, and Jack Dongarra, Chapman & Hall/CRC Computational Science Series, ISBN 9781439825365, 2011. A pdf version is available. Changes in Dense Linear Algebra Kernels, Decadeslong perspective, Piotr Luszczek, Jakub Kurzak, and Jack Dongarra, pp 313342, in Solving the Schrï¿½dinger equation: has everything been tried? Editor Paul Popular, Imperial College Press, 2011, ISBN13 9781848167247. A pdf version is available. DAGuE: A generic distributed DAG engine for high performance computing,G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier, J. Dongarra, Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on , pp.11511158, 1620 May 2011, ISSN: 15302075. A pdf version is available. Dense Linear Algebra for Hybrid GPUBased Systems, S. Tomov and J. Dongarra, pp 3756, in Scientific Computing with Multicore and Accelerators, Edited by Jakub Kurzak, David Bader, and Jack Dongarra, Chapman & Hall/CRC Computational Science Series, ISBN 9781439825365, 2011. A pdf version is available. Evaluation of the HPC Challenge Benchmarks in Virtualized Environments, P. Luszczek, E. Meek, S. Moore, D. Terpstra, J. Dongarra, 6th Workshop on Virtualization in HighPerformance Cloud Computing (VHPC '11) as part of EuroPar 2011, Bordeux France. A pdf version is available. Exploiting FineGrain Parallelism in Recursive LU Factorization, Jack Dongarra, Mathieu Faverge, Hatem Ltaief, Piotr Luszczek, International Conference on Parallel Computing, 30 August  2 September 2011, Ghant Belgium. A pdf version is available. Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA,George Bosilca, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Azzam Haidar, Thomas Herault, Jakub Kurzak, Julien Langou, Pierre Lemarinier, Hatem Ltaief, Piotr Luszczek, Asim YarKhan, Jack Dongarra, 12th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC11), May 1620, 2011, Anchorage, Alaska, USA. A pdf version is available. Fully Empirical Autotuned Dense QR Factorization For Multicore Architectures, E. Agullo, J. Dongarra, R. Nath, S. Tomov, EuroPar 2011. A pdf version is available. High Performance Matrix Inversion Based on LU Factorization for Multicore Architectures,J. Dongarra, M. Faverge, H. Ltaief, P. Luszcsek, 4th Workshop on ManyTask Computing on Grids and Supercomputers (MTAGS) 2011, Colocated with Supercomputing/SC 2011, Seattle Washington, November 14th, 2011. A pdf version is available. HighPerformance HighResolution SemiLagrangian Tracer Transport on a Sphere, T. White and J. Dongarra, Journal of Computational Physics, Volume 230 Issue 17, July, 2011, pp 67786799. A pdf version is available. Impact of KernelAssisted MPI Communication over Scientific Applications: CPMD and FFTW, T. Ma, A. Bouteiller, G. Bosilca, J. Dongarra, EuroMPI2011, September 1921, 2011, Santorini Greece. A pdf version is available. Implementing Matrix Factorization on the Cell B.E., J. Kurzak, and J. Dongarra, pp. 2135, in Scientific Computing with Multicore and Accelerators, Edited by Jakub Kurzak, David Bader, and Jack Dongarra, Chapman & Hall/CRC Computational Science Series, ISBN 9781439825365, 2011. A pdf version is available. Implementing Matrix Multiplication on the Cell B.E., W. Alvaro, J. Kurzak, and J. Dongarra, pp 320, in Scientific Computing with Multicore and Accelerators, Edited by Jakub Kurzak, David Bader, and Jack Dongarra, Chapman & Hall/CRC Computational Science Series, ISBN 9781439825365, 2011. A pdf version is available. Improvement of parallelization efficiency of batch pattern BP training algorithm using Open MPI, Volodymyr Turchenko, Lucio Grandinetti, George Bosilca and Jack J. Dongarra, International Conferenc e on Computational Science, ICCS 2010, Amsterdam The Netherlands, June 2010. A pdf version is available. Keeneland: Bringing Heterogeneous GPU Computing to the Computational Science Community, J.S. Vetter, R. Glassbrook, J. Dongarra, K. Schwan, B. Loftis, S. McNally, J. Meredith, J. Rogers, P. Roth, K. Spafford, and S. Yalamanchili, IEEE Computing in Science and Engineering, 13(5):905, 2011, ISSN: 15219615. A pdf version is available. Level3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms, Fred G. Gustavson, Jerzy Wasniewski, Jack J. Dongarra, J. Herrero, and J. Langou, accepted in ACM TOMS, June 2011. A pdf version is available. LU Factorization for Acceleratorbased Systems, Emmanuel Agullo, Cï¿½edric Augonnet, Jack Dongarra, Mathieu Faverge, Julien Langou, Hatem Ltaief, Stanimire Tomov, The 9TH ACS/IEEE International Conference on Computer Systems and Applications AICCSA 2011, June 27th  June 30th 2011, Sharm ElSheikh, Egypt. A pdf version is available. Multithreading in the PLASMA Library,Jakub Kurzak, Piotr Luszczek, Asim YarKhan, Mathieu Faverge, Julien Langou, Henricus Bouwmeester, and Jack Dongarra in Multi and ManyCore Technologies: Architecture, Programming, Algorithms, & Applications, published by Taylor & Francis, 2011. A pdf version is available. OMPIO: A Modular Software Architecture for MPI I/O, Mohamad Chaarawi, Edgar Gabriel, Rainer Keller, Richard Graham, George Bosilca and Jack Dongarra, EuroMPI2011, September 1921, 2011, Santorini Greece. A pdf version is available. On Scalability for MPI Runtime Systems, George Bosilca, Thomas Herault, Ala Rezmerita and Jack Dongarra, The International Workshop on Runtime and Operating Systems for Supercomputers, May 31, 2011. A pdf version is available. Optimizing Symmetric Dense MatrixVector Multiplication on GPUs, Jakub Kurzak, Jack Dongarra, and Rajib Nath, IEEE/ACM SC11 Conference, Seattle WA, November 2011. A pdf version is available. Overlapping Computation and Communication for Advection on Hybrid Parallel Computers, J. White and J. Dongarra, IPDPS 2011, Anchorage, AK, May 2011. A pdf version is available. Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated FineGrained and MemoryAware Kernels, Hatem Ltaief, Azzam , and Jack Dongarra, IEEE/ACM SC11 Conference, Seattle WA, November 2011. A pdf version is available. Performance Portability of a GPU Enabled Factorization with the DAGuE Framework,Aurelien Bouteiller, George Bosilca, Jack J. Dongarra, Thomas Herault, Pierre Lemarinier, Stanimir Tomov and Narapat Ohm Saengpatsa, IEEE Cluster: workshop on Parallel Programming on Accelerator Clusters (PPAC), June 24, 2011. A pdf version is available. Profiling High Performance Dense Linear Algebra Algorithms on Multicore Architectures for Power and Energy Efficiency, Hatem Ltaief, Piotr Luszczek and Jack Dongarra, the International Conference on EnergyAware High Performance Computing September 0709, 2011, Hamburg, Germany. A pdf version is available. QCGOMPI: MPI Applications on Grids, Emmanuel Agullo, Camille Coti, Thomas Herault, Julien Langou, Sylvain Peyronnet, Ala Rezmerita, Franck Cappello, Jack Dongarra, Future Generation Computer Systems, Volume 27, Issue 4, pp 357369, April 2011. A pdf version is available. QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment, Emmanuel Agullo, Camille Coti, Jack Dongarra, Thomas Herault, and Julien Langou, UTCS10651, Janua ry 6, 2010. A pdf version is available. QR Factorization on a Multicore Node Enhanced with Multiple GPU Accelerators, E. Agullo, C. Augonnet, J. Dongarra, M. Feverge, H. Ltaief, S. Thibault, S. Tomov, IPDPS 2011, Anchorage, AK, May 2011. A pdf version is available. Recent Advances in the Message Passing Interface 18th European MPI Users' Group Meeting,EuroMPI 2011 Santorini, Greece, September 1821, 2011, Yiannis Cotronis, Anthony Danalis, Dimitrios S. Nikolopoulos, and Jack Dongarra (Eds.) Springer, LNCS, Volume 6960, 2011, ISSN 03029743, ISBN 9783642244483. Rectangular Full Packed Format for Cholesky's Algorithm: Factorization, Solution, and Inverse. Fred G. Gustavson, Jerzy Wasniewski, Jack J. Dongarra, and J. Langou, ACM TOMS, Volume 37, Number 2, 2011, pp. 181:1821, 2011, ISSN 00983500. A pdf version is available. Reducing the Amount of Pivoting in Symmetric Indefinite Systems, D. Becker, M. Babolin, J. Dongarra, to appear PPAM, October 2011. A pdf version is available. Scalable Runtime for MPI: Efficiently Building the Communication Infrastructure, G. Bosilca, T. Herault, P. Lemarinier, A. Rezmerita, and J. Dongarra, EuroMPI2011, September 1921, 2011, Santorini Greece. A pdf version is available. Scientific Computing with Multicore and Accelerators, Edited by Jakub Kurzak, David Bader, and Jack Dongarra, Chapman & Hall/CRC Computational Science Series, ISBN 9781439825365, 2011. Soft Error Resilient QR Factorization for Hybrid System with GPGPU,P. Du, P. Luszczek, S. Tomov, and J. Dongarra, Workshop on Latest Advances in Scalable Algorithms for LargeScale Systems (ScalA) held in conjunction with the 24th IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC) 2011, November 14, 2011, Seattle, WA, USA. A pdf version is available. Solving the Generalized Symmetric Eigenvalue Problem using Tile Algorithms on Multicore Architectures, Hatem Ltaief, Piotr Luszczek, and Jack Dongarra, International Conference on Parallel Computing, 30 August  2 September 2011, Ghant Belgium. A pdf version is available. The International Exascale Software Roadmap, J. Dongarra, P. Beckman, et. al, International Journal of High Performance Computing, Volume 25, Number 1, pp. 360, 2011, ISSN 10943420. A pdf version is available. Toward High Performance and Conquer Eigensolver for Dense Symmetric Matrices, Azzam Haidar, Hatem Ltaief, and Jack Dongarra, submitted to SIAM SISC, February 2011. A pdf version is available. Towards an efficient tile matrix inversion of symmetric positive definite matrices on multicore architectures,Agullo, E., Bouwmeester, H., Dongarra, J., Kurzak, J., Langou, J., and Rosenberg, L., In Proceedings of the 9th International Meeting on High Performance Computing for Computational Science, VEC PAR'10, Berkeley, CA, June 2225 2011. A pdf version is available. Tracebased Performance Analysis for the Petascale Simulation Code FLASH, Heike Jagode, Jack Dongarra, Andreas Knupfer, Matthias Jurenz, Matthias S. Muller, and Wolfgang E. Nagel, International Journal of High Performance Computing, Volume 25, Number 4, Winter 2011, pp. 428439, ISSN 10943420. A pdf version is available. TwoStage Tridiagonal Reduction for Dense Symmetric Matrices using Tile Algorithms on Multicore Architectures, Piotr Luszczek, Hatem Ltaief, and Jack Dongarra, IPDPS 2011, Anchorage, AK, May 2011. A pdf version is available.  2010 Accelerating the Reduction to Upper Hessenberg, Tridiagonal, and Bidiagonal Forms Through Hybrid GPUBased Computing, S. Tomov, R. Nath, and J. Dongarra, Parallel Computing, Volume 36, Number 12, 2010, pp. 45654.A pdf version is available. An Improved MAGMA GEMM for Fermi GPUs, Rajib Nath, Stanimire Tomov, and Jack Dongarra, International Journal of High Performance Computing Applications, Volume 24, number 4, 2010, pp 511515, ISSN 10943420. A pdf version is available. Dense Linear Algebra Solvers for Multicore with GPU Accelerators, Stanimire Tomov, Rajib Nath, Hatem Ltaief, and Jack Dongarra, Proceedings of IPDPS 2010: 24th IEEE I nternational Parallel and Distributed Processing Symposium, Atlanta, GA, April 2010. A pdf version is available. Empirical Performance Tuning of Dense Linear Algebra Software, Jack Dongarra and Shirley Moore, pp 255272, in Performance Tuning of Scientific Applications, David H. Bailey, Robert F. Lucas, Samuel W. Williams, Editors, Chapman & Hall/CRC Computational Science Series, ISBN 9781439815694, 2010. A pdf version is available. Faster, Cheaper, Better  a Hybridization Methodology to Develop Linear Algebra Software for GPUs, Emmanuel Agullo, Cedric Augonnet, Jack Dongarra, Hatem Ltaief, Raymond Namyst, Samuel Thibault, and Stanimire Tomov, Nvidia GPU Gems, Morgan Kaufmann (Ed.), 2010. A pdf version is available. Hybrid Multicore Cholesky Factorization with Multiple GPU Accelerators, H. Ltaief, S. Tomov, R. Nath, and J. Dongarra, Submitted to IEEE Transaction on Parallel and Distributed Computing, March 2010. A pdf version is available. Parallel Band TwoSided Matrix Bidiagonalization for Multicore Architectures, Hatem Ltaief, Jakub Kurzak, and Jack Dongarra, IEEE Transactions on Parallel and Distributed Systems, April 2010, pp 417423. A pdf version is available. Redesigning the Message Logging Model for High Performance, A. Bouteiller, G. Bosilca, and J. Dongarra, Concurrency and Computation Practice and Experience, Volume 22, Number 15, November 2010, pp 21962212, ISSN 15320626. A pdf version is available. Scheduling Linear Algebra Operations on Multicore Processors, Jakub Kurzak, Hatem Ltaief, Jack Dongarra, and Rosa M. Badia, Concurrency and Computation: Practice and Experience, Vol. 22, no. 1, pp. 1544, January, 2010. A pdf version is available. Scheduling Twosided Transformations using AlgorithmsbyTiles on Multicore Architectures, H. Ltaief, J. Kurzak, J. Dongarra, and R. Badia, Scientific Programming, Volume 18, Number 1, pp 3550, 2010, ISSN 10589244. A pdf version is available. SelfHealing Network for Scalable FaultTolerant Runtime Environments, T. Angskun, G. Fagg, G. Bosilca, J. PjesivacGrbovic, and J Dongarra, Future Generation Computer Systems, Volume 26, Issue 3, pp 479485, March 2010, ISSN 0167739X, 2010. A pdf version is available. SmartGridRPC: The new RPC model for high performance Grid computing and its implementation in SmartGridSolve, T. Brady, A. Lastovetsky, K. Seymour, M. Guidolin,and J. Dongarra, Concurrency Practice and Experience, pp 24672487, Volume 22 Number 18, ISSN 15320626, 2010. A pdf version is available. Towards Dense Linear Algebra for Hybrid GPU Accelerated Manycore Systems, Parallel Computing, Volume 36, Issues 56, pp 232240, 2010, ISSN 01678191. A pdf version is available. Reliability and Performance Modeling and Analysis for Grid Computing,YuanShun Dai, Jack Dongarra, in Handbook of Research on Scalable Computing Technologies, Editors KuanChing Li, ChingHsien Hsu, Laurence Tianruo Yang, Jack Dongarra, Hans Zima, IGI Global, 2010. A pdf version is available. Transparent CrossPlatform Access to Software Services using GridSolve and GridRPC, Keith Seymour, Asim YarKhan, and Jack Dongarra to appear in Cloud Computing and Software Services: Theory and Techniques, editors Syed Ahson and Mohammad Ilyas, 2010, CRC Press. A pdf version is available.  2009 A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures, Alfredo Buttari, Julien Langou, Jakub Kurzak, and Jack Dongarra, Parallel Computing, Volume 35, Issue 1, pp 3853, 2009, ISSN:01678191 A pdf version is available. Accelerating Scientific Computations with Mixed Precision Algorithms, Marc Baboulin, Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julie Langou, Julien Langou, Piotr Luszczek, and Stanimire Tomov, Computer Physics Communications 180 (2009) 25262533. A pdf version is available. Accelerating TimeToSolution for Computational Science and Engineering, J. Demmel, J. Dongarra, A. Fox, S. Williams, V. Volkov, and K. Yelick, SciDAC Review, Winter 2009, pp 4657. A pdf version is available. Algorithmic Based Fault Tolerance Applied to High Performance Computing, Jack J. Dongarra, George Bosilca, Remi Delmas, and Julien Langou, Journal of Parallel and Distributed Computing, Volume 69, pp 410416, 2009. A pdf version is available. Computing the Conditioning of the Components of a Linear Least Squares Solution, Marc Baboulin, Jack Dongarra, and Julien Langou,Numerical Linear Algebra with Applications, July 2009, Volume 16 Issue 7, p 517533. A pdf version is available. Highly Scalable SelfHealing Algorithms for High Peroformance Scientific Computing, Zizhong Chen and Dongarra, J.IEEE Transactions on Computing, Volume 58, Number 11, November 2009, pp 15121524, ISSN 00189340. A pdf version is available. Optimizing Matrix Multiplication for a ShortVector SIMD Architecture  CELL Processor, Wesley Alvaro, Jakub Kurzak, and Jack Dongarra, Parallel Computing, Volume 35, pp 138150, 2009. A pdf version is available. Paravirtualization Effect on Single and Multithreaded MemoryIntensive Linear Algebra Software, Lamia Youseff, Keith Seymour, Haihang You, Dmitrii Zagorodnov, Jack Dongarra, and Rich Wolski, Cluster Computing Journal, Volume 12, Number 2 / June, 2009, pp 101122, ISSN 13867857. A pdf version is available. QR Factorization for the CELL Processor, Jakub Kurzak and Jack Dongarra, Accepted in Scientific Programming, Scientific Programming, Volume 17, Issue 12, January 2009, pp 3142, ISSN:10589244. A pdf version is available. Scheduling Linear Algebra Operations on Multicore Processors, Jakub Kurzak, Hatem Ltaief, Jack Dongarra, and Rosa Badia, to appear in Trends in High Performance and Large Scale Computing, editors L. Grandinetti, G. Joubert, and W. Gentzsch, IOP Press, to be published in 2009. A pdf version is available. The International Exascale Software Project: A Call to Cooperative Action by the Global High Performance Community, Jack Dongarra, Pete Beckman, Patrick Aerts, Frank Cappello, Thomas Lippert, Satoshi Matsuoka, Paul Messina, Terry Moore, Rick Stevens, Anne Trefethen, Mateo Valero, Volume 23, Number 4, Winter 2009, International Journal of High Performance Computer Applications, pp 309322, ISSN 10943420. A pdf version is available. The Problem with the Linpack Benchmark Matrix Generator, Julien Langou and Jack Dongarra, International Journal of High Performance Computer Applications, Volume 23, Number 1, Spring 2009, pp 5  14. A pdf version is available.  2008 A Comparison of Search Techniques for Empirical Code Optimization, Keith Seymour, Haihang You, and Jack Dongarra, submitted to The Third international Workshop on Automatic Performance Tuning, October 1st, 2008, Tsukuba International Congress Center, Epochal Tsukuba, Japan. A pdf version is available. A Tribute to Gene Golub, Jack Dongarra, Computing in Science and Engineering, IEEE, March/April 2008, pp 5. A pdf version is available. AlgorithmBased Fault Tolerance for FailStop Failures, Zizhong Chen and Jack Dongarra, IEEE Transactions on Parallel and Distributed Systems, Vol. 19, No. 12, December, 2008. A pdf version is available. Interactive GridAccess Using Gridsolve and Giggle, M. Hardt, K. Seymour, J. Dongarra, M. Zapf, and N.V. Ruiter, Computing and Informatics, Vol. 27, No. 2, pp 233248, 2008, ISSN 13359150. A pdf version is available. Interior State Computation of Nano Structures,Andrew Canning, Jack Dongarra, Julien Langou, Osni Marques, Stanimire Tomov, Christof Voemel, and LinWang Wang, PARA 2008, 9th International Workshop on StateoftheArt in Scientific and Parallel Computing, May 1316, 2008, Trondheim Norway. A pdf version is available. Netlib and NANet: Building a Scientific Computing Community, J. Dongarra, G. Golub, E. Grosse, C. Moler, K. Moore, IEEE Annals of the History of Computing, Volume 3 Number 2, April  June 2008, pp 30  41. A pdf version is available. Parallel Tiled QR Factorization for Multicore Architectures, Alfredo Buttari, Julien Langou, Jakub Kurzak, and Jack Dongarra, Concurrency and Computation: Practice and Experience, 2008; 20:15731590. A pdf version is available. Revisiting Matrix Product on MasterWorker Platforms, Jack Dongarra, JeanFranÃ§ois Pineau, Yves Robert, Zhiao Shi and FrÃ©dÃ©ric Vivien, International Journal of Foundations of Computer Science (IJFCS), Volume 19, Number 6, December 2008, pp 13171336. A pdf version is available. Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization, Jakub Kurzak, Alfredo Buttari, and Jack Dongarra, IEEE Transactions on Parallel and Distributed Systems, Volume 19, Number 9, September 2008, pp 1  11. A pdf version is available. Some Issues in Dense Linear Algebra for Multicore and Special Purpose Architectures, Marc Baboulin, Stan Tomov and Jack Dongarra, PARA 2008, 9th International Workshop on StateoftheArt in Scientific and Parallel Computing, EECS Tech Report UTCS08615, LAWN #200, May 1316, 2008, Trondheim Norway. A pdf version is available. StateoftheArt Eigensolvers for Electronic Structure Calculations of Large Scale NanoSystems, Christof Vomel, Stanimire Z. Tomov, Osni A. Marques, A. Canning, LinWang Wang, and Jack J. Dongarra, Journal of Computational Physics, Volume 227, Issue 15 (July 2008), pages 71137124. A pdf version is available. The PlayStation 3 for High Performance Scientific Computing, Jakub Kurzak, Alfredo Buttari, Piotr Luszczek, and Jack Dongarra, Computing in Science and Engineering, IEEE, May/June 2008, pp 8083. A pdf version is available. Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64bit Accuracy, Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Piotr Luszczek, and Stanimire Tomov, ACM Transactions on Mathematical Software, Volume 34 Number 4, July 2008, pp 1  22. A pdf version is available.  2007 Automatic Analysis of Inefficiency Patterns in Parallel Applications, Felix Wolf, Bernd Mohr, Jack Dongarra, Shirley Moore, Concurrency and Computation: Practice and Experience, Volume 19, Issue 11, pp 14811496, August 2007. A pdf version is available. Implementation of Mixed Precision in Solving Systems of Linear Equations on the Cell Processor, Jakub Kurzak, Jack Dongarra, Concurrency and Computation: Practice and Experience, Volume 19, Issue 10, pp 13711385, July 2007. A pdf version is available. Improved Runtime and Transfer Time Prediction Mechanisms in a Network Enabled Servers Middleware, Emmanuel Jeannot, Keith Seymour, Asim YarKhan, and Jack J. Dongarra, Parallel Processing Letters, March 2007, Volume 17, Number 1, pp 4759, ISSN 01296264. A pdf version is available. Performance Analysis of MPI Collective Operations, Jelena PjesivacGrboviÂ´c, Thara Angskun, George Bosilca, Graham E. Fagg, Edgar Gabriel, and Jack J. Dongarra, Cluster Computing Journal, Volume 10, pp 127143, 2007. A pdf version is available. Recovery Patterns for Iterative Methods in a Parallel Unstable Environment, G. Bosilca, Z. Chen, J. Dongarra, and J. Langou, SIAM Journal on Scientific Computing, pp 102116, Volume 30, Number 1, 2007. A pdf version is available. Scalability Analysis of the SPEC OpenMP Benchmarks on LargeScale Shared Memory Multiprocessors, K. Fuerlinger, M. Gerndt, J. Dongarra, in Lecture Notes in Computer Science, Volumes 44874490, Computational Science  ICCS 2007, 7th International Conference Beijing, China, May 27  30, 2007, Editors Yong Shi, Geert Dick van Albada, Jack Dongarra, and Peter M.A. Sloot, ISBN10 354072589X, ISSN 03029743, Springer Berlin / Heidelberg, 2007. A pdf version is available. The Impact of Multicore on Computational Science Software, Jack Dongarra, Dennis Gannon, Geoffrey Fox, and Ken Kennedy, CTWatch Quarterly, Volume 3 Number 1, February 2007, (Unreviewed). A pdf version is available. The Use of Bulk States to Accelerate the Band Edge State Calculation of a Semiconductor Quantum Dot, Christof Vomel, Stanimire Z. Tomov, LinWang Wang, Osni A. Marques, and Jack J. Dongarra, Journal of Computational Physics, Volume 223, Number 2, pp 774782, ISSN 00219991, 2007. A pdf version is available.  2006 ConjugateGradient Eigenvalue Solvers in Computing Electronic Properties of Nanostructure Architectures, Stanimire Tomov, Julien Langou, Andrew Canning, LinWang Wang, and Jack Dongarra, The International Journal of Computational Science and Engineering, Volume 2, Number 3/4, pp 205212, 2006, ISSN 17427185. A pdf version is available. Design and Implementation of the HPC Challenge Benchmark Suite, Piotr Luszczek, Jack Dongarra, Jeremy Kepner, CTWatch Quarterly, November 2006, Volume 2, Number 4A, http://www.ctwatch.org/quarterly/archives/november2006/ (Unreviewed). A pdf version is available. NanoPSE: A Nanoscience Problem Solving Environment for Atomistic Electronic Structure of Semiconductor Nanostructures, W. B. Jones, G. Bester, A. Canning, A. Franceschetti, P. A. Graf, K. Kim, J. Langou, L.W. Wang, J. Dongarra, and A. Zunger, , in "the Proceedings of Science Discovery through Advanced Computing (SciDAC 2005)", Journal of Physics: Conference Series 16, 277282, 2005. A pdf version is available. Predicting the Electronic Properties of 3D, MillionAtom Semiconductor Nanostructure Architectures, A. Zunger, A. Franceschetti, G. Bester, W.B. Jones, Kwiseon Kim, P. A. Graf, LW. Wang, A. Canning, O. Marques, C. Voemel, J. Dongarra, J. Langou and S. Tomov, Journal of Physics: 46 (2006) 292298. A pdf version is available. Scheduling Workflow Applications on Processors with Different Capabilities, Zhiao Shi and Jack Dongarra, Future Generation Computing Systems, Volume 22, pp 665675, 2006. A pdf version is available. Recent Developments in GridSolve, Asim YarKhan, Keith Seymour, Kiran Sagi, Zhiao Shi, and Jack Dongarra, International Journal of High Performance Applications and Supercomputing, Volume 20 Number 1 Spring 2006, ISSN 10943420, pp 131132. A pdf version is available. Self Adapting Numerical Software (SANS) Effort, George Bosilca, Zizhong Chen, Jack Dongarra, Victor Eijkhout, Graham E. Fagg, Erika Fuentes, Julien Langou, Piotr Luszczek, Jelena PjesivacGrbovic, Keith Seymour, Haihang You, and Sathish S. Vadhiyar, IBM Journal of Research and Development, pp. 223238, Volume 50, Number 2/3, 2006. A pdf version is available. Trends in HighPerformance Computing, Jack Dongarra, January/February 2006, IEEE Circuits & Devices Magazine, pp 2227, ISSN 87553996. A pdf version is available. TwentyPlus Years of Netlib and NANet, Part 1 and 2, SIAM News, pp 13, Volume 39, Number 3&4, April & May 2006 (Unreviewed news article). A pdf version is available.  2005 A Not So Simple Matter of Software, Jack Dongarra, NCSA Access, Summer 2005 (nonrefereed magazine publication). A pdf version is available. A Scalable Approach to MPI Application Performance Analysis, Shirley Moore, Felix Wolf, Jack Dongarra, Sameer Shende, Patricia Teller, and Bernd Mohr, Volume 3666, Recent Advances in Parallel Virtual Machine and Messaging Passing Interface Users' Group Meeting Euro PVMMPI 2005, pp 309316, Springer Heidelberg, 2005, ISSN: 03029743. A pdf version is available. An Asynchronous Algorithm on NetSolve Global Computing System, Jack Dongarra, Nahid Emad, S. A. Shahzadeh Fazeli, Future Generation Computing Systems , Vol. 22, No. 3, pp 279290, 2005. A pdf version is available. Biological Sequence Alignment on the Computational Grid using the GrADS Framework, Asim YarKhan and Jack Dongarra, Future Generation Computer Systems, Volume 21, Issue 6, pp 980986, June 2005. A pdf version is available. Condition Numbers of Gaussian Random Matrices, Zizhong Chen and Jack Dongarra, SIAM Matrix Analysis and Applications, Volume 27, Number 3, pp 603620, 2005. A pdf version is available. Evaluating Dynamic Communicators and OneSided Operations for Current MPI Libraries, Edgar Gabriel, Graham E. Fagg, and Jack J. Dongarra, International Journal of High Performance Computing Applications, Volume 19, Number 1, pp 6781, Spring 2005, ISSN 10943420. A pdf version is available. Hash Functions for Datatype Signatures in MPI, George Bosilca, Jack Dongarra, Graham Fagg, and Julien Langou, Lecture Notes in Computer Science, Volume 3666, Recent Advances in Parallel Virtual Machine and Messaging Passing Interface Users' Group Meeting Euro PVMMPI 2005, pp 7683, Springer Heidelberg, 2005, ISSN: 03029743. A pdf version is available. High Performance Computing: Clusters, Constellations, MPPs, and Future Directions, Jack Dongarra, Thomas Sterling, Horst Simon, and Erich Strohmaier, Computing in Science and Engineering, Volume 7, Number 2, March/April 2005, pp. 5159, ISSN 15219615. A pdf version is available. New Grid Scheduling and Rescheduling Methods in the GrADS Project, F. Berman, H. Casanova, A Chien, K. Cooper, H. Dail, A. Dasgupta, W. Deng, J. Dongarra, L. Johnsson, K. Kennedy, C. Koelbel, B. Liu, X. Liu, A. Mandal, G. Marin, M. Mazina, J. MellorCrummey, C. Mendes, A. Olugbile, M. Patel, D. Reed, Z. Shi,O. Sievert, H. Xia, and A.YarKhan, International Journal of Parallel Programming, Vol. 33, No. 2, June 2005. A pdf version is available. Process FaultTolerance: Semantics, Design and Applications for High Performance Computing, Graham E. Fagg, Edgar Gabriel, Zizhong Chen, Thara Angskun, George Bosilca, Jelena PjesivacGrbovic, and Jack J. Dongarra, International Journal for High Performance Applications and Supercomputing, Vol. 19, N0. 4, pp 465478. 2005. A pdf version is available. Recent Trends in the Marketplace of High Performance Computing, Erich Strohmaier, Jack J. Dongarra, Hans W. Meuer, and Horst D. Simon, Parallel Computing, Volume 31, Issues 34 , pp 261273, MarchApril 2005. A pdf version is available. Scalable Fault Tolerant MPI: Extending the Recovery Algorithm, Graham E. Fagg, Thara Angskun, George Bosilca, Jelena PjesivacGrbovic, and Jack J. Dongarra, Lecture Notes in Computer Science, Volume 3666, Recent Advances in Parallel Virtual Machine and Messaging Passing Interface Users' Group Meeting Euro PVMMPI 2005, pp 6775, Springer Heidelberg, 2005, ISSN: 03029743. A pdf version is available. Scanning the Special Issue on Program Generation Optimization and Platform Adaptation, J.M.F. Moura, M. Puschel, D. Padua, and J. Dongarra, Proceedings of the IEEE, Volume 93, Number 2, February 2005, pp 211215, ISSN 00189219. A pdf version is available. Self Adapting Linear Algebra Algorithms and Software, Jim Demmel, Jack Dongarra, Victor Eijkhout, Erika Fuentes, Antoine Petitet, Rich Vuduc, R. Clint Whaley, Katherine Yelick, Proceedings of the IEEE, Volume 93, Number 2, February 2005, pp 293312, ISSN 00189219. A pdf version is available. Self Adaptivity in Grid Computing, S. Vadhiyar and J. Dongarra, Concurrency and Computation: Practice and Experience. Volume 17, Issue 24, 2005, pp. 235257. A pdf version is available. The Component Structure of a SelfAdapting Numerical Software System, Victor Eijkhout, Erika Fuentes, Thomas Eidson, and Jack Dongarra, International Journal of Parallel Programming, Vol. 33, No. 2, June 2005. A pdf version is available. The Top500 and Computational Science, A not so simple matter of software, Jack Dongarra, Scientific Computing, pp 1416, August 2005 (nonrefereed magazine publication). A pdf version is available.  2004 Simplified Grid Computing through Spreadsheets and NetSolve, David Abramson, Jack Dongarra, Eric Meek, Paul Roe, Zhiao Shi, High Performance Computing and Grid in Asia Pacific Region, 2004. Proceedings. Seventh International Conference, 2222 July 2004 DOI: 10.1109/HPCASIA.2004.1324012 A pdf version is available. Building and Using a Fault Tolerant MPI Implementation, Graham E Fagg and Jack J Dongarra, International Journal of High Performance Applications and Supercomputing, Volume 18, number 3, Fall 2004, pp 353362, ISSN 10943420. A pdf version is available. GrADSolve  A Gridbased RPC system for Remote Invocation of Parallel Software, Sathish Vadhiyar and Jack Dongarra, Journal of Parallel and Distributed Computing, 64(6):774783, June 2004, ISSN 07437315. A pdf version is available. Self Adapting Software for Numerical Linear Algebra and LAPACK for Clusters, Z. Chen, J. Dongarra, P. Luszczek, and K. Roche, Parallel Computing 29(1112):17231743, November/December 2003, ISSN 01678191. A pdf version is available. The Virtual Instrument: Support for Gridenabled MCell Simulations, Henri Casanova, Thomas Bartol, Francine Berman, Erhan Gokcay, Adam Birnbaum, Jack Dongarra, Mark Ellisman, Marcio Faerman, Michelle Miller, Graziano Obertelli, Stuart Pomerantz, Terry Sejnowski, Joel Stiles, Rich Wolski, International Journal of High Performance Computing Applications, Volume 18, Number 1, Spring 2004, pp 318, ISSN 10943420. A pdf version is available. Toward an Accurate Model for Collective Communications, Sathish Vadhiyar, Graham Fagg, Jack Dongarra, International Journal of High Performance Computing Applications, Volume 18, Number 1, Spring 2004, pp 159166, ISSN 10943420. A pdf version is available. Trends in High Performance Computing, Jack Dongarra, The Computer Journal, 47(4):399403, The British Computer Society, 2004. A pdf version is available.  2003 Self Adaptability in Grid Computing, S. Vadhiyar and J. Dongarra, Currency and Computation: Practice and Experience, January 2003, ISSN 15320634. A pdf version is available. Selfadapting Numerical Algorithm for Next Generation Applications, J. Dongarra and V. Eijkhout, International Journal of High Performance Computing Applications 17(2):125132, Summer 2003, ISSN 10943420. A pdf version is available. Selfadapting Numerical Software and Automatic Tuning of Heuristics, Jack Dongarra and Victor Eijkhout, Lecture Notes in Computer Science, Volume 2660, SpringerVerlag Heidelberg, pp 759  770, ISSN: 03029743, June 2003. A pdf version is available. SRS: A Framework for Developing Malleable and Migratable Parallel Applications for Distributed Systems, S. S. Vadhiyar and J. J. Dongarra, Parallel Processing Letters 13(2):291312, June 2003, ISSN 01296264. A pdf version is available. The LINPACK Benchmark: Past, Present, and Future, J. J. Dongarra, P. Luszczek, and A. Petitet, Concurrency and Computation: Practice and Experience 15(9):803820, August 2003, ISSN 15320634. A pdf version is available.  2002 A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures, G. Henry, D. Watkins, and J. Dongarra, SIAM Journal on Scientific Computing 24(1):284311, January 2003, ISSN 10648275. A pdf version is available. An Updated Set of Basic Linear Algebra Subprograms (BLAS), L. S. Blackford, J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux, L. Kaufman, A. Lumsdaine, A. Petitet, R. Pozo, K. Remington, and R. C. Whaley, ACM Transactions on Mathematical Software 28(2):135151, June 2002, ISSN 00983500. A pdf version is available. Automatic Translation of Fortran to JVM Bytecode, K. Seymour and J. Dongarra, Concurrency and Computation: Practice and Experience 15(35):207222, March/April 2003, ISSN 15320626 (print), 15320634 (electronic). A pdf version is available. Basic Linear Algebra Subprograms Technical (BLAST) Forum Standard, Special Issue  Part I, International Journal of High Performance Computing Applications 16(1):1111, Spring 2002, ISSN 10943420. A pdf version is available. Basic Linear Algebra Subprograms Technical (BLAST) Forum Standard, Special Issue  Part II, International Journal of High Performance Computing Applications 16(2):115199, Spring 2002, ISSN 10943420. A pdf version is available. HARNESS Fault Tolerant MPI Design, Usage and Performance Issues, G. E. Fagg and J. J. Dongarra, Future Generation Computer Systems 18(8):11271142, October 2002, ISSN 0167739X. A pdf version is available. Innovations of the NetSolve Grid Computing System, D. C. Arnold, H. Casanova, and J. Dongarra, Concurrency and Computation: Practice and Experience, Special Issue: Grid Computing Environments 14(1315):14571479, November/December 2002, ISSN 15320626 (print), 15320634 (electronic). A pdf version is available. Middleware for the Use of Storage in Communication, M. Beck, D. Arnold, A. Bassi, F. Berman, H. Casanova, J. Dongarra, T. Moore, G. Obertelli, J. Plank, M. Swany, S. Vadhiyar, and R. Wolski, Parallel Computing 28(12):17731788, December 2002, ISSN 01678191. A pdf version is available. NetBuild: Transparent CrossPlatform Access to Computational Software Libraries, K. Moore and J. Dongarra, Concurrency and Computation: Practice and Experience 14(1315):14451456, November/December 2002, ISSN 15320626 (print), 15320634 (electronic). A pdf version is available.  2001 A Comparison of Parallel Solvers for Diagonally Dominant and General NarrowBanded Linear Systems, P. Arbenz, A. Cleary, J. Dongarra, and M. Hegland, Parallel and Distributed Computing Practices, Special Issue: Parallel Numerical Linear Algebra 2(4):385400, November 1999, ISSN 10972803. A pdf version is available. Automated Empirical Optimization of Software and the ATLAS Project, R. Whaley, A. Petitet, and J. Dongarra, Parallel Computing 27(12):325, January 2001, ISSN 01678191. A pdf version is available. Biannual Top500 Computer Lists Track Changing Environments for Scientific Computing, J. Dongarra, H. Meuer, H. Simon, and E. Strohmaier, SIAM News 34(9), November 2001, ISSN 00361445. A pdf version is available. HARNESS and Fault Tolerant MPI, G. Fagg, A. Bukovsky, and J. Dongarra, Parallel Computing 27(11):14791496, October 2001, ISSN 01678191. A pdf version is available. High Performance Computing Trends, J. J. Dongarra, H. W. Meuer, H. D. Simon, and E. Strohmaier, HERMIS 2:155163, November 2001, ISSN 11087609. A pdf version is available. Iterative Solver Benchmark, J. Dongarra, V. Eijkhout, and H. van der Vorst, Scientific Programming 9(4):223231, 2001, ISSN 10589244. A pdf version is available. Measuring Computer Performance: A Practitionerï¿½â‚¬â„¢s Guide, Book Review by D. Lilja, Cambridge University Press (ISBN 0521641055), SIAM Review 43(2):383384, 2001, ISSN 00361445. A pdf version is available. NetworkEnabled Solvers: A Step Toward GridBased Computing, J. Dongarra, SIAM News 34(10), December 2001, ISSN 00361445. A pdf version is available. Numerical Libraries and the Grid, A. Petitet, S. Blackford, J. Dongarra, B. Ellis, G. Fagg, K. Roche, and S. Vadhiyar, International Journal of High Performance Computing Applications 15(4):359374, Winter 2001, ISSN 10943420. A pdf version is available. Numerical Libraries and Tools for Scalable Parallel Cluster Computing, J. Dongarra, S. Moore, and A. Trefethen, International Journal of High Performance Computing Applications 15(2):175180, Summer 2001, ISSN 10943420. A pdf version is available. On the Convergence of Computational and Data Grids, D. C. Arnold, S. S. Vahdiyar, and J. J. Dongarra, Parallel Processing Letters 11(23):187202, June/September 2001, ISSN 01296264. A pdf version is available. Recursive Approach in Sparse Matrix LU Factorization, J. Dongarra, V. Eijkhout, and P. Luszczek, Scientific Programming 9(1):5160, 2001, ISSN 10589244. A pdf version is available. Telescoping Languages: A Strategy for Automatic Generation of Scientific ProblemSolving Systems from Annotated Libraries, K. Kennedy, B. Broom, K. Cooper, J. Dongarra, R. Fowler, D. Gannon, L. Johnsson, J. MellorCrummey, and L. Torczon, Journal of Parallel and Distributed Computing 61(12):18031826, December 2001, ISSN 07437315. A pdf version is available. The GrADS Project: Software Support for HighLevel Grid Application Development, F. Berman, A. Chien, K. Cooper, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, K. Kennedy, C. Kesselman, J. MellorCrummey, D. Reed, L. Torczon, and R. Wolski, International Journal of High Performance Computing Applications 15(4):327344, Winter 2001, ISSN 10943420. A pdf version is available. The Quest for Petascale Computing, J. Dongarra and D. Walker, Computing in Science and Engineering 3(3):3239, May/June 2001, ISSN 15219615. A pdf version is available.  2000 A Portable Programming Interface for Performance Evaluation on Modern Processors, S. Browne, J Dongarra, N. Garner, G. Ho, and P. Mucci, International Journal of High Performance Computing Applications 14(3):189204, Fall 2000, ISSN 10943420. A pdf version is available. The Design And Implementation Of The Parallel OutOfCore Scalapack LU, QR, And Cholesky Factorization Routines, E. D'Azevedo and J. Dongarra, Concurrency: Practice and Experience 12(15):14811493, 2000, ISSN 10403108. A pdf version is available.  1999 A Comparison Of Parallel Solvers For General Narrow Banded Linear Systems, P. Arbenz, A. Cleary, J. Dongarra, and M. Hegland, Parallel and Distributed Computing Practices 2(4):385400, December 1999, ISSN 10972803. A pdf version is available. A Parallel Divide and Conquer algorithm for the Symmetric Eigenvalue Problem, F. Tisseur and J. Dongarra, SIAM Journal on Scientific Computing 6(20):22232236, 1999, ISSN 10648275. A pdf version is available. Adaptive Scheduling for Task Farming with Grid Middleware, H. Casanova, M. Kim, J. Plank, and J. Dongarra, International Journal of High Performance Computing Applications 13(3):231240, Fall 1999, ISSN 10943420. A pdf version is available. Algorithmic Issues on Heterogeneous Computing Platforms, Pierre Boulet, J. Dongarra, F. Rastello, Y. Robert, and F. Vivien, Parallel Processing Letters 9(2):197213, 1999, ISSN 01296264. A pdf version is available. Algorithmic Redistribution Methods for BlockCyclic Decompositions, A. P. Petitet and J. J. Dongarra, IEEE Transactions on Parallel and Distributed Systems 10(12):201220, 1999, ISSN 10459219. A pdf version is available. Atlanta Organizers Put Mathematics to Work For the Math Sciences Community, M. Berry and J. Dongarra, SIAM News 32(6), July/August 1999, ISSN 00361445. A pdf version is available. Deploying Fault Tolerance and Task Migration with NetSolve, J. S. Plank, H. Casanova, M. Beck, and J. J. Dongarra, Future Generation Computer Systems 15(56):745755, October 1999, ISSN 0167739X. A pdf version is available. Experiences with Windows NT as a Cluster Computing Platform for Parallel Computing, M. Fischer and J. Dongarra, Parallel and Distributed Computing Practices, Special Issue: Cluster Computing 2(2):119128, June 1999, ISSN 10972803. A pdf version is available. HARNESS: A Next Generation Distributed Virtual Machine, M. Beck, J. J. Dongarra, G. E. Fagg, G. A. Geist, P. Gray, J. Kohl, M. Migliardi, K. Moore, T. Moore, P. Papadopoulous, S. L. Scott, and V. Sunderam, Future Generation Computer Systems 15(56):571582, October 1999, ISSN 0167739X. A pdf version is available. JLAPACK  Compiling LAPACK Fortran to Java, D. Doolin, J. Dongarra, and K. Seymour, Scientific Programming 7(2):111138, 1999, ISSN 10589244. A pdf version is available. Logistical Quality of Service in NetSolve, M. Beck, H. Casanova, J. Dongarra, T. Moore, J. Plank, F. Berman, and R. Wolski, Computer Communications 22(11):10341044, 1999, ISSN 01403664. A pdf version is available. Numerical Linear Algebra Algorithms and Software, J. Dongarra and V. Eijkhout, Journal of Computational and Applied Mathematics 123(12):489514, November 1, 2000, ISSN 03770427. A pdf version is available. Scalable Networked Information Processing Environment (SNIPE), G. E. Fagg, K. Moore, and J. J. Dongarra, Future Generation Computer Systems 15(56):595605, October 1999, ISSN 0167739X. A pdf version is available. Static Tiling For Heterogeneous Computing Platforms, P. Boulet, J. Dongarra, Y. Robert, and F. Vivien, Parallel Computing 25(5):547568, 1999, ISSN 01678191. A pdf version is available. Stochastic Performance Prediction for Iterative Algorithms in Distributed Environments, H. Casanova, M. Thomason, and J. Dongarra, Journal of Parallel and Distributed Computing 58(1):6891, July 1999, ISSN 07437315. A pdf version is available. The Marketplace for HighPerformance Computers, E. Strohmaier, J. Dongarra, H. Meuer, and H. Simon, Parallel Computing 25(1314):15171545, December 1999, ISSN 01678191. A pdf version is available. Tiling On Systems with Communication/Computation Overlap, P.Y. Calland, J. Dongarra, and Y. Robert, Concurrency: Practice and Experience 11(3):139153, 1999, ISSN 10403108. A pdf version is available.  1998 Applying NetSolve's Network Enabled Server, H. Casanova and J. Dongarra, IEEE Computational Science and Engineering 5(3):5767, July/September 1998, ISSN 10709924. A pdf version is available. Determining the Idle Time of a Tiling: New Results, F. Desprez, J. Dongarra, F. Rastello, and Yves Robert, Journal of Computing and Information Science in Engineering (Special Issue on Compiler Techniques for HighPerformance Computing) 14(1):167190, March 1998, ISSN 15309827. A pdf version is available. Developing Numerical Libraries in Java, R. F. Boisvert, J. J. Dongarra, R. Pozo, K. A. Remington, and G. W. Stewart, Concurrency: Practice and Experience 10(1113):11171129, 1998, ISSN 10403108. A pdf version is available. National HPCC Software Exchange (NHSE): Uniting the High Performance Computing and Communications Community, S. Browne, J. Dongarra, J. Horner, P. McMahan, S. Wells, DLib Magazine (Electronic), May 1998, ISSN 10829873. A pdf version is available. Programming Tools and Environments, J. Saltz, A. Sussman, S. Graham, J. Demmel, S. Baden, and J. Dongarra, Communications of the ACM 41(11):6473, November 1998, ISSN 00010782 A pdf version is available. Scheduling BlockCyclic Array Redistribution, F. Desprez, J. Dongarra, A. Petitet, C. Randriamaro, and Y. Robert, IEEE Transactions on Parallel and Distributed Systems 9(2):192205, February 1998, ISSN 10459219. A pdf version is available. Using Agentbased Software for Scientific Computing in the NetSolve System, H. Casanova and J. Dongarra, Parallel Computing 24(1213):17771790, November, 1998, ISSN 01678191.k A pdf version is available.  1997 Changing Technologies of HPC, J. J. Dongarra, H. W. Meuer, H. D. Simon, and E. Strohmaier, Future Generation Computer Systems 12(5):461474, April 1997, ISSN 0167739X. A pdf version is available. Fault Tolerant Matrix Operations for Networks of Workstations Using Diskless Checkpointing, J. Plank, Y. Kim, and J. Dongarra, Journal of Parallel and Distributed Computing 43(2):125138, 1997, ISSN 07437315. A pdf version is available. Java Access to Numerical Libraries, H. Casanova, J. Dongarra, and D. Doolin, Concurrency: Practice and Experience 9(11):12791291, 1997, ISSN 10403108. A pdf version is available. Key Concepts for Parallel Out of Core LU Factorization, J. Dongarra, S. Hammarling, and D. Walker, Parallel Computing 23(12):4970, April 1997. ISSN 01678191. A pdf version is available. MessagePassing Performance of Various Computers, J. Dongarra and T. Dunigan, Concurrency: Practice and Experience 9(10):915926, 1997, ISSN 10403108. A pdf version is available. NetSolve: A NetworkEnabled Server for Solving Computational Science Problems, H. Casanova, and J. Dongarra, The International Journal of Supercomputer Applications and High Performance Computing 11(3):212223, Fall 1997. ISSN 10783482. A pdf version is available. Practical Experience in the Numerical Dangers of Heterogeneous Computing, L. S. Blackford, A. Cleary, J. Demmel, J. Dongarra, I. Dhillon, S. Hammarling, A. Petitet, H. Ren, K. Stanley, and R. C. Whaley, ACM Transactions on Mathematical Software 23(2):133147, June 1997, ISSN 00983500. A pdf version is available. The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Computers, J. Bai, J. Demmel, J. Dongarra, A. Petitet, H. Robinson, and K. Stanley, SIAM Journal on Scientific Computing 18(5):14461461, 1997, ISSN 01965204. A pdf version is available. Top500 Supercomputer Sites, J. Dongarra, H. W. Meuer and E. Strohmaier, Supercomputer 67:89120, 1997, ISSN 01687875. A pdf version is available.  1996 A Message Passing Standard for MPP and Workstations, J. Dongarra, S. W. Otto, M. Snir, and D. Walker, Communications of the ACM 39(7):8490, July 1996, ISSN 00010782. A pdf version is available. Algorithmic Bombardment for the Iterative Solution of Linear Systems: A PolyIterative Approach, R. Barrett, M. Berry, J. Dongarra, V. Eijkhout, and C. Romine, Journal of Computational and Applied Mathematics 74(12):91110, November 1996, ISSN 03770427. A pdf version is available. Chebyshev tau  QZ Algorithm Methods for Calculating Spectra of Hydrodynamic Stability Problems, J. Dongarra, B. Straughan and D. W. Walker, Applied Numerical Mathematics 22(4):399435, 1996, ISSN 01689274. A pdf version is available. Future Linear Algebra Libraries, J. Dongarra, IEEE Computational Science and Engineering 3(2):3840, Summer 1996, ISSN 10709924. A pdf version is available. LAPACK for Fortran90, J. Dongarra, J. Du Croz, S. Hammarling, J. Wasniewski, A. Zemla, Applied Mathematics and Computer Science 6(2):101109, 1996, ISSN 1641876X. A pdf version is available. MPI: A Standard Message Passing Interface, J. Dongarra and D. Walker, Supercomputer 12(1):5668, January 1996, ISSN 01687875. Overview of HighPerformance Computers, A. van der Steen and J. Dongarra, Electronic Journal of the NHSE Review 1(1), 1996, HTML. PBBLAS: A Set of Parallel Block Basic Linear Algebra Subroutines, J. Choi, J. Dongarra, and D. Walker, Concurrency: Practice and Experience 8(7):517535, September 1996, ISSN 10403108. A pdf version is available. PVMPI: An Integration of PVM and MPI Systems, G. Fagg and J. Dongarra, Calculateurs ParallÃ¨les 8(2):151166, 1996, Hermes, ISSN 12603198. A pdf version is available. ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers  Design Issues and Performance, J. Choi, J. Demmel, J. Dongarra, I. Dhillon, S. Ostrouchov, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley, Computer Physics Communications 97(12):115, August 1996, ISSN 00104655. A pdf version is available. The Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines, J. Choi, J. J. Dongarra, L. S. Ostrouchov, A. P. Petitet, D. W. Walker and R. C. Whaley, Scientific Programming 5(3):173184, Fall 1996, ISSN 10589244. A pdf version is available.  1995 A Highly Parallel Algorithm for the Reduction of a Nonsymmetric Matrix to Block UpperHessenberg Form, M. W. Berry, J. Dongarra, and Y. Kim, Parallel Computing 21(8):11891212, August 1995, ISSN 01678191. A pdf version is available. Parallel Matrix Transpose Algorithms on Distributed Memory Concurrent Computers, J. Choi, J. Dongarra, and D. Walker, Parallel Computing 21(9):13871405, 1995, ISSN 01678191. A pdf version is available. Performance Study of LU Factorization with Low Communication Overhead on Multiprocessors, F. Desprez, J. Dongarra, and B. Tourancheau, Parallel Processing Letters 5(2):157169, June 1995, ISSN 01296264. A pdf version is available. Recent Enhancements to PVM, A. Beguelin, J. Dongarra, A. Geist, R. Manchek, and V. Sunderam, International Journal of Supercomputer Applications and High Performance Computing 9(2):108127, Summer 1995, ISSN 10783482. A pdf version is available. Software Distribution Using XNETLIB, J. Dongarra, T. Rowan and R. Wade, ACM Transactions on Mathematical Software 21(1):7988, March 1995, ISSN 00983500. A pdf version is available. Software Libraries for Linear Algebra Computations on High Performance Computers, J. Dongarra and D. Walker, SIAM Review 37(2):151180, June 1995, ISSN 00361445. A pdf version is available. The Design of a Parallel, Dense Linear Algebra Software Library: Reduction to Hessenberg, Tridiagonal, and Bidiagonal Form, J. Choi, J. Dongarra, and D. Walker, Numerical Algorithms 10(34):379400, 1995, ISSN 10171398. A pdf version is available. The National HPCC Software Exchange, S. Browne, J. Dongarra, S. Green, K. Moore, T. Rowan, R. Wade, G. Fox, K. Hawick K. Kennedy, J. Pool, R. Stevens, B. Olsen, and T. Disz, IEEE Computational Science and Engineering 2(2):6269, Summer 1995, ISSN 10709924. A pdf version is available. The Netlib Mathematical Software Repository, S. Browne, J. Dongarra, E. Grosse, and T. Rowan, DLib Magazine, Electronic Journal, September 1995, ISSN 10829873, http://www.dlib.org/dlib/september95/netlib/09browne.html. A pdf version is available. The ParkBench Benchmark Collection, J. Dongarra and T. Hey, Supercomputer 11(23):94115, June 1995, ISSN 01687875. Top500 Supercomputer Sites, J. Dongarra, H. Meuer and E. Strohmaier, Supercomputer 11(23):133194, June 1995, ISSN 01687875. A pdf version is available.  1994 CRPC Research into Linear Algebra Software for HighPerformance Computers, J. Choi, J. J. Dongarra, R. Pozo, D. C. Sorensen, and D. W. Walker, International Journal of Supercomputing Applications 8(2):99118, Summer 1994, ISSN 08902720. A pdf version is available. Experiences with CODE and HeNCE in Visual Programming for Parallel Computing, J. C. Browne, J. Dongarra, S. I. Hyder, K. Moore, and P. Newton, IEEE Parallel and Distributed Technology 3(1):7583, Spring 1994, ISSN 10636552. A pdf version is available. HeNCE: A Heterogeneous Network Computing Environment, A. Beguelin, J. J. Dongarra, G. A. Geist, R. Manchek, and K. Moore, Scientific Programming 3(1):4960, Spring 1994, ISSN 10589244. A pdf version is available. MPI: A Message Passing Interface Standard, Special Issue, International Journal of Supercomputer Applications 8(34):159416, Fall/Winter 1994, ISSN 08902720. A pdf version is available. PARKBENCH Report  1: Public International Benchmarks for Parallel Computers, PARKBENCH Committee (assembled by R. Hockney and M. Berry, with contributions from D. Bailey, M. Berry, J. Dongarra, V. Getov, T. Haupt, T. Hey, R. Hockney, and D. Walker), Scientific Programming 3(2):101146, 1994, ISSN 10599244. A pdf version is available. PDS: A Performance Database Server, M. W. Berry, J. Dongarra, B. H. LaRose, and T. Letsche, Scientific Programming 3(2):147156, 1994, ISSN 10599244. A pdf version is available. PUMMA: Parallel Universal Matrix Multiplication Algorithms on Distributed Memory Concurrent Computers, J. Choi, J. J. Dongarra, and D. W. Walker, Concurrency: Practice and Experience 6(7):543570, October 1994, ISSN 10403108. A pdf version is available. Scalability Issues in the Design of a Library for Dense Linear Algebra, J. J. Dongarra, R. A. van de Geijn, and D. W. Walker, Journal of Parallel and Distributed Computing 22(3):523537, September 1994, ISSN 07437315. A pdf version is available. The PVM Concurrent Computing System: Evolution, Experiences, and Trends, V. S. Sunderam, J. Dongarra, G. A. Geist, and R Manchek, Parallel Computing 20(4):531545, March 31, 1994, ISSN 01678191. A pdf version is available.  1993 A Parallel Algorithm for the NonSymmetric Eigenvalue Problem, J. J. Dongarra and M. Sidani, SIAM Journal on Scientific Computing 14(3):542569, May 1993, ISSN 10648275. A pdf version is available. Integrated PVM Framework Supports Heterogeneous Network Computing, J. Dongarra, G. A. Geist, R. Manchek, and V. S. Sunderam, Computers in Physics 7(2):166175, April 1993, ISSN 08956111. A pdf version is available. Linear Algebra Libraries for HighPerformance Computers: A Personal Perspective, J. Dongarra, IEEE Parallel and Distributed Technology: Systems and Applications 1(1):1724, February 1993, ISSN 10636552. A pdf version is available. Performance of LAPACK: A Portable Library of Numerical Linear Algebra Routines, E. C. Anderson and J. Dongarra, Proceedings of the IEEE 81(8):10941102, August 1993, ISSN 00189219. A pdf version is available. Supporting Heterogeneous Network Computing: PVM, J. Dongarra, A. Geist, R. Manchek, and V. Sunderam, Chemical Design Automation News 8(910):3642, September/October 1993, ISSN 08866716. A pdf version is available. Visualization and Debugging in a Heterogeneous Environment, A. Beguelin, J. Dongarra, A. Geist, and V. Sunderam, IEEE Computer 26(6):8895, June 1993, ISSN 00189162. A pdf version is available.  1992 ALGORITHM 710; FORTRAN Subroutines for Computing the Eigenvalues and Eigenvectors of a General Matrix by Reduction to General Tridiagonal Form, J. J. Dongarra, G. A. Geist, and C. H. Romine, ACM Transactions on Mathematical Software 18(4):392400, December 1992, ISSN 00983500. A pdf version is available. Generalized QR Factorization and Its Applications, E. Anderson, Z. Bai, and J. Dongarra, Linear Algebra and Its Applications 162164:243271, February 1992, ISSN 00243795. A pdf version is available. Numerical Considerations in Computing Invariant Subspaces, J. J. Dongarra, S. Hammarling and J. H. Wilkinson, SIAM Journal on Matrix Analysis and Applications 13(1):145161, January 1992, ISSN 08954798. A pdf version is available. Performance of Various Computers Using Standard Sparse Linear Equations Solving Techniques, J. J. Dongarra and H. A. van der Vorst, Supercomputer 9(5):1729, September 1992, ISSN 01687875. A pdf version is available. Reduction to Condensed Form for the Eigenvalue Problem on Distributed Memory Architectures, J. J. Dongarra and R. A. van de Geijn, Parallel Computing 18(9):973982, September 1992, ISSN 01678191. A pdf version is available.  1991 A Comparative Study of Automatic Vectorizing Compilers, D. Levine, D. Callahan, and J. Dongarra, Parallel Computing, 17(1011):12231244, December 1991, ISSN 01678191. A pdf version is available. Opening the Door to Heterogeneous Network Supercomputing, A. Beguelin, J. Dongarra, A. Geist, R. Manchek, and V. Sunderam, Supercomputing Review 4(9):4445, September 1991, ISSN 10486836. A pdf version is available. Parallel Loops  A Test Suite for Parallelizing Compilers: Description and Example Results, J. Dongarra, M. Furtney, S. Reinhardt and J. Russell, Parallel Computing 17(1011):12471257, December 1991, ISSN 01678191. A pdf version is available. Special Report: 1990 Gordon Bell Prize Winners, J. Dongarra, A. H. Karp, K. Miura, and H. Simon, IEEE Software 8(3):9297, 102, May/June 1991, ISSN 07407459. A pdf version is available. The IBM RISC System/6000 and Linear Algebra Operations, J. Dongarra, P. Mayes and G. Radicati di Brozolo, Supercomputer 8(4):1530, July 1991, ISSN 01687875. A pdf version is available.  1990 A Set of Level 3 Basic Linear Algebra Subprograms, J. J. Dongarra, J. Du Croz, S. Hammarling, and I. S. Duff, ACM Transactions on Mathematical Software 16(1):117, March 1990, ISSN 00983500. A pdf version is available. A Tool to Aid in the Design, Implementation, and Understanding of Matrix Algorithms for Parallel Processors, J. Dongarra, O. Brewer, J. A. Kohl, and S. Fineberg, Journal of Parallel and Distributed Computing 9(2):185202, June 1990, ISSN 07437315. A pdf version is available. Algorithm 679; A Set of Level 3 Basic Linear Algebra Subprogram: Model Implementation and Test Programs, J. J. Dongarra, J. Du Croz, S. Hammarling, and I. S. Duff, ACM Transactions on Mathematical Software 16(1):1828, March 1990, ISSN 00983500. A pdf version is available.  1989 Block Reduction of Matrices to Condensed Forms for Eigenvalue Computations, J. J. Dongarra, S. J. Hammarling, and D. C. Sorensen, Journal of Computational and Applied Mathematics 27(12):215227, September 1989, ISSN 03770427. A pdf version is available. Shopping for Mathematical Software Electronically, J. Dongarra and E. Grosse, IEEE Potentials 8(1):3738, February 1989, ISSN 02786648. A pdf version is available.  1988 Algorithm 656: An Extended Set of Basic Linear Algebra Subprograms: Model Implementation and Test Programs, J. J. Dongarra, J. Du Croz, S. Hammarling, R. J. Hanson, ACM Transactions on Mathematical Software 14(1):1832, March 1988, ISSN 00983500. A pdf version is available. An Extended Set of Fortran Basic Linear Algebra Subprograms, J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, ACM Transactions on Mathematical Software 14(1): 117, March 1988, ISSN 00983500. A pdf version is available. Programming Methodology and Performance Issues for Advanced Computer Architectures, J. J. Dongarra, D. C. Sorensen, K. Connolly, and J. Patterson, Parallel Computing 8(13):4158, October 1988, ISSN 01678191. A pdf version is available. Tools to Aid in the Analysis of Memory Access Patterns for FORTRAN Programs, O. Brewer, J. Dongarra, and D. Sorensen, Parallel Computing 9(1):2535, December 1988, ISSN 01678191. A pdf version is available.  1987 A Fully Parallel Algorithm for the Symmetric Eigenvalue Problem, J. J. Dongarra and D. C. Sorensen, SIAM Journal on Scientific and Statistical Computing 8(2):139154, March 1987, ISSN 01965204. A pdf version is available. A Portable Environment for Developing Parallel FORTRAN Programs, J. J. Dongarra and D. C. Sorensen, Parallel Computing 5(12):175186, July 1987, ISSN 01678191. A pdf version is available. Computer Benchmarking: Paths and Pitfalls, J. Dongarra, J. Martin, and J. Worlton, IEEE Spectrum 24(7): 3843, June 1987, ISSN 00189235. A pdf version is available. Distribution of Mathematical Software via Electronic Mail, J. J. Dongarra and E. Grosse, Communications of the ACM 30(5):403407, May 1987, ISSN 00010782. A pdf version is available. Solving Banded Systems on a Parallel Processor, J. J. Dongarra and L. Johnsson, Parallel Computing 5(12):219246, July 1987, ISSN 01678191. A pdf version is available.  1986 How Do the "Minisupers" Stack Up?, J. J. Dongarra, IEEE Computer 19(3):93, 100, March 1986, ISSN 00189162. A pdf version is available. Implementing Dense Linear Algebra Algorithms Using Multitasking on the CRAY XMP4 (Or Approaching the Gigaflop), J. J. Dongarra and T. Hewitt, SIAM Journal on Statistical and Scientific Computing 7(1):347350, January 1986, ISSN 01965204. A pdf version is available. Implementation of Some Concurrent Algorithms for Matrix Factorization, J. J. Dongarra, A. H. Sameh, and D. C. Sorensen, Parallel Computing 3(1):2534, March 1986, ISSN 01678191. A pdf version is available. Linear Algebra on HighPerformance Computers, J. Dongarra and D. Sorensen, Applied Mathematics and Computation 20(12):5788, September 1986, ISSN 00963003. A pdf version is available. Squeezing the Most out of High Performance Computers for Finding the Eigenvalues, J. Dongarra, L. Kaufman, and S. Hammarling, Linear Algebra and Its Applications 77:113136, May 1986, ISSN 00243795. A pdf version is available.  1985 A Proposal for an Extended Set of Fortran Basic Linear Algebra Subprograms, J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, ACM SIGNUM Newsletter 20(1):218, January 1985, ISSN 01635778. A pdf version is available. Algorithm Design for Different Computer Architectures, J. J. Dongarra, B. T. Smith, and D. Sorensen, IEEE Software 2(4):7980, July 1985. A pdf version is available.  1984 A Collection of Parallel Linear Equations Routines for the Denelcor HEP, J. J. Dongarra and R. E Hiromoto, Parallel Computing 1(2):133142, December 1984, ISSN 01678191. A pdf version is available. EISPACK  A Collection for Solving Eigenvalue Problems, J. Dongarra and C. Moler, in Sources and Development of Mathematical Software, W. R. Cowell, ed., pp. 6887, PrenticeHall: Upper Saddle River, NY, 1984, ISBN 0138235015. A pdf version is available. Implementing Linear Algebra Algorithms for Dense Matrices on a Vector Pipeline Machine, J. J. Dongarra, F. G. Gustavson and A. Karp, SIAM Review 26(1):91112, January 1984, ISSN 00361445. A pdf version is available. Multiprocessing Linear Algebra Algorithms on the CRAY XMP2: Experiences with Small Granularity, S. S. Chen, J. J. Dongarra, and C. C. Hsiung, Journal of Parallel and Distributed Computing 1(1):2231, August 1984, ISSN 07437315. A pdf version is available. On Some Parallel Banded System Solvers, J. J. Dongarra and A. H. Sameh, Parallel Computing 1(3):223235, December 1984. A pdf version is available. Performances comparÃ©s de 80 ordinateurs sur des programmes Fortran, J. J. Dongarra, Technique et Science Informatiques 3(5):355360, 1984, ISSN 07524072. A pdf version is available. Solving the Secular Equation Including Spin Orbit Coupling for Systems with Inversion and Time Reversal Symmetry, J. J. Dongarra, J. R. Gabriel, D. D. Koelling, and J. H. Wilkinson, Journal of Computational Physics 54(2):278288, May 1984, ISSN 00219991. A pdf version is available. Squeezing the Most out of an Algorithm in CRAY FORTRAN, J. J. Dongarra, and S. C. Eisenstat, ACM Transactions on Mathematical Software 10(3):219230, September 1984, ISSN 00983500. A pdf version is available. The Eigenvalue Problem for Hermitian Matrices with Time Reversal Symmetry, J. J. Dongarra, J. R. Gabriel, D. D. Koelling, and J. H. Wilkinson, Linear Algebra and Its Applications 60:2742, August 1984, ISSN 00243795. A pdf version is available.  1983 Improving the Accuracy of Computed Eigenvalues and Eigenvectors, J. J. Dongarra, C. B. Moler and J. H. Wilkinson, SIAM Journal on Numerical Analysis 20(1):2345, February 1983, ISSN 00361429. A pdf version is available. Improving the Accuracy of Computed Singular Values, J. J. Dongarra, SIAM Journal on Scientific and Statistical Computing 4(4):712719, December 1983, ISSN 01965204. A pdf version is available. Performance of Various Computers Using Standard Linear Equations Software in a Fortran Environment, J. J. Dongarra, ACM SIGARCH Computer Architecture News 11(5):2227, December 1983, ISSN 01635964. A pdf version is available.  1982 Algorithm 589: SICEDR: A FORTRAN Subroutine for Improving the Accuracy of Computed Matrix Eigenvalues, J. J. Dongarra, ACM Transactions on MathematicalSoftware 8(4):371375, December 1982, ISSN 00983500. A pdf version is available.  1979 Unrolling Loops in Fortran, J. Dongarra and A. R. Hinds, SoftwarePractice and Experience, 9(3):219226, March 1979, ISSN 00380644. A pdf version is available.



