LAPACK Wishlistupdated on Tue December 10 2013maintained by J. Langou, U. Colorado Denvermaintained by J. Langou, U. of Tennessee(*) remove unnecessary transpositions from lapacke_?_work layero Lawrence Mulholland, NAG, DEC-10-2013 o use tricks ala CBLAS layer to remove unnecessary transpositions from lapacke o see: forum topic 4469(*) add inplace transposition algorithmo Julien, DEC-10-2013 o add inplace transposition algorithm to LAPACK and use it in LAPACKE (If appropriate) o see: forum topic 4469(*) ScaLAPACK :: PDLARFBo Keita Teranishi, Cray, 16-12-10 o fact: PDLARB does not use PBLAS (rely on BLACS and BLAS) o todo: investigate why and if using PBLAS is better, use PBLAS ... o see forum topic 2094(*) LAPACK/ScaLAPACKo Nichols A. Romero, Argonne Leadership Computing Facility, 12-21-10 o include Jack Poulson's PDSYNTRD algorithm (See his master thesis) o include "faster Householder algorithm" (See "Accumulating Householder Transformations, Revisited" by Joffrain, Low, Quintana-OrtÃ, van de Geijn, and Van Zee o see see forum topic 2113(*) LAPACK :: DSPEVR routineCurrently, LAPACK contains many flavors for driver routines to solve eigenproblems - for example: DSYEV{D/R/X} DSTEV{D/R/X} DSPEV{D/X} Request made by user on the LAPACK mailing list: see email(*) ScaLAPACK :: P[SDCZ]LATRS is not the ScaLAPACK equivallent of LAPACK [SDCZ]LATRSo Jill Reese, Mathworks, 04/13/2010 o ScaLAPACK P[SDCZ]LATRS is not the ScaLAPACK equivallent of LAPACK [SDCZ]LATRS, it is a wrapper on top of P[SDCZ]TRSV, in other words, there is no check to prevent possible overflow. o As a consequence the numerical behavior of LAPACK and ScaLAPACK routines can be quite different(*) LAPACK :: support multiple couples (c,d) in xGGLSEodate:Sep 06 2009, "kyewong" oproblem description:The interface of xGGLSE only supports one couple (c,d). When solving a linear equality-constrained least squares problem, most of the work is in the factorization of the matrices A and B (with xGGRQF), if you have several couples (c,d), you do not want to repeat the factorization for each couple. You want to reuse the factorization. olearn more:see http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=2&t=1615 o This is a rather easy task. Do not forget to add testing of the new functionnality in the TESTING directory!(*) LAPACK :: interface (and/or) source code with 64-bit interger(*) ScaLAPACK :: interface (and/or) source code with 64-bit intergero interfaces and code sources using 64-bit integers would fix bug0020 o code sources using 64-bit integers would enable packed format routine to work for N > sqrt(2^31) o interface using 64-bit integers are needed for N > 2^31 (possible for large vectors in the ScaLAPACK context) o possibility of compiling lapack with flag to set all integers at 64-bit(*) LAPACK :: New routines: ILAxLV, scan a vector for trailing zeros.(*) LAPACK :: Use the vector scanning routines in xLARFG and xLARFP.o see Jason's email: jason_20090323_001.txt and jason_20090323_002.txt(*) LAPACK :: multishift QZ with early aggressive deflationo Bo Kågströom and Daniel Kressner. Multishift variants of the QZ algorithm with aggressive early deflation. SIAM J. Matrix Anal. Appl., 29(1):199-227, 2006. o This will fix as well the problem described in: http://www-math.cudenver.edu/~langou/lapack-3.2/lapack_known_issues.html#QZ(*) LAPACK :: block reordering algorithmo Daniel Kressner. Block algorithms for reordering standard and generalized Schur forms. ACM Trans. Math. Software, 32(4):521-532, 2006.(*) LAPACK :: Extra-precise iterative refinement for overdetermined least squareso James Demmel, Yozo Hida, Xiaoye S. Li, and E. Jason Riedy. Extra-precise Iterative Refinement for Overdetermined Least Squares Problems. LAPACK Working Note 188, May 2007.(*) LAPACK :: accurate and efficient Givens rotationso David Bindel, James Demmel, William Kahan, and Osni Marques. On computing givens rotations reliably and efficiently. ACM Transactions on Mathematical Software (TOMS) Volume 28, Issue 2, 2002. Pages: 206-238. o http://www.cs.berkeley.edu/~demmel/Givens/(*) LAPACK :: blas 2.5o Gary W. Howell, James Demmel, Charles T. Fulton, Sven Hammarling, and Karen Marmol. Cache efficient bidiagonalization using BLAS 2.5 operators. ACM Transactions on Mathematical Software (TOMS) Volume 34, Issue 3, 2008.(*) LAPACK :: support more matrix types for extra-precise iterative refinement.o Matrix types SB (symmetric band), PB (positive definite band), HB (Hermitian band), and packed storage. Tridiagonal types such as GT (general tridiagonal) are also on the wish list but first we need to derive adequate test cases.(*) LAPACK :: make xLARFB thread friendlyo Take into account the comments of Robert van de Geijn (Univ. of Texas at Austin) concerning the interface of xLARFB. The interface labels V as input/output and this is not at all convenient for multithreaded implementations.(*) LAPACK :: Change the default Cholesky factorization in SRC from right-looking to left-lookingo Change the default Cholesky factorization in SRC from right-looking to left-looking, move left-looking to the VARIANTS directory.(*) LAPACK :: Add some recursive variants for QR and Cholesky.o Add some recursive variants for QR and Cholesky.(*) LAPACK :: Add some recursive variants for QR and Cholesky.o See: http://www.netlib.org/lapack/lapack-3.2.html#_9_10_bug_fixes_for_the_bidiagonal_svd_routine_that_fix_some_rare_convergence_failures o Remove IEEE=.FALSE. in DQDS (DLASQ3, SLASQ3 of Osni). oNote:Collin Engstrom on Tue 21 Jul 2009 sent us an email to tell us that he was able to reproduce the numerical failure as described with LAPACK but not with CLAPACK. Sounds like fun to debug!