=================== ERRATA in ScaLAPACK =================== VERSION 1.0 : February 28, 1995 PATCHES: VERSION 1.1 : March 20, 1995 VERSION 1.2 : May 10, 1996 PATCHES: VERSION 1.3 : June 5, 1996 VERSION 1.4 : November 17, 1996 VERSION 1.5 : May 1, 1997 PATCHES: VERSION 1.6 : November 15, 1997 DATE: July 31, 2001 This errata file lists errata for the ScaLAPACK Users' Guide and errata for the ScaLAPACK code itself. It also lists machine-specific installation hints, details of the updates, and any outstanding bugs to be fixed in the upcoming release. =============================================== Errata in ScaLAPACK Users' Guide, First Edition =============================================== page ---- 8: Change to netlib format ( .tar.gz --> .tgz ) Section 1.7 change scalapack.tar.gz -> scalapack.tgz 9: Change to netlib format ( .tar.gz --> .tgz ) Section 1.8 change manpages.tar.gz -> manpages.tgz 13-14: Change to netlib format ( .tar.gz --> .tgz ) Section 2.1, numerous references to .tar.gz files must be replaced by .tgz 14: Change to netlib format ( .shar -> .tgz ) http://www.netlib.org/blas/blas.shar should be changed to http://www.netilb.org/blas/blas.tgz and sh blas.shar should be changed to gunzip -c blas.tgz | tar xvf - 20: Typographical error in the 8th row of the coefficient matrix A. The 8th row is listed as -s -c -a -l -a -p -a c k and should be replaced by: -s +c -a -l -a -p -a c k Likewise, this correction must be propogated to the 8th row of Figure 2.1. -s +c -a -l -a -p -a c k 72: Section 4.4.2 In the text beginning "Table 4.10 illustrates ...", it erroneously says NB=3. This should be NB=8, as noted in the caption of Table 4.10. 90: Section 4.6.7, third paragraph inherit --> inherent 116: Erroneous use of BLAS instead of BLACS in the first sentence of Section 5.4.1. The first sentence should say: "Users should choose vendor-supplied BLACS optimized for their computer; these BLACS will be the fastest BLACS implementation." =================== Errata in ScaLAPACK =================== We have tested ScaLAPACK on heterogeneous clusters of workstations using pvm3.4 and mpich version 1.1, and on the IBM SP-2, Intel Paragon, SGI Power Challenge, SGI Origin 2000, and Cray T3E. We tested on the IBM SP-2 using IBM MPI, on the SGI Power Challenge and Origin 2000 using SGI MPI v3.0, and on the Cray T3E using Cray MPI (mpt.1.2.0.0.6beta). The workstations used were AIX46K, ALPHA, HPPA, LINUX, SGI64, SUN4, and SUN4SOL2. *** !! IMPORTANT !! ********************************************* Note to PGF77 and NAGF90 compiler users: We have unconfirmed reports of Fortran/C interoperability problems (string handling) with ScaLAPACK on the nagf90 and pg77 compilers. Until this is resolved, the user will have to avoid using these two compilers in conjunction with ScaLAPACK and the BLACS. *** !! IMPORTANT !! ********************************************* ####################################### # MACHINE-SPECIFIC INSTALLATION HINTS # ####################################### 1. For HPPA, you may need to compile all routines in SCALAPACK/PBLAS/SRC/INTERNAL/PBBLAS/ with NO optimization. And, the user must modify SCALAPACK/SRC/pslabad.f and pdlabad.f to remove the IF-conditional and always take the square root. 2. For NEC SX-4, the user must modify SCALAPACK/SRC/pslabad.f and pdlabad.f to remove the IF-conditional and always take the square root. And, you must include the -DNOIEEE flag in the CDEFS definition in the SLmake.inc. 3. For Cray T3E, the user must modify SCALAPACK/SRC/pslabad.f and pdlabad.f to remove the IF-conditional and always take the square root. And, you must include the -DNOIEEE flag in the CDEFS definition in the SLmake.inc. Also for the Cray T3E ONLY, the user must reset the defined size of an integer and a real in the ScaLAPACK Test Suite. Set the variable INTGSZ = 8, REALSZ = 8, and CPLXSZ = 16 in all SCALAPACK/TESTING/LIN/p*driver.f, SCALAPACK/TESTING/EIG/p*driver.f, SCALAPACK/TESTING/EIG/p*sepreq.f, SCALAPACK/TESTING/EIG/p*gsepreq.f, SCALAPACK/PBLAS/TESTING/p*tst.f, and SCALAPACK/PBLAS/TIMING/p*tim.f testing/timing programs! By default, INTGSZ = 4, REALSZ = 4, and CPLXSZ = 8 for all other architectures. ################################### # TO BE FIXED IN THE NEXT RELEASE # ################################### ------------- ------- ----------------------------------------- DIRECTORY ROUTINE DESCRIPTION OF CHANGE ------------- ------- ----------------------------------------- SCALAPACK/PBLAS/SRC/psaxpy_.c On lines 362-363, there is a call to its pdaxpy_.c corresponding serial routine _axpy. The pcaxpy_.c very last argument in the call should be pzaxpy_.c "incy", not "&desc_Y[LLD_]". pcdotc_.c On line 471, &ixcol should be &iycol. pzdotc_.c pcdotu_.c On line 472, &ixcol should be &iycol. pzdotu_.c SCALAPACK/PBLAS/SRC/INTERNAL/PBBLAS/ Bug description: ================ In the rank-k and ran-2k Hermitian updates PBBLAS routines, when C is just a block, BETA is one, and ( ALPHA is zero or NQ/NP is zero ), the call to the Level 3 BLAS _HERK or HER2K does not zero the imaginary part of the diagonals of C, and simply returns. This causes a problem for example when the process owning C does not own any of A/B; In such a case one has to zero those imaginary parts. ========================================================================== pbcherk.f line 283: INTRINSIC CMPLX, MAX, MIN should be INTRINSIC CMPLX, MAX, MIN, REAL line 770: CALL CHERK( UPLO, TRANS, N, NQ, ALPHA, A,LDA, BETA, C,LDC ) should be CALL CHERK( UPLO, TRANS, N, NQ, ALPHA, A,LDA, BETA, C,LDC ) IF( ( BETA.EQ.RONE ).AND. $ ( ( ALPHA.EQ.RZERO ).OR.( NQ.EQ.0 ) ) ) THEN DO 80 JJ = 1, N C( JJ, JJ ) = CMPLX( REAL( C( JJ, JJ ) ), RZERO ) 80 CONTINUE END IF line 824: CALL CHERK( UPLO, TRANS, N, NP, ALPHA, A,LDA, BETA, C,LDC ) should be CALL CHERK( UPLO, TRANS, N, NP, ALPHA, A,LDA, BETA, C,LDC ) IF( ( BETA.EQ.RONE ).AND. $ ( ( ALPHA.EQ.RZERO ).OR.( NP.EQ.0 ) ) ) THEN DO 90 JJ = 1, N C( JJ, JJ ) = CMPLX( REAL( C( JJ, JJ ) ), RZERO ) 90 CONTINUE END IF ========================================================================== pbcher2k.f line 332 INTRINSIC CONJG, CMPLX, MAX, MIN should be INTRINSIC CONJG, CMPLX, MAX, MIN, REAL line 1164: CALL CHER2K( UPLO, TRANS, N, NQ, ALPHA, A, LDA, B, LDB, $ BETA, C, LDC ) should be CALL CHER2K( UPLO, TRANS, N, NQ, ALPHA, A, LDA, B, LDB, $ BETA, C, LDC ) IF( ( BETA.EQ.RONE ).AND. $ ( ( ALPHA.EQ.ZERO ).OR.( NQ.EQ.0 ) ) ) THEN DO 150 JJ = 1, N C( JJ, JJ ) = CMPLX( REAL( C( JJ, JJ ) ), RZERO ) 150 CONTINUE END IF line 1223 CALL CHER2K( UPLO, TRANS, N, NP, ALPHA, A, LDA, B, LDB, $ BETA, C, LDC ) should be CALL CHER2K( UPLO, TRANS, N, NP, ALPHA, A, LDA, B, LDB, $ BETA, C, LDC ) IF( ( BETA.EQ.RONE ).AND. $ ( ( ALPHA.EQ.ZERO ).OR.( NP.EQ.0 ) ) ) THEN DO 160 JJ = 1, N C( JJ, JJ ) = CMPLX( REAL( C( JJ, JJ ) ), RZERO ) 160 CONTINUE END IF ========================================================================== for pbzherk.f and pbzher2k.f, changes are the same with essentially the following caveats CMPLX -> DCMPLX, REAL -> DBLE ========================================================================== SCALAPACK/INSTALL/SLmake.LINUX Accidently missing definitions of F77, CC, CCFLAGS, SRCFLAG, F77LOADER, CCLOADER, F77LOADFLAGS, CCLOADFLAGS. SCALAPACK/SRC/ psgebrd.f Avoid 0-th element reference on line 376 Replace with: IF( JS.GT.0 ) $ CALL PSELSET( A, I+JB-1, J+JB, DESCA, E( JS ) ) Avoid 0-th element reference on line 380 Replace with: IF( JS.GT.0 ) $ CALL PSELSET( A, I+JB, J+JB-1, DESCA, E( JS ) ) pcgebrd.f Avoid 0-th element reference on line 378. Replace with: IF( JS.GT.0 ) $ CALL PCELSET( A, I+JB-1, J+JB, DESCA, CMPLX( E( JS ) ) ) Avoid 0-th element reference on line 382. Replace with: IF( JS.GT.0 ) $ CALL PCELSET( A, I+JB, J+JB-1, DESCA, CMPLX( E( JS ) ) ) SCALAPACK/SRC/ psgeqpf.f Avoid 0-th element reference on lines 460 and and 465. Replace lines 459-468 with: IF( (JJA+NQ-JJ).GT.0 ) THEN IF( MYROW.EQ.ICURROW ) THEN CALL SGEBS2D( ICTXT, 'Columnwise', ' ', 1, JJA+NQ-JJ, $ A( II+( MIN( JJA+NQ-1, JJ )-1 )*LDA ), $ LDA ) CALL SCOPY( JJA+NQ-JJ, A( II+( MIN( JJA+NQ-1, JJ ) $ -1)*LDA ), LDA, WORK( IPW+MIN( JJA+NQ-1, $ JJ )-1 ), 1 ) ELSE CALL SGEBR2D( ICTXT, 'Columnwise', ' ', JJA+NQ-JJ, 1, $ WORK( IPW+MIN( JJA+NQ-1, JJ )-1 ), $ MAX( 1, NQ ), ICURROW, MYCOL ) END IF END IF SCALAPACK/SRC/ pcgeqpf.f Avoid 0-th element reference on lines 485 and and 490. Replace lines 484-493 with: IF( (JJA+NQ-JJ).GT.0 ) THEN IF( MYROW.EQ.ICURROW ) THEN CALL CGEBS2D( ICTXT, 'Columnwise', ' ', 1, JJA+NQ-JJ, $ A( II+( MIN( JJA+NQ-1, JJ )-1 )*LDA ), $ LDA ) CALL CCOPY( JJA+NQ-JJ, A( II+( MIN( JJA+NQ-1, JJ ) $ -1)*LDA ), LDA, WORK( MIN( JJA+NQ-1, JJ ) ), $ 1 ) ELSE CALL CGEBR2D( ICTXT, 'Columnwise', ' ', JJA+NQ-JJ, 1, $ WORK( MIN( JJA+NQ-1, JJ ) ), MAX( 1, NQ ), $ ICURROW, MYCOL ) END IF END IF SCALAPACK/SRC/ psgeqpf.f/ Line 484: pcgeqpf.f $ J+LL-JJ+1, DESCA, 1 ) should be $ J+LL-JJ+2, DESCA, 1 ) psgesvx.f/ Line 502 of psgesvx.f should be pcgesvx.f NQ = NUMROC( N+ICOFFA, DESCA( NB_ ), MYCOL, IACOL, $ NPCOL ) instead of NQ = NUMROC( N+ICOFFA, DESCB( NB_ ), MYCOL, IACOL, $ NPCOL ) Likewise, line 503-504 of pcgesvx.f should be NQ = NUMROC( N+ICOFFA, DESCA( NB_ ), MYCOL, IACOL, $ NPCOL ) instead of NQ = NUMROC( N+ICOFFA, DESCB( NB_ ), MYCOL, IACOL, $ NPCOL ) SCALAPACK/SRC/ pslahqr.f Copy the H11 line so that it is defined on both sides of the loop. That is, line 2440 should be the same as line 2435. SCALAPACK/SRC/ pslarf.f Avoid 0-th element reference on line 487. Replace with IF( IOFFC.GT.0 ) $ CALL SGEMV( 'Transpose', MP, NQ, ONE, $ C( IOFFC ), LDC, WORK, 1, ZERO, $ WORK( IPW ), 1 ) Avoid 0-th element reference on line 501. Replace with IF( IOFFC.GT.0 ) $ CALL SGER( MP, NQ, -TAULOC, WORK, 1, WORK( IPW ), $ 1, C( IOFFC ), LDC ) Avoid 0-th element reference on line 592. Replace with IF( IOFFV.GT.0 .AND. IOFFC.GT.0 ) $ CALL SGER( MP, NQ, -TAULOC, WORK, 1, $ V( IOFFV ), LDV, C( IOFFC ), LDC ) Avoid 0-th element reference on line 531. Replace with: IF( IOFFC.GT.0 ) $ CALL SGEMV( 'Transpose', MP, NQ, ONE, $ C( IOFFC ), LDC, WORK, 1, ZERO, $ WORK( IPW ), 1 ) Avoid 0-th element reference on line 545. Replace with: IF( IOFFC.GT.0 ) $ CALL SGER( MP, NQ, -TAULOC, WORK, 1, WORK( IPW ), $ 1, C( IOFFC ), LDC ) Avoid 0-th element reference on line 712. Replace with: IF( IOFFV.GT.0 ) $ CALL SCOPY( NQ, V( IOFFV ), LDV, WORK, 1 ) Avoid 0-th element reference on line 746. Replace with: IF( IOFFC.GT.0 ) $ CALL SGER( MP, NQ, -TAULOC, WORK( IPW ), 1, WORK, $ 1, C( IOFFC ), LDC ) pclarf.f Avoid 0-th element reference on line 488. Replace with: IF( IOFFC.GT.0 ) $ CALL CGEMV( 'Conjugate transpose', MP, NQ, ONE, $ C( IOFFC ), LDC, WORK, 1, ZERO, $ WORK( IPW ), 1 ) Avoid 0-th element reference on line 502. Replace with: IF( IOFFC.GT.0 ) $ CALL CGERC( MP, NQ, -TAULOC, WORK, 1, WORK( IPW ), $ 1, C( IOFFC ), LDC ) pclarf.f Avoid 0-th element reference on line 595. Replace with: IF( IOFFV.GT.0 .AND. IOFFC.GT.0 ) $ CALL CGERC( MP, NQ, -TAULOC, WORK, 1, $ V( IOFFV ), LDV, C( IOFFC ), $ LDC ) Avoid 0-th element reference on line 531. Replace with: IF( IOFFC.GT.0 ) $ CALL CGEMV( 'Conjugate transpose', MP, NQ, ONE, $ C( IOFFC ), LDC, WORK, 1, ZERO, $ WORK( IPW ), 1 ) Avoid 0-th element reference on line 547. Replace with: IF( IOFFC.GT.0 ) $ CALL CGERC( MP, NQ, -TAULOC, WORK, 1, WORK( IPW ), $ 1, C( IOFFC ), LDC ) Avoid 0-th element reference on line 715. Replace with: IF( IOFFV.GT.0 ) $ CALL CCOPY( NQ, V( IOFFV ), LDV, WORK, 1 ) Avoid 0-th element reference on line 749. Replace with: IF( IOFFC.GT.0 ) $ CALL CGERC( MP, NQ, -TAULOC, WORK( IPW ), 1, WORK, $ 1, C( IOFFC ), LDC ) SCALAPACK/SRC/ pslarz.f Avoid 0-th element reference on line 676. Replace with: IF( MPC2.GT.0 .AND. NQV.GT.0 ) $ CALL SGER( MPC2, NQV, -TAULOC, WORK, 1, $ V( IOFFV ), LDV, C( IOFFC2 ), $ LDC ) SCALAPACK/SRC/ pclarz.f Avoid 0-th element reference on line 677. Replace with: IF( MPC2.GT.0 .AND. NQV.GT.0 ) $ CALL CGERC( MPC2, NQV, -TAULOC, WORK, 1, $ V( IOFFV ), LDV, C( IOFFC2 ), $ LDC ) SCALAPACK/SRC/ pslarzb.f Avoid 0-th element reference on line 602. Replace with: IF( IOFFC2.GT.0 ) $ CALL SGEMM( 'No transpose', 'No transpose', MPC2, NQC2, K, $ -ONE, WORK( IPW ), LW, WORK( IPV ), LV, ONE, $ C( IOFFC2 ), LDC ) SCALAPACK/SRC/ pclarzb.f Avoid 0-th element reference on line 615. Replace with: IF( IOFFC2.GT.0 ) $ CALL CGEMM( 'No transpose', 'No transpose', MPC2, NQC2, K, $ -ONE, WORK( IPW ), LW, WORK( IPV ), LV, ONE, $ C( IOFFC2 ), LDC ) SCALAPACK/SRC/ psorg2l.f/ Code correction as follows: pcung2l.f Line 258: TAUJ = TAU( MIN( JJ, NQA0 ) ) should be: IACOL = INDXG2P( J, DESCA( NB_ ), MYCOL, DESCA( CSRC_ ), $ NPCOL ) IF( MYCOL.EQ.IACOL ) $ TAUJ = TAU( MIN( JJ, NQA0 ) ) psorgl2.f/ pcungl2.f Line 257: TAUI = TAU( MIN( II, KP ) ) should be: IAROW = INDXG2P( I, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), $ NPROW ) IF( MYROW.EQ.IAROW ) $ TAUI = TAU( MIN( II, KP ) ) psorg2r.f/ pcung2r.f Line 258: TAUJ = TAU( MIN( JJ, KQ ) ) should be: IACOL = INDXG2P( J, DESCA( NB_ ), MYCOL, DESCA( CSRC_ ), $ NPCOL ) IF( MYCOL.EQ.IACOL ) $ TAUJ = TAU( MIN( JJ, KQ ) ) psorgr2.f/ pcungr2.f Line 258: TAUI = TAU( MIN( II, MP ) ) should be: IAROW = INDXG2P( I, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), $ NPROW ) IF( MYROW.EQ.IAROW ) $ TAUI = TAU( MIN( II, MP ) ) SCALAPACK/SRC/ psposvx.f/ Line 426 of psposvx.f should be pcposvx.f NQ = NUMROC( N+ICOFFA, DESCA( NB_ ), MYCOL, IACOL, NPCOL ) instead of NQ = NUMROC( N+ICOFFA, DESCB( NB_ ), MYCOL, IACOL, NPCOL ) Likewise, line 425 of pcposvx.f should be NQ = NUMROC( N+ICOFFA, DESCA( NB_ ), MYCOL, IACOL, NPCOL ) instead of NQ = NUMROC( N+ICOFFA, DESCB( NB_ ), MYCOL, IACOL, NPCOL ) SCALAPACK/SRC/ pssyevx.f/ Input error checking corrections: pdsyevx.f Line 513 should be replaced by: INFO = -( 800+CTXT_ ) Line 516 should be replaced by: INFO = -( 2100+CTXT_ ) Line 520 should be replaced by: CALL CHK1MAT( N, 4, N, 4, IA, JA, DESCA, 8, INFO ) Line 522 should be replaced by: $ CALL CHK1MAT( N, 4, N, 4, IZ, JZ, DESCZ, 21, INFO ) Line 702-703 should be replaced by: CALL PCHK2MAT( N, 4, N, 4, IA, JA, DESCA, 8, N, 4, N, 4, IZ, $ JZ, DESCZ, 21, 4, IDUM1, IDUM2, INFO ) Line 706 should be replaced by: CALL PCHK1MAT( N, 4, N, 4, IA, JA, DESCA, 8, 4, IDUM1, SCALAPACK/SRC/ pcheevx.f/ Input error checking corrections: pzheevx.f Line 532 should be replaced by: INFO = -( 800+CTXT_ ) Line 535 should be replaced by: INFO = -( 2100+CTXT_ ) Line 538 should be replaced by: CALL CHK1MAT( N, 4, N, 4, IA, JA, DESCA, 8, INFO ) Line 541 should be replaced by: $ CALL CHK1MAT( N, 4, N, 4, IZ, JZ, DESCZ, 21, INFO ) Line 732-33 should be replaced by: CALL PCHK2MAT( N, 4, N, 4, IA, JA, DESCA, 8, N, 4, N, 4, IZ, $ JZ, DESCZ, 21, 4, IDUM1, IDUM2, INFO ) Line 735 should be replaced by: CALL PCHK1MAT( N, 4, N, 4, IA, JA, DESCA, 8, 4, IDUM1, SCALAPACK/TESTING/EIG/ pssepsubtst.f Avoid 0-th element reference by replacing line 254 with: NORMWIN = SAFMIN / EPS IF( N.GE.1 ) $ NORMWIN = MAX( ABS( WIN( 1 ) ), ABS( WIN( N ) ), NORMWIN ) SCALAPACK/TESTING/EIG/ pzbrdinfo.f Typo in FORMAT statement line 257 (complex single --> complex double) pzhrdinfo.f Typo in FORMAT statement line 261 (complex single --> complex double) pztrdinfo.f Typo in FORMAT statement line 262 (complex single --> complex double) ******************************************************************* * Changes to ScaLAPACK code to make it run on T3E * ******************************************************************* SCALAPACK/SLmake.inc: add -DNO_IEEE to CDEFS add -dp to compilation flags SCALAPACK/SRC/p?stein.f: as first executable line, add: onenrm = 0.0E0 SCALAPACK/SRC/p?latrz.f: add AII = ZERO right after quick return so it is initialized on nodes that do not do stuff in P_LARFG SCALAPACK/SRC/p?labad.f: comment out IF( LOG10( LARGE ).GT.2000.D0 ) THEN because T3E has IEEE hardware, but not arithmetic SCALAPACK/SRC/pslaiect.c as first line, add: #define float double SCALAPACK/PBLAS/SRC/pblas.h in original file, move lines 92 and 93 to line 119 (:92,93mo119) this moves the typedef for complex below the #define float double . . at line 27, add: #ifdef T3E #define _MACH_ _T3D_ #endif SCALAPACK/PBLAS/TESTING/p*tst.f: Set REALSZ = 8 and CPLXSZ = 16!! SCALAPACK/PBLAS/TIMING/p*tim.f: Set REALSZ = 8 and CPLXSZ = 16!! SCALAPACK/TESTING/LIN/p*driver.f: Set INTGSZ = 8, REALSZ = 8, and CPLXSZ = 16!! SCALAPACK/TESTING/EIG/p*driver.f: Set INTGSZ = 8, REALSZ = 8, and CPLXSZ = 16!! SCALAPACK/TESTING/EIG/p*sepreq.f: Set INTGSZ = 8, REALSZ = 8, and CPLXSZ = 16!! SCALAPACK/TESTING/EIG/p*gsepreq.f: Set INTGSZ = 8, REALSZ = 8, and CPLXSZ = 16!! SCALAPACK/TESTING/EIG/p[c,z]sepsubtst.f: change line 737 from: IF( MISSSMALLEST .AND. ( WIN( MYIL-1 ).LT.VL+NORMWIN* $ FIVE*THRESH*EPS ) )MISSSMALLEST = .FALSE. to: if (misssmallest) $ misssmallest = win(myil-1) .ge. $ vl+normwin*five*thresh*eps SCALAPACK/TESTING/EIG/p[d,s]sepsubtst.f: change line 714 from: IF( MISSSMALLEST .AND. ( WIN( MYIL-1 ).LT.VL+NORMWIN* $ FIVE*THRESH*EPS ) )MISSSMALLEST = .FALSE. to: if (misssmallest) $ misssmallest = win(myil-1) .ge. $ vl+normwin*five*thresh*eps SCALAPACK/TESTING/EIG/p*gsepsubtst.f: Same change as for p*sepsubtst.f. ####################################### # HETEROGENEOUS NETWORKS OF COMPUTERS # ####################################### 4. Unresolved issues remain in heterogeneous computing. As a result, the ScaLAPACK library is not as robust in a heterogeneous environment as it is in a homogeneous environment. The following is a list of known problems when ScaLAPACK is run on a heterogeneous network of computers. For further details, please refer to LAPACK Working Note 112: http://www.netlib.org/lapack/lawns/lawn112.ps 5. Because the ALPHA handles denormalized numbers differently than other architectures, the least squares executables SCALAPACK/TESTING/x_ls will fail when run on a heterogeneous network. However, the least squares executables will run homogeneously on ALPHAs. 6. The symmetric eigensolvers may have trouble on heterogeneous networks when a subset of eigenvalues is chosen by value (i.e. RANGE='V') and one of the limits of that range (VL or VU) is within a couple ulps of an actual eigenvalue. This is not a problem on homogeneous systems. When this happens, some processes will return INFO <> 0. This can happen when running the test code. In every case that we have seen, the answer is correct despite the spurious error message. 7. The symmetric eigensolver test will often hang on a heterogeneous system which includes an ALPHA. The symmetric eigensolver test will occasionally hang on a network of heterogeneous computers which does not include an ALPHA, such as a network which includes an HPPA and an RS6K. ################################### # DETAILS OF UPDATES # ################################### Note: Unless otherwise stated, changes to single precision routines (names beginning with PS or PC) apply also to the corresponding double precision routines (names beginning with PD and PZ). UPDATE (Version 1.6) ------------ ------- ------------- -------------------------- DIRECTORY ROUTINE LAST MODIFIED DESCRIPTION OF CHANGE ------------ ------- ------------- -------------------------- SRC/ psdb*.f/ Several modifications to source code pcdb*.f psdt*.f/ Several modifications to source code pcdt*.f psgb*.f/ Several modifications to source code pcgb*.f psgeqlf.f/ Missing from input error-checking: pcgeqlf.f IF( LWORK.EQ.-1 ) THEN IDUM1( 1 ) = -1 ELSE IDUM1( 1 ) = 1 END IF IDUM2( 1 ) = 9 psgerfs.f/ Uninitialized variables sometimes caused pcgerfs.f failures in p_gerfs on SGI systems. The fix is to change: IDUM2( 3 ) = 3 IF( LWORK.EQ.-1 ) THEN IDUM1( 5 ) = -1 ELSE IDUM1( 5 ) = 1 END IF IDUM2( 5 ) = 24 to: IDUM2( 3 ) = 3 IF( LWORK.EQ.-1 ) THEN IDUM1( 4 ) = -1 ELSE IDUM1( 4 ) = 1 END IF IDUM2( 4 ) = 24 psgesvd.f Corrections to input error-checking. pslacp2.f/ Error when UPLO = 'All', and the operands A pclacp2.f and B are not exactly distributed the same way, but still correctly aligned, PxLACP2 fails for certain values of the parameters M and N. pslapiv.f/ Incorrectly used DESCA instead of DESCIP. pclapiv.f Lines 234 and 235 should be: IPT = MOD( JP-1, DESCIP(NB_) ) DESCPT(M_) = N + IPT + NPROW*DESCIP(NB_) Line 237 should be: DESCPT(MB_) = DESCIP(NB_) Line 294 should be: JPT = MOD( IP-1, DESCIP(MB_) ) Line 296 should be: DESCPT(N_) = M + JPT + NPCOL*DESCIP(MB_) Line 298 should be: DESCPT(NB_) = DESCIP(MB_) psorm2l.f/ Incorrectly used DESCA instead of DESCC. pcunm2l.f Line 416 of psorm2l.f $ JC, DESCA, WORK ) should be $ JC, DESCC, WORK ) Lines 425 and 428 of pcunm2l.f $ IC, JC, DESCA, WORK ) should be $ IC, JC, DESCC, WORK ) pcporfs.f BERR and FERR were erroneously declared as COMPLEX instead of REAL! psstebz.f Error in LWORK=-1 return, should retain INFO=0 for LWORK=-1 or LIWORK=-1. Add the line LQUERY = ( LWORK.EQ.-1 .OR. LIWORK.EQ.-1 ) Change line 346 from ELSE IF( LWORK.LT.MAX( 5*N, 7 ) ) THEN to ELSE IF( LWORK.LT.MAX( 5*N, 7 ).AND..NOT.LQUERY) THEN Change line 348 from ELSE IF( LIWORK.LT.MAX( 4*N, 14, NPROW*NPCOL ) ) THEN to ELSE IF( LIWORK.LT.MAX( 4*N, 14, NPROW*NPCOL ) $ .AND. .NOT.LQUERY ) THEN psstein.f/ Error in LWORK=-1 return, should retain pcstein.f INFO=0 if LWORK=-1. Add the line LQUERY = ( LWORK.EQ.-1 .OR. LIWORK.EQ.-1 ) Change line 347 from ELSE IF( MAXVEC.LT.LOAD ) THEN to ELSE IF( MAXVEC.LT.LOAD.AND..NOT.LQUERY) THEN Change line 349 from ELSE IF( LIWORK.LT.3*N+P+1 ) THEN to ELSE IF( LIWORK.LT.3*N+P+1.AND..NOT.LQUERY) THEN pssyev.f Documentation corrections and corrections to the input error-checking. pssyevx.f/ Documentation corrections and corrections pcheevx.f to the input error-checking. TOOLS/ SL_gridreshape.c The following include statement should be added: #include reshape.c The following include statement should be added: #include TESTING/LIN/ p*db*.f/ Miscellaneous corrections p*dt*.f/ p*gb*.f/ p*pb*.f/ p*pt*.f TESTING/EIG/ pshrddriver.f/ Error in workspace calculation pchrddriver.f Change Line 281 from: WORKSIZ = NB*NB + NB*IHLP + ITEMP + IPOSTPAD to WORKSIZ = MAX( NB*NB + NB*IHLP + ITEMP, NB * NP ) + $ IPOSTPAD pssepchk.f/ Error in documentation describing pcsepchk.f tests performed and LWORK calculation pssepsubtst.f/ " pcsepsubtst.f pssqpsubtst.f/ " pcsqpsubtst.f ################################### # REPORTED PROBLEMS # ################################### The following testing problems have been noted, but their cause is currently unknown. 1. For ALPHA, testing failures in SCALAPACK/TESTING/xsnep, xdnep, xssep, and/or xdsep, depending upon the version of OSF, under investigation. 2. During testing on the IBM SP-2, using IBM MPI, we encountered testing failures in SCALAPACK/TESTING/xsnep. The following bugs have been identified. 1. On the SP-2, using MPLBLACS, hangs occur in x_pbllt, x_ptllt, x_dtlu, x_dblu due to an error in MPL. These hangs do not occur when using the MPIBLACS. Therefore, users of the banded codes are urged to use the MPIBLACS. Here is a small MPL routine demonstrating the problem: ***** program tst integer k, iam, Np, ictxt, i, j call mpc_environ(Np, Iam); k = Iam + 100 print*,'start' if (iam.eq.1) then call mp_send(Iam, 4, 0, 2, i) call mp_send(k, 4, 0, 3, j) print*,mp_status(i) print*,mp_status(j) else if (iam .eq. 0) then call mp_brecv(k, 4, 1, 3, j) call mp_brecv(k, 4, 1, 2, j) end if print*,'done' stop end *****