|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 SUBROUTINE PDLAED3( ICTXT, K, N, NB, D, DROW, DCOL, RHO, DLAMDA, 00002 $ W, Z, U, LDU, BUF, INDX, INDCOL, INDROW, 00003 $ INDXR, INDXC, CTOT, NPCOL, INFO ) 00004 * 00005 * -- ScaLAPACK auxiliary routine (version 1.7) -- 00006 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00007 * and University of California, Berkeley. 00008 * December 31, 1998 00009 * 00010 * .. Scalar Arguments .. 00011 INTEGER DCOL, DROW, ICTXT, INFO, K, LDU, N, NB, NPCOL 00012 DOUBLE PRECISION RHO 00013 * .. 00014 * .. Array Arguments .. 00015 INTEGER CTOT( 0: NPCOL-1, 4 ), INDCOL( * ), 00016 $ INDROW( * ), INDX( * ), INDXC( * ), INDXR( * ) 00017 DOUBLE PRECISION BUF( * ), D( * ), DLAMDA( * ), U( LDU, * ), 00018 $ W( * ), Z( * ) 00019 * .. 00020 * 00021 * Purpose 00022 * ======= 00023 * 00024 * PDLAED3 finds the roots of the secular equation, as defined by the 00025 * values in D, W, and RHO, between 1 and K. It makes the 00026 * appropriate calls to SLAED4 00027 * 00028 * This code makes very mild assumptions about floating point 00029 * arithmetic. It will work on machines with a guard digit in 00030 * add/subtract, or on those binary machines without guard digits 00031 * which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or Cray-2. 00032 * It could conceivably fail on hexadecimal or decimal machines 00033 * without guard digits, but we know of none. 00034 * 00035 * Arguments 00036 * ========= 00037 * 00038 * ICTXT (global input) INTEGER 00039 * The BLACS context handle, indicating the global context of 00040 * the operation on the matrix. The context itself is global. 00041 * 00042 * K (output) INTEGER 00043 * The number of non-deflated eigenvalues, and the order of the 00044 * related secular equation. 0 <= K <=N. 00045 * 00046 * N (input) INTEGER 00047 * The dimension of the symmetric tridiagonal matrix. N >= 0. 00048 * 00049 * NB (global input) INTEGER 00050 * The blocking factor used to distribute the columns of the 00051 * matrix. NB >= 1. 00052 * 00053 * D (input/output) DOUBLE PRECISION array, dimension (N) 00054 * On entry, D contains the eigenvalues of the two submatrices to 00055 * be combined. 00056 * On exit, D contains the trailing (N-K) updated eigenvalues 00057 * (those which were deflated) sorted into increasing order. 00058 * 00059 * DROW (global input) INTEGER 00060 * The process row over which the first row of the matrix D is 00061 * distributed. 0 <= DROW < NPROW. 00062 * 00063 * DCOL (global input) INTEGER 00064 * The process column over which the first column of the 00065 * matrix D is distributed. 0 <= DCOL < NPCOL. 00066 * 00067 * Q (input/output) DOUBLE PRECISION array, dimension (LDQ, N) 00068 * On entry, Q contains the eigenvectors of two submatrices in 00069 * the two square blocks with corners at (1,1), (N1,N1) 00070 * and (N1+1, N1+1), (N,N). 00071 * On exit, Q contains the trailing (N-K) updated eigenvectors 00072 * (those which were deflated) in its last N-K columns. 00073 * 00074 * LDQ (input) INTEGER 00075 * The leading dimension of the array Q. LDQ >= max(1,NQ). 00076 * 00077 * RHO (global input/output) DOUBLE PRECISION 00078 * On entry, the off-diagonal element associated with the rank-1 00079 * cut which originally split the two submatrices which are now 00080 * being recombined. 00081 * On exit, RHO has been modified to the value required by 00082 * PDLAED3. 00083 * 00084 * DLAMDA (global output) DOUBLE PRECISION array, dimension (N) 00085 * A copy of the first K eigenvalues which will be used by 00086 * SLAED3 to form the secular equation. 00087 * 00088 * W (global output) DOUBLE PRECISION array, dimension (N) 00089 * The first k values of the final deflation-altered z-vector 00090 * which will be passed to SLAED3. 00091 * 00092 * Z (global input) DOUBLE PRECISION array, dimension (N) 00093 * On entry, Z contains the updating vector (the last 00094 * row of the first sub-eigenvector matrix and the first row of 00095 * the second sub-eigenvector matrix). 00096 * On exit, the contents of Z have been destroyed by the updating 00097 * process. 00098 * 00099 * U (global output) DOUBLE PRECISION array 00100 * global dimension (N, N), local dimension (LDU, NQ). 00101 * Q contains the orthonormal eigenvectors of the symmetric 00102 * tridiagonal matrix. 00103 * 00104 * LDU (input) INTEGER 00105 * The leading dimension of the array U. 00106 * 00107 * QBUF (workspace) DOUBLE PRECISION array, dimension 3*N 00108 * 00109 * 00110 * INDX (workspace) INTEGER array, dimension (N) 00111 * The permutation used to sort the contents of DLAMDA into 00112 * ascending order. 00113 * 00114 * INDCOL (workspace) INTEGER array, dimension (N) 00115 * 00116 * 00117 * INDROW (workspace) INTEGER array, dimension (N) 00118 * 00119 * 00120 * INDXR (workspace) INTEGER array, dimension (N) 00121 * 00122 * 00123 * INDXC (workspace) INTEGER array, dimension (N) 00124 * 00125 * CTOT (workspace) INTEGER array, dimension( NPCOL, 4) 00126 * 00127 * NPCOL (global input) INTEGER 00128 * The total number of columns over which the distributed 00129 * submatrix is distributed. 00130 * 00131 * INFO (output) INTEGER 00132 * = 0: successful exit. 00133 * < 0: if INFO = -i, the i-th argument had an illegal value. 00134 * > 0: The algorithm failed to compute the ith eigenvalue. 00135 * 00136 * ===================================================================== 00137 * 00138 * .. Parameters .. 00139 DOUBLE PRECISION ONE 00140 PARAMETER ( ONE = 1.0D+0 ) 00141 * .. 00142 * .. Local Scalars .. 00143 INTEGER COL, GI, I, IINFO, IIU, IPD, IU, J, JJU, JU, 00144 $ KK, KL, KLC, KLR, MYCOL, MYKL, MYKLR, MYROW, 00145 $ NPROW, PDC, PDR, ROW 00146 DOUBLE PRECISION AUX, TEMP 00147 * .. 00148 * .. External Functions .. 00149 INTEGER INDXG2L 00150 DOUBLE PRECISION DLAMC3, DNRM2 00151 EXTERNAL INDXG2L, DLAMC3, DNRM2 00152 * .. 00153 * .. External Subroutines .. 00154 EXTERNAL BLACS_GRIDINFO, DCOPY, DGEBR2D, DGEBS2D, 00155 $ DGERV2D, DGESD2D, DLAED4 00156 * .. 00157 * .. Intrinsic Functions .. 00158 INTRINSIC MOD, SIGN, SQRT 00159 * .. 00160 * .. Executable Statements .. 00161 * 00162 * Test the input parameters. 00163 * 00164 INFO = 0 00165 * 00166 * Quick return if possible 00167 * 00168 IF( K.EQ.0 ) 00169 $ RETURN 00170 * 00171 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 00172 * 00173 ROW = DROW 00174 COL = DCOL 00175 DO 20 I = 1, N, NB 00176 DO 10 J = 0, NB - 1 00177 IF( I+J.LE.N ) THEN 00178 INDROW( I+J ) = ROW 00179 INDCOL( I+J ) = COL 00180 END IF 00181 10 CONTINUE 00182 ROW = MOD( ROW+1, NPROW ) 00183 COL = MOD( COL+1, NPCOL ) 00184 20 CONTINUE 00185 * 00186 MYKL = CTOT( MYCOL, 1 ) + CTOT( MYCOL, 2 ) + CTOT( MYCOL, 3 ) 00187 KLR = MYKL / NPROW 00188 IF( MYROW.EQ.DROW ) THEN 00189 MYKLR = KLR + MOD( MYKL, NPROW ) 00190 ELSE 00191 MYKLR = KLR 00192 END IF 00193 PDC = 1 00194 COL = DCOL 00195 30 CONTINUE 00196 IF( MYCOL.NE.COL ) THEN 00197 PDC = PDC + CTOT( COL, 1 ) + CTOT( COL, 2 ) + CTOT( COL, 3 ) 00198 COL = MOD( COL+1, NPCOL ) 00199 GO TO 30 00200 END IF 00201 PDR = PDC 00202 KL = KLR + MOD( MYKL, NPROW ) 00203 ROW = DROW 00204 40 CONTINUE 00205 IF( MYROW.NE.ROW ) THEN 00206 PDR = PDR + KL 00207 KL = KLR 00208 ROW = MOD( ROW+1, NPROW ) 00209 GO TO 40 00210 END IF 00211 * 00212 DO 50 I = 1, K 00213 DLAMDA( I ) = DLAMC3( DLAMDA( I ), DLAMDA( I ) ) - DLAMDA( I ) 00214 Z( I ) = ONE 00215 50 CONTINUE 00216 IF( MYKLR.GT.0 ) THEN 00217 KK = PDR 00218 DO 80 I = 1, MYKLR 00219 CALL DLAED4( K, KK, DLAMDA, W, BUF, RHO, BUF( K+I ), IINFO ) 00220 IF( IINFO.NE.0 ) THEN 00221 INFO = KK 00222 END IF 00223 * 00224 * ..Compute part of z 00225 * 00226 DO 60 J = 1, KK - 1 00227 Z( J ) = Z( J )*( BUF( J ) / 00228 $ ( DLAMDA( J )-DLAMDA( KK ) ) ) 00229 60 CONTINUE 00230 Z( KK ) = Z( KK )*BUF( KK ) 00231 DO 70 J = KK + 1, K 00232 Z( J ) = Z( J )*( BUF( J ) / 00233 $ ( DLAMDA( J )-DLAMDA( KK ) ) ) 00234 70 CONTINUE 00235 KK = KK + 1 00236 80 CONTINUE 00237 * 00238 IF( MYROW.NE.DROW ) THEN 00239 CALL DCOPY( K, Z, 1, BUF, 1 ) 00240 CALL DGESD2D( ICTXT, K+MYKLR, 1, BUF, K+MYKLR, DROW, MYCOL ) 00241 ELSE 00242 IPD = 2*K + 1 00243 CALL DCOPY( MYKLR, BUF( K+1 ), 1, BUF( IPD ), 1 ) 00244 IF( KLR.GT.0 ) THEN 00245 IPD = MYKLR + IPD 00246 ROW = MOD( DROW+1, NPROW ) 00247 DO 100 I = 1, NPROW - 1 00248 CALL DGERV2D( ICTXT, K+KLR, 1, BUF, K+KLR, ROW, 00249 $ MYCOL ) 00250 CALL DCOPY( KLR, BUF( K+1 ), 1, BUF( IPD ), 1 ) 00251 DO 90 J = 1, K 00252 Z( J ) = Z( J )*BUF( J ) 00253 90 CONTINUE 00254 IPD = IPD + KLR 00255 ROW = MOD( ROW+1, NPROW ) 00256 100 CONTINUE 00257 END IF 00258 END IF 00259 END IF 00260 * 00261 IF( MYROW.EQ.DROW ) THEN 00262 IF( MYCOL.NE.DCOL .AND. MYKL.NE.0 ) THEN 00263 CALL DCOPY( K, Z, 1, BUF, 1 ) 00264 CALL DCOPY( MYKL, BUF( 2*K+1 ), 1, BUF( K+1 ), 1 ) 00265 CALL DGESD2D( ICTXT, K+MYKL, 1, BUF, K+MYKL, MYROW, DCOL ) 00266 ELSE IF( MYCOL.EQ.DCOL ) THEN 00267 IPD = 2*K + 1 00268 COL = DCOL 00269 KL = MYKL 00270 DO 120 I = 1, NPCOL - 1 00271 IPD = IPD + KL 00272 COL = MOD( COL+1, NPCOL ) 00273 KL = CTOT( COL, 1 ) + CTOT( COL, 2 ) + CTOT( COL, 3 ) 00274 IF( KL.NE.0 ) THEN 00275 CALL DGERV2D( ICTXT, K+KL, 1, BUF, K+KL, MYROW, COL ) 00276 CALL DCOPY( KL, BUF( K+1 ), 1, BUF( IPD ), 1 ) 00277 DO 110 J = 1, K 00278 Z( J ) = Z( J )*BUF( J ) 00279 110 CONTINUE 00280 END IF 00281 120 CONTINUE 00282 DO 130 I = 1, K 00283 Z( I ) = SIGN( SQRT( -Z( I ) ), W( I ) ) 00284 130 CONTINUE 00285 * 00286 END IF 00287 END IF 00288 * 00289 * Diffusion 00290 * 00291 IF( MYROW.EQ.DROW .AND. MYCOL.EQ.DCOL ) THEN 00292 CALL DCOPY( K, Z, 1, BUF, 1 ) 00293 CALL DCOPY( K, BUF( 2*K+1 ), 1, BUF( K+1 ), 1 ) 00294 CALL DGEBS2D( ICTXT, 'All', ' ', 2*K, 1, BUF, 2*K ) 00295 ELSE 00296 CALL DGEBR2D( ICTXT, 'All', ' ', 2*K, 1, BUF, 2*K, DROW, DCOL ) 00297 CALL DCOPY( K, BUF, 1, Z, 1 ) 00298 END IF 00299 * 00300 * Copy of D at the good place 00301 * 00302 KLC = 0 00303 KLR = 0 00304 DO 140 I = 1, K 00305 GI = INDX( I ) 00306 D( GI ) = BUF( K+I ) 00307 COL = INDCOL( GI ) 00308 ROW = INDROW( GI ) 00309 IF( COL.EQ.MYCOL ) THEN 00310 KLC = KLC + 1 00311 INDXC( KLC ) = I 00312 END IF 00313 IF( ROW.EQ.MYROW ) THEN 00314 KLR = KLR + 1 00315 INDXR( KLR ) = I 00316 END IF 00317 140 CONTINUE 00318 * 00319 * Compute eigenvectors of the modified rank-1 modification. 00320 * 00321 IF( MYKL.NE.0 ) THEN 00322 DO 180 J = 1, MYKL 00323 KK = INDXC( J ) 00324 JU = INDX( KK ) 00325 JJU = INDXG2L( JU, NB, J, J, NPCOL ) 00326 CALL DLAED4( K, KK, DLAMDA, W, BUF, RHO, AUX, IINFO ) 00327 IF( IINFO.NE.0 ) THEN 00328 INFO = KK 00329 END IF 00330 IF( K.EQ.1 .OR. K.EQ.2 ) THEN 00331 DO 150 I = 1, KLR 00332 KK = INDXR( I ) 00333 IU = INDX( KK ) 00334 IIU = INDXG2L( IU, NB, J, J, NPROW ) 00335 U( IIU, JJU ) = BUF( KK ) 00336 150 CONTINUE 00337 GO TO 180 00338 END IF 00339 * 00340 DO 160 I = 1, K 00341 BUF( I ) = Z( I ) / BUF( I ) 00342 160 CONTINUE 00343 TEMP = DNRM2( K, BUF, 1 ) 00344 DO 170 I = 1, KLR 00345 KK = INDXR( I ) 00346 IU = INDX( KK ) 00347 IIU = INDXG2L( IU, NB, J, J, NPROW ) 00348 U( IIU, JJU ) = BUF( KK ) / TEMP 00349 170 CONTINUE 00350 * 00351 180 CONTINUE 00352 END IF 00353 * 00354 190 CONTINUE 00355 * 00356 RETURN 00357 * 00358 * End of PDLAED3 00359 * 00360 END