|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 * 00002 * 00003 SUBROUTINE PSSYGS2( IBTYPE, UPLO, N, A, IA, JA, DESCA, B, IB, JB, 00004 $ DESCB, INFO ) 00005 * 00006 * -- ScaLAPACK routine (version 1.7) -- 00007 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00008 * and University of California, Berkeley. 00009 * May 1, 1997 00010 * 00011 * .. Scalar Arguments .. 00012 CHARACTER UPLO 00013 INTEGER IA, IB, IBTYPE, INFO, JA, JB, N 00014 * .. 00015 * .. Array Arguments .. 00016 INTEGER DESCA( * ), DESCB( * ) 00017 REAL A( * ), B( * ) 00018 * .. 00019 * 00020 * Purpose 00021 * ======= 00022 * 00023 * PSSYGS2 reduces a real symmetric-definite generalized eigenproblem 00024 * to standard form. 00025 * 00026 * In the following sub( A ) denotes A( IA:IA+N-1, JA:JA+N-1 ) and 00027 * sub( B ) denotes B( IB:IB+N-1, JB:JB+N-1 ). 00028 * 00029 * If IBTYPE = 1, the problem is sub( A )*x = lambda*sub( B )*x, 00030 * and sub( A ) is overwritten by inv(U**T)*sub( A )*inv(U) or 00031 * inv(L)*sub( A )*inv(L**T) 00032 * 00033 * If IBTYPE = 2 or 3, the problem is sub( A )*sub( B )*x = lambda*x or 00034 * sub( B )*sub( A )*x = lambda*x, and sub( A ) is overwritten by 00035 * U*sub( A )*U**T or L**T*sub( A )*L. 00036 * 00037 * sub( B ) must have been previously factorized as U**T*U or L*L**T by 00038 * PSPOTRF. 00039 * 00040 * Notes 00041 * ===== 00042 * 00043 * Each global data object is described by an associated description 00044 * vector. This vector stores the information required to establish 00045 * the mapping between an object element and its corresponding process 00046 * and memory location. 00047 * 00048 * Let A be a generic term for any 2D block cyclicly distributed array. 00049 * Such a global array has an associated description vector DESCA. 00050 * In the following comments, the character _ should be read as 00051 * "of the global array". 00052 * 00053 * NOTATION STORED IN EXPLANATION 00054 * --------------- -------------- -------------------------------------- 00055 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 00056 * DTYPE_A = 1. 00057 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 00058 * the BLACS process grid A is distribu- 00059 * ted over. The context itself is glo- 00060 * bal, but the handle (the integer 00061 * value) may vary. 00062 * M_A (global) DESCA( M_ ) The number of rows in the global 00063 * array A. 00064 * N_A (global) DESCA( N_ ) The number of columns in the global 00065 * array A. 00066 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 00067 * the rows of the array. 00068 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 00069 * the columns of the array. 00070 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 00071 * row of the array A is distributed. 00072 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the 00073 * first column of the array A is 00074 * distributed. 00075 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 00076 * array. LLD_A >= MAX(1,LOCr(M_A)). 00077 * 00078 * Let K be the number of rows or columns of a distributed matrix, 00079 * and assume that its process grid has dimension p x q. 00080 * LOCr( K ) denotes the number of elements of K that a process 00081 * would receive if K were distributed over the p processes of its 00082 * process column. 00083 * Similarly, LOCc( K ) denotes the number of elements of K that a 00084 * process would receive if K were distributed over the q processes of 00085 * its process row. 00086 * The values of LOCr() and LOCc() may be determined via a call to the 00087 * ScaLAPACK tool function, NUMROC: 00088 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 00089 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 00090 * An upper bound for these quantities may be computed by: 00091 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 00092 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 00093 * 00094 * Arguments 00095 * ========= 00096 * 00097 * IBTYPE (global input) INTEGER 00098 * = 1: compute inv(U**T)*sub( A )*inv(U) or 00099 * inv(L)*sub( A )*inv(L**T); 00100 * = 2 or 3: compute U*sub( A )*U**T or L**T*sub( A )*L. 00101 * 00102 * UPLO (global input) CHARACTER 00103 * = 'U': Upper triangle of sub( A ) is stored and sub( B ) is 00104 * factored as U**T*U; 00105 * = 'L': Lower triangle of sub( A ) is stored and sub( B ) is 00106 * factored as L*L**T. 00107 * 00108 * N (global input) INTEGER 00109 * The order of the matrices sub( A ) and sub( B ). N >= 0. 00110 * 00111 * A (local input/local output) REAL pointer into the 00112 * local memory to an array of dimension (LLD_A, LOCc(JA+N-1)). 00113 * On entry, this array contains the local pieces of the 00114 * N-by-N symmetric distributed matrix sub( A ). If UPLO = 'U', 00115 * the leading N-by-N upper triangular part of sub( A ) contains 00116 * the upper triangular part of the matrix, and its strictly 00117 * lower triangular part is not referenced. If UPLO = 'L', the 00118 * leading N-by-N lower triangular part of sub( A ) contains 00119 * the lower triangular part of the matrix, and its strictly 00120 * upper triangular part is not referenced. 00121 * 00122 * On exit, if INFO = 0, the transformed matrix, stored in the 00123 * same format as sub( A ). 00124 * 00125 * IA (global input) INTEGER 00126 * A's global row index, which points to the beginning of the 00127 * submatrix which is to be operated on. 00128 * 00129 * JA (global input) INTEGER 00130 * A's global column index, which points to the beginning of 00131 * the submatrix which is to be operated on. 00132 * 00133 * DESCA (global and local input) INTEGER array of dimension DLEN_. 00134 * The array descriptor for the distributed matrix A. 00135 * 00136 * B (local input) REAL pointer into the local memory 00137 * to an array of dimension (LLD_B, LOCc(JB+N-1)). On entry, 00138 * this array contains the local pieces of the triangular factor 00139 * from the Cholesky factorization of sub( B ), as returned by 00140 * PSPOTRF. 00141 * 00142 * IB (global input) INTEGER 00143 * B's global row index, which points to the beginning of the 00144 * submatrix which is to be operated on. 00145 * 00146 * JB (global input) INTEGER 00147 * B's global column index, which points to the beginning of 00148 * the submatrix which is to be operated on. 00149 * 00150 * DESCB (global and local input) INTEGER array of dimension DLEN_. 00151 * The array descriptor for the distributed matrix B. 00152 * 00153 * INFO (global output) INTEGER 00154 * = 0: successful exit 00155 * < 0: If the i-th argument is an array and the j-entry had 00156 * an illegal value, then INFO = -(i*100+j), if the i-th 00157 * argument is a scalar and had an illegal value, then 00158 * INFO = -i. 00159 * 00160 * ===================================================================== 00161 * 00162 * .. Parameters .. 00163 INTEGER BLOCK_CYCLIC_2D, DLEN_, DTYPE_, CTXT_, M_, N_, 00164 $ MB_, NB_, RSRC_, CSRC_, LLD_ 00165 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 00166 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 00167 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 00168 REAL ONE, HALF 00169 PARAMETER ( ONE = 1.0E+0, HALF = 0.5E+0 ) 00170 * .. 00171 * .. Local Scalars .. 00172 LOGICAL UPPER 00173 INTEGER IACOL, IAROW, IBCOL, IBROW, ICOFFA, ICOFFB, 00174 $ ICTXT, IIA, IIB, IOFFA, IOFFB, IROFFA, IROFFB, 00175 $ JJA, JJB, K, LDA, LDB, MYCOL, MYROW, NPCOL, 00176 $ NPROW 00177 REAL AKK, BKK, CT 00178 * .. 00179 * .. External Subroutines .. 00180 EXTERNAL BLACS_EXIT, BLACS_GRIDINFO, CHK1MAT, INFOG2L, 00181 $ PXERBLA, SAXPY, SSCAL, SSYR2, STRMV, STRSV 00182 * .. 00183 * .. Intrinsic Functions .. 00184 INTRINSIC MOD 00185 * .. 00186 * .. External Functions .. 00187 LOGICAL LSAME 00188 INTEGER INDXG2P 00189 EXTERNAL LSAME, INDXG2P 00190 * .. 00191 * .. Executable Statements .. 00192 * This is just to keep ftnchek happy 00193 IF( BLOCK_CYCLIC_2D*CSRC_*CTXT_*DLEN_*DTYPE_*LLD_*MB_*M_*NB_*N_* 00194 $ RSRC_.LT.0 )RETURN 00195 * 00196 * Get grid parameters 00197 * 00198 ICTXT = DESCA( CTXT_ ) 00199 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 00200 * 00201 * Test the input parameters. 00202 * 00203 INFO = 0 00204 IF( NPROW.EQ.-1 ) THEN 00205 INFO = -( 700+CTXT_ ) 00206 ELSE 00207 UPPER = LSAME( UPLO, 'U' ) 00208 CALL CHK1MAT( N, 3, N, 3, IA, JA, DESCA, 7, INFO ) 00209 CALL CHK1MAT( N, 3, N, 3, IB, JB, DESCB, 11, INFO ) 00210 IF( INFO.EQ.0 ) THEN 00211 IAROW = INDXG2P( IA, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), 00212 $ NPROW ) 00213 IBROW = INDXG2P( IB, DESCB( MB_ ), MYROW, DESCB( RSRC_ ), 00214 $ NPROW ) 00215 IACOL = INDXG2P( JA, DESCA( NB_ ), MYCOL, DESCA( CSRC_ ), 00216 $ NPCOL ) 00217 IBCOL = INDXG2P( JB, DESCB( NB_ ), MYCOL, DESCB( CSRC_ ), 00218 $ NPCOL ) 00219 IROFFA = MOD( IA-1, DESCA( MB_ ) ) 00220 ICOFFA = MOD( JA-1, DESCA( NB_ ) ) 00221 IROFFB = MOD( IB-1, DESCB( MB_ ) ) 00222 ICOFFB = MOD( JB-1, DESCB( NB_ ) ) 00223 IF( IBTYPE.LT.1 .OR. IBTYPE.GT.3 ) THEN 00224 INFO = -1 00225 ELSE IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN 00226 INFO = -2 00227 ELSE IF( N.LT.0 ) THEN 00228 INFO = -3 00229 ELSE IF( N+ICOFFA.GT.DESCA( NB_ ) ) THEN 00230 INFO = -3 00231 ELSE IF( IROFFA.NE.0 ) THEN 00232 INFO = -5 00233 ELSE IF( ICOFFA.NE.0 ) THEN 00234 INFO = -6 00235 ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN 00236 INFO = -( 700+NB_ ) 00237 ELSE IF( IROFFB.NE.0 .OR. IBROW.NE.IAROW ) THEN 00238 INFO = -9 00239 ELSE IF( ICOFFB.NE.0 .OR. IBCOL.NE.IACOL ) THEN 00240 INFO = -10 00241 ELSE IF( DESCB( MB_ ).NE.DESCA( MB_ ) ) THEN 00242 INFO = -( 1100+MB_ ) 00243 ELSE IF( DESCB( NB_ ).NE.DESCA( NB_ ) ) THEN 00244 INFO = -( 1100+NB_ ) 00245 ELSE IF( ICTXT.NE.DESCB( CTXT_ ) ) THEN 00246 INFO = -( 1100+CTXT_ ) 00247 END IF 00248 END IF 00249 END IF 00250 * 00251 IF( INFO.NE.0 ) THEN 00252 CALL PXERBLA( ICTXT, 'PSSYGS2', -INFO ) 00253 CALL BLACS_EXIT( ICTXT ) 00254 RETURN 00255 END IF 00256 * 00257 * Quick return if possible 00258 * 00259 IF( N.EQ.0 .OR. ( MYROW.NE.IAROW .OR. MYCOL.NE.IACOL ) ) 00260 $ RETURN 00261 * 00262 * Compute local information 00263 * 00264 LDA = DESCA( LLD_ ) 00265 LDB = DESCB( LLD_ ) 00266 CALL INFOG2L( IA, JA, DESCA, NPROW, NPCOL, MYROW, MYCOL, IIA, JJA, 00267 $ IAROW, IACOL ) 00268 CALL INFOG2L( IA, JA, DESCA, NPROW, NPCOL, MYROW, MYCOL, IIB, JJB, 00269 $ IBROW, IBCOL ) 00270 * 00271 IF( IBTYPE.EQ.1 ) THEN 00272 * 00273 IF( UPPER ) THEN 00274 * 00275 IOFFA = IIA + JJA*LDA 00276 IOFFB = IIB + JJB*LDB 00277 * 00278 * Compute inv(U')*sub( A )*inv(U) 00279 * 00280 DO 10 K = 1, N 00281 * 00282 * Update the upper triangle of 00283 * A(ia+k-1:ia+n-a,ia+k-1:ia+n-1) 00284 * 00285 AKK = A( IOFFA-LDA ) 00286 BKK = B( IOFFB-LDB ) 00287 AKK = AKK / BKK**2 00288 A( IOFFA-LDA ) = AKK 00289 IF( K.LT.N ) THEN 00290 CALL SSCAL( N-K, ONE / BKK, A( IOFFA ), LDA ) 00291 CT = -HALF*AKK 00292 CALL SAXPY( N-K, CT, B( IOFFB ), LDB, A( IOFFA ), 00293 $ LDA ) 00294 CALL SSYR2( UPLO, N-K, -ONE, A( IOFFA ), LDA, 00295 $ B( IOFFB ), LDB, A( IOFFA+1 ), LDA ) 00296 CALL SAXPY( N-K, CT, B( IOFFB ), LDB, A( IOFFA ), 00297 $ LDA ) 00298 CALL STRSV( UPLO, 'Transpose', 'Non-unit', N-K, 00299 $ B( IOFFB+1 ), LDB, A( IOFFA ), LDA ) 00300 END IF 00301 * 00302 * A( IOFFA ) -> A( K, K+1 ) 00303 * B( IOFFB ) -> B( K, K+1 ) 00304 * 00305 IOFFA = IOFFA + LDA + 1 00306 IOFFB = IOFFB + LDB + 1 00307 * 00308 10 CONTINUE 00309 * 00310 ELSE 00311 * 00312 IOFFA = IIA + 1 + ( JJA-1 )*LDA 00313 IOFFB = IIB + 1 + ( JJB-1 )*LDB 00314 * 00315 * Compute inv(L)*sub( A )*inv(L') 00316 * 00317 DO 20 K = 1, N 00318 * 00319 * Update the lower triangle of 00320 * A(ia+k-1:ia+n-a,ia+k-1:ia+n-1) 00321 * 00322 AKK = A( IOFFA-1 ) 00323 BKK = B( IOFFB-1 ) 00324 AKK = AKK / BKK**2 00325 A( IOFFA-1 ) = AKK 00326 * 00327 IF( K.LT.N ) THEN 00328 CALL SSCAL( N-K, ONE / BKK, A( IOFFA ), 1 ) 00329 CT = -HALF*AKK 00330 CALL SAXPY( N-K, CT, B( IOFFB ), 1, A( IOFFA ), 1 ) 00331 CALL SSYR2( UPLO, N-K, -ONE, A( IOFFA ), 1, 00332 $ B( IOFFB ), 1, A( IOFFA+LDA ), LDA ) 00333 CALL SAXPY( N-K, CT, B( IOFFB ), 1, A( IOFFA ), 1 ) 00334 CALL STRSV( UPLO, 'No transpose', 'Non-unit', N-K, 00335 $ B( IOFFB+LDB ), LDB, A( IOFFA ), 1 ) 00336 END IF 00337 * 00338 * A( IOFFA ) -> A( K+1, K ) 00339 * B( IOFFB ) -> B( K+1, K ) 00340 * 00341 IOFFA = IOFFA + LDA + 1 00342 IOFFB = IOFFB + LDB + 1 00343 * 00344 20 CONTINUE 00345 * 00346 END IF 00347 * 00348 ELSE 00349 * 00350 IF( UPPER ) THEN 00351 * 00352 IOFFA = IIA + ( JJA-1 )*LDA 00353 IOFFB = IIB + ( JJB-1 )*LDB 00354 * 00355 * Compute U*sub( A )*U' 00356 * 00357 DO 30 K = 1, N 00358 * 00359 * Update the upper triangle of A(ia:ia+k-1,ja:ja+k-1) 00360 * 00361 AKK = A( IOFFA+K-1 ) 00362 BKK = B( IOFFB+K-1 ) 00363 CALL STRMV( UPLO, 'No transpose', 'Non-unit', K-1, 00364 $ B( IIB+( JJB-1 )*LDB ), LDB, A( IOFFA ), 1 ) 00365 CT = HALF*AKK 00366 CALL SAXPY( K-1, CT, B( IOFFB ), 1, A( IOFFA ), 1 ) 00367 CALL SSYR2( UPLO, K-1, ONE, A( IOFFA ), 1, B( IOFFB ), 1, 00368 $ A( IIA+( JJA-1 )*LDA ), LDA ) 00369 CALL SAXPY( K-1, CT, B( IOFFB ), 1, A( IOFFA ), 1 ) 00370 CALL SSCAL( K-1, BKK, A( IOFFA ), 1 ) 00371 A( IOFFA+K-1 ) = AKK*BKK**2 00372 * 00373 * A( IOFFA ) -> A( 1, K ) 00374 * B( IOFFB ) -> B( 1, K ) 00375 * 00376 IOFFA = IOFFA + LDA 00377 IOFFB = IOFFB + LDB 00378 * 00379 30 CONTINUE 00380 * 00381 ELSE 00382 * 00383 IOFFA = IIA + ( JJA-1 )*LDA 00384 IOFFB = IIB + ( JJB-1 )*LDB 00385 * 00386 * Compute L'*sub( A )*L 00387 * 00388 DO 40 K = 1, N 00389 * 00390 * Update the lower triangle of A(ia:ia+k-1,ja:ja+k-1) 00391 * 00392 AKK = A( IOFFA+( K-1 )*LDA ) 00393 BKK = B( IOFFB+( K-1 )*LDB ) 00394 CALL STRMV( UPLO, 'Transpose', 'Non-unit', K-1, 00395 $ B( IIB+( JJB-1 )*LDB ), LDB, A( IOFFA ), 00396 $ LDA ) 00397 CT = HALF*AKK 00398 CALL SAXPY( K-1, CT, B( IOFFB ), LDB, A( IOFFA ), LDA ) 00399 CALL SSYR2( UPLO, K-1, ONE, A( IOFFA ), LDA, B( IOFFB ), 00400 $ LDB, A( IIA+( JJA-1 )*LDA ), LDA ) 00401 CALL SAXPY( K-1, CT, B( IOFFB ), LDB, A( IOFFA ), LDA ) 00402 CALL SSCAL( K-1, BKK, A( IOFFA ), LDA ) 00403 A( IOFFA+( K-1 )*LDA ) = AKK*BKK**2 00404 * 00405 * A( IOFFA ) -> A( K, 1 ) 00406 * B( IOFFB ) -> B( K, 1 ) 00407 * 00408 IOFFA = IOFFA + 1 00409 IOFFB = IOFFB + 1 00410 * 00411 40 CONTINUE 00412 * 00413 END IF 00414 * 00415 END IF 00416 * 00417 RETURN 00418 * 00419 * End of PSSYGS2 00420 * 00421 END