|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 SUBROUTINE PZLASCL( TYPE, CFROM, CTO, M, N, A, IA, JA, DESCA, 00002 $ INFO ) 00003 * 00004 * -- ScaLAPACK auxiliary routine (version 1.7) -- 00005 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00006 * and University of California, Berkeley. 00007 * May 1, 1997 00008 * 00009 * .. Scalar Arguments .. 00010 CHARACTER TYPE 00011 INTEGER IA, INFO, JA, M, N 00012 DOUBLE PRECISION CFROM, CTO 00013 * .. 00014 * .. Array Arguments .. 00015 INTEGER DESCA( * ) 00016 COMPLEX*16 A( * ) 00017 * .. 00018 * 00019 * Purpose 00020 * ======= 00021 * 00022 * PZLASCL multiplies the M-by-N complex distributed matrix sub( A ) 00023 * denoting A(IA:IA+M-1,JA:JA+N-1) by the real scalar CTO/CFROM. This 00024 * is done without over/underflow as long as the final result 00025 * CTO * A(I,J) / CFROM does not over/underflow. TYPE specifies that 00026 * sub( A ) may be full, upper triangular, lower triangular or upper 00027 * Hessenberg. 00028 * 00029 * Notes 00030 * ===== 00031 * 00032 * Each global data object is described by an associated description 00033 * vector. This vector stores the information required to establish 00034 * the mapping between an object element and its corresponding process 00035 * and memory location. 00036 * 00037 * Let A be a generic term for any 2D block cyclicly distributed array. 00038 * Such a global array has an associated description vector DESCA. 00039 * In the following comments, the character _ should be read as 00040 * "of the global array". 00041 * 00042 * NOTATION STORED IN EXPLANATION 00043 * --------------- -------------- -------------------------------------- 00044 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 00045 * DTYPE_A = 1. 00046 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 00047 * the BLACS process grid A is distribu- 00048 * ted over. The context itself is glo- 00049 * bal, but the handle (the integer 00050 * value) may vary. 00051 * M_A (global) DESCA( M_ ) The number of rows in the global 00052 * array A. 00053 * N_A (global) DESCA( N_ ) The number of columns in the global 00054 * array A. 00055 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 00056 * the rows of the array. 00057 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 00058 * the columns of the array. 00059 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 00060 * row of the array A is distributed. 00061 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the 00062 * first column of the array A is 00063 * distributed. 00064 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 00065 * array. LLD_A >= MAX(1,LOCr(M_A)). 00066 * 00067 * Let K be the number of rows or columns of a distributed matrix, 00068 * and assume that its process grid has dimension p x q. 00069 * LOCr( K ) denotes the number of elements of K that a process 00070 * would receive if K were distributed over the p processes of its 00071 * process column. 00072 * Similarly, LOCc( K ) denotes the number of elements of K that a 00073 * process would receive if K were distributed over the q processes of 00074 * its process row. 00075 * The values of LOCr() and LOCc() may be determined via a call to the 00076 * ScaLAPACK tool function, NUMROC: 00077 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 00078 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 00079 * An upper bound for these quantities may be computed by: 00080 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 00081 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 00082 * 00083 * Arguments 00084 * ========= 00085 * 00086 * TYPE (global input) CHARACTER 00087 * TYPE indices the storage type of the input distributed 00088 * matrix. 00089 * = 'G': sub( A ) is a full matrix, 00090 * = 'L': sub( A ) is a lower triangular matrix, 00091 * = 'U': sub( A ) is an upper triangular matrix, 00092 * = 'H': sub( A ) is an upper Hessenberg matrix. 00093 * 00094 * CFROM (global input) DOUBLE PRECISION 00095 * CTO (global input) DOUBLE PRECISION 00096 * The distributed matrix sub( A ) is multiplied by CTO/CFROM. 00097 * A(I,J) is computed without over/underflow if the final 00098 * result CTO * A(I,J) / CFROM can be represented without 00099 * over/underflow. CFROM must be nonzero. 00100 * 00101 * M (global input) INTEGER 00102 * The number of rows to be operated on i.e the number of rows 00103 * of the distributed submatrix sub( A ). M >= 0. 00104 * 00105 * N (global input) INTEGER 00106 * The number of columns to be operated on i.e the number of 00107 * columns of the distributed submatrix sub( A ). N >= 0. 00108 * 00109 * A (local input/local output) COMPLEX*16 pointer into the 00110 * local memory to an array of dimension (LLD_A,LOCc(JA+N-1)). 00111 * This array contains the local pieces of the distributed 00112 * matrix sub( A ). On exit, this array contains the local 00113 * pieces of the distributed matrix multiplied by CTO/CFROM. 00114 * 00115 * IA (global input) INTEGER 00116 * The row index in the global array A indicating the first 00117 * row of sub( A ). 00118 * 00119 * JA (global input) INTEGER 00120 * The column index in the global array A indicating the 00121 * first column of sub( A ). 00122 * 00123 * DESCA (global and local input) INTEGER array of dimension DLEN_. 00124 * The array descriptor for the distributed matrix A. 00125 * 00126 * INFO (local output) INTEGER 00127 * = 0: successful exit 00128 * < 0: If the i-th argument is an array and the j-entry had 00129 * an illegal value, then INFO = -(i*100+j), if the i-th 00130 * argument is a scalar and had an illegal value, then 00131 * INFO = -i. 00132 * 00133 * ===================================================================== 00134 * 00135 * .. Parameters .. 00136 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 00137 $ LLD_, MB_, M_, NB_, N_, RSRC_ 00138 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 00139 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 00140 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 00141 DOUBLE PRECISION ONE, ZERO 00142 PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) 00143 * .. 00144 * .. Local Scalars .. 00145 LOGICAL DONE 00146 INTEGER IACOL, IAROW, ICOFFA, ICTXT, ICURCOL, ICURROW, 00147 $ IIA, II, INXTROW, IOFFA, IROFFA, ITYPE, J, JB, 00148 $ JJA, JJ, JN, KK, LDA, LL, MYCOL, MYROW, MP, 00149 $ NPCOL, NPROW, NQ 00150 DOUBLE PRECISION BIGNUM, CFROM1, CFROMC, CTO1, CTOC, MUL, SMLNUM 00151 * .. 00152 * .. External Subroutines .. 00153 EXTERNAL BLACS_GRIDINFO, CHK1MAT, INFOG2L, PXERBLA 00154 * .. 00155 * .. External Functions .. 00156 LOGICAL LSAME, DISNAN 00157 INTEGER ICEIL, NUMROC 00158 DOUBLE PRECISION PDLAMCH 00159 EXTERNAL DISNAN, ICEIL, LSAME, NUMROC, PDLAMCH 00160 * .. 00161 * .. Intrinsic Functions .. 00162 INTRINSIC ABS, MIN, MOD 00163 * .. 00164 * .. Executable Statements .. 00165 * 00166 * Get grid parameters 00167 * 00168 ICTXT = DESCA( CTXT_ ) 00169 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 00170 * 00171 * Test the input parameters 00172 * 00173 IF( NPROW.EQ.-1 ) THEN 00174 INFO = -907 00175 ELSE 00176 INFO = 0 00177 CALL CHK1MAT( M, 4, N, 6, IA, JA, DESCA, 9, INFO ) 00178 IF( INFO.EQ.0 ) THEN 00179 IF( LSAME( TYPE, 'G' ) ) THEN 00180 ITYPE = 0 00181 ELSE IF( LSAME( TYPE, 'L' ) ) THEN 00182 ITYPE = 1 00183 ELSE IF( LSAME( TYPE, 'U' ) ) THEN 00184 ITYPE = 2 00185 ELSE IF( LSAME( TYPE, 'H' ) ) THEN 00186 ITYPE = 3 00187 ELSE 00188 ITYPE = -1 00189 END IF 00190 IF( ITYPE.EQ.-1 ) THEN 00191 INFO = -1 00192 ELSE IF( CFROM.EQ.ZERO .OR. DISNAN(CFROM) ) THEN 00193 INFO = -4 00194 ELSE IF( DISNAN(CTO) ) THEN 00195 INFO = -5 00196 END IF 00197 END IF 00198 END IF 00199 * 00200 IF( INFO.NE.0 ) THEN 00201 CALL PXERBLA( ICTXT, 'PZLASCL', -INFO ) 00202 RETURN 00203 END IF 00204 * 00205 * Quick return if possible 00206 * 00207 IF( N.EQ.0 .OR. M.EQ.0 ) 00208 $ RETURN 00209 * 00210 * Get machine parameters 00211 * 00212 SMLNUM = PDLAMCH( ICTXT, 'S' ) 00213 BIGNUM = ONE / SMLNUM 00214 * 00215 CFROMC = CFROM 00216 CTOC = CTO 00217 * 00218 * Compute local indexes 00219 * 00220 LDA = DESCA( LLD_ ) 00221 IROFFA = MOD( IA-1, DESCA( MB_ ) ) 00222 ICOFFA = MOD( JA-1, DESCA( NB_ ) ) 00223 JN = MIN( ICEIL( JA, DESCA( NB_ ) ) * DESCA( NB_ ), JA+N-1 ) 00224 CALL INFOG2L( IA, JA, DESCA, NPROW, NPCOL, MYROW, MYCOL, IIA, JJA, 00225 $ IAROW, IACOL ) 00226 MP = NUMROC( M+IROFFA, DESCA( MB_ ), MYROW, IAROW, NPROW ) 00227 IF( MYROW.EQ.IAROW ) 00228 $ MP = MP - IROFFA 00229 NQ = NUMROC( N+ICOFFA, DESCA( NB_ ), MYCOL, IACOL, NPCOL ) 00230 IF( MYCOL.EQ.IACOL ) 00231 $ NQ = NQ - ICOFFA 00232 * 00233 10 CONTINUE 00234 CFROM1 = CFROMC*SMLNUM 00235 IF( CFROM1.EQ.CFROMC ) THEN 00236 ! CFROMC is an inf. Multiply by a correctly signed zero for 00237 ! finite CTOC, or a NaN if CTOC is infinite. 00238 MUL = CTOC / CFROMC 00239 DONE = .TRUE. 00240 CTO1 = CTOC 00241 ELSE 00242 CTO1 = CTOC / BIGNUM 00243 IF( CTO1.EQ.CTOC ) THEN 00244 ! CTOC is either 0 or an inf. In both cases, CTOC itself 00245 ! serves as the correct multiplication factor. 00246 MUL = CTOC 00247 DONE = .TRUE. 00248 CFROMC = ONE 00249 ELSE IF( ABS( CFROM1 ).GT.ABS( CTOC ) .AND. CTOC.NE.ZERO ) THEN 00250 MUL = SMLNUM 00251 DONE = .FALSE. 00252 CFROMC = CFROM1 00253 ELSE IF( ABS( CTO1 ).GT.ABS( CFROMC ) ) THEN 00254 MUL = BIGNUM 00255 DONE = .FALSE. 00256 CTOC = CTO1 00257 ELSE 00258 MUL = CTOC / CFROMC 00259 DONE = .TRUE. 00260 END IF 00261 END IF 00262 * 00263 IOFFA = ( JJA - 1 ) * LDA 00264 ICURROW = IAROW 00265 ICURCOL = IACOL 00266 * 00267 IF( ITYPE.EQ.0 ) THEN 00268 * 00269 * Full matrix 00270 * 00271 DO 30 JJ = JJA, JJA+NQ-1 00272 DO 20 II = IIA, IIA+MP-1 00273 A( IOFFA+II ) = A( IOFFA+II ) * MUL 00274 20 CONTINUE 00275 IOFFA = IOFFA + LDA 00276 30 CONTINUE 00277 * 00278 ELSE IF( ITYPE.EQ.1 ) THEN 00279 * 00280 * Lower triangular matrix 00281 * 00282 II = IIA 00283 JJ = JJA 00284 JB = JN-JA+1 00285 * 00286 IF( MYCOL.EQ.ICURCOL ) THEN 00287 IF( MYROW.EQ.ICURROW ) THEN 00288 DO 50 LL = JJ, JJ + JB -1 00289 DO 40 KK = II+LL-JJ, IIA+MP-1 00290 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00291 40 CONTINUE 00292 IOFFA = IOFFA + LDA 00293 50 CONTINUE 00294 ELSE 00295 DO 70 LL = JJ, JJ + JB -1 00296 DO 60 KK = II, IIA+MP-1 00297 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00298 60 CONTINUE 00299 IOFFA = IOFFA + LDA 00300 70 CONTINUE 00301 END IF 00302 JJ = JJ + JB 00303 END IF 00304 * 00305 IF( MYROW.EQ.ICURROW ) 00306 $ II = II + JB 00307 ICURROW = MOD( ICURROW+1, NPROW ) 00308 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00309 * 00310 * Loop over remaining block of columns 00311 * 00312 DO 120 J = JN+1, JA+N-1, DESCA( NB_ ) 00313 JB = MIN( JA+N-J, DESCA( NB_ ) ) 00314 * 00315 IF( MYCOL.EQ.ICURCOL ) THEN 00316 IF( MYROW.EQ.ICURROW ) THEN 00317 DO 90 LL = JJ, JJ + JB -1 00318 DO 80 KK = II+LL-JJ, IIA+MP-1 00319 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00320 80 CONTINUE 00321 IOFFA = IOFFA + LDA 00322 90 CONTINUE 00323 ELSE 00324 DO 110 LL = JJ, JJ + JB -1 00325 DO 100 KK = II, IIA+MP-1 00326 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00327 100 CONTINUE 00328 IOFFA = IOFFA + LDA 00329 110 CONTINUE 00330 END IF 00331 JJ = JJ + JB 00332 END IF 00333 * 00334 IF( MYROW.EQ.ICURROW ) 00335 $ II = II + JB 00336 ICURROW = MOD( ICURROW+1, NPROW ) 00337 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00338 * 00339 120 CONTINUE 00340 * 00341 ELSE IF( ITYPE.EQ.2 ) THEN 00342 * 00343 * Upper triangular matrix 00344 * 00345 II = IIA 00346 JJ = JJA 00347 JB = JN-JA+1 00348 * 00349 IF( MYCOL.EQ.ICURCOL ) THEN 00350 IF( MYROW.EQ.ICURROW ) THEN 00351 DO 140 LL = JJ, JJ + JB -1 00352 DO 130 KK = IIA, MIN(II+LL-JJ,IIA+MP-1) 00353 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00354 130 CONTINUE 00355 IOFFA = IOFFA + LDA 00356 140 CONTINUE 00357 ELSE 00358 DO 160 LL = JJ, JJ + JB -1 00359 DO 150 KK = IIA, MIN(II-1,IIA+MP-1) 00360 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00361 150 CONTINUE 00362 IOFFA = IOFFA + LDA 00363 160 CONTINUE 00364 END IF 00365 JJ = JJ + JB 00366 END IF 00367 * 00368 IF( MYROW.EQ.ICURROW ) 00369 $ II = II + JB 00370 ICURROW = MOD( ICURROW+1, NPROW ) 00371 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00372 * 00373 * Loop over remaining block of columns 00374 * 00375 DO 210 J = JN+1, JA+N-1, DESCA( NB_ ) 00376 JB = MIN( JA+N-J, DESCA( NB_ ) ) 00377 * 00378 IF( MYCOL.EQ.ICURCOL ) THEN 00379 IF( MYROW.EQ.ICURROW ) THEN 00380 DO 180 LL = JJ, JJ + JB -1 00381 DO 170 KK = IIA, MIN(II+LL-JJ,IIA+MP-1) 00382 A( IOFFA+KK ) = A( IOFFA+KK )*MUL 00383 170 CONTINUE 00384 IOFFA = IOFFA + LDA 00385 180 CONTINUE 00386 ELSE 00387 DO 200 LL = JJ, JJ + JB -1 00388 DO 190 KK = IIA, MIN(II-1,IIA+MP-1) 00389 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00390 190 CONTINUE 00391 IOFFA = IOFFA + LDA 00392 200 CONTINUE 00393 END IF 00394 JJ = JJ + JB 00395 END IF 00396 * 00397 IF( MYROW.EQ.ICURROW ) 00398 $ II = II + JB 00399 ICURROW = MOD( ICURROW+1, NPROW ) 00400 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00401 * 00402 210 CONTINUE 00403 * 00404 ELSE IF( ITYPE.EQ.3 ) THEN 00405 * 00406 * Upper Hessenberg matrix 00407 * 00408 II = IIA 00409 JJ = JJA 00410 JB = JN-JA+1 00411 * 00412 * Only one process row 00413 * 00414 IF( NPROW.EQ.1 ) THEN 00415 * 00416 * Handle first block of columns separately 00417 * 00418 IF( MYCOL.EQ.ICURCOL ) THEN 00419 DO 230 LL = JJ, JJ+JB-1 00420 DO 220 KK = IIA, MIN( II+LL-JJ+1, IIA+MP-1 ) 00421 A( IOFFA+KK ) = A( IOFFA+KK )*MUL 00422 220 CONTINUE 00423 IOFFA = IOFFA + LDA 00424 230 CONTINUE 00425 JJ = JJ + JB 00426 END IF 00427 * 00428 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00429 * 00430 * Loop over remaining block of columns 00431 * 00432 DO 260 J = JN+1, JA+N-1, DESCA( NB_ ) 00433 JB = MIN( JA+N-J, DESCA( NB_ ) ) 00434 * 00435 IF( MYCOL.EQ.ICURCOL ) THEN 00436 DO 250 LL = JJ, JJ+JB-1 00437 DO 240 KK = IIA, MIN( II+LL-JJ+1, IIA+MP-1 ) 00438 A( IOFFA+KK ) = A( IOFFA+KK )*MUL 00439 240 CONTINUE 00440 IOFFA = IOFFA + LDA 00441 250 CONTINUE 00442 JJ = JJ + JB 00443 END IF 00444 * 00445 II = II + JB 00446 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00447 * 00448 260 CONTINUE 00449 * 00450 ELSE 00451 * 00452 * Handle first block of columns separately 00453 * 00454 INXTROW = MOD( ICURROW+1, NPROW ) 00455 IF( MYCOL.EQ.ICURCOL ) THEN 00456 IF( MYROW.EQ.ICURROW ) THEN 00457 DO 280 LL = JJ, JJ + JB -1 00458 DO 270 KK = IIA, MIN(II+LL-JJ+1,IIA+MP-1) 00459 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00460 270 CONTINUE 00461 IOFFA = IOFFA + LDA 00462 280 CONTINUE 00463 ELSE 00464 DO 300 LL = JJ, JJ + JB -1 00465 DO 290 KK = IIA, MIN(II-1,IIA+MP-1) 00466 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00467 290 CONTINUE 00468 IOFFA = IOFFA + LDA 00469 300 CONTINUE 00470 IF( MYROW.EQ.INXTROW .AND. II.LE.IIA+MP-1 ) 00471 $ A( II+(JJ+JB-2)*LDA ) = A( II+(JJ+JB-2)*LDA ) * MUL 00472 END IF 00473 JJ = JJ + JB 00474 END IF 00475 * 00476 IF( MYROW.EQ.ICURROW ) 00477 $ II = II + JB 00478 ICURROW = INXTROW 00479 ICURROW = MOD( ICURROW+1, NPROW ) 00480 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00481 * 00482 * Loop over remaining block of columns 00483 * 00484 DO 350 J = JN+1, JA+N-1, DESCA( NB_ ) 00485 JB = MIN( JA+N-J, DESCA( NB_ ) ) 00486 * 00487 IF( MYCOL.EQ.ICURCOL ) THEN 00488 IF( MYROW.EQ.ICURROW ) THEN 00489 DO 320 LL = JJ, JJ + JB -1 00490 DO 310 KK = IIA, MIN( II+LL-JJ+1, IIA+MP-1 ) 00491 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00492 310 CONTINUE 00493 IOFFA = IOFFA + LDA 00494 320 CONTINUE 00495 ELSE 00496 DO 340 LL = JJ, JJ + JB -1 00497 DO 330 KK = IIA, MIN( II-1, IIA+MP-1 ) 00498 A( IOFFA+KK ) = A( IOFFA+KK ) * MUL 00499 330 CONTINUE 00500 IOFFA = IOFFA + LDA 00501 340 CONTINUE 00502 IF( MYROW.EQ.INXTROW .AND. II.LE.IIA+MP-1 ) 00503 $ A( II+(JJ+JB-2)*LDA ) = A( II+(JJ+JB-2)*LDA ) * 00504 $ MUL 00505 END IF 00506 JJ = JJ + JB 00507 END IF 00508 * 00509 IF( MYROW.EQ.ICURROW ) 00510 $ II = II + JB 00511 ICURROW = INXTROW 00512 ICURROW = MOD( ICURROW+1, NPROW ) 00513 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00514 * 00515 350 CONTINUE 00516 * 00517 END IF 00518 * 00519 END IF 00520 * 00521 IF( .NOT.DONE ) 00522 $ GO TO 10 00523 * 00524 RETURN 00525 * 00526 * End of PZLASCL 00527 * 00528 END