|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 SUBROUTINE PCMAX1( N, AMAX, INDX, X, IX, JX, DESCX, INCX ) 00002 * 00003 * -- ScaLAPACK auxiliary routine (version 1.7) -- 00004 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00005 * and University of California, Berkeley. 00006 * May 25, 2001 00007 * 00008 * .. Scalar Arguments .. 00009 INTEGER INDX, INCX, IX, JX, N 00010 COMPLEX AMAX 00011 * .. 00012 * .. Array Arguments .. 00013 INTEGER DESCX( * ) 00014 COMPLEX X( * ) 00015 * .. 00016 * 00017 * Purpose 00018 * ======= 00019 * 00020 * PCMAX1 computes the global index of the maximum element in absolute 00021 * value of a distributed vector sub( X ). The global index is returned 00022 * in INDX and the value is returned in AMAX, 00023 * 00024 * where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1, 00025 * X(IX,JX:JX+N-1) if INCX = M_X. 00026 * 00027 * Notes 00028 * ===== 00029 * 00030 * Each global data object is described by an associated description 00031 * vector. This vector stores the information required to establish 00032 * the mapping between an object element and its corresponding process 00033 * and memory location. 00034 * 00035 * Let A be a generic term for any 2D block cyclicly distributed array. 00036 * Such a global array has an associated description vector DESCA. 00037 * In the following comments, the character _ should be read as 00038 * "of the global array". 00039 * 00040 * NOTATION STORED IN EXPLANATION 00041 * --------------- -------------- -------------------------------------- 00042 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 00043 * DTYPE_A = 1. 00044 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 00045 * the BLACS process grid A is distribu- 00046 * ted over. The context itself is glo- 00047 * bal, but the handle (the integer 00048 * value) may vary. 00049 * M_A (global) DESCA( M_ ) The number of rows in the global 00050 * array A. 00051 * N_A (global) DESCA( N_ ) The number of columns in the global 00052 * array A. 00053 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 00054 * the rows of the array. 00055 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 00056 * the columns of the array. 00057 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 00058 * row of the array A is distributed. 00059 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the 00060 * first column of the array A is 00061 * distributed. 00062 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 00063 * array. LLD_A >= MAX(1,LOCr(M_A)). 00064 * 00065 * Let K be the number of rows or columns of a distributed matrix, 00066 * and assume that its process grid has dimension p x q. 00067 * LOCr( K ) denotes the number of elements of K that a process 00068 * would receive if K were distributed over the p processes of its 00069 * process column. 00070 * Similarly, LOCc( K ) denotes the number of elements of K that a 00071 * process would receive if K were distributed over the q processes of 00072 * its process row. 00073 * The values of LOCr() and LOCc() may be determined via a call to the 00074 * ScaLAPACK tool function, NUMROC: 00075 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 00076 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 00077 * An upper bound for these quantities may be computed by: 00078 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 00079 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 00080 * 00081 * Because vectors may be viewed as a subclass of matrices, a 00082 * distributed vector is considered to be a distributed matrix. 00083 * 00084 * When the result of a vector-oriented PBLAS call is a scalar, it will 00085 * be made available only within the scope which owns the vector(s) 00086 * being operated on. Let X be a generic term for the input vector(s). 00087 * Then, the processes which receive the answer will be (note that if 00088 * an operation involves more than one vector, the processes which re- 00089 * ceive the result will be the union of the following calculation for 00090 * each vector): 00091 * 00092 * If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process 00093 * row or process column owns the vector operand, therefore only the 00094 * process of coordinate {RSRC_X, CSRC_X} receives the result; 00095 * 00096 * If INCX = M_X, then sub( X ) is a vector distributed over a process 00097 * row. Each process part of this row receives the result; 00098 * 00099 * If INCX = 1, then sub( X ) is a vector distributed over a process 00100 * column. Each process part of this column receives the result; 00101 * 00102 * Based on PCAMAX from Level 1 PBLAS. The change is to use the 00103 * 'genuine' absolute value. 00104 * 00105 * The serial version was contributed to LAPACK by Nick Higham for use 00106 * with CLACON. 00107 * 00108 * Arguments 00109 * ========= 00110 * 00111 * N (global input) pointer to INTEGER 00112 * The number of components of the distributed vector sub( X ). 00113 * N >= 0. 00114 * 00115 * AMAX (global output) pointer to REAL 00116 * The absolute value of the largest entry of the distributed 00117 * vector sub( X ) only in the scope of sub( X ). 00118 * 00119 * INDX (global output) pointer to INTEGER 00120 * The global index of the element of the distributed vector 00121 * sub( X ) whose real part has maximum absolute value. 00122 * 00123 * X (local input) COMPLEX array containing the local 00124 * pieces of a distributed matrix of dimension of at least 00125 * ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) ) 00126 * This array contains the entries of the distributed vector 00127 * sub( X ). 00128 * 00129 * IX (global input) INTEGER 00130 * The row index in the global array X indicating the first 00131 * row of sub( X ). 00132 * 00133 * JX (global input) INTEGER 00134 * The column index in the global array X indicating the 00135 * first column of sub( X ). 00136 * 00137 * DESCX (global and local input) INTEGER array of dimension DLEN_. 00138 * The array descriptor for the distributed matrix X. 00139 * 00140 * INCX (global input) INTEGER 00141 * The global increment for the elements of X. Only two values 00142 * of INCX are supported in this version, namely 1 and M_X. 00143 * INCX must not be zero. 00144 * 00145 * ===================================================================== 00146 * 00147 * .. Parameters .. 00148 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 00149 $ LLD_, MB_, M_, NB_, N_, RSRC_ 00150 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 00151 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 00152 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 00153 COMPLEX ZERO 00154 PARAMETER ( ZERO = ( 0.0E+0, 0.0E+0 ) ) 00155 * .. 00156 * .. Local Scalars .. 00157 CHARACTER CBTOP, CCTOP, RBTOP, RCTOP 00158 INTEGER ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW, 00159 $ JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP, 00160 $ NPCOL, NPROW, NQ 00161 * .. 00162 * .. Local Arrays .. 00163 COMPLEX WORK( 2 ) 00164 * .. 00165 * .. External Subroutines .. 00166 EXTERNAL BLACS_GRIDINFO, CCOMBAMAX1, CGAMX2D, 00167 $ IGEBR2D, IGEBS2D, INFOG2L, PCTREECOMB, 00168 $ PB_TOPGET 00169 * .. 00170 * .. External Functions .. 00171 LOGICAL LSAME 00172 INTEGER ICMAX1, INDXL2G, NUMROC 00173 EXTERNAL ICMAX1, INDXL2G, NUMROC 00174 * .. 00175 * .. Intrinsic Functions .. 00176 INTRINSIC ABS, CMPLX, MOD, NINT, REAL 00177 * .. 00178 * .. Executable Statements .. 00179 * 00180 * Get grid parameters 00181 * 00182 ICTXT = DESCX( CTXT_ ) 00183 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 00184 * 00185 * Quick return if possible. 00186 * 00187 INDX = 0 00188 AMAX = ZERO 00189 IF( N.LE.0 ) 00190 $ RETURN 00191 * 00192 * Retrieve local information for vector X. 00193 * 00194 LDX = DESCX( LLD_ ) 00195 CALL INFOG2L( IX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, IIX, JJX, 00196 $ IXROW, IXCOL ) 00197 * 00198 IF( INCX.EQ.1 .AND. DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN 00199 INDX = JX 00200 AMAX = X( IIX+(JJX-1)*LDX ) 00201 RETURN 00202 END IF 00203 * 00204 * Find the maximum value and its index 00205 * 00206 IF( INCX.EQ.DESCX( M_ ) ) THEN 00207 * 00208 IF( MYROW.EQ.IXROW ) THEN 00209 * 00210 ICOFF = MOD( JX-1, DESCX( NB_ ) ) 00211 NQ = NUMROC( N+ICOFF, DESCX( NB_ ), MYCOL, IXCOL, NPCOL ) 00212 IF( MYCOL.EQ.IXCOL ) 00213 $ NQ = NQ-ICOFF 00214 * 00215 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', RBTOP ) 00216 * 00217 IF( LSAME( RBTOP, ' ' ) ) THEN 00218 * 00219 IF( NQ.GT.0 ) THEN 00220 LCINDX = JJX-1+ICMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX ) 00221 WORK( 1 ) = X( IIX+(LCINDX-1)*LDX ) 00222 WORK( 2 ) = CMPLX( REAL( INDXL2G( LCINDX, $ DESCX( NB_ ), MYCOL, DESCX( CSRC_ ), NPCOL ) ) ) 00223 ELSE 00224 WORK( 1 ) = ZERO 00225 WORK( 2 ) = ZERO 00226 END IF 00227 * 00228 CALL PCTREECOMB( ICTXT, 'Row', 2, WORK, -1, MYCOL, 00229 $ CCOMBAMAX1 ) 00230 * 00231 AMAX = WORK( 1 ) 00232 IF( AMAX.EQ.ZERO ) THEN 00233 INDX = JX 00234 ELSE 00235 INDX = NINT( REAL( WORK( 2 ) ) ) 00236 END IF 00237 * 00238 ELSE 00239 * 00240 CALL PB_TOPGET( ICTXT, 'Combine', 'Rowwise', RCTOP ) 00241 * 00242 IF( NQ.GT.0 ) THEN 00243 LCINDX = JJX-1+ICMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX ) 00244 AMAX = X( IIX + (LCINDX-1)*LDX ) 00245 ELSE 00246 AMAX = ZERO 00247 END IF 00248 * 00249 * Find the maximum value 00250 * 00251 CALL CGAMX2D( ICTXT, 'Rowwise', RCTOP, 1, 1, AMAX, 1, 00252 $ IDUMM, MAXPOS, 1, -1, MYROW ) 00253 * 00254 IF( AMAX.NE.ZERO ) THEN 00255 * 00256 * Broadcast corresponding global index 00257 * 00258 IF( MYCOL.EQ.MAXPOS ) THEN 00259 INDX = INDXL2G( LCINDX, DESCX( NB_ ), MYCOL, 00260 $ DESCX( CSRC_ ), NPCOL ) 00261 CALL IGEBS2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX, 00262 $ 1 ) 00263 ELSE 00264 CALL IGEBR2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX, 00265 $ 1, MYROW, MAXPOS ) 00266 END IF 00267 * 00268 ELSE 00269 * 00270 INDX = JX 00271 * 00272 END IF 00273 * 00274 END IF 00275 * 00276 END IF 00277 * 00278 ELSE 00279 * 00280 IF( MYCOL.EQ.IXCOL ) THEN 00281 * 00282 IROFF = MOD( IX-1, DESCX( MB_ ) ) 00283 NP = NUMROC( N+IROFF, DESCX( MB_ ), MYROW, IXROW, NPROW ) 00284 IF( MYROW.EQ.IXROW ) 00285 $ NP = NP-IROFF 00286 * 00287 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', CBTOP ) 00288 * 00289 IF( LSAME( CBTOP, ' ' ) ) THEN 00290 * 00291 IF( NP.GT.0 ) THEN 00292 LCINDX = IIX-1+ICMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 ) 00293 WORK( 1 ) = X( LCINDX + (JJX-1)*LDX ) 00294 WORK( 2 ) = CMPLX( REAL( INDXL2G( LCINDX, $ DESCX( MB_ ), MYROW, DESCX( RSRC_ ), NPROW ) ) ) 00295 ELSE 00296 WORK( 1 ) = ZERO 00297 WORK( 2 ) = ZERO 00298 END IF 00299 * 00300 CALL PCTREECOMB( ICTXT, 'Column', 2, WORK, -1, MYCOL, 00301 $ CCOMBAMAX1 ) 00302 * 00303 AMAX = WORK( 1 ) 00304 IF( AMAX.EQ.ZERO ) THEN 00305 INDX = IX 00306 ELSE 00307 INDX = NINT( REAL( WORK( 2 ) ) ) 00308 END IF 00309 * 00310 ELSE 00311 * 00312 CALL PB_TOPGET( ICTXT, 'Combine', 'Columnwise', CCTOP ) 00313 * 00314 IF( NP.GT.0 ) THEN 00315 LCINDX = IIX-1+ICMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 ) 00316 AMAX = X( LCINDX + (JJX-1)*LDX ) 00317 ELSE 00318 AMAX = ZERO 00319 END IF 00320 * 00321 * Find the maximum value 00322 * 00323 CALL CGAMX2D( ICTXT, 'Columnwise', CCTOP, 1, 1, AMAX, 1, 00324 $ MAXPOS, IDUMM, 1, -1, MYCOL ) 00325 * 00326 IF( AMAX.NE.ZERO ) THEN 00327 * 00328 * Broadcast corresponding global index 00329 * 00330 IF( MYROW.EQ.MAXPOS ) THEN 00331 INDX = INDXL2G( LCINDX, DESCX( MB_ ), MYROW, 00332 $ DESCX( RSRC_ ), NPROW ) 00333 CALL IGEBS2D( ICTXT, 'Columnwise', CBTOP, 1, 1, 00334 $ INDX, 1 ) 00335 ELSE 00336 CALL IGEBR2D( ICTXT, 'Columnwise', CBTOP, 1, 1, 00337 $ INDX, 1, MAXPOS, MYCOL ) 00338 END IF 00339 * 00340 ELSE 00341 * 00342 INDX = IX 00343 * 00344 END IF 00345 * 00346 END IF 00347 * 00348 END IF 00349 * 00350 END IF 00351 * 00352 RETURN 00353 * 00354 * End of PCMAX1 00355 * 00356 END 00357 * 00358 SUBROUTINE CCOMBAMAX1 ( V1, V2 ) 00359 * 00360 * -- ScaLAPACK auxiliary routine (version 1.7) -- 00361 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00362 * and University of California, Berkeley. 00363 * May 1, 1997 00364 * 00365 * .. Array Arguments .. 00366 COMPLEX V1( 2 ), V2( 2 ) 00367 * .. 00368 * 00369 * Purpose 00370 * ======= 00371 * 00372 * CCOMBAMAX1 finds the element having maximum real part absolute 00373 * value as well as its corresponding globl index. 00374 * 00375 * Arguments 00376 * ========= 00377 * 00378 * V1 (local input/local output) COMPLEX array of 00379 * dimension 2. The first maximum absolute value element and 00380 * its global index. V1(1) = AMAX, V1(2) = INDX. 00381 * 00382 * V2 (local input) COMPLEX array of dimension 2. 00383 * The second maximum absolute value element and its global 00384 * index. V2(1) = AMAX, V2(2) = INDX. 00385 * 00386 * ===================================================================== 00387 * 00388 * .. Intrinsic Functions .. 00389 INTRINSIC ABS, REAL 00390 * .. 00391 * .. Executable Statements .. 00392 * 00393 IF( ABS( REAL( V1( 1 ) ) ).LT.ABS( REAL( V2( 1 ) ) ) ) THEN 00394 V1( 1 ) = V2( 1 ) 00395 V1( 2 ) = V2( 2 ) 00396 END IF 00397 * 00398 RETURN 00399 * 00400 * End of CCOMBAMAX1 00401 * 00402 END 00403