|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 SUBROUTINE PDLACP3( M, I, A, DESCA, B, LDB, II, JJ, REV ) 00002 IMPLICIT NONE 00003 * 00004 * -- ScaLAPACK routine (version 1.7) -- 00005 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00006 * and University of California, Berkeley. 00007 * May 25, 2001 00008 * 00009 * .. Scalar Arguments .. 00010 INTEGER I, II, JJ, LDB, M, REV 00011 * .. 00012 * .. Array Arguments .. 00013 INTEGER DESCA( * ) 00014 DOUBLE PRECISION A( * ), B( LDB, * ) 00015 * .. 00016 * 00017 * Purpose 00018 * ======= 00019 * 00020 * PDLACP3 is an auxiliary routine that copies from a global parallel 00021 * array into a local replicated array or vise versa. Notice that 00022 * the entire submatrix that is copied gets placed on one node or 00023 * more. The receiving node can be specified precisely, or all nodes 00024 * can receive, or just one row or column of nodes. 00025 * 00026 * Notes 00027 * ===== 00028 * 00029 * Each global data object is described by an associated description 00030 * vector. This vector stores the information required to establish 00031 * the mapping between an object element and its corresponding process 00032 * and memory location. 00033 * 00034 * Let A be a generic term for any 2D block cyclicly distributed array. 00035 * Such a global array has an associated description vector DESCA. 00036 * In the following comments, the character _ should be read as 00037 * "of the global array". 00038 * 00039 * NOTATION STORED IN EXPLANATION 00040 * --------------- -------------- -------------------------------------- 00041 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 00042 * DTYPE_A = 1. 00043 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 00044 * the BLACS process grid A is distribu- 00045 * ted over. The context itself is glo- 00046 * bal, but the handle (the integer 00047 * value) may vary. 00048 * M_A (global) DESCA( M_ ) The number of rows in the global 00049 * array A. 00050 * N_A (global) DESCA( N_ ) The number of columns in the global 00051 * array A. 00052 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 00053 * the rows of the array. 00054 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 00055 * the columns of the array. 00056 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 00057 * row of the array A is distributed. 00058 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the 00059 * first column of the array A is 00060 * distributed. 00061 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 00062 * array. LLD_A >= MAX(1,LOCr(M_A)). 00063 * 00064 * Let K be the number of rows or columns of a distributed matrix, 00065 * and assume that its process grid has dimension p x q. 00066 * LOCr( K ) denotes the number of elements of K that a process 00067 * would receive if K were distributed over the p processes of its 00068 * process column. 00069 * Similarly, LOCc( K ) denotes the number of elements of K that a 00070 * process would receive if K were distributed over the q processes of 00071 * its process row. 00072 * The values of LOCr() and LOCc() may be determined via a call to the 00073 * ScaLAPACK tool function, NUMROC: 00074 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 00075 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 00076 * An upper bound for these quantities may be computed by: 00077 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 00078 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 00079 * 00080 * Arguments 00081 * ========= 00082 * 00083 * M (global input) INTEGER 00084 * M is the order of the square submatrix that is copied. 00085 * M >= 0. 00086 * Unchanged on exit 00087 * 00088 * I (global input) INTEGER 00089 * A(I,I) is the global location that the copying starts from. 00090 * Unchanged on exit. 00091 * 00092 * A (global input/output) DOUBLE PRECISION array, dimension 00093 * (DESCA(LLD_),*) 00094 * On entry, the parallel matrix to be copied into or from. 00095 * On exit, if REV=1, the copied data. 00096 * Unchanged on exit if REV=0. 00097 * 00098 * DESCA (global and local input) INTEGER array of dimension DLEN_. 00099 * The array descriptor for the distributed matrix A. 00100 * 00101 * B (local input/output) DOUBLE PRECISION array of size (LDB,M) 00102 * If REV=0, this is the global portion of the array 00103 * A(I:I+M-1,I:I+M-1). 00104 * If REV=1, this is the unchanged on exit. 00105 * 00106 * LDB (local input) INTEGER 00107 * The leading dimension of B. 00108 * 00109 * II (global input) INTEGER 00110 * By using REV 0 & 1, data can be sent out and returned again. 00111 * If REV=0, then II is destination row index for the node(s) 00112 * receiving the replicated B. 00113 * If II>=0,JJ>=0, then node (II,JJ) receives the data 00114 * If II=-1,JJ>=0, then all rows in column JJ receive the 00115 * data 00116 * If II>=0,JJ=-1, then all cols in row II receive the data 00117 * If II=-1,JJ=-1, then all nodes receive the data 00118 * If REV<>0, then II is the source row index for the node(s) 00119 * sending the replicated B. 00120 * 00121 * JJ (global input) INTEGER 00122 * Similar description as II above 00123 * 00124 * REV (global input) INTEGER 00125 * Use REV = 0 to send global A into locally replicated B 00126 * (on node (II,JJ)). 00127 * Use REV <> 0 to send locally replicated B from node (II,JJ) 00128 * to its owner (which changes depending on its location in 00129 * A) into the global A. 00130 * 00131 * Implemented by: G. Henry, May 1, 1997 00132 * 00133 * ===================================================================== 00134 * 00135 * .. Parameters .. 00136 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 00137 $ LLD_, MB_, M_, NB_, N_, RSRC_ 00138 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 00139 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 00140 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 00141 DOUBLE PRECISION ZERO 00142 PARAMETER ( ZERO = 0.0D+0 ) 00143 * .. 00144 * .. Local Scalars .. 00145 INTEGER COL, CONTXT, HBL, IAFIRST, ICOL1, ICOL2, IDI, 00146 $ IDJ, IFIN, III, IROW1, IROW2, ISTOP, ISTOPI, 00147 $ ISTOPJ, ITMP, JAFIRST, JJJ, LDA, MYCOL, MYROW, 00148 $ NPCOL, NPROW, ROW 00149 * .. 00150 * .. External Functions .. 00151 INTEGER NUMROC 00152 EXTERNAL NUMROC 00153 * .. 00154 * .. External Subroutines .. 00155 EXTERNAL BLACS_GRIDINFO, DGEBR2D, DGEBS2D, DGERV2D, 00156 $ DGESD2D, INFOG1L 00157 * .. 00158 * .. Intrinsic Functions .. 00159 INTRINSIC MIN, MOD 00160 * .. 00161 * .. Executable Statements .. 00162 * 00163 IF( M.LE.0 ) 00164 $ RETURN 00165 * 00166 HBL = DESCA( MB_ ) 00167 CONTXT = DESCA( CTXT_ ) 00168 LDA = DESCA( LLD_ ) 00169 IAFIRST = DESCA( RSRC_ ) 00170 JAFIRST = DESCA( CSRC_ ) 00171 * 00172 CALL BLACS_GRIDINFO( CONTXT, NPROW, NPCOL, MYROW, MYCOL ) 00173 * 00174 IF( REV.EQ.0 ) THEN 00175 DO 20 IDI = 1, M 00176 DO 10 IDJ = 1, M 00177 B( IDI, IDJ ) = ZERO 00178 10 CONTINUE 00179 20 CONTINUE 00180 END IF 00181 * 00182 IFIN = I + M - 1 00183 * 00184 IF( MOD( I+HBL, HBL ).NE.0 ) THEN 00185 ISTOP = MIN( I+HBL-MOD( I+HBL, HBL ), IFIN ) 00186 ELSE 00187 ISTOP = I 00188 END IF 00189 IDJ = I 00190 ISTOPJ = ISTOP 00191 IF( IDJ.LE.IFIN ) THEN 00192 30 CONTINUE 00193 IDI = I 00194 ISTOPI = ISTOP 00195 IF( IDI.LE.IFIN ) THEN 00196 40 CONTINUE 00197 ROW = MOD( ( IDI-1 ) / HBL + IAFIRST, NPROW ) 00198 COL = MOD( ( IDJ-1 ) / HBL + JAFIRST, NPCOL ) 00199 CALL INFOG1L( IDI, HBL, NPROW, ROW, IAFIRST, IROW1, ITMP ) 00200 IROW2 = NUMROC( ISTOPI, HBL, ROW, IAFIRST, NPROW ) 00201 CALL INFOG1L( IDJ, HBL, NPCOL, COL, JAFIRST, ICOL1, ITMP ) 00202 ICOL2 = NUMROC( ISTOPJ, HBL, COL, JAFIRST, NPCOL ) 00203 IF( ( MYROW.EQ.ROW ) .AND. ( MYCOL.EQ.COL ) ) THEN 00204 IF( ( II.EQ.-1 ) .AND. ( JJ.EQ.-1 ) ) THEN 00205 * 00206 * Send the message to everyone 00207 * 00208 IF( REV.EQ.0 ) THEN 00209 CALL DGEBS2D( CONTXT, 'All', ' ', IROW2-IROW1+1, 00210 $ ICOL2-ICOL1+1, A( ( ICOL1-1 )*LDA+ 00211 $ IROW1 ), LDA ) 00212 END IF 00213 END IF 00214 IF( ( II.EQ.-1 ) .AND. ( JJ.NE.-1 ) ) THEN 00215 * 00216 * Send the message to Column MYCOL which better be JJ 00217 * 00218 IF( REV.EQ.0 ) THEN 00219 CALL DGEBS2D( CONTXT, 'Col', ' ', IROW2-IROW1+1, 00220 $ ICOL2-ICOL1+1, A( ( ICOL1-1 )*LDA+ 00221 $ IROW1 ), LDA ) 00222 END IF 00223 END IF 00224 IF( ( II.NE.-1 ) .AND. ( JJ.EQ.-1 ) ) THEN 00225 * 00226 * Send the message to Row MYROW which better be II 00227 * 00228 IF( REV.EQ.0 ) THEN 00229 CALL DGEBS2D( CONTXT, 'Row', ' ', IROW2-IROW1+1, 00230 $ ICOL2-ICOL1+1, A( ( ICOL1-1 )*LDA+ 00231 $ IROW1 ), LDA ) 00232 END IF 00233 END IF 00234 IF( ( II.NE.-1 ) .AND. ( JJ.NE.-1 ) .AND. 00235 $ ( ( MYROW.NE.II ) .OR. ( MYCOL.NE.JJ ) ) ) THEN 00236 * 00237 * Recv/Send the message to (II,JJ) 00238 * 00239 IF( REV.EQ.0 ) THEN 00240 CALL DGESD2D( CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 00241 $ A( ( ICOL1-1 )*LDA+IROW1 ), LDA, II, 00242 $ JJ ) 00243 ELSE 00244 CALL DGERV2D( CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 00245 $ B( IDI-I+1, IDJ-I+1 ), LDB, II, JJ ) 00246 END IF 00247 END IF 00248 IF( REV.EQ.0 ) THEN 00249 DO 60 JJJ = ICOL1, ICOL2 00250 DO 50 III = IROW1, IROW2 00251 B( IDI+III-IROW1+1-I, IDJ+JJJ-ICOL1+1-I ) 00252 $ = A( ( JJJ-1 )*LDA+III ) 00253 50 CONTINUE 00254 60 CONTINUE 00255 ELSE 00256 DO 80 JJJ = ICOL1, ICOL2 00257 DO 70 III = IROW1, IROW2 00258 A( ( JJJ-1 )*LDA+III ) = B( IDI+III-IROW1+1-I, 00259 $ IDJ+JJJ-ICOL1+1-I ) 00260 70 CONTINUE 00261 80 CONTINUE 00262 END IF 00263 ELSE 00264 IF( ( II.EQ.-1 ) .AND. ( JJ.EQ.-1 ) ) THEN 00265 IF( REV.EQ.0 ) THEN 00266 CALL DGEBR2D( CONTXT, 'All', ' ', IROW2-IROW1+1, 00267 $ ICOL2-ICOL1+1, B( IDI-I+1, IDJ-I+1 ), 00268 $ LDB, ROW, COL ) 00269 END IF 00270 END IF 00271 IF( ( II.EQ.-1 ) .AND. ( JJ.EQ.MYCOL ) ) THEN 00272 IF( REV.EQ.0 ) THEN 00273 CALL DGEBR2D( CONTXT, 'Col', ' ', IROW2-IROW1+1, 00274 $ ICOL2-ICOL1+1, B( IDI-I+1, IDJ-I+1 ), 00275 $ LDB, ROW, COL ) 00276 END IF 00277 END IF 00278 IF( ( II.EQ.MYROW ) .AND. ( JJ.EQ.-1 ) ) THEN 00279 IF( REV.EQ.0 ) THEN 00280 CALL DGEBR2D( CONTXT, 'Row', ' ', IROW2-IROW1+1, 00281 $ ICOL2-ICOL1+1, B( IDI-I+1, IDJ-I+1 ), 00282 $ LDB, ROW, COL ) 00283 END IF 00284 END IF 00285 IF( ( II.EQ.MYROW ) .AND. ( JJ.EQ.MYCOL ) ) THEN 00286 IF( REV.EQ.0 ) THEN 00287 CALL DGERV2D( CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 00288 $ B( IDI-I+1, IDJ-I+1 ), LDB, ROW, 00289 $ COL ) 00290 ELSE 00291 CALL DGESD2D( CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 00292 $ B( IDI-I+1, IDJ-I+1 ), LDB, ROW, 00293 $ COL ) 00294 * CALL DGESD2D(CONTXT, IROW2-IROW1+1, ICOL2-ICOL1+1, 00295 * $ A((ICOL1-1)*LDA+IROW1),LDA, ROW, COL) 00296 END IF 00297 END IF 00298 END IF 00299 IDI = ISTOPI + 1 00300 ISTOPI = MIN( ISTOPI+HBL, IFIN ) 00301 IF( IDI.LE.IFIN ) 00302 $ GO TO 40 00303 END IF 00304 IDJ = ISTOPJ + 1 00305 ISTOPJ = MIN( ISTOPJ+HBL, IFIN ) 00306 IF( IDJ.LE.IFIN ) 00307 $ GO TO 30 00308 END IF 00309 RETURN 00310 * 00311 * End of PDLACP3 00312 * 00313 END