|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 SUBROUTINE PDLACPY( UPLO, M, N, A, IA, JA, DESCA, B, IB, JB, 00002 $ DESCB ) 00003 * 00004 * -- ScaLAPACK auxiliary routine (version 1.7) -- 00005 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00006 * and University of California, Berkeley. 00007 * May 1, 1997 00008 * 00009 * .. Scalar Arguments .. 00010 CHARACTER UPLO 00011 INTEGER IA, IB, JA, JB, M, N 00012 * .. 00013 * .. Array Arguments .. 00014 INTEGER DESCA( * ), DESCB( * ) 00015 DOUBLE PRECISION A( * ), B( * ) 00016 * .. 00017 * 00018 * Purpose 00019 * ======= 00020 * 00021 * PDLACPY copies all or part of a distributed matrix A to another 00022 * distributed matrix B. No communication is performed, PDLACPY 00023 * performs a local copy sub( A ) := sub( B ), where sub( A ) denotes 00024 * A(IA:IA+M-1,JA:JA+N-1) and sub( B ) denotes B(IB:IB+M-1,JB:JB+N-1). 00025 * 00026 * Notes 00027 * ===== 00028 * 00029 * Each global data object is described by an associated description 00030 * vector. This vector stores the information required to establish 00031 * the mapping between an object element and its corresponding process 00032 * and memory location. 00033 * 00034 * Let A be a generic term for any 2D block cyclicly distributed array. 00035 * Such a global array has an associated description vector DESCA. 00036 * In the following comments, the character _ should be read as 00037 * "of the global array". 00038 * 00039 * NOTATION STORED IN EXPLANATION 00040 * --------------- -------------- -------------------------------------- 00041 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 00042 * DTYPE_A = 1. 00043 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 00044 * the BLACS process grid A is distribu- 00045 * ted over. The context itself is glo- 00046 * bal, but the handle (the integer 00047 * value) may vary. 00048 * M_A (global) DESCA( M_ ) The number of rows in the global 00049 * array A. 00050 * N_A (global) DESCA( N_ ) The number of columns in the global 00051 * array A. 00052 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 00053 * the rows of the array. 00054 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 00055 * the columns of the array. 00056 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 00057 * row of the array A is distributed. 00058 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the 00059 * first column of the array A is 00060 * distributed. 00061 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 00062 * array. LLD_A >= MAX(1,LOCr(M_A)). 00063 * 00064 * Let K be the number of rows or columns of a distributed matrix, 00065 * and assume that its process grid has dimension p x q. 00066 * LOCr( K ) denotes the number of elements of K that a process 00067 * would receive if K were distributed over the p processes of its 00068 * process column. 00069 * Similarly, LOCc( K ) denotes the number of elements of K that a 00070 * process would receive if K were distributed over the q processes of 00071 * its process row. 00072 * The values of LOCr() and LOCc() may be determined via a call to the 00073 * ScaLAPACK tool function, NUMROC: 00074 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 00075 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 00076 * An upper bound for these quantities may be computed by: 00077 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 00078 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 00079 * 00080 * Arguments 00081 * ========= 00082 * 00083 * UPLO (global input) CHARACTER 00084 * Specifies the part of the distributed matrix sub( A ) to be 00085 * copied: 00086 * = 'U': Upper triangular part is copied; the strictly 00087 * lower triangular part of sub( A ) is not referenced; 00088 * = 'L': Lower triangular part is copied; the strictly 00089 * upper triangular part of sub( A ) is not referenced; 00090 * Otherwise: All of the matrix sub( A ) is copied. 00091 * 00092 * M (global input) INTEGER 00093 * The number of rows to be operated on i.e the number of rows 00094 * of the distributed submatrix sub( A ). M >= 0. 00095 * 00096 * N (global input) INTEGER 00097 * The number of columns to be operated on i.e the number of 00098 * columns of the distributed submatrix sub( A ). N >= 0. 00099 * 00100 * A (local input) DOUBLE PRECISION pointer into the local memory 00101 * to an array of dimension (LLD_A, LOCc(JA+N-1) ). This array 00102 * contains the local pieces of the distributed matrix sub( A ) 00103 * to be copied from. 00104 * 00105 * IA (global input) INTEGER 00106 * The row index in the global array A indicating the first 00107 * row of sub( A ). 00108 * 00109 * JA (global input) INTEGER 00110 * The column index in the global array A indicating the 00111 * first column of sub( A ). 00112 * 00113 * DESCA (global and local input) INTEGER array of dimension DLEN_. 00114 * The array descriptor for the distributed matrix A. 00115 * 00116 * B (local output) DOUBLE PRECISION pointer into the local memory 00117 * to an array of dimension (LLD_B, LOCc(JB+N-1) ). This array 00118 * contains on exit the local pieces of the distributed matrix 00119 * sub( B ) set as follows: 00120 * 00121 * if UPLO = 'U', B(IB+i-1,JB+j-1) = A(IA+i-1,JA+j-1), 00122 * 1<=i<=j, 1<=j<=N; 00123 * if UPLO = 'L', B(IB+i-1,JB+j-1) = A(IA+i-1,JA+j-1), 00124 * j<=i<=M, 1<=j<=N; 00125 * otherwise, B(IB+i-1,JB+j-1) = A(IA+i-1,JA+j-1), 00126 * 1<=i<=M, 1<=j<=N. 00127 * 00128 * IB (global input) INTEGER 00129 * The row index in the global array B indicating the first 00130 * row of sub( B ). 00131 * 00132 * JB (global input) INTEGER 00133 * The column index in the global array B indicating the 00134 * first column of sub( B ). 00135 * 00136 * DESCB (global and local input) INTEGER array of dimension DLEN_. 00137 * The array descriptor for the distributed matrix B. 00138 * 00139 * ===================================================================== 00140 * 00141 * .. Parameters .. 00142 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 00143 $ LLD_, MB_, M_, NB_, N_, RSRC_ 00144 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 00145 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 00146 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 00147 * .. 00148 * .. Local Scalars .. 00149 INTEGER I, IAA, IBB, IBLK, IN, ITMP, J, JAA, JBB, 00150 $ JBLK, JN, JTMP 00151 * .. 00152 * .. External Subroutines .. 00153 EXTERNAL PDLACP2 00154 * .. 00155 * .. External Functions .. 00156 LOGICAL LSAME 00157 INTEGER ICEIL 00158 EXTERNAL ICEIL, LSAME 00159 * .. 00160 * .. Intrinsic Functions .. 00161 INTRINSIC MIN, MOD 00162 * .. 00163 * .. Executable Statements .. 00164 * 00165 IF( M.EQ.0 .OR. N.EQ.0 ) 00166 $ RETURN 00167 * 00168 IN = MIN( ICEIL( IA, DESCA( MB_ ) ) * DESCA( MB_ ), IA+M-1 ) 00169 JN = MIN( ICEIL( JA, DESCA( NB_ ) ) * DESCA( NB_ ), JA+N-1 ) 00170 * 00171 IF( M.LE.( DESCA( MB_ ) - MOD( IA-1, DESCA( MB_ ) ) ) .OR. 00172 $ N.LE.( DESCA( NB_ ) - MOD( JA-1, DESCA( NB_ ) ) ) ) THEN 00173 CALL PDLACP2( UPLO, M, N, A, IA, JA, DESCA, 00174 $ B, IB, JB, DESCB ) 00175 ELSE 00176 * 00177 IF( LSAME( UPLO, 'U' ) ) THEN 00178 CALL PDLACP2( UPLO, IN-IA+1, N, A, IA, JA, DESCA, 00179 $ B, IB, JB, DESCB ) 00180 DO 10 I = IN+1, IA+M-1, DESCA( MB_ ) 00181 ITMP = I-IA 00182 IBLK = MIN( DESCA( MB_ ), M-ITMP ) 00183 IBB = IB + ITMP 00184 JBB = JB + ITMP 00185 JAA = JA + ITMP 00186 CALL PDLACP2( UPLO, IBLK, N-ITMP, A, I, JAA, DESCA, 00187 $ B, IBB, JBB, DESCB ) 00188 10 CONTINUE 00189 ELSE IF( LSAME( UPLO, 'L' ) ) THEN 00190 CALL PDLACP2( UPLO, M, JN-JA+1, A, IA, JA, DESCA, 00191 $ B, IB, JB, DESCB ) 00192 DO 20 J = JN+1, JA+N-1, DESCA( NB_ ) 00193 JTMP = J-JA 00194 JBLK = MIN( DESCA( NB_ ), N-JTMP ) 00195 IBB = IB + JTMP 00196 JBB = JB + JTMP 00197 IAA = IA + JTMP 00198 CALL PDLACP2( UPLO, M-JTMP, JBLK, A, IAA, J, DESCA, 00199 $ B, IBB, JBB, DESCB ) 00200 20 CONTINUE 00201 ELSE 00202 IF( M.LE.N ) THEN 00203 CALL PDLACP2( UPLO, IN-IA+1, N, A, IA, JA, DESCA, 00204 $ B, IB, JB, DESCB ) 00205 DO 30 I = IN+1, IA+M-1, DESCA( MB_ ) 00206 ITMP = I-IA 00207 IBLK = MIN( DESCA( MB_ ), M-ITMP ) 00208 IBB = IB+ITMP 00209 CALL PDLACP2( UPLO, IBLK, N, A, I, JA, DESCA, 00210 $ B, IBB, JB, DESCB ) 00211 30 CONTINUE 00212 ELSE 00213 CALL PDLACP2( UPLO, M, JN-JA+1, A, IA, JA, DESCA, 00214 $ B, IB, JB, DESCB ) 00215 DO 40 J = JN+1, JA+N-1, DESCA( NB_ ) 00216 JTMP = J-JA 00217 JBLK = MIN( DESCA( NB_ ), N-JTMP ) 00218 JBB = JB+JTMP 00219 CALL PDLACP2( UPLO, M, JBLK, A, IA, J, DESCA, 00220 $ B, IB, JBB, DESCB ) 00221 40 CONTINUE 00222 END IF 00223 END IF 00224 * 00225 END IF 00226 * 00227 RETURN 00228 * 00229 * End of PDLACPY 00230 * 00231 END