|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 SUBROUTINE PCLASCHK( SYMM, DIAG, N, NRHS, X, IX, JX, DESCX, 00002 $ IASEED, IA, JA, DESCA, IBSEED, ANORM, RESID, 00003 $ WORK ) 00004 * 00005 * -- ScaLAPACK auxiliary routine (version 1.7) -- 00006 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00007 * and University of California, Berkeley. 00008 * May 1, 1997 00009 * 00010 * .. Scalar Arguments .. 00011 CHARACTER DIAG, SYMM 00012 INTEGER IA, IASEED, IBSEED, IX, JA, JX, N, NRHS 00013 REAL ANORM, RESID 00014 * .. 00015 * .. Array Arguments .. 00016 INTEGER DESCA( * ), DESCX( * ) 00017 COMPLEX WORK( * ), X( * ) 00018 * .. 00019 * 00020 * Purpose 00021 * ======= 00022 * 00023 * PCLASCHK computes the residual 00024 * || sub( A )*sub( X ) - B || / (|| sub( A ) ||*|| sub( X ) ||*eps*N) 00025 * to check the accuracy of the factorization and solve steps in the 00026 * LU and Cholesky decompositions, where sub( A ) denotes 00027 * A(IA:IA+N-1,JA,JA+N-1), sub( X ) denotes X(IX:IX+N-1, JX:JX+NRHS-1). 00028 * 00029 * Notes 00030 * ===== 00031 * 00032 * Each global data object is described by an associated description 00033 * vector. This vector stores the information required to establish 00034 * the mapping between an object element and its corresponding process 00035 * and memory location. 00036 * 00037 * Let A be a generic term for any 2D block cyclicly distributed array. 00038 * Such a global array has an associated description vector DESCA. 00039 * In the following comments, the character _ should be read as 00040 * "of the global array". 00041 * 00042 * NOTATION STORED IN EXPLANATION 00043 * --------------- -------------- -------------------------------------- 00044 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 00045 * DTYPE_A = 1. 00046 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 00047 * the BLACS process grid A is distribu- 00048 * ted over. The context itself is glo- 00049 * bal, but the handle (the integer 00050 * value) may vary. 00051 * M_A (global) DESCA( M_ ) The number of rows in the global 00052 * array A. 00053 * N_A (global) DESCA( N_ ) The number of columns in the global 00054 * array A. 00055 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 00056 * the rows of the array. 00057 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 00058 * the columns of the array. 00059 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 00060 * row of the array A is distributed. 00061 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the 00062 * first column of the array A is 00063 * distributed. 00064 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 00065 * array. LLD_A >= MAX(1,LOCr(M_A)). 00066 * 00067 * Let K be the number of rows or columns of a distributed matrix, 00068 * and assume that its process grid has dimension p x q. 00069 * LOCr( K ) denotes the number of elements of K that a process 00070 * would receive if K were distributed over the p processes of its 00071 * process column. 00072 * Similarly, LOCc( K ) denotes the number of elements of K that a 00073 * process would receive if K were distributed over the q processes of 00074 * its process row. 00075 * The values of LOCr() and LOCc() may be determined via a call to the 00076 * ScaLAPACK tool function, NUMROC: 00077 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 00078 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 00079 * An upper bound for these quantities may be computed by: 00080 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 00081 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 00082 * 00083 * Arguments 00084 * ========= 00085 * 00086 * SYMM (global input) CHARACTER 00087 * if SYMM = 'H', sub( A ) is a hermitian distributed matrix, 00088 * otherwise sub( A ) is a general distributed matrix. 00089 * 00090 * DIAG (global input) CHARACTER 00091 * If DIAG = 'D', sub( A ) is diagonally dominant. 00092 * 00093 * N (global input) INTEGER 00094 * The number of columns to be operated on, i.e. the number of 00095 * columns of the distributed submatrix sub( A ). N >= 0. 00096 * 00097 * NRHS (global input) INTEGER 00098 * The number of right-hand-sides, i.e the number of columns 00099 * of the distributed matrix sub( X ). NRHS >= 0. 00100 * 00101 * X (local input) COMPLEX pointer into the local memory 00102 * to an array of dimension (LLD_X,LOCc(JX+NRHS-1). This array 00103 * contains the local pieces of the answer vector(s) sub( X ) of 00104 * sub( A ) sub( X ) - B, split up over a column of processes. 00105 * 00106 * IX (global input) INTEGER 00107 * The row index in the global array X indicating the first 00108 * row of sub( X ). 00109 * 00110 * JX (global input) INTEGER 00111 * The column index in the global array X indicating the 00112 * first column of sub( X ). 00113 * 00114 * DESCX (global and local input) INTEGER array of dimension DLEN_. 00115 * The array descriptor for the distributed matrix X. 00116 * 00117 * IASEED (global input) INTEGER 00118 * The seed number to generate the original matrix Ao. 00119 * 00120 * IA (global input) INTEGER 00121 * The row index in the global array A indicating the first 00122 * row of sub( A ). 00123 * 00124 * JA (global input) INTEGER 00125 * The column index in the global array A indicating the 00126 * first column of sub( A ). 00127 * 00128 * DESCA (global and local input) INTEGER array of dimension DLEN_. 00129 * The array descriptor for the distributed matrix A. 00130 * 00131 * IBSEED (global input) INTEGER 00132 * The seed number to generate the original matrix B. 00133 * 00134 * ANORM (global input) REAL 00135 * The 1-norm or infinity norm of the distributed matrix 00136 * sub( A ). 00137 * 00138 * RESID (global output) REAL 00139 * The residual error: 00140 * ||sub( A )*sub( X )-B|| / (||sub( A )||*||sub( X )||*eps*N). 00141 * 00142 * WORK (local workspace) COMPLEX array, dimension (LWORK) 00143 * LWORK >= MAX(1,Np)*NB_X + Nq*NB_X + MAX( MAX(NQ*MB_A,2*NB_X), 00144 * NB_X * NUMROC( NUMROC(N,MB_X,0,0,NPCOL), MB_X, 0, 0, LCMQ ) ) 00145 * 00146 * ===================================================================== 00147 * 00148 * .. Parameters .. 00149 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 00150 $ LLD_, MB_, M_, NB_, N_, RSRC_ 00151 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 00152 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 00153 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 00154 COMPLEX ZERO, ONE 00155 PARAMETER ( ONE = ( 1.0E+0, 0.0E+0 ), 00156 $ ZERO = ( 0.0E+0, 0.0E+0 ) ) 00157 * .. 00158 * .. Local Scalars .. 00159 INTEGER IACOL, IAROW, IB, ICOFF, ICTXT, ICURCOL, IDUMM, 00160 $ II, IIA, IIX, IOFFX, IPA, IPB, IPW, IPX, IROFF, 00161 $ IXCOL, IXROW, J, JBRHS, JJ, JJA, JJX, LDX, 00162 $ MYCOL, MYROW, NP, NPCOL, NPROW, NQ 00163 REAL DIVISOR, EPS, RESID1 00164 COMPLEX BETA 00165 * .. 00166 * .. External Subroutines .. 00167 EXTERNAL BLACS_GRIDINFO, CGAMX2D, CGEMM, CGSUM2D, 00168 $ CLASET, PBCTRAN, PCMATGEN, SGEBR2D, 00169 $ SGEBS2D, SGERV2D, SGESD2D 00170 * .. 00171 * .. External Functions .. 00172 INTEGER ICAMAX, NUMROC 00173 REAL PSLAMCH 00174 EXTERNAL ICAMAX, NUMROC, PSLAMCH 00175 * .. 00176 * .. Intrinsic Functions .. 00177 INTRINSIC ABS, MAX, MIN, MOD, REAL 00178 * .. 00179 * .. Executable Statements .. 00180 * 00181 * Get needed initial parameters 00182 * 00183 ICTXT = DESCA( CTXT_ ) 00184 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 00185 * 00186 EPS = PSLAMCH( ICTXT, 'eps' ) 00187 RESID = 0.0E+0 00188 DIVISOR = ANORM * EPS * REAL( N ) 00189 * 00190 CALL INFOG2L( IA, JA, DESCA, NPROW, NPCOL, MYROW, MYCOL, IIA, JJA, 00191 $ IAROW, IACOL ) 00192 CALL INFOG2L( IX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, IIX, JJX, 00193 $ IXROW, IXCOL ) 00194 IROFF = MOD( IA-1, DESCA( MB_ ) ) 00195 ICOFF = MOD( JA-1, DESCA( NB_ ) ) 00196 NP = NUMROC( N+IROFF, DESCA( MB_ ), MYROW, IAROW, NPROW ) 00197 NQ = NUMROC( N+ICOFF, DESCA( NB_ ), MYCOL, IACOL, NPCOL ) 00198 * 00199 LDX = MAX( 1, NP ) 00200 IPB = 1 00201 IPX = IPB + NP * DESCX( NB_ ) 00202 IPA = IPX + NQ * DESCX( NB_ ) 00203 * 00204 IF( MYROW.EQ.IAROW ) 00205 $ NP = NP - IROFF 00206 IF( MYCOL.EQ.IACOL ) 00207 $ NQ = NQ - ICOFF 00208 * 00209 ICURCOL = IXCOL 00210 * 00211 * Loop over the rhs 00212 * 00213 DO 40 J = 1, NRHS, DESCX( NB_ ) 00214 JBRHS = MIN( DESCX( NB_ ), NRHS-J+1 ) 00215 * 00216 * Transpose x from ICURCOL to all rows 00217 * 00218 IOFFX = IIX + ( JJX - 1 ) * DESCX( LLD_ ) 00219 CALL PBCTRAN( ICTXT, 'Column', 'Transpose', N, JBRHS, 00220 $ DESCX( MB_ ), X( IOFFX ), DESCX( LLD_ ), ZERO, 00221 $ WORK( IPX ), JBRHS, IXROW, ICURCOL, -1, IACOL, 00222 $ WORK( IPA ) ) 00223 * 00224 * Regenerate B in IXCOL 00225 * 00226 IF( MYCOL.EQ.ICURCOL ) THEN 00227 CALL PCMATGEN( ICTXT, 'N', 'N', DESCX( M_ ), DESCX( N_ ), 00228 $ DESCX( MB_ ), DESCX( NB_ ), WORK( IPB ), LDX, 00229 $ IXROW, IXCOL, IBSEED, IIX-1, NP, JJX-1, 00230 $ JBRHS, MYROW, MYCOL, NPROW, NPCOL ) 00231 BETA = ONE 00232 ELSE 00233 BETA = ZERO 00234 END IF 00235 * 00236 IF( NQ.GT.0 ) THEN 00237 DO 10 II = IIA, IIA+NP-1, DESCA( MB_ ) 00238 IB = MIN( DESCA( MB_ ), IIA+NP-II ) 00239 * 00240 * Regenerate ib rows of the matrix A(IA:IA+N-1,JA:JA+N-1). 00241 * 00242 CALL PCMATGEN( ICTXT, SYMM, DIAG, DESCA( M_ ), 00243 $ DESCA( N_ ), DESCA( MB_ ), DESCA( NB_ ), 00244 $ WORK( IPA ), IB, DESCA( RSRC_ ), 00245 $ DESCA( CSRC_ ), IASEED, II-1, IB, 00246 $ JJA-1, NQ, MYROW, MYCOL, NPROW, NPCOL ) 00247 * 00248 * Compute B <= B - A * X. 00249 * 00250 CALL CGEMM( 'No transpose', 'Transpose', IB, JBRHS, NQ, 00251 $ -ONE, WORK( IPA ), IB, WORK( IPX ), JBRHS, 00252 $ BETA, WORK( IPB+II-IIA ), LDX ) 00253 * 00254 10 CONTINUE 00255 * 00256 ELSE IF( MYCOL.NE.ICURCOL ) THEN 00257 * 00258 CALL CLASET( 'All', NP, JBRHS, ZERO, ZERO, WORK( IPB ), 00259 $ LDX ) 00260 * 00261 END IF 00262 * 00263 * Add B rowwise to ICURCOL 00264 * 00265 CALL CGSUM2D( ICTXT, 'Row', ' ', NP, JBRHS, WORK( IPB ), LDX, 00266 $ MYROW, ICURCOL ) 00267 * 00268 IF( MYCOL.EQ.ICURCOL ) THEN 00269 * 00270 * Figure || A * X - B || & || X || 00271 * 00272 IPW = IPA + JBRHS 00273 DO 20 JJ = 0, JBRHS - 1 00274 IF( NP.GT.0 ) THEN 00275 II = ICAMAX( NP, WORK( IPB+JJ*LDX ), 1 ) 00276 WORK( IPA+JJ ) = ABS( WORK( IPB+II-1+JJ*LDX ) ) 00277 WORK( IPW+JJ ) = ABS( X( IOFFX + ICAMAX( NP, 00278 $ X( IOFFX + JJ*DESCX( LLD_ ) ), 1 )-1+JJ* 00279 $ DESCX( LLD_ ) ) ) 00280 ELSE 00281 WORK( IPA+JJ ) = ZERO 00282 WORK( IPW+JJ ) = ZERO 00283 END IF 00284 20 CONTINUE 00285 * 00286 * After CGAMX2D computation, 00287 * WORK(IPB) has the maximum of || Ax - b ||, and 00288 * WORK(IPX) has the maximum of || X ||. 00289 * 00290 CALL CGAMX2D( ICTXT, 'Column', ' ', 1, 2*JBRHS, 00291 $ WORK( IPA ), 1, IDUMM, IDUMM, -1, 0, ICURCOL ) 00292 * 00293 * Calculate residual = ||Ax-b|| / (||x||*||A||*eps*N) 00294 * 00295 IF( MYROW.EQ.0 ) THEN 00296 DO 30 JJ = 0, JBRHS - 1 00297 RESID1 = REAL( WORK( IPA+JJ ) ) / 00298 $ ( REAL( WORK( IPW+JJ ) )*DIVISOR ) 00299 IF( RESID.LT.RESID1 ) 00300 $ RESID = RESID1 00301 30 CONTINUE 00302 IF( MYCOL.NE.0 ) 00303 $ CALL SGESD2D( ICTXT, 1, 1, RESID, 1, 0, 0 ) 00304 END IF 00305 * 00306 ELSE IF( MYROW.EQ.0 .AND. MYCOL.EQ.0 ) THEN 00307 * 00308 CALL SGERV2D( ICTXT, 1, 1, RESID1, 1, 0, ICURCOL ) 00309 IF( RESID.LT.RESID1 ) 00310 $ RESID = RESID1 00311 * 00312 END IF 00313 * 00314 IF( MYCOL.EQ.ICURCOL ) 00315 $ JJX = JJX + JBRHS 00316 ICURCOL = MOD( ICURCOL+1, NPCOL ) 00317 * 00318 40 CONTINUE 00319 * 00320 IF( MYROW.EQ.0 .AND. MYCOL.EQ.0 ) THEN 00321 CALL SGEBS2D( ICTXT, 'All', ' ', 1, 1, RESID, 1 ) 00322 ELSE 00323 CALL SGEBR2D( ICTXT, 'All', ' ', 1, 1, RESID, 1, 0, 0 ) 00324 END IF 00325 * 00326 RETURN 00327 * 00328 * End of PCLASCHK 00329 * 00330 END