|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 SUBROUTINE PSORGR2( M, N, K, A, IA, JA, DESCA, TAU, WORK, LWORK, 00002 $ INFO ) 00003 * 00004 * -- ScaLAPACK routine (version 1.7) -- 00005 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00006 * and University of California, Berkeley. 00007 * May 25, 2001 00008 * 00009 * .. Scalar Arguments .. 00010 INTEGER IA, INFO, JA, K, LWORK, M, N 00011 * .. 00012 * .. Array Arguments .. 00013 INTEGER DESCA( * ) 00014 REAL A( * ), TAU( * ), WORK( * ) 00015 * .. 00016 * 00017 * Purpose 00018 * ======= 00019 * 00020 * PSORGR2 generates an M-by-N real distributed matrix Q denoting 00021 * A(IA:IA+M-1,JA:JA+N-1) with orthonormal rows, which is defined as the 00022 * last M rows of a product of K elementary reflectors of order N 00023 * 00024 * Q = H(1) H(2) . . . H(k) 00025 * 00026 * as returned by PSGERQF. 00027 * 00028 * Notes 00029 * ===== 00030 * 00031 * Each global data object is described by an associated description 00032 * vector. This vector stores the information required to establish 00033 * the mapping between an object element and its corresponding process 00034 * and memory location. 00035 * 00036 * Let A be a generic term for any 2D block cyclicly distributed array. 00037 * Such a global array has an associated description vector DESCA. 00038 * In the following comments, the character _ should be read as 00039 * "of the global array". 00040 * 00041 * NOTATION STORED IN EXPLANATION 00042 * --------------- -------------- -------------------------------------- 00043 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 00044 * DTYPE_A = 1. 00045 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 00046 * the BLACS process grid A is distribu- 00047 * ted over. The context itself is glo- 00048 * bal, but the handle (the integer 00049 * value) may vary. 00050 * M_A (global) DESCA( M_ ) The number of rows in the global 00051 * array A. 00052 * N_A (global) DESCA( N_ ) The number of columns in the global 00053 * array A. 00054 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 00055 * the rows of the array. 00056 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 00057 * the columns of the array. 00058 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 00059 * row of the array A is distributed. 00060 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the 00061 * first column of the array A is 00062 * distributed. 00063 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 00064 * array. LLD_A >= MAX(1,LOCr(M_A)). 00065 * 00066 * Let K be the number of rows or columns of a distributed matrix, 00067 * and assume that its process grid has dimension p x q. 00068 * LOCr( K ) denotes the number of elements of K that a process 00069 * would receive if K were distributed over the p processes of its 00070 * process column. 00071 * Similarly, LOCc( K ) denotes the number of elements of K that a 00072 * process would receive if K were distributed over the q processes of 00073 * its process row. 00074 * The values of LOCr() and LOCc() may be determined via a call to the 00075 * ScaLAPACK tool function, NUMROC: 00076 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 00077 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 00078 * An upper bound for these quantities may be computed by: 00079 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 00080 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 00081 * 00082 * Arguments 00083 * ========= 00084 * 00085 * M (global input) INTEGER 00086 * The number of rows to be operated on i.e the number of rows 00087 * of the distributed submatrix Q. M >= 0. 00088 * 00089 * N (global input) INTEGER 00090 * The number of columns to be operated on i.e the number of 00091 * columns of the distributed submatrix Q. N >= M >= 0. 00092 * 00093 * K (global input) INTEGER 00094 * The number of elementary reflectors whose product defines the 00095 * matrix Q. M >= K >= 0. 00096 * 00097 * A (local input/local output) REAL pointer into the 00098 * local memory to an array of dimension (LLD_A,LOCc(JA+N-1)). 00099 * On entry, the i-th row must contain the vector which defines 00100 * the elementary reflector H(i), IA+M-K <= i <= IA+M-1, as 00101 * returned by PSGERQF in the K rows of its distributed 00102 * matrix argument A(IA+M-K:IA+M-1,JA:*). On exit, this array 00103 * contains the local pieces of the M-by-N distributed matrix Q. 00104 * 00105 * IA (global input) INTEGER 00106 * The row index in the global array A indicating the first 00107 * row of sub( A ). 00108 * 00109 * JA (global input) INTEGER 00110 * The column index in the global array A indicating the 00111 * first column of sub( A ). 00112 * 00113 * DESCA (global and local input) INTEGER array of dimension DLEN_. 00114 * The array descriptor for the distributed matrix A. 00115 * 00116 * TAU (local input) REAL, array, dimension LOCr(IA+M-1) 00117 * This array contains the scalar factors TAU(i) of the 00118 * elementary reflectors H(i) as returned by PSGERQF. 00119 * TAU is tied to the distributed matrix A. 00120 * 00121 * WORK (local workspace/local output) REAL array, 00122 * dimension (LWORK) 00123 * On exit, WORK(1) returns the minimal and optimal LWORK. 00124 * 00125 * LWORK (local or global input) INTEGER 00126 * The dimension of the array WORK. 00127 * LWORK is local input and must be at least 00128 * LWORK >= NqA0 + MAX( 1, MpA0 ), where 00129 * 00130 * IROFFA = MOD( IA-1, MB_A ), ICOFFA = MOD( JA-1, NB_A ), 00131 * IAROW = INDXG2P( IA, MB_A, MYROW, RSRC_A, NPROW ), 00132 * IACOL = INDXG2P( JA, NB_A, MYCOL, CSRC_A, NPCOL ), 00133 * MpA0 = NUMROC( M+IROFFA, MB_A, MYROW, IAROW, NPROW ), 00134 * NqA0 = NUMROC( N+ICOFFA, NB_A, MYCOL, IACOL, NPCOL ), 00135 * 00136 * INDXG2P and NUMROC are ScaLAPACK tool functions; 00137 * MYROW, MYCOL, NPROW and NPCOL can be determined by calling 00138 * the subroutine BLACS_GRIDINFO. 00139 * 00140 * If LWORK = -1, then LWORK is global input and a workspace 00141 * query is assumed; the routine only calculates the minimum 00142 * and optimal size for all work arrays. Each of these 00143 * values is returned in the first entry of the corresponding 00144 * work array, and no error message is issued by PXERBLA. 00145 * 00146 * 00147 * INFO (local output) INTEGER 00148 * = 0: successful exit 00149 * < 0: If the i-th argument is an array and the j-entry had 00150 * an illegal value, then INFO = -(i*100+j), if the i-th 00151 * argument is a scalar and had an illegal value, then 00152 * INFO = -i. 00153 * 00154 * ===================================================================== 00155 * 00156 * .. Parameters .. 00157 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 00158 $ LLD_, MB_, M_, NB_, N_, RSRC_ 00159 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 00160 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 00161 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 00162 REAL ONE, ZERO 00163 PARAMETER ( ONE = 1.0E+0, ZERO = 0.0E+0 ) 00164 * .. 00165 * .. Local Scalars .. 00166 LOGICAL LQUERY 00167 CHARACTER COLBTOP, ROWBTOP 00168 INTEGER IACOL, IAROW, I, ICTXT, II, LWMIN, MP, MPA0, 00169 $ MYCOL, MYROW, NPCOL, NPROW, NQA0 00170 REAL TAUI 00171 * .. 00172 * .. External Subroutines .. 00173 EXTERNAL BLACS_ABORT, BLACS_GRIDINFO, CHK1MAT, PSELSET, 00174 $ PSLARF, PSLASET, PSSCAL, PB_TOPGET, 00175 $ PB_TOPSET, PXERBLA 00176 * .. 00177 * .. External Functions .. 00178 INTEGER INDXG2L, INDXG2P, NUMROC 00179 EXTERNAL INDXG2L, INDXG2P, NUMROC 00180 * .. 00181 * .. Intrinsic Functions .. 00182 INTRINSIC MAX, MIN, MOD, REAL 00183 * .. 00184 * .. Executable Statements .. 00185 * 00186 * Get grid parameters 00187 * 00188 ICTXT = DESCA( CTXT_ ) 00189 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 00190 * 00191 * Test the input parameters 00192 * 00193 INFO = 0 00194 IF( NPROW.EQ.-1 ) THEN 00195 INFO = -(700+CTXT_) 00196 ELSE 00197 CALL CHK1MAT( M, 1, N, 2, IA, JA, DESCA, 7, INFO ) 00198 IF( INFO.EQ.0 ) THEN 00199 IAROW = INDXG2P( IA, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), 00200 $ NPROW ) 00201 IACOL = INDXG2P( JA, DESCA( NB_ ), MYCOL, DESCA( CSRC_ ), 00202 $ NPCOL ) 00203 MPA0 = NUMROC( M+MOD( IA-1, DESCA( MB_ ) ), DESCA( MB_ ), 00204 $ MYROW, IAROW, NPROW ) 00205 NQA0 = NUMROC( N+MOD( JA-1, DESCA( NB_ ) ), DESCA( NB_ ), 00206 $ MYCOL, IACOL, NPCOL ) 00207 LWMIN = NQA0 + MAX( 1, MPA0 ) 00208 * 00209 WORK( 1 ) = REAL( LWMIN ) 00210 LQUERY = ( LWORK.EQ.-1 ) 00211 IF( N.LT.M ) THEN 00212 INFO = -2 00213 ELSE IF( K.LT.0 .OR. K.GT.M ) THEN 00214 INFO = -3 00215 ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN 00216 INFO = -10 00217 END IF 00218 END IF 00219 END IF 00220 IF( INFO.NE.0 ) THEN 00221 CALL PXERBLA( ICTXT, 'PSORGR2', -INFO ) 00222 CALL BLACS_ABORT( ICTXT, 1 ) 00223 RETURN 00224 ELSE IF( LQUERY ) THEN 00225 RETURN 00226 END IF 00227 * 00228 * Quick return if possible 00229 * 00230 IF( M.LE.0 ) 00231 $ RETURN 00232 * 00233 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', ROWBTOP ) 00234 CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', COLBTOP ) 00235 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Rowwise', ' ' ) 00236 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Columnwise', 'I-ring' ) 00237 * 00238 IF( K.LT.M ) THEN 00239 * 00240 * Initialise rows ia:ia+m-k-1 to rows of the unit matrix 00241 * 00242 CALL PSLASET( 'All', M-K, N-M, ZERO, ZERO, A, IA, JA, DESCA ) 00243 CALL PSLASET( 'All', M-K, M, ZERO, ONE, A, IA, JA+N-M, DESCA ) 00244 * 00245 END IF 00246 * 00247 TAUI = ZERO 00248 MP = NUMROC( IA+M-1, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), NPROW ) 00249 * 00250 DO 10 I = IA+M-K, IA+M-1 00251 * 00252 * Apply H(i) to A(ia:i,ja:ja+n-k+i-1) from the right 00253 * 00254 CALL PSELSET( A, I, JA+N-M+I-IA, DESCA, ONE ) 00255 CALL PSLARF( 'Right', I-IA, I-IA+N-M+1, A, I, JA, DESCA, 00256 $ DESCA( M_ ), TAU, A, IA, JA, DESCA, WORK ) 00257 II = INDXG2L( I, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), NPROW ) 00258 IAROW = INDXG2P( I, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), 00259 $ NPROW ) 00260 IF( MYROW.EQ.IAROW ) 00261 $ TAUI = TAU( MIN( II, MP ) ) 00262 CALL PSSCAL( I-IA+N-M, -TAUI, A, I, JA, DESCA, DESCA( M_ ) ) 00263 CALL PSELSET( A, I, JA+N-M+I-IA, DESCA, ONE-TAUI ) 00264 * 00265 * Set A(i,ja+n-m+i-ia+1:ja+n-1) to zero 00266 * 00267 CALL PSLASET( 'All', 1, IA+M-1-I, ZERO, ZERO, A, I, 00268 $ JA+N-M+I-IA+1, DESCA ) 00269 * 00270 10 CONTINUE 00271 * 00272 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Rowwise', ROWBTOP ) 00273 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Columnwise', COLBTOP ) 00274 * 00275 WORK( 1 ) = REAL( LWMIN ) 00276 * 00277 RETURN 00278 * 00279 * End of PSORGR2 00280 * 00281 END