|
ScaLAPACK
2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
|
00001 SUBROUTINE PDGGRQF( M, P, N, A, IA, JA, DESCA, TAUA, B, IB, JB, 00002 $ DESCB, TAUB, WORK, LWORK, INFO ) 00003 * 00004 * -- ScaLAPACK routine (version 1.7) -- 00005 * University of Tennessee, Knoxville, Oak Ridge National Laboratory, 00006 * and University of California, Berkeley. 00007 * May 1, 1997 00008 * 00009 * .. Scalar Arguments .. 00010 INTEGER IA, IB, INFO, JA, JB, LWORK, M, N, P 00011 * .. 00012 * .. Array Arguments .. 00013 INTEGER DESCA( * ), DESCB( * ) 00014 DOUBLE PRECISION A( * ), B( * ), TAUA( * ), TAUB( * ), WORK( * ) 00015 * .. 00016 * 00017 * Purpose 00018 * ======= 00019 * 00020 * PDGGRQF computes a generalized RQ factorization of 00021 * an M-by-N matrix sub( A ) = A(IA:IA+M-1,JA:JA+N-1) 00022 * and a P-by-N matrix sub( B ) = B(IB:IB+P-1,JB:JB+N-1): 00023 * 00024 * sub( A ) = R*Q, sub( B ) = Z*T*Q, 00025 * 00026 * where Q is an N-by-N orthogonal matrix, Z is a P-by-P orthogonal 00027 * matrix, and R and T assume one of the forms: 00028 * 00029 * if M <= N, R = ( 0 R12 ) M, or if M > N, R = ( R11 ) M-N, 00030 * N-M M ( R21 ) N 00031 * N 00032 * 00033 * where R12 or R21 is upper triangular, and 00034 * 00035 * if P >= N, T = ( T11 ) N , or if P < N, T = ( T11 T12 ) P, 00036 * ( 0 ) P-N P N-P 00037 * N 00038 * 00039 * where T11 is upper triangular. 00040 * 00041 * In particular, if sub( B ) is square and nonsingular, the GRQ 00042 * factorization of sub( A ) and sub( B ) implicitly gives the RQ 00043 * factorization of sub( A )*inv( sub( B ) ): 00044 * 00045 * sub( A )*inv( sub( B ) ) = (R*inv(T))*Z' 00046 * 00047 * where inv( sub( B ) ) denotes the inverse of the matrix sub( B ), 00048 * and Z' denotes the transpose of matrix Z. 00049 * 00050 * Notes 00051 * ===== 00052 * 00053 * Each global data object is described by an associated description 00054 * vector. This vector stores the information required to establish 00055 * the mapping between an object element and its corresponding process 00056 * and memory location. 00057 * 00058 * Let A be a generic term for any 2D block cyclicly distributed array. 00059 * Such a global array has an associated description vector DESCA. 00060 * In the following comments, the character _ should be read as 00061 * "of the global array". 00062 * 00063 * NOTATION STORED IN EXPLANATION 00064 * --------------- -------------- -------------------------------------- 00065 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, 00066 * DTYPE_A = 1. 00067 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating 00068 * the BLACS process grid A is distribu- 00069 * ted over. The context itself is glo- 00070 * bal, but the handle (the integer 00071 * value) may vary. 00072 * M_A (global) DESCA( M_ ) The number of rows in the global 00073 * array A. 00074 * N_A (global) DESCA( N_ ) The number of columns in the global 00075 * array A. 00076 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute 00077 * the rows of the array. 00078 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute 00079 * the columns of the array. 00080 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first 00081 * row of the array A is distributed. 00082 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the 00083 * first column of the array A is 00084 * distributed. 00085 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local 00086 * array. LLD_A >= MAX(1,LOCr(M_A)). 00087 * 00088 * Let K be the number of rows or columns of a distributed matrix, 00089 * and assume that its process grid has dimension p x q. 00090 * LOCr( K ) denotes the number of elements of K that a process 00091 * would receive if K were distributed over the p processes of its 00092 * process column. 00093 * Similarly, LOCc( K ) denotes the number of elements of K that a 00094 * process would receive if K were distributed over the q processes of 00095 * its process row. 00096 * The values of LOCr() and LOCc() may be determined via a call to the 00097 * ScaLAPACK tool function, NUMROC: 00098 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), 00099 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). 00100 * An upper bound for these quantities may be computed by: 00101 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A 00102 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A 00103 * 00104 * Arguments 00105 * ========= 00106 * 00107 * M (global input) INTEGER 00108 * The number of rows to be operated on i.e the number of 00109 * rows of the distributed submatrix sub( A ). M >= 0. 00110 * 00111 * P (global input) INTEGER 00112 * The number of rows to be operated on i.e the number of 00113 * rows of the distributed submatrix sub( B ). P >= 0. 00114 * 00115 * N (global input) INTEGER 00116 * The number of columns to be operated on i.e the number of 00117 * columns of the distributed submatrices sub( A ) and sub( B ). 00118 * N >= 0. 00119 * 00120 * A (local input/local output) DOUBLE PRECISION pointer into the 00121 * local memory to an array of dimension (LLD_A, LOCc(JA+N-1)). 00122 * On entry, the local pieces of the M-by-N distributed matrix 00123 * sub( A ) which is to be factored. On exit, if M <= N, the 00124 * upper triangle of A( IA:IA+M-1, JA+N-M:JA+N-1 ) contains the 00125 * M by M upper triangular matrix R; if M >= N, the elements on 00126 * and above the (M-N)-th subdiagonal contain the M by N upper 00127 * trapezoidal matrix R; the remaining elements, with the array 00128 * TAUA, represent the orthogonal matrix Q as a product of 00129 * elementary reflectors (see Further Details). 00130 * 00131 * IA (global input) INTEGER 00132 * The row index in the global array A indicating the first 00133 * row of sub( A ). 00134 * 00135 * JA (global input) INTEGER 00136 * The column index in the global array A indicating the 00137 * first column of sub( A ). 00138 * 00139 * DESCA (global and local input) INTEGER array of dimension DLEN_. 00140 * The array descriptor for the distributed matrix A. 00141 * 00142 * TAUA (local output) DOUBLE PRECISION array, dimension LOCr(IA+M-1) 00143 * This array contains the scalar factors of the elementary 00144 * reflectors which represent the orthogonal unitary matrix Q. 00145 * TAUA is tied to the distributed matrix A (see Further 00146 * Details). 00147 * 00148 * B (local input/local output) DOUBLE PRECISION pointer into the 00149 * local memory to an array of dimension (LLD_B, LOCc(JB+N-1)). 00150 * On entry, the local pieces of the P-by-N distributed matrix 00151 * sub( B ) which is to be factored. On exit, the elements on 00152 * and above the diagonal of sub( B ) contain the min(P,N) by N 00153 * upper trapezoidal matrix T (T is upper triangular if P >= N); 00154 * the elements below the diagonal, with the array TAUB, 00155 * represent the orthogonal matrix Z as a product of elementary 00156 * reflectors (see Further Details). 00157 * 00158 * IB (global input) INTEGER 00159 * The row index in the global array B indicating the first 00160 * row of sub( B ). 00161 * 00162 * JB (global input) INTEGER 00163 * The column index in the global array B indicating the 00164 * first column of sub( B ). 00165 * 00166 * DESCB (global and local input) INTEGER array of dimension DLEN_. 00167 * The array descriptor for the distributed matrix B. 00168 * 00169 * TAUB (local output) DOUBLE PRECISION array, dimension 00170 * LOCc(JB+MIN(P,N)-1). This array contains the scalar factors 00171 * TAUB of the elementary reflectors which represent the 00172 * orthogonal matrix Z. TAUB is tied to the distributed matrix 00173 * B (see Further Details). 00174 * 00175 * WORK (local workspace/local output) DOUBLE PRECISION array, 00176 * dimension (LWORK) 00177 * On exit, WORK(1) returns the minimal and optimal LWORK. 00178 * 00179 * LWORK (local or global input) INTEGER 00180 * The dimension of the array WORK. 00181 * LWORK is local input and must be at least 00182 * LWORK >= MAX( MB_A * ( MpA0 + NqA0 + MB_A ), 00183 * MAX( (MB_A*(MB_A-1))/2, (PpB0 + NqB0)*MB_A ) + 00184 * MB_A * MB_A, 00185 * NB_B * ( PpB0 + NqB0 + NB_B ) ), where 00186 * 00187 * IROFFA = MOD( IA-1, MB_A ), ICOFFA = MOD( JA-1, NB_A ), 00188 * IAROW = INDXG2P( IA, MB_A, MYROW, RSRC_A, NPROW ), 00189 * IACOL = INDXG2P( JA, NB_A, MYCOL, CSRC_A, NPCOL ), 00190 * MpA0 = NUMROC( M+IROFFA, MB_A, MYROW, IAROW, NPROW ), 00191 * NqA0 = NUMROC( N+ICOFFA, NB_A, MYCOL, IACOL, NPCOL ), 00192 * 00193 * IROFFB = MOD( IB-1, MB_B ), ICOFFB = MOD( JB-1, NB_B ), 00194 * IBROW = INDXG2P( IB, MB_B, MYROW, RSRC_B, NPROW ), 00195 * IBCOL = INDXG2P( JB, NB_B, MYCOL, CSRC_B, NPCOL ), 00196 * PpB0 = NUMROC( P+IROFFB, MB_B, MYROW, IBROW, NPROW ), 00197 * NqB0 = NUMROC( N+ICOFFB, NB_B, MYCOL, IBCOL, NPCOL ), 00198 * 00199 * and NUMROC, INDXG2P are ScaLAPACK tool functions; 00200 * MYROW, MYCOL, NPROW and NPCOL can be determined by calling 00201 * the subroutine BLACS_GRIDINFO. 00202 * 00203 * If LWORK = -1, then LWORK is global input and a workspace 00204 * query is assumed; the routine only calculates the minimum 00205 * and optimal size for all work arrays. Each of these 00206 * values is returned in the first entry of the corresponding 00207 * work array, and no error message is issued by PXERBLA. 00208 * 00209 * INFO (global output) INTEGER 00210 * = 0: successful exit 00211 * < 0: If the i-th argument is an array and the j-entry had 00212 * an illegal value, then INFO = -(i*100+j), if the i-th 00213 * argument is a scalar and had an illegal value, then 00214 * INFO = -i. 00215 * 00216 * Further Details 00217 * =============== 00218 * 00219 * The matrix Q is represented as a product of elementary reflectors 00220 * 00221 * Q = H(ia) H(ia+1) . . . H(ia+k-1), where k = min(m,n). 00222 * 00223 * Each H(i) has the form 00224 * 00225 * H(i) = I - taua * v * v' 00226 * 00227 * where taua is a real scalar, and v is a real vector with 00228 * v(n-k+i+1:n) = 0 and v(n-k+i) = 1; v(1:n-k+i-1) is stored on exit in 00229 * A(ia+m-k+i-1,ja:ja+n-k+i-2), and taua in TAUA(ia+m-k+i-1). 00230 * To form Q explicitly, use ScaLAPACK subroutine PDORGRQ. 00231 * To use Q to update another matrix, use ScaLAPACK subroutine PDORMRQ. 00232 * 00233 * The matrix Z is represented as a product of elementary reflectors 00234 * 00235 * Z = H(jb) H(jb+1) . . . H(jb+k-1), where k = min(p,n). 00236 * 00237 * Each H(i) has the form 00238 * 00239 * H(i) = I - taub * v * v' 00240 * 00241 * where taub is a real scalar, and v is a real vector with 00242 * v(1:i-1) = 0 and v(i) = 1; v(i+1:p) is stored on exit in 00243 * B(ib+i:ib+p-1,jb+i-1), and taub in TAUB(jb+i-1). 00244 * To form Z explicitly, use ScaLAPACK subroutine PDORGQR. 00245 * To use Z to update another matrix, use ScaLAPACK subroutine PDORMQR. 00246 * 00247 * Alignment requirements 00248 * ====================== 00249 * 00250 * The distributed submatrices sub( A ) and sub( B ) must verify some 00251 * alignment properties, namely the following expression should be true: 00252 * 00253 * ( NB_A.EQ.NB_B .AND. ICOFFA.EQ.ICOFFB .AND. IACOL.EQ.IBCOL ) 00254 * 00255 * ===================================================================== 00256 * 00257 * .. Parameters .. 00258 INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, 00259 $ LLD_, MB_, M_, NB_, N_, RSRC_ 00260 PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, 00261 $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, 00262 $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) 00263 * .. Local Scalars .. 00264 LOGICAL LQUERY 00265 INTEGER IACOL, IAROW, IBCOL, IBROW, ICOFFA, ICOFFB, 00266 $ ICTXT, IROFFA, IROFFB, LWMIN, MPA0, MYCOL, 00267 $ MYROW, NPCOL, NPROW, NQA0, NQB0, PPB0 00268 * .. 00269 * .. External Subroutines .. 00270 EXTERNAL BLACS_GRIDINFO, CHK1MAT, PCHK2MAT, PDGEQRF, 00271 $ PDGERQF, PDORMRQ, PXERBLA 00272 * .. 00273 * .. Local Arrays .. 00274 INTEGER IDUM1( 1 ), IDUM2( 1 ) 00275 * .. 00276 * .. External Functions .. 00277 INTEGER INDXG2P, NUMROC 00278 EXTERNAL INDXG2P, NUMROC 00279 * .. 00280 * .. Intrinsic Functions .. 00281 INTRINSIC DBLE, INT, MAX, MIN, MOD 00282 * .. 00283 * .. Executable Statements .. 00284 * 00285 * Get grid parameters 00286 * 00287 ICTXT = DESCA( CTXT_ ) 00288 CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) 00289 * 00290 * Test the input parameters 00291 * 00292 INFO = 0 00293 IF( NPROW.EQ.-1 ) THEN 00294 INFO = -707 00295 ELSE 00296 CALL CHK1MAT( M, 1, N, 3, IA, JA, DESCA, 7, INFO ) 00297 CALL CHK1MAT( P, 2, N, 3, IB, JB, DESCB, 12, INFO ) 00298 IF( INFO.EQ.0 ) THEN 00299 IROFFA = MOD( IA-1, DESCA( MB_ ) ) 00300 ICOFFA = MOD( JA-1, DESCA( NB_ ) ) 00301 IROFFB = MOD( IB-1, DESCB( MB_ ) ) 00302 ICOFFB = MOD( JB-1, DESCB( NB_ ) ) 00303 IAROW = INDXG2P( IA, DESCA( MB_ ), MYROW, DESCA( RSRC_ ), 00304 $ NPROW ) 00305 IACOL = INDXG2P( JA, DESCA( NB_ ), MYCOL, DESCA( CSRC_ ), 00306 $ NPCOL ) 00307 IBROW = INDXG2P( IB, DESCB( MB_ ), MYROW, DESCB( RSRC_ ), 00308 $ NPROW ) 00309 IBCOL = INDXG2P( JB, DESCB( NB_ ), MYCOL, DESCB( CSRC_ ), 00310 $ NPCOL ) 00311 MPA0 = NUMROC( M+IROFFA, DESCA( MB_ ), MYROW, IAROW, NPROW ) 00312 NQA0 = NUMROC( N+ICOFFA, DESCA( NB_ ), MYCOL, IACOL, NPCOL ) 00313 PPB0 = NUMROC( P+IROFFB, DESCB( MB_ ), MYROW, IBROW, NPROW ) 00314 NQB0 = NUMROC( N+ICOFFB, DESCB( NB_ ), MYCOL, IBCOL, NPCOL ) 00315 LWMIN = MAX( DESCA( MB_ ) * ( MPA0 + NQA0 + DESCA( MB_ ) ), 00316 $ MAX( MAX( ( DESCA( MB_ )*( DESCA( MB_ ) - 1 ) ) / 2, 00317 $ ( PPB0 + NQB0 ) * DESCA( MB_ ) ) + 00318 $ DESCA( MB_ ) * DESCA( MB_ ), 00319 $ DESCB( NB_ ) * ( PPB0 + NQB0 + DESCB( NB_ ) ) ) ) 00320 * 00321 WORK( 1 ) = DBLE( LWMIN ) 00322 LQUERY = ( LWORK.EQ.-1 ) 00323 IF( IACOL.NE.IBCOL .OR. ICOFFA.NE.ICOFFB ) THEN 00324 INFO = -11 00325 ELSE IF( DESCA( NB_ ).NE.DESCB( NB_ ) ) THEN 00326 INFO = -1204 00327 ELSE IF( ICTXT.NE.DESCB( CTXT_ ) ) THEN 00328 INFO = -1207 00329 ELSE IF( LWORK.LT.LWMIN .AND. .NOT.LQUERY ) THEN 00330 INFO = -15 00331 END IF 00332 END IF 00333 IF( LWORK.EQ.-1 ) THEN 00334 IDUM1( 1 ) = -1 00335 ELSE 00336 IDUM1( 1 ) = 1 00337 END IF 00338 IDUM2( 1 ) = 15 00339 CALL PCHK2MAT( M, 1, N, 3, IA, JA, DESCA, 7, P, 2, N, 3, IB, 00340 $ JB, DESCB, 12, 1, IDUM1, IDUM2, INFO ) 00341 END IF 00342 * 00343 IF( INFO.NE.0 ) THEN 00344 CALL PXERBLA( ICTXT, 'PDGGRQF', -INFO ) 00345 RETURN 00346 ELSE IF( LQUERY ) THEN 00347 RETURN 00348 END IF 00349 * 00350 * RQ factorization of M-by-N matrix sub( A ): sub( A ) = R*Q 00351 * 00352 CALL PDGERQF( M, N, A, IA, JA, DESCA, TAUA, WORK, LWORK, INFO ) 00353 LWMIN = INT( WORK( 1 ) ) 00354 * 00355 * Update sub( B ) := sub( B )*Q' 00356 * 00357 CALL PDORMRQ( 'Right', 'Transpose', P, N, MIN( M, N ), A, 00358 $ MAX( IA, IA+M-N ), JA, DESCA, TAUA, B, IB, JB, 00359 $ DESCB, WORK, LWORK, INFO ) 00360 LWMIN = MAX( LWMIN, INT( WORK( 1 ) ) ) 00361 * 00362 * QR factorization of P-by-N matrix sub( B ): sub( B ) = Z*T 00363 * 00364 CALL PDGEQRF( P, N, B, IB, JB, DESCB, TAUB, WORK, LWORK, INFO ) 00365 WORK( 1 ) = DBLE( MAX( LWMIN, INT( WORK( 1 ) ) ) ) 00366 * 00367 RETURN 00368 * 00369 * End of PDGGRQF 00370 * 00371 END