![]() |
SCALAPACK 2.2.2
LAPACK: Linear Algebra PACKage
|
#define static2 static |
– ScaLAPACK routine (version 1.7) – Oak Ridge National Laboratory, Univ. of Tennessee, and Univ. of California, Berkeley. October 31, 1994.
SUBROUTINE PSTRMR2D(UPLO, DIAG, M, N, $ A, IA, JA, ADESC, $ B, IB, JB, BDESC,
PSTRMR2D copies a submatrix of A on a submatrix of B. A and B can have different distributions: they can be on different processor grids, they can have different blocksizes, the beginning of the area to be copied can be at a different places on A and B.
The parameters can be confusing when the grids of A and B are partially or completly disjoint, in the case a processor calls this routines but is either not in the A context or B context, the ADESC[CTXT] or BDESC[CTXT] must be equal to -1, to ensure the routine recognise this situation. To summarize the rule:
The submatrix to be copied is assumed to be trapezoidal. So only the upper or the lower part will be copied. The other part is unchanged.
A description vector is associated with each 2D block-cyclicly dis- tributed matrix. This vector stores the information required to establish the mapping between a matrix entry and its corresponding process and memory location.
In the following comments, the character _ should be read as "of the distributed matrix". Let A be a generic term for any 2D block cyclicly distributed matrix. Its description vector is DESC_A:
NOTATION STORED IN EXPLANATION
DT_A (global) DESCA( DT_ ) The descriptor type. CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating the BLACS process grid A is distribu- ted over. The context itself is glo- bal, but the handle (the integer value) may vary. M_A (global) DESCA( M_ ) The number of rows in the distributed matrix A. N_A (global) DESCA( N_ ) The number of columns in the distri- buted matrix A. MB_A (global) DESCA( MB_ ) The blocking factor used to distribute the rows of A. NB_A (global) DESCA( NB_ ) The blocking factor used to distribute the columns of A. RSRC_A (global) DESCA( RSRC_ ) The process row over which the first row of the matrix A is distributed. CSRC_A (global) DESCA( CSRC_ ) The process column over which the first column of A is distributed. LLD_A (local) DESCA( LLD_ ) The leading dimension of the local array storing the local blocks of the distributed matrix A. LLD_A >= MAX(1,LOCp(M_A)).
The parameters of the routine have changed in April 1996 There is a new last argument. It must be a context englobing all processors involved in the initial and final distribution.
Be aware that all processors included in this context must call the redistribution routine.
UPLO (input) CHARACTER*1. On entry, UPLO specifies whether we should copy the upper part of the lower part of the defined submatrix: UPLO = 'U' or 'u' copy the upper triangular part. UPLO = 'L' or 'l' copy the lower triangular part. Unchanged on exit.
DIAG (input) CHARACTER*1. On entry, DIAG specifies whether we should copy the diagonal. DIAG = 'U' or 'u' do NOT copy the diagonal of the submatrix. DIAG = 'N' or 'n' DO copy the diagonal of the submatrix. Unchanged on exit.
M (input) INTEGER. On entry, M specifies the number of rows of the submatrix to be copied. M must be at least zero. Unchanged on exit.
N (input) INTEGER. On entry, N specifies the number of cols of the submatrix to be redistributed.rows of B. M must be at least zero. Unchanged on exit.
A (input) REAL On entry, the source matrix. Unchanged on exit.
IA,JA (input) INTEGER On entry,the coordinates of the beginning of the submatrix of A to copy. 1 <= IA <= M_A - M + 1,1 <= JA <= N_A - N + 1, Unchanged on exit.
ADESC (input) A description vector (see Notes above) If the current processor is not part of the context of A the ADESC[CTXT] must be equal to -1.
B (output) REAL On entry, the destination matrix. The portion corresponding to the defined submatrix are updated.
IB,JB (input) INTEGER On entry,the coordinates of the beginning of the submatrix of B that will be updated. 1 <= IB <= M_B - M + 1,1 <= JB <= N_B - N + 1, Unchanged on exit.
BDESC (input) B description vector (see Notes above) For processors not part of the context of B BDESC[CTXT] must be equal to -1.
CTXT (input) a context englobing at least all processors included in either A context or B context
for the processors belonging to grid 0, one buffer of size block 0 and for the processors belonging to grid 1, also one buffer of size block 1.
Created March 1993 by B. Tourancheau (See sccs for modifications).