ScaLAPACK  2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
pzlapv2.f
Go to the documentation of this file.
00001       SUBROUTINE PZLAPV2( DIREC, ROWCOL, M, N, A, IA, JA, DESCA, IPIV,
00002      $                    IP, JP, DESCIP )
00003 *
00004 *  -- ScaLAPACK auxiliary routine (version 1.7) --
00005 *     University of Tennessee, Knoxville, Oak Ridge National Laboratory,
00006 *     and University of California, Berkeley.
00007 *     May 1, 1997
00008 *
00009 *     .. Scalar Arguments ..
00010       CHARACTER          DIREC, ROWCOL
00011       INTEGER            IA, IP, JA, JP, M, N
00012 *     ..
00013 *     .. Array Arguments ..
00014       INTEGER            DESCA( * ), DESCIP( * ), IPIV( * )
00015       COMPLEX*16         A( * )
00016 *     ..
00017 *
00018 *  Purpose
00019 *  =======
00020 *
00021 *  PZLAPV2 applies either P (permutation matrix indicated by IPIV)
00022 *  or inv( P ) to a M-by-N distributed matrix sub( A ) denoting
00023 *  A(IA:IA+M-1,JA:JA+N-1), resulting in row or column pivoting.  The
00024 *  pivot vector should be aligned with the distributed matrix A.  For
00025 *  pivoting the rows of sub( A ), IPIV should be distributed along a
00026 *  process column and replicated over all process rows.  Similarly,
00027 *  IPIV should be distributed along a process row and replicated over
00028 *  all process columns for column pivoting.
00029 *
00030 *  Notes
00031 *  =====
00032 *
00033 *  Each global data object is described by an associated description
00034 *  vector.  This vector stores the information required to establish
00035 *  the mapping between an object element and its corresponding process
00036 *  and memory location.
00037 *
00038 *  Let A be a generic term for any 2D block cyclicly distributed array.
00039 *  Such a global array has an associated description vector DESCA.
00040 *  In the following comments, the character _ should be read as
00041 *  "of the global array".
00042 *
00043 *  NOTATION        STORED IN      EXPLANATION
00044 *  --------------- -------------- --------------------------------------
00045 *  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
00046 *                                 DTYPE_A = 1.
00047 *  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
00048 *                                 the BLACS process grid A is distribu-
00049 *                                 ted over. The context itself is glo-
00050 *                                 bal, but the handle (the integer
00051 *                                 value) may vary.
00052 *  M_A    (global) DESCA( M_ )    The number of rows in the global
00053 *                                 array A.
00054 *  N_A    (global) DESCA( N_ )    The number of columns in the global
00055 *                                 array A.
00056 *  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
00057 *                                 the rows of the array.
00058 *  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
00059 *                                 the columns of the array.
00060 *  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
00061 *                                 row of the array A is distributed.
00062 *  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
00063 *                                 first column of the array A is
00064 *                                 distributed.
00065 *  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
00066 *                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
00067 *
00068 *  Let K be the number of rows or columns of a distributed matrix,
00069 *  and assume that its process grid has dimension p x q.
00070 *  LOCr( K ) denotes the number of elements of K that a process
00071 *  would receive if K were distributed over the p processes of its
00072 *  process column.
00073 *  Similarly, LOCc( K ) denotes the number of elements of K that a
00074 *  process would receive if K were distributed over the q processes of
00075 *  its process row.
00076 *  The values of LOCr() and LOCc() may be determined via a call to the
00077 *  ScaLAPACK tool function, NUMROC:
00078 *          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
00079 *          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
00080 *  An upper bound for these quantities may be computed by:
00081 *          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
00082 *          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
00083 *
00084 *  Arguments
00085 *  =========
00086 *
00087 *  DIREC   (global input) CHARACTER
00088 *          Specifies in which order the permutation is applied:
00089 *            = 'F' (Forward) Applies pivots Forward from top of matrix.
00090 *                  Computes P * sub( A );
00091 *            = 'B' (Backward) Applies pivots Backward from bottom of
00092 *                  matrix. Computes inv( P ) * sub( A ).
00093 *
00094 *  ROWCOL  (global input) CHARACTER
00095 *          Specifies if the rows or columns are to be permuted:
00096 *            = 'R' Rows will be permuted,
00097 *            = 'C' Columns will be permuted.
00098 *
00099 *  M       (global input) INTEGER
00100 *          The number of rows to be operated on, i.e. the number of rows
00101 *          of the distributed submatrix sub( A ). M >= 0.
00102 *
00103 *  N       (global input) INTEGER
00104 *          The number of columns to be operated on, i.e. the number of
00105 *          columns of the distributed submatrix sub( A ). N >= 0.
00106 *
00107 *  A       (local input/local output) COMPLEX*16 pointer into the
00108 *          local memory to an array of dimension (LLD_A, LOCc(JA+N-1)).
00109 *          On entry, this local array contains the local pieces of the
00110 *          distributed matrix sub( A ) to which the row or columns
00111 *          interchanges will be applied. On exit, this array contains
00112 *          the local pieces of the permuted distributed matrix.
00113 *
00114 *  IA      (global input) INTEGER
00115 *          The row index in the global array A indicating the first
00116 *          row of sub( A ).
00117 *
00118 *  JA      (global input) INTEGER
00119 *          The column index in the global array A indicating the
00120 *          first column of sub( A ).
00121 *
00122 *  DESCA   (global and local input) INTEGER array of dimension DLEN_.
00123 *          The array descriptor for the distributed matrix A.
00124 *
00125 *  IPIV    (input) INTEGER array, dimension >= LOCr(M_A)+MB_A if
00126 *          ROWCOL = 'R', LOCc(N_A)+NB_A otherwise. It contains
00127 *          the pivoting information. IPIV(i) is the global row (column),
00128 *          local row (column) i was swapped with.  The last piece of the
00129 *          array of size MB_A (resp. NB_A) is used as workspace. IPIV is
00130 *          tied to the distributed matrix A.
00131 *
00132 *  IP      (global input) INTEGER
00133 *          IPIV's global row index, which points to the beginning of the
00134 *          submatrix which is to be operated on.
00135 *
00136 *  JP      (global input) INTEGER
00137 *          IPIV's global column index, which points to the beginning of
00138 *          the submatrix which is to be operated on.
00139 *
00140 *  DESCIP  (global and local input) INTEGER array of dimension 8
00141 *          The array descriptor for the distributed matrix IPIV.
00142 *
00143 *  =====================================================================
00144 *
00145 *     .. Parameters ..
00146       INTEGER            BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
00147      $                   LLD_, MB_, M_, NB_, N_, RSRC_
00148       PARAMETER          ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1,
00149      $                     CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6,
00150      $                     RSRC_ = 7, CSRC_ = 8, LLD_ = 9 )
00151 *     ..
00152 *     .. Local Scalars ..
00153       LOGICAL            FORWRD, ROWPVT
00154       INTEGER            I, IB, ICTXT, ICURCOL, ICURROW, IIP, IP1, ITMP,
00155      $                   IPVWRK, J, JB, JJP, JP1, K, MA, MBA, MYCOL,
00156      $                   MYROW, NBA, NPCOL, NPROW
00157 *     ..
00158 *     .. External Subroutines ..
00159       EXTERNAL           BLACS_GRIDINFO, IGEBS2D, IGEBR2D, INFOG2L,
00160      $                   PZSWAP
00161 *     ..
00162 *     .. External Functions ..
00163       LOGICAL            LSAME
00164       INTEGER            ICEIL, NUMROC
00165       EXTERNAL           ICEIL, LSAME, NUMROC
00166 *     ..
00167 *     .. Intrinsic Functions ..
00168       INTRINSIC          MIN, MOD
00169 *     ..
00170 *     .. Executable Statements ..
00171 *
00172       ROWPVT = LSAME( ROWCOL, 'R' )
00173       IF( ROWPVT ) THEN
00174          IF( M.LE.1 .OR. N.LT.1 )
00175      $      RETURN
00176       ELSE
00177          IF( M.LT.1 .OR. N.LE.1 )
00178      $      RETURN
00179       END IF
00180       FORWRD = LSAME( DIREC, 'F' )
00181 *
00182 *
00183 *     Get grid and matrix parameters
00184 *
00185       MA    = DESCA( M_ )
00186       MBA   = DESCA( MB_ )
00187       NBA   = DESCA( NB_ )
00188       ICTXT = DESCA( CTXT_ )
00189       CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL )
00190 *
00191 *     If I'm applying pivots from beginning to end (e.g., repeating
00192 *     pivoting done earlier).  Thus this section computes P * sub( A ).
00193 *
00194       IF( FORWRD ) THEN
00195          CALL INFOG2L( IP, JP, DESCIP, NPROW, NPCOL, MYROW, MYCOL,
00196      $                 IIP, JJP, ICURROW, ICURCOL )
00197 *
00198 *        If I'm pivoting the rows of sub( A )
00199 *
00200          IF( ROWPVT ) THEN
00201             IPVWRK = NUMROC( DESCIP( M_ ), DESCIP( MB_ ), MYROW,
00202      $                       DESCIP( RSRC_ ), NPROW ) + 1 -
00203      $                       DESCIP( MB_ )
00204 *
00205 *          Loop over rows of sub( A )
00206 *
00207             I = IA
00208             IB = MIN( M, ICEIL( IA, MBA ) * MBA - IA + 1 )
00209    10       CONTINUE
00210 *
00211 *              Find local pointer into IPIV, and broadcast this block's
00212 *              pivot information to everyone in process column
00213 *
00214                IF( MYROW.EQ.ICURROW ) THEN
00215                   CALL IGEBS2D( ICTXT, 'Columnwise', ' ', IB, 1,
00216      $                          IPIV( IIP ), IB )
00217                   ITMP = IIP
00218                   IIP = IIP + IB
00219                ELSE
00220                   ITMP = IPVWRK
00221                   CALL IGEBR2D( ICTXT, 'Columnwise', ' ', IB, 1,
00222      $                          IPIV( ITMP ), IB, ICURROW, MYCOL )
00223                END IF
00224 *
00225 *              Pivot the block of rows
00226 *
00227                DO 20 K = I, I+IB-1
00228                   IP1 = IPIV( ITMP ) - IP + IA
00229                   IF( IP1.NE.K )
00230      $               CALL PZSWAP( N, A, K, JA, DESCA, MA, A, IP1, JA,
00231      $                            DESCA, MA )
00232                   ITMP = ITMP + 1
00233    20          CONTINUE
00234 *
00235 *              Go on to next row of processes, increment row counter,
00236 *              and figure number of rows to pivot next
00237 *
00238                ICURROW = MOD( ICURROW+1, NPROW )
00239                I = I + IB
00240                IB = MIN( MBA, M-I+IA )
00241             IF( IB .GT. 0 ) GOTO 10
00242 *
00243 *        If I am pivoting the columns of sub( A )
00244 *
00245          ELSE
00246             IPVWRK = NUMROC( DESCIP( N_ ), DESCIP( NB_ ), MYCOL,
00247      $                       DESCIP( CSRC_ ), NPCOL ) + 1 -
00248      $                       DESCIP( NB_ )
00249 *
00250 *          Loop over columns of sub( A )
00251 *
00252             J = JA
00253             JB = MIN( N, ICEIL( JA, NBA ) * NBA - JA + 1 )
00254    30       CONTINUE
00255 *
00256 *              Find local pointer into IPIV, and broadcast this block's
00257 *              pivot information to everyone in process row
00258 *
00259                IF( MYCOL.EQ.ICURCOL ) THEN
00260                   CALL IGEBS2D( ICTXT, 'Rowwise', ' ', JB, 1,
00261      $                          IPIV( JJP ), JB )
00262                   ITMP = JJP
00263                   JJP = JJP + JB
00264                ELSE
00265                   ITMP = IPVWRK
00266                   CALL IGEBR2D( ICTXT, 'Rowwise', ' ', JB, 1,
00267      $                          IPIV( ITMP ), JB, MYROW, ICURCOL )
00268                END IF
00269 *
00270 *              Pivot the block of columns
00271 *
00272                DO 40 K = J, J+JB-1
00273                   JP1 = IPIV( ITMP ) - JP + JA
00274                   IF( JP1.NE.K )
00275      $               CALL PZSWAP( M, A, IA, K, DESCA, 1, A, IA, JP1,
00276      $                            DESCA, 1 )
00277                   ITMP = ITMP + 1
00278    40          CONTINUE
00279 *
00280 *              Go on to next column of processes, increment column
00281 *              counter, and figure number of columns to pivot next
00282 *
00283                ICURCOL = MOD( ICURCOL+1, NPCOL )
00284                J = J + JB
00285                JB = MIN( NBA, N-J+JA )
00286             IF( JB .GT. 0 ) GOTO 30
00287          END IF
00288 *
00289 *     If I want to apply pivots in reverse order, i.e. reversing
00290 *     pivoting done earlier.  Thus this section computes
00291 *     inv( P ) * sub( A ).
00292 *
00293       ELSE
00294 *
00295 *        If I'm pivoting the rows of sub( A )
00296 *
00297          IF( ROWPVT ) THEN
00298             CALL INFOG2L( IP+M-1, JP, DESCIP, NPROW, NPCOL, MYROW,
00299      $                    MYCOL, IIP, JJP, ICURROW, ICURCOL )
00300 *
00301             IPVWRK = NUMROC( DESCIP( M_ ), DESCIP( MB_ ), MYROW,
00302      $                       DESCIP( RSRC_ ), NPROW ) + 1 -
00303      $                       DESCIP( MB_ )
00304 *
00305 *           If I'm not in the current process row, my IIP points out
00306 *           past end of pivot vector (since I don't own a piece of the
00307 *           last row). Adjust IIP so it points at last pivot entry.
00308 *
00309             IF( MYROW.NE.ICURROW ) IIP = IIP - 1
00310 *
00311 *           Loop over rows in reverse order, starting at last row
00312 *
00313             I = IA + M - 1
00314             IB = MOD( I, MBA )
00315             IF( IB .EQ. 0 ) IB = MBA
00316             IB = MIN( IB, M )
00317    50       CONTINUE
00318 *
00319 *              Find local pointer into IPIV, and broadcast this block's
00320 *              pivot information to everyone in process column
00321 *
00322                IF( MYROW.EQ.ICURROW ) THEN
00323                   ITMP = IIP
00324                   IIP = IIP - IB
00325                   CALL IGEBS2D( ICTXT, 'Columnwise', ' ', IB, 1,
00326      $                          IPIV( IIP+1 ), IB )
00327                ELSE
00328                   CALL IGEBR2D( ICTXT, 'Columnwise', ' ', IB, 1,
00329      $                          IPIV( IPVWRK ), IB, ICURROW, MYCOL )
00330                   ITMP = IPVWRK + IB - 1
00331                END IF
00332 *
00333 *              Pivot the block of rows
00334 *
00335                DO 60 K = I, I-IB+1, -1
00336                   IP1 = IPIV( ITMP ) - IP + IA
00337                   IF( IP1.NE.K )
00338      $               CALL PZSWAP( N, A, K, JA, DESCA, MA, A, IP1, JA,
00339      $                            DESCA, MA )
00340                   ITMP = ITMP - 1
00341    60          CONTINUE
00342 *
00343 *              Go to previous row of processes, decrement row counter,
00344 *              and figure number of rows to be pivoted next
00345 *
00346                ICURROW = MOD( NPROW+ICURROW-1, NPROW )
00347                I = I - IB
00348                IB = MIN( MBA, I-IA+1 )
00349             IF( IB .GT. 0 ) GOTO 50
00350 *
00351 *        Otherwise, I'm pivoting the columns of sub( A )
00352 *
00353          ELSE
00354             CALL INFOG2L( IP, JP+N-1, DESCIP, NPROW, NPCOL, MYROW,
00355      $                    MYCOL, IIP, JJP, ICURROW, ICURCOL )
00356             IPVWRK = NUMROC( DESCIP( N_ ), DESCIP( NB_ ), MYCOL,
00357      $                       DESCIP( CSRC_ ), NPCOL ) + 1 -
00358      $                       DESCIP( NB_ )
00359 *
00360 *           If I'm not in the current process column, my JJP points out
00361 *           past end of pivot vector (since I don't own a piece of the
00362 *           last column). Adjust JJP so it points at last pivot entry.
00363 *
00364             IF( MYCOL.NE.ICURCOL ) JJP = JJP - 1
00365 *
00366 *          Loop over columns in reverse order starting at last column
00367 *
00368             J = JA + N - 1
00369             JB = MOD( J, NBA )
00370             IF( JB .EQ. 0 ) JB = NBA
00371             JB = MIN( JB, N )
00372    70       CONTINUE
00373 *
00374 *              Find local pointer into IPIV, and broadcast this block's
00375 *              pivot information to everyone in process row
00376 *
00377                IF( MYCOL.EQ.ICURCOL ) THEN
00378                   ITMP = JJP
00379                   JJP = JJP - JB
00380                   CALL IGEBS2D( ICTXT, 'Rowwise', ' ', JB, 1,
00381      $                          IPIV( JJP+1 ), JB )
00382                ELSE
00383                   CALL IGEBR2D( ICTXT, 'Rowwise', ' ', JB, 1,
00384      $                          IPIV( IPVWRK ), JB, MYROW, ICURCOL )
00385                   ITMP = IPVWRK + JB - 1
00386                END IF
00387 *
00388 *              Pivot a block of columns
00389 *
00390                DO 80 K = J, J-JB+1, -1
00391                   JP1 = IPIV( ITMP ) - JP + JA
00392                   IF( JP1.NE.K )
00393      $               CALL PZSWAP( M, A, IA, K, DESCA, 1, A, IA, JP1,
00394      $                            DESCA, 1 )
00395                   ITMP = ITMP - 1
00396    80          CONTINUE
00397 *
00398 *              Go to previous row of processes, decrement row counter,
00399 *              and figure number of rows to be pivoted next
00400 *
00401                ICURCOL = MOD( NPCOL+ICURCOL-1, NPCOL )
00402                J = J - JB
00403                JB = MIN( NBA, J-JA+1 )
00404             IF( JB .GT. 0 ) GOTO 70
00405          END IF
00406 *
00407       END IF
00408 *
00409       RETURN
00410 *
00411 *     End PZLAPV2
00412 *
00413       END