ScaLAPACK  2.0.2
ScaLAPACK: Scalable Linear Algebra PACKage
pcmax1.f
Go to the documentation of this file.
00001       SUBROUTINE PCMAX1( N, AMAX, INDX, X, IX, JX, DESCX, INCX )
00002 *
00003 *  -- ScaLAPACK auxiliary routine (version 1.7) --
00004 *     University of Tennessee, Knoxville, Oak Ridge National Laboratory,
00005 *     and University of California, Berkeley.
00006 *     May 25, 2001
00007 *
00008 *     .. Scalar Arguments ..
00009       INTEGER            INDX, INCX, IX, JX, N
00010       COMPLEX            AMAX
00011 *     ..
00012 *     .. Array Arguments ..
00013       INTEGER            DESCX( * )
00014       COMPLEX            X( * )
00015 *     ..
00016 *
00017 *  Purpose
00018 *  =======
00019 *
00020 *  PCMAX1 computes the global index of the maximum element in absolute
00021 *  value of a distributed vector sub( X ). The global index is returned
00022 *  in INDX and the value is returned in AMAX,
00023 *
00024 *  where sub( X ) denotes X(IX:IX+N-1,JX) if INCX = 1,
00025 *                         X(IX,JX:JX+N-1) if INCX = M_X.
00026 *
00027 *  Notes
00028 *  =====
00029 *
00030 *  Each global data object is described by an associated description
00031 *  vector.  This vector stores the information required to establish
00032 *  the mapping between an object element and its corresponding process
00033 *  and memory location.
00034 *
00035 *  Let A be a generic term for any 2D block cyclicly distributed array.
00036 *  Such a global array has an associated description vector DESCA.
00037 *  In the following comments, the character _ should be read as
00038 *  "of the global array".
00039 *
00040 *  NOTATION        STORED IN      EXPLANATION
00041 *  --------------- -------------- --------------------------------------
00042 *  DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
00043 *                                 DTYPE_A = 1.
00044 *  CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
00045 *                                 the BLACS process grid A is distribu-
00046 *                                 ted over. The context itself is glo-
00047 *                                 bal, but the handle (the integer
00048 *                                 value) may vary.
00049 *  M_A    (global) DESCA( M_ )    The number of rows in the global
00050 *                                 array A.
00051 *  N_A    (global) DESCA( N_ )    The number of columns in the global
00052 *                                 array A.
00053 *  MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
00054 *                                 the rows of the array.
00055 *  NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
00056 *                                 the columns of the array.
00057 *  RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
00058 *                                 row of the array A is distributed.
00059 *  CSRC_A (global) DESCA( CSRC_ ) The process column over which the
00060 *                                 first column of the array A is
00061 *                                 distributed.
00062 *  LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
00063 *                                 array.  LLD_A >= MAX(1,LOCr(M_A)).
00064 *
00065 *  Let K be the number of rows or columns of a distributed matrix,
00066 *  and assume that its process grid has dimension p x q.
00067 *  LOCr( K ) denotes the number of elements of K that a process
00068 *  would receive if K were distributed over the p processes of its
00069 *  process column.
00070 *  Similarly, LOCc( K ) denotes the number of elements of K that a
00071 *  process would receive if K were distributed over the q processes of
00072 *  its process row.
00073 *  The values of LOCr() and LOCc() may be determined via a call to the
00074 *  ScaLAPACK tool function, NUMROC:
00075 *          LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
00076 *          LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
00077 *  An upper bound for these quantities may be computed by:
00078 *          LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
00079 *          LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
00080 *
00081 *  Because vectors may be viewed as a subclass of matrices, a
00082 *  distributed vector is considered to be a distributed matrix.
00083 *
00084 *  When the result of a vector-oriented PBLAS call is a scalar, it will
00085 *  be made available only within the scope which owns the vector(s)
00086 *  being operated on.  Let X be a generic term for the input vector(s).
00087 *  Then, the processes which receive the answer will be (note that if
00088 *  an operation involves more than one vector, the processes which re-
00089 *  ceive the result will be the union of the following calculation for
00090 *  each vector):
00091 *
00092 *  If N = 1, M_X = 1 and INCX = 1, then one can't determine if a process
00093 *  row or process column owns the vector operand, therefore only the
00094 *  process of coordinate {RSRC_X, CSRC_X} receives the result;
00095 *
00096 *  If INCX = M_X, then sub( X ) is a vector distributed over a process
00097 *  row. Each process part of this row receives the result;
00098 *
00099 *  If INCX = 1, then sub( X ) is a vector distributed over a process
00100 *  column. Each process part of this column receives the result;
00101 *
00102 *  Based on PCAMAX from Level 1 PBLAS. The change is to use the
00103 *  'genuine' absolute value.
00104 *
00105 *  The serial version was contributed to LAPACK by Nick Higham for use
00106 *  with CLACON.
00107 *
00108 *  Arguments
00109 *  =========
00110 *
00111 *  N       (global input) pointer to INTEGER
00112 *          The number of components of the distributed vector sub( X ).
00113 *          N >= 0.
00114 *
00115 *  AMAX    (global output) pointer to REAL
00116 *          The absolute value of the largest entry of the distributed
00117 *          vector sub( X ) only in the scope of sub( X ).
00118 *
00119 *  INDX    (global output) pointer to INTEGER
00120 *          The global index of the element of the distributed vector
00121 *          sub( X ) whose real part has maximum absolute value.
00122 *
00123 *  X       (local input) COMPLEX array containing the local
00124 *          pieces of a distributed matrix of dimension of at least
00125 *              ( (JX-1)*M_X + IX + ( N - 1 )*abs( INCX ) )
00126 *          This array contains the entries of the distributed vector
00127 *          sub( X ).
00128 *
00129 *  IX      (global input) INTEGER
00130 *          The row index in the global array X indicating the first
00131 *          row of sub( X ).
00132 *
00133 *  JX      (global input) INTEGER
00134 *          The column index in the global array X indicating the
00135 *          first column of sub( X ).
00136 *
00137 *  DESCX   (global and local input) INTEGER array of dimension DLEN_.
00138 *          The array descriptor for the distributed matrix X.
00139 *
00140 *  INCX    (global input) INTEGER
00141 *          The global increment for the elements of X. Only two values
00142 *          of INCX are supported in this version, namely 1 and M_X.
00143 *          INCX must not be zero.
00144 *
00145 *  =====================================================================
00146 *
00147 *     .. Parameters ..
00148       INTEGER            BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
00149      $                   LLD_, MB_, M_, NB_, N_, RSRC_
00150       PARAMETER          ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1,
00151      $                     CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6,
00152      $                     RSRC_ = 7, CSRC_ = 8, LLD_ = 9 )
00153       COMPLEX            ZERO
00154       PARAMETER          ( ZERO = ( 0.0E+0, 0.0E+0 ) )
00155 *     ..
00156 *     .. Local Scalars ..
00157       CHARACTER          CBTOP, CCTOP, RBTOP, RCTOP
00158       INTEGER            ICOFF, ICTXT, IDUMM, IIX, IROFF, IXCOL, IXROW,
00159      $                   JJX, LCINDX, LDX, MAXPOS, MYCOL, MYROW, NP,
00160      $                   NPCOL, NPROW, NQ
00161 *     ..
00162 *     .. Local Arrays ..
00163       COMPLEX            WORK( 2 )
00164 *     ..
00165 *     .. External Subroutines ..
00166       EXTERNAL           BLACS_GRIDINFO, CCOMBAMAX1, CGAMX2D,
00167      $                   IGEBR2D, IGEBS2D, INFOG2L, PCTREECOMB,
00168      $                   PB_TOPGET
00169 *     ..
00170 *     .. External Functions ..
00171       LOGICAL            LSAME
00172       INTEGER            ICMAX1, INDXL2G, NUMROC
00173       EXTERNAL           ICMAX1, INDXL2G, NUMROC
00174 *     ..
00175 *     .. Intrinsic Functions ..
00176       INTRINSIC          ABS, CMPLX, MOD, NINT, REAL
00177 *     ..
00178 *     .. Executable Statements ..
00179 *
00180 *     Get grid parameters
00181 *
00182       ICTXT = DESCX( CTXT_ )
00183       CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL )
00184 *
00185 *     Quick return if possible.
00186 *
00187       INDX = 0
00188       AMAX = ZERO
00189       IF( N.LE.0 )
00190      $   RETURN
00191 *
00192 *     Retrieve local information for vector X.
00193 *
00194       LDX = DESCX( LLD_ )
00195       CALL INFOG2L( IX, JX, DESCX, NPROW, NPCOL, MYROW, MYCOL, IIX, JJX,
00196      $              IXROW, IXCOL )
00197 *
00198       IF( INCX.EQ.1 .AND. DESCX( M_ ).EQ.1 .AND. N.EQ.1 ) THEN
00199          INDX = JX
00200          AMAX = X( IIX+(JJX-1)*LDX )
00201          RETURN
00202       END IF
00203 *
00204 *     Find the maximum value and its index
00205 *
00206       IF( INCX.EQ.DESCX( M_ ) ) THEN
00207 *
00208          IF( MYROW.EQ.IXROW ) THEN
00209 *
00210             ICOFF = MOD( JX-1, DESCX( NB_ ) )
00211             NQ = NUMROC( N+ICOFF, DESCX( NB_ ), MYCOL, IXCOL, NPCOL )
00212             IF( MYCOL.EQ.IXCOL )
00213      $         NQ = NQ-ICOFF
00214 *
00215             CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', RBTOP )
00216 *
00217             IF( LSAME( RBTOP, ' ' ) ) THEN
00218 *
00219                IF( NQ.GT.0 ) THEN
00220                   LCINDX = JJX-1+ICMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX )
00221                   WORK( 1 ) = X( IIX+(LCINDX-1)*LDX )
00222                   WORK( 2 ) = CMPLX( REAL( INDXL2G( LCINDX,
     $              DESCX( NB_ ), MYCOL, DESCX( CSRC_ ), NPCOL ) ) )
00223                ELSE
00224                   WORK( 1 ) = ZERO
00225                   WORK( 2 ) = ZERO
00226                END IF
00227 *
00228                CALL PCTREECOMB( ICTXT, 'Row', 2, WORK, -1, MYCOL,
00229      $                          CCOMBAMAX1 )
00230 *
00231                AMAX = WORK( 1 )
00232                IF( AMAX.EQ.ZERO ) THEN
00233                   INDX = JX
00234                ELSE
00235                   INDX = NINT( REAL( WORK( 2 ) ) )
00236                END IF
00237 *
00238             ELSE
00239 *
00240                CALL PB_TOPGET( ICTXT, 'Combine', 'Rowwise', RCTOP )
00241 *
00242                IF( NQ.GT.0 ) THEN
00243                   LCINDX = JJX-1+ICMAX1( NQ, X( IIX+(JJX-1)*LDX ), LDX )
00244                   AMAX = X( IIX + (LCINDX-1)*LDX )
00245                ELSE
00246                   AMAX = ZERO
00247                END IF
00248 *
00249 *              Find the maximum value
00250 *
00251                CALL CGAMX2D( ICTXT, 'Rowwise', RCTOP, 1, 1, AMAX, 1,
00252      $                       IDUMM, MAXPOS, 1, -1, MYROW )
00253 *
00254                IF( AMAX.NE.ZERO ) THEN
00255 *
00256 *                 Broadcast corresponding global index
00257 *
00258                   IF( MYCOL.EQ.MAXPOS ) THEN
00259                      INDX = INDXL2G( LCINDX, DESCX( NB_ ), MYCOL,
00260      $                               DESCX( CSRC_ ), NPCOL )
00261                      CALL IGEBS2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX,
00262      $                             1 )
00263                   ELSE
00264                      CALL IGEBR2D( ICTXT, 'Rowwise', RBTOP, 1, 1, INDX,
00265      $                             1, MYROW, MAXPOS )
00266                   END IF
00267 *
00268                ELSE
00269 *
00270                   INDX = JX
00271 *
00272                END IF
00273 *
00274             END IF
00275 *
00276          END IF
00277 *
00278       ELSE
00279 *
00280          IF( MYCOL.EQ.IXCOL ) THEN
00281 *
00282             IROFF = MOD( IX-1, DESCX( MB_ ) )
00283             NP = NUMROC( N+IROFF, DESCX( MB_ ), MYROW, IXROW, NPROW )
00284             IF( MYROW.EQ.IXROW )
00285      $         NP = NP-IROFF
00286 *
00287             CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', CBTOP )
00288 *
00289             IF( LSAME( CBTOP, ' ' ) ) THEN
00290 *
00291                IF( NP.GT.0 ) THEN
00292                   LCINDX = IIX-1+ICMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 )
00293                   WORK( 1 ) = X( LCINDX + (JJX-1)*LDX )
00294                   WORK( 2 ) = CMPLX( REAL( INDXL2G( LCINDX,
     $              DESCX( MB_ ), MYROW, DESCX( RSRC_ ), NPROW ) ) )
00295                ELSE
00296                   WORK( 1 ) = ZERO
00297                   WORK( 2 ) = ZERO
00298                END IF
00299 *
00300                CALL PCTREECOMB( ICTXT, 'Column', 2, WORK, -1, MYCOL,
00301      $                          CCOMBAMAX1 )
00302 *
00303                AMAX = WORK( 1 )
00304                IF( AMAX.EQ.ZERO ) THEN
00305                   INDX = IX
00306                ELSE
00307                   INDX = NINT( REAL( WORK( 2 ) ) )
00308                END IF
00309 *
00310             ELSE
00311 *
00312                CALL PB_TOPGET( ICTXT, 'Combine', 'Columnwise', CCTOP )
00313 *
00314                IF( NP.GT.0 ) THEN
00315                   LCINDX = IIX-1+ICMAX1( NP, X( IIX+(JJX-1)*LDX ), 1 )
00316                   AMAX = X( LCINDX + (JJX-1)*LDX )
00317                ELSE
00318                   AMAX = ZERO
00319                END IF
00320 *
00321 *              Find the maximum value
00322 *
00323                CALL CGAMX2D( ICTXT, 'Columnwise', CCTOP, 1, 1, AMAX, 1,
00324      $                       MAXPOS, IDUMM, 1, -1, MYCOL )
00325 *
00326                IF( AMAX.NE.ZERO ) THEN
00327 *
00328 *                 Broadcast corresponding global index
00329 *
00330                   IF( MYROW.EQ.MAXPOS ) THEN
00331                      INDX = INDXL2G( LCINDX, DESCX( MB_ ), MYROW,
00332      $                               DESCX( RSRC_ ), NPROW )
00333                      CALL IGEBS2D( ICTXT, 'Columnwise', CBTOP, 1, 1,
00334      $                             INDX, 1 )
00335                   ELSE
00336                      CALL IGEBR2D( ICTXT, 'Columnwise', CBTOP, 1, 1,
00337      $                             INDX, 1, MAXPOS, MYCOL )
00338                   END IF
00339 *
00340                ELSE
00341 *
00342                   INDX = IX
00343 *
00344                END IF
00345 *
00346             END IF
00347 *
00348          END IF
00349 *
00350       END IF
00351 *
00352       RETURN
00353 *
00354 *     End of PCMAX1
00355 *
00356       END
00357 *
00358       SUBROUTINE CCOMBAMAX1 ( V1, V2 )
00359 *
00360 *  -- ScaLAPACK auxiliary routine (version 1.7) --
00361 *     University of Tennessee, Knoxville, Oak Ridge National Laboratory,
00362 *     and University of California, Berkeley.
00363 *     May 1, 1997
00364 *
00365 *     .. Array Arguments ..
00366       COMPLEX            V1( 2 ), V2( 2 )
00367 *     ..
00368 *
00369 *  Purpose
00370 *  =======
00371 *
00372 *  CCOMBAMAX1 finds the element having maximum real part absolute
00373 *  value as well as its corresponding globl index.
00374 *
00375 *  Arguments
00376 *  =========
00377 *
00378 *  V1        (local input/local output) COMPLEX array of
00379 *            dimension 2.  The first maximum absolute value element and
00380 *            its global index. V1(1) = AMAX, V1(2) = INDX.
00381 *
00382 *  V2        (local input) COMPLEX array of dimension 2.
00383 *            The second maximum absolute value element and its global
00384 *            index. V2(1) = AMAX, V2(2) = INDX.
00385 *
00386 *  =====================================================================
00387 *
00388 *     .. Intrinsic Functions ..
00389       INTRINSIC          ABS, REAL
00390 *     ..
00391 *     .. Executable Statements ..
00392 *
00393       IF( ABS( REAL( V1( 1 ) ) ).LT.ABS( REAL( V2( 1 ) ) ) ) THEN
00394          V1( 1 ) = V2( 1 )
00395          V1( 2 ) = V2( 2 )
00396       END IF
00397 *
00398       RETURN
00399 *
00400 *     End of CCOMBAMAX1
00401 *
00402       END
00403