d1/d62/pdblastst_8f_source.html

      SUBROUTINE pdoptee( ICTXT, NOUT, SUBPTR, SCODE, SNAME )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            ICTXT, NOUT, SCODE

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)      SNAME

*     ..

*     .. Subroutine Arguments ..

      EXTERNAL           subptr

*     ..

*

*  Purpose

*  =======

*

*  PDOPTEE  tests  whether  the  PBLAS respond correctly to a bad option

*  argument.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  SUBPTR  (global input) SUBROUTINE

*          On entry,  SUBPTR  is  a  subroutine. SUBPTR must be declared

*          EXTERNAL in the calling subroutine.

*

*  SCODE   (global input) INTEGER

*          On entry, SCODE specifies the calling sequence code.

*

*  SNAME   (global input) CHARACTER*(*)

*          On entry,  SNAME  specifies  the subroutine name calling this

*          subprogram.

*

*  Calling sequence encodings

*  ==========================

*

*  code Formal argument list                                Examples

*

*  11   (n,      v1,v2)                                     _SWAP, _COPY

*  12   (n,s1,   v1   )                                     _SCAL, _SCAL

*  13   (n,s1,   v1,v2)                                     _AXPY, _DOT_

*  14   (n,s1,i1,v1   )                                     _AMAX

*  15   (n,u1,   v1   )                                     _ASUM, _NRM2

*

*  21   (     trans,     m,n,s1,m1,v1,s2,v2)                _GEMV

*  22   (uplo,             n,s1,m1,v1,s2,v2)                _SYMV, _HEMV

*  23   (uplo,trans,diag,  n,   m1,v1      )                _TRMV, _TRSV

*  24   (                m,n,s1,v1,v2,m1)                   _GER_

*  25   (uplo,             n,s1,v1,   m1)                   _SYR

*  26   (uplo,             n,u1,v1,   m1)                   _HER

*  27   (uplo,             n,s1,v1,v2,m1)                   _SYR2, _HER2

*

*  31   (          transa,transb,     m,n,k,s1,m1,m2,s2,m3) _GEMM

*  32   (side,uplo,                   m,n,  s1,m1,m2,s2,m3) _SYMM, _HEMM

*  33   (     uplo,trans,               n,k,s1,m1,   s2,m3) _SYRK

*  34   (     uplo,trans,               n,k,u1,m1,   u2,m3) _HERK

*  35   (     uplo,trans,               n,k,s1,m1,m2,s2,m3) _SYR2K

*  36   (     uplo,trans,               n,k,s1,m1,m2,u2,m3) _HER2K

*  37   (                             m,n,  s1,m1,   s2,m3) _TRAN_

*  38   (side,uplo,transa,       diag,m,n,  s1,m1,m2      ) _TRMM, _TRSM

*  39   (          trans,             m,n,  s1,m1,   s2,m3) _GEADD

*  40   (     uplo,trans,             m,n,  s1,m1,   s2,m3) _TRADD

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER             APOS

*     ..

*     .. External Subroutines ..

      EXTERNAL            pdchkopt

*     ..

*     .. Executable Statements ..

*

*     Level 2 PBLAS

*

      IF( scode.EQ.21 ) THEN

*

*        Check 1st (and only) option

*

         apos = 1

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'A', apos )

*

      ELSE IF( scode.EQ.22 .OR. scode.EQ.25 .OR. scode.EQ.26 .OR.

     $         scode.EQ.27 ) THEN

*

*        Check 1st (and only) option

*

         apos = 1

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'U', apos )

*

      ELSE IF( scode.EQ.23 ) THEN

*

*        Check 1st option

*

         apos = 1

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'U', apos )

*

*        Check 2nd option

*

         apos = 2

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 3rd option

*

         apos = 3

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'D', apos )

*

*     Level 3 PBLAS

*

      ELSE IF( scode.EQ.31 ) THEN

*

*        Check 1st option

*

         apos = 1

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 2'nd option

*

         apos = 2

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'B', apos )

*

      ELSE IF( scode.EQ.32 ) THEN

*

*        Check 1st option

*

         apos = 1

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'S', apos )

*

*        Check 2nd option

*

         apos = 2

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'U', apos )

*

      ELSE IF( scode.EQ.33 .OR. scode.EQ.34 .OR. scode.EQ.35 .OR.

     $         scode.EQ.36 .OR. scode.EQ.40 ) THEN

*

*        Check 1st option

*

         apos = 1

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'U', apos )

*

*        Check 2'nd option

*

         apos = 2

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'A', apos )

*

      ELSE IF( scode.EQ.38 ) THEN

*

*        Check 1st option

*

         apos = 1

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'S', apos )

*

*        Check 2nd option

*

         apos = 2

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'U', apos )

*

*        Check 3rd option

*

         apos = 3

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 4th option

*

         apos = 4

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'D', apos )

*

*

      ELSE IF( scode.EQ.39 ) THEN

*

*        Check 1st option

*

         apos = 1

         CALL pdchkopt( ictxt, nout, subptr, scode, sname, 'A', apos )

*

      END IF

*

      RETURN

*

*     End of PDOPTEE

*

      END

      SUBROUTINE pdchkopt( ICTXT, NOUT, SUBPTR, SCODE, SNAME, ARGNAM,

     $                     ARGPOS )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1         ARGNAM

      INTEGER             ARGPOS, ICTXT, NOUT, SCODE

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)       SNAME

*     ..

*     .. Subroutine Arguments ..

      EXTERNAL            subptr

*     ..

*

*  Purpose

*  =======

*

*  PDCHKOPT tests the option ARGNAM in any PBLAS routine.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  SUBPTR  (global input) SUBROUTINE

*          On entry,  SUBPTR  is  a  subroutine. SUBPTR must be declared

*          EXTERNAL in the calling subroutine.

*

*  SCODE   (global input) INTEGER

*          On entry, SCODE specifies the calling sequence code.

*

*  SNAME   (global input) CHARACTER*(*)

*          On entry,  SNAME  specifies  the subroutine name calling this

*          subprogram.

*

*  ARGNAM  (global input) CHARACTER*(*)

*          On entry,  ARGNAM  specifies  the  name  of  the option to be

*          checked. ARGNAM can either be 'D', 'S', 'A', 'B', or 'U'.

*

*  ARGPOS  (global input) INTEGER

*          On entry, ARGPOS indicates the position of the option ARGNAM

*          to be tested.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER            INFOT

*     ..

*     .. External Subroutines ..

      EXTERNAL           pchkpbe, pdcallsub, pdsetpblas

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      EXTERNAL           lsame

*     ..

*     .. Common Blocks ..

      CHARACTER          DIAG, SIDE, TRANSA, TRANSB, UPLO

      COMMON             /pblasc/diag, side, transa, transb, uplo

*     ..

*     .. Executable Statements ..

*

*     Reiniatilize the dummy arguments to correct values

*

      CALL pdsetpblas( ictxt )

*

      IF( lsame( argnam, 'D' ) ) THEN

*

*        Generate bad DIAG option

*

         diag = '/'

*

      ELSE IF( lsame( argnam, 'S' ) ) THEN

*

*        Generate bad SIDE option

*

         side = '/'

*

      ELSE IF( lsame( argnam, 'A' ) ) THEN

*

*        Generate bad TRANSA option

*

         transa = '/'

*

      ELSE IF( lsame( argnam, 'B' ) ) THEN

*

*        Generate bad TRANSB option

*

         transb = '/'

*

      ELSE IF( lsame( argnam, 'U' ) ) THEN

*

*        Generate bad UPLO option

*

         uplo = '/'

*

      END IF

*

*     Set INFOT to the position of the bad dimension argument

*

      infot = argpos

*

*     Call the PBLAS routine

*

      CALL pdcallsub( subptr, scode )

      CALL pchkpbe( ictxt, nout, sname, infot )

*

      RETURN

*

*     End of PDCHKOPT

*

      END

      SUBROUTINE pddimee( ICTXT, NOUT, SUBPTR, SCODE, SNAME )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            ICTXT, NOUT, SCODE

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)      SNAME

*     ..

*     .. Subroutine Arguments ..

      EXTERNAL           subptr

*     ..

*

*  Purpose

*  =======

*

*  PDDIMEE  tests whether the PBLAS respond correctly to a bad dimension

*  argument.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  SUBPTR  (global input) SUBROUTINE

*          On entry,  SUBPTR  is  a  subroutine. SUBPTR must be declared

*          EXTERNAL in the calling subroutine.

*

*  SCODE   (global input) INTEGER

*          On entry, SCODE specifies the calling sequence code.

*

*  SNAME   (global input) CHARACTER*(*)

*          On entry,  SNAME  specifies  the subroutine name calling this

*          subprogram.

*

*  Calling sequence encodings

*  ==========================

*

*  code Formal argument list                                Examples

*

*  11   (n,      v1,v2)                                     _SWAP, _COPY

*  12   (n,s1,   v1   )                                     _SCAL, _SCAL

*  13   (n,s1,   v1,v2)                                     _AXPY, _DOT_

*  14   (n,s1,i1,v1   )                                     _AMAX

*  15   (n,u1,   v1   )                                     _ASUM, _NRM2

*

*  21   (     trans,     m,n,s1,m1,v1,s2,v2)                _GEMV

*  22   (uplo,             n,s1,m1,v1,s2,v2)                _SYMV, _HEMV

*  23   (uplo,trans,diag,  n,   m1,v1      )                _TRMV, _TRSV

*  24   (                m,n,s1,v1,v2,m1)                   _GER_

*  25   (uplo,             n,s1,v1,   m1)                   _SYR

*  26   (uplo,             n,u1,v1,   m1)                   _HER

*  27   (uplo,             n,s1,v1,v2,m1)                   _SYR2, _HER2

*

*  31   (          transa,transb,     m,n,k,s1,m1,m2,s2,m3) _GEMM

*  32   (side,uplo,                   m,n,  s1,m1,m2,s2,m3) _SYMM, _HEMM

*  33   (     uplo,trans,               n,k,s1,m1,   s2,m3) _SYRK

*  34   (     uplo,trans,               n,k,u1,m1,   u2,m3) _HERK

*  35   (     uplo,trans,               n,k,s1,m1,m2,s2,m3) _SYR2K

*  36   (     uplo,trans,               n,k,s1,m1,m2,u2,m3) _HER2K

*  37   (                             m,n,  s1,m1,   s2,m3) _TRAN_

*  38   (side,uplo,transa,       diag,m,n,  s1,m1,m2      ) _TRMM, _TRSM

*  39   (          trans,             m,n,  s1,m1,   s2,m3) _GEADD

*  40   (     uplo,trans,             m,n,  s1,m1,   s2,m3) _TRADD

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER             APOS

*     ..

*     .. External Subroutines ..

      EXTERNAL            pdchkdim

*     ..

*     .. Executable Statements ..

*

*     Level 1 PBLAS

*

      IF( scode.EQ.11 .OR. scode.EQ.12 .OR. scode.EQ.13 .OR.

     $    scode.EQ.14 .OR. scode.EQ.15 ) THEN

*

*        Check 1st (and only) dimension

*

         apos = 1

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

*     Level 2 PBLAS

*

      ELSE IF( scode.EQ.21 ) THEN

*

*        Check 1st dimension

*

         apos = 2

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'M', apos )

*

*        Check 2nd dimension

*

         apos = 3

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

      ELSE IF( scode.EQ.22 .OR. scode.EQ.25 .OR. scode.EQ.26 .OR.

     $         scode.EQ.27 ) THEN

*

*        Check 1st (and only) dimension

*

         apos = 2

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

      ELSE IF( scode.EQ.23 ) THEN

*

*        Check 1st (and only) dimension

*

         apos = 4

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

      ELSE IF( scode.EQ.24 ) THEN

*

*        Check 1st dimension

*

         apos = 1

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'M', apos )

*

*        Check 2nd dimension

*

         apos = 2

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

*     Level 3 PBLAS

*

      ELSE IF( scode.EQ.31 ) THEN

*

*        Check 1st dimension

*

         apos = 3

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'M', apos )

*

*        Check 2nd dimension

*

         apos = 4

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

*        Check 3rd dimension

*

         apos = 5

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'K', apos )

*

      ELSE IF( scode.EQ.32 ) THEN

*

*        Check 1st dimension

*

         apos = 3

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'M', apos )

*

*        Check 2nd dimension

*

         apos = 4

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

      ELSE IF( scode.EQ.33 .OR. scode.EQ.34 .OR. scode.EQ.35 .OR.

     $         scode.EQ.36 ) THEN

*

*        Check 1st dimension

*

         apos = 3

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

*        Check 2nd dimension

*

         apos = 4

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'K', apos )

*

      ELSE IF( scode.EQ.37 ) THEN

*

*        Check 1st dimension

*

         apos = 1

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'M', apos )

*

*        Check 2nd dimension

*

         apos = 2

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

      ELSE IF( scode.EQ.38 ) THEN

*

*        Check 1st dimension

*

         apos = 5

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'M', apos )

*

*        Check 2nd dimension

*

         apos = 6

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

      ELSE IF( scode.EQ.39 ) THEN

*

*        Check 1st dimension

*

         apos = 2

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'M', apos )

*

*        Check 2nd dimension

*

         apos = 3

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

      ELSE IF( scode.EQ.40 ) THEN

*

*        Check 1st dimension

*

         apos = 3

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'M', apos )

*

*        Check 2nd dimension

*

         apos = 4

         CALL pdchkdim( ictxt, nout, subptr, scode, sname, 'N', apos )

*

      END IF

*

      RETURN

*

*     End of PDDIMEE

*

      END

      SUBROUTINE pdchkdim( ICTXT, NOUT, SUBPTR, SCODE, SNAME, ARGNAM,

     $                     ARGPOS )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1         ARGNAM

      INTEGER             ARGPOS, ICTXT, NOUT, SCODE

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)       SNAME

*     ..

*     .. Subroutine Arguments ..

      EXTERNAL            subptr

*     ..

*

*  Purpose

*  =======

*

*  PDCHKDIM tests the dimension ARGNAM in any PBLAS routine.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  SUBPTR  (global input) SUBROUTINE

*          On entry,  SUBPTR  is  a  subroutine. SUBPTR must be declared

*          EXTERNAL in the calling subroutine.

*

*  SCODE   (global input) INTEGER

*          On entry, SCODE specifies the calling sequence code.

*

*  SNAME   (global input) CHARACTER*(*)

*          On entry,  SNAME  specifies  the subroutine name calling this

*          subprogram.

*

*  ARGNAM  (global input) CHARACTER*(*)

*          On entry,  ARGNAM  specifies  the name of the dimension to be

*          checked. ARGNAM can either be 'M', 'N' or 'K'.

*

*  ARGPOS  (global input) INTEGER

*          On entry, ARGPOS indicates the position of the option ARGNAM

*          to be tested.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER            INFOT

*     ..

*     .. External Subroutines ..

      EXTERNAL           pchkpbe, pdcallsub, pdsetpblas

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      EXTERNAL           LSAME

*     ..

*     .. Common Blocks ..

      INTEGER            KDIM, MDIM, NDIM

      COMMON             /PBLASN/KDIM, MDIM, NDIM

*     ..

*     .. Executable Statements ..

*

*     Reiniatilize the dummy arguments to correct values

*

      CALL pdsetpblas( ictxt )

*

      IF( lsame( argnam, 'M' ) ) THEN

*

*        Generate bad MDIM

*

         mdim = -1

*

      ELSE IF( lsame( argnam, 'N' ) ) THEN

*

*        Generate bad NDIM

*

         ndim = -1

*

      ELSE

*

*        Generate bad KDIM

*

         kdim = -1

*

      END IF

*

*     Set INFOT to the position of the bad dimension argument

*

      infot = argpos

*

*     Call the PBLAS routine

*

      CALL pdcallsub( subptr, scode )

      CALL pchkpbe( ictxt, nout, sname, infot )

*

      RETURN

*

*     End of PDCHKDIM

*

      END

      SUBROUTINE pdvecee( ICTXT, NOUT, SUBPTR, SCODE, SNAME )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER             ICTXT, NOUT, SCODE

*     ..

*     .. Array Arguments ..

      CHARACTER*7         SNAME

*     ..

*     .. Subroutine Arguments ..

      EXTERNAL            subptr

*     ..

*

*  Purpose

*  =======

*

*  PDVECEE  tests  whether  the  PBLAS respond correctly to a bad vector

*  argument.  Each  vector <vec> is described by: <vec>, I<vec>, J<vec>,

*  DESC<vec>,  INC<vec>.   Out   of  all  these,  only  I<vec>,  J<vec>,

*  DESC<vec>, and INC<vec> can be tested.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  SUBPTR  (global input) SUBROUTINE

*          On entry,  SUBPTR  is  a  subroutine. SUBPTR must be declared

*          EXTERNAL in the calling subroutine.

*

*  SCODE   (global input) INTEGER

*          On entry, SCODE specifies the calling sequence code.

*

*  SNAME   (global input) CHARACTER*(*)

*          On entry,  SNAME  specifies  the subroutine name calling this

*          subprogram.

*

*  Calling sequence encodings

*  ==========================

*

*  code Formal argument list                                Examples

*

*  11   (n,      v1,v2)                                     _SWAP, _COPY

*  12   (n,s1,   v1   )                                     _SCAL, _SCAL

*  13   (n,s1,   v1,v2)                                     _AXPY, _DOT_

*  14   (n,s1,i1,v1   )                                     _AMAX

*  15   (n,u1,   v1   )                                     _ASUM, _NRM2

*

*  21   (     trans,     m,n,s1,m1,v1,s2,v2)                _GEMV

*  22   (uplo,             n,s1,m1,v1,s2,v2)                _SYMV, _HEMV

*  23   (uplo,trans,diag,  n,   m1,v1      )                _TRMV, _TRSV

*  24   (                m,n,s1,v1,v2,m1)                   _GER_

*  25   (uplo,             n,s1,v1,   m1)                   _SYR

*  26   (uplo,             n,u1,v1,   m1)                   _HER

*  27   (uplo,             n,s1,v1,v2,m1)                   _SYR2, _HER2

*

*  31   (          transa,transb,     m,n,k,s1,m1,m2,s2,m3) _GEMM

*  32   (side,uplo,                   m,n,  s1,m1,m2,s2,m3) _SYMM, _HEMM

*  33   (     uplo,trans,               n,k,s1,m1,   s2,m3) _SYRK

*  34   (     uplo,trans,               n,k,u1,m1,   u2,m3) _HERK

*  35   (     uplo,trans,               n,k,s1,m1,m2,s2,m3) _SYR2K

*  36   (     uplo,trans,               n,k,s1,m1,m2,u2,m3) _HER2K

*  37   (                             m,n,  s1,m1,   s2,m3) _TRAN_

*  38   (side,uplo,transa,       diag,m,n,  s1,m1,m2      ) _TRMM, _TRSM

*  39   (          trans,             m,n,  s1,m1,   s2,m3) _GEADD

*  40   (     uplo,trans,             m,n,  s1,m1,   s2,m3) _TRADD

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER             APOS

*     ..

*     .. External Subroutines ..

      EXTERNAL            pdchkmat

*     ..

*     .. Executable Statements ..

*

*     Level 1 PBLAS

*

      IF( scode.EQ.11 ) THEN

*

*        Check 1st vector

*

         apos = 2

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

*        Check 2nd vector

*

         apos = 7

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'Y', apos )

*

      ELSE IF( scode.EQ.12 .OR. scode.EQ.15 ) THEN

*

*        Check 1st (and only) vector

*

         apos = 3

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

      ELSE IF( scode.EQ.13 ) THEN

*

*        Check 1st vector

*

         apos = 3

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

*        Check 2nd vector

*

         apos = 8

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'Y', apos )

*

      ELSE IF( scode.EQ.14 ) THEN

*

*        Check 1st (and only) vector

*

         apos = 4

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

*     Level 2 PBLAS

*

      ELSE IF( scode.EQ.21 ) THEN

*

*        Check 1st vector

*

         apos = 9

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

*        Check 2nd vector

*

         apos = 15

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'Y', apos )

*

      ELSE IF( scode.EQ.22 ) THEN

*

*        Check 1st vector

*

         apos = 8

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

*        Check 2nd vector

*

         apos = 14

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'Y', apos )

*

      ELSE IF( scode.EQ.23 ) THEN

*

*        Check 1st (and only) vector

*

         apos = 9

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

      ELSE IF( scode.EQ.24 .OR. scode.EQ.27 ) THEN

*

*        Check 1st vector

*

         apos = 4

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

*        Check 2nd vector

*

         apos = 9

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'Y', apos )

*

      ELSE IF( scode.EQ.26 .OR. scode.EQ.27 ) THEN

*

*        Check 1'st (and only) vector

*

         apos = 4

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'X', apos )

*

      END IF

*

      RETURN

*

*     End of PDVECEE

*

      END

      SUBROUTINE pdmatee( ICTXT, NOUT, SUBPTR, SCODE, SNAME )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER             ICTXT, NOUT, SCODE

*     ..

*     .. Array Arguments ..

      CHARACTER*7         SNAME

*     ..

*     .. Subroutine Arguments ..

      EXTERNAL            subptr

*     ..

*

*  Purpose

*  =======

*

*  PDMATEE  tests  whether  the  PBLAS respond correctly to a bad matrix

*  argument.  Each  matrix <mat> is described by: <mat>, I<mat>, J<mat>,

*  and DESC<mat>.  Out  of  all these, only I<vec>, J<vec> and DESC<mat>

*  can be tested.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  SUBPTR  (global input) SUBROUTINE

*          On entry,  SUBPTR  is  a  subroutine. SUBPTR must be declared

*          EXTERNAL in the calling subroutine.

*

*  SCODE   (global input) INTEGER

*          On entry, SCODE specifies the calling sequence code.

*

*  SNAME   (global input) CHARACTER*(*)

*          On entry,  SNAME  specifies  the subroutine name calling this

*          subprogram.

*

*  Calling sequence encodings

*  ==========================

*

*  code Formal argument list                                Examples

*

*  11   (n,      v1,v2)                                     _SWAP, _COPY

*  12   (n,s1,   v1   )                                     _SCAL, _SCAL

*  13   (n,s1,   v1,v2)                                     _AXPY, _DOT_

*  14   (n,s1,i1,v1   )                                     _AMAX

*  15   (n,u1,   v1   )                                     _ASUM, _NRM2

*

*  21   (     trans,     m,n,s1,m1,v1,s2,v2)                _GEMV

*  22   (uplo,             n,s1,m1,v1,s2,v2)                _SYMV, _HEMV

*  23   (uplo,trans,diag,  n,   m1,v1      )                _TRMV, _TRSV

*  24   (                m,n,s1,v1,v2,m1)                   _GER_

*  25   (uplo,             n,s1,v1,   m1)                   _SYR

*  26   (uplo,             n,u1,v1,   m1)                   _HER

*  27   (uplo,             n,s1,v1,v2,m1)                   _SYR2, _HER2

*

*  31   (          transa,transb,     m,n,k,s1,m1,m2,s2,m3) _GEMM

*  32   (side,uplo,                   m,n,  s1,m1,m2,s2,m3) _SYMM, _HEMM

*  33   (     uplo,trans,               n,k,s1,m1,   s2,m3) _SYRK

*  34   (     uplo,trans,               n,k,u1,m1,   u2,m3) _HERK

*  35   (     uplo,trans,               n,k,s1,m1,m2,s2,m3) _SYR2K

*  36   (     uplo,trans,               n,k,s1,m1,m2,u2,m3) _HER2K

*  37   (                             m,n,  s1,m1,   s2,m3) _TRAN_

*  38   (side,uplo,transa,       diag,m,n,  s1,m1,m2      ) _TRMM, _TRSM

*  39   (          trans,             m,n,  s1,m1,   s2,m3) _GEADD

*  40   (     uplo,trans,             m,n,  s1,m1,   s2,m3) _TRADD

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER             APOS

*     ..

*     .. External Subroutines ..

      EXTERNAL            pdchkmat

*     ..

*     .. Executable Statements ..

*

*     Level 2 PBLAS

*

      IF( scode.EQ.21 .OR. scode.EQ.23 ) THEN

*

*        Check 1st (and only) matrix

*

         apos = 5

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

      ELSE IF( scode.EQ.22 ) THEN

*

*        Check 1st (and only) matrix

*

         apos = 4

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

      ELSE IF( scode.EQ.24 .OR. scode.EQ.27 ) THEN

*

*        Check 1st (and only) matrix

*

         apos = 14

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

      ELSE IF( scode.EQ.25 .OR. scode.EQ.26 ) THEN

*

*        Check 1st (and only) matrix

*

         apos = 9

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*     Level 3 PBLAS

*

      ELSE IF( scode.EQ.31 ) THEN

*

*        Check 1st matrix

*

         apos = 7

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 2nd matrix

*

         apos = 11

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'B', apos )

*

*        Check 3nd matrix

*

         apos = 16

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'C', apos )

*

      ELSE IF( scode.EQ.32 .OR. scode.EQ.35 .OR. scode.EQ.36 ) THEN

*

*        Check 1st matrix

*

         apos = 6

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 2nd matrix

*

         apos = 10

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'B', apos )

*

*        Check 3nd matrix

*

         apos = 15

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'C', apos )

*

      ELSE IF( scode.EQ.33 .OR. scode.EQ.34 ) THEN

*

*        Check 1st matrix

*

         apos = 6

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 2nd matrix

*

         apos = 11

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'C', apos )

*

      ELSE IF( scode.EQ.37 ) THEN

*

*        Check 1st matrix

*

         apos = 4

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 2nd matrix

*

         apos = 9

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'C', apos )

*

      ELSE IF( scode.EQ.38 ) THEN

*

*        Check 1st matrix

*

         apos = 8

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 2nd matrix

*

         apos = 12

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'B', apos )

*

      ELSE IF( scode.EQ.39 ) THEN

*

*        Check 1st matrix

*

         apos = 5

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 2nd matrix

*

         apos = 10

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'C', apos )

*

      ELSE IF( scode.EQ.40 ) THEN

*

*        Check 1st matrix

*

         apos = 6

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'A', apos )

*

*        Check 2nd matrix

*

         apos = 11

         CALL pdchkmat( ictxt, nout, subptr, scode, sname, 'C', apos )

*

      END IF

*

      RETURN

*

*     End of PDMATEE

*

      END

      SUBROUTINE pdsetpblas( ICTXT )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            ICTXT

*     ..

*

*  Purpose

*  =======

*

*  PDSETPBLAS initializes *all* the dummy arguments to correct values.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   rsrc_

      parameter( block_cyclic_2d_inb = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ONE

      PARAMETER          ( ONE = 1.0d+0 )

*     ..

*     .. External Subroutines ..

      EXTERNAL           pb_descset2

*     ..

*     .. Common Blocks ..

      CHARACTER*1        DIAG, SIDE, TRANSA, TRANSB, UPLO

      INTEGER            IA, IB, IC, INCX, INCY, ISCLR, IX, IY, JA, JB,

     $                   jc, jx, jy, kdim, mdim, ndim

      DOUBLE PRECISION   USCLR, SCLR

      INTEGER            DESCA( DLEN_ ), DESCB( DLEN_ ), DESCC( DLEN_ ),

     $                   descx( dlen_ ), descy( dlen_ )

      DOUBLE PRECISION   A( 2, 2 ), B( 2, 2 ), C( 2, 2 ), X( 2 ), Y( 2 )

      COMMON             /PBLASC/DIAG, SIDE, TRANSA, TRANSB, UPLO

      COMMON             /pblasd/desca, descb, descc, descx, descy

      COMMON             /pblasi/ia, ib, ic, incx, incy, isclr, ix, iy,

     $                   ja, jb, jc, jx, jy

      COMMON             /pblasm/a, b, c

      COMMON             /pblasn/kdim, mdim, ndim

      COMMON             /pblass/sclr, usclr

      COMMON             /pblasv/x, y

*     ..

*     .. Executable Statements ..

*

*     Set default values for options

*

      diag   = 'N'

      side   = 'L'

      transa = 'N'

      transb = 'N'

      uplo   = 'U'

*

*     Set default values for scalars

*

      kdim   = 1

      mdim   = 1

      ndim   = 1

      isclr  = 1

      sclr   = one

      usclr  = one

*

*     Set default values for distributed matrix A

*

      a( 1, 1 ) = one

      a( 2, 1 ) = one

      a( 1, 2 ) = one

      a( 2, 2 ) = one

      ia = 1

      ja = 1

      CALL pb_descset2( desca, 2, 2, 1, 1, 1, 1, 0, 0, ictxt, 2 )

*

*     Set default values for distributed matrix B

*

      b( 1, 1 ) = one

      b( 2, 1 ) = one

      b( 1, 2 ) = one

      b( 2, 2 ) = one

      ib = 1

      jb = 1

      CALL pb_descset2( descb, 2, 2, 1, 1, 1, 1, 0, 0, ictxt, 2 )

*

*     Set default values for distributed matrix C

*

      c( 1, 1 ) = one

      c( 2, 1 ) = one

      c( 1, 2 ) = one

      c( 2, 2 ) = one

      ic = 1

      jc = 1

      CALL pb_descset2( descc, 2, 2, 1, 1, 1, 1, 0, 0, ictxt, 2 )

*

*     Set default values for distributed matrix X

*

      x( 1 ) = one

      x( 2 ) = one

      ix = 1

      jx = 1

      CALL pb_descset2( descx, 2, 1, 1, 1, 1, 1, 0, 0, ictxt, 2 )

      incx = 1

*

*     Set default values for distributed matrix Y

*

      y( 1 ) = one

      y( 2 ) = one

      iy = 1

      jy = 1

      CALL pb_descset2( descy, 2, 1, 1, 1, 1, 1, 0, 0, ictxt, 2 )

      incy = 1

*

      RETURN

*

*     End of PDSETPBLAS

*

      END

      SUBROUTINE pdchkmat( ICTXT, NOUT, SUBPTR, SCODE, SNAME, ARGNAM,

     $                     ARGPOS )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1         ARGNAM

      INTEGER             ARGPOS, ICTXT, NOUT, SCODE

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)       SNAME

*     ..

*     .. Subroutine Arguments ..

      EXTERNAL            subptr

*     ..

*

*  Purpose

*  =======

*

*  PDCHKMAT tests the matrix (or vector) ARGNAM in any PBLAS routine.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  SUBPTR  (global input) SUBROUTINE

*          On entry,  SUBPTR  is  a  subroutine. SUBPTR must be declared

*          EXTERNAL in the calling subroutine.

*

*  SCODE   (global input) INTEGER

*          On entry, SCODE specifies the calling sequence code.

*

*  SNAME   (global input) CHARACTER*(*)

*          On entry,  SNAME  specifies  the subroutine name calling this

*          subprogram.

*

*  ARGNAM  (global input) CHARACTER*(*)

*          On entry,  ARGNAM  specifies the name of the matrix or vector

*          to be checked.  ARGNAM can either be 'A', 'B' or 'C' when one

*          wants to check a matrix, and 'X' or 'Y' for a vector.

*

*  ARGPOS  (global input) INTEGER

*          On entry, ARGPOS indicates the position of the first argument

*          of the matrix (or vector) ARGNAM.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      parameter( block_cyclic_2d_inb = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      INTEGER             DESCMULT

      PARAMETER           ( DESCMULT = 100 )

*     ..

*     .. Local Scalars ..

      INTEGER             I, INFOT, NPROW, NPCOL, MYROW, MYCOL

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, pchkpbe, pdcallsub, pdsetpblas

*     ..

*     .. External Functions ..

      LOGICAL             LSAME

      EXTERNAL            LSAME

*     ..

*     .. Common Blocks ..

      INTEGER            IA, IB, IC, INCX, INCY, ISCLR, IX, IY, JA, JB,

     $                   JC, JX, JY

      INTEGER            DESCA( DLEN_ ), DESCB( DLEN_ ), DESCC( DLEN_ ),

     $                   descx( dlen_ ), descy( dlen_ )

      COMMON             /pblasd/desca, descb, descc, descx, descy

      COMMON             /pblasi/ia, ib, ic, incx, incy, isclr, ix, iy,

     $                   ja, jb, jc, jx, jy

*     ..

*     .. Executable Statements ..

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      IF( lsame( argnam, 'A' ) ) THEN

*

*        Check IA. Set all other OK, bad IA

*

         CALL pdsetpblas( ictxt )

         ia    = -1

         infot = argpos + 1

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check JA. Set all other OK, bad JA

*

         CALL pdsetpblas( ictxt )

         ja    = -1

         infot = argpos + 2

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check DESCA. Set all other OK, bad DESCA

*

         DO 10 i = 1, dlen_

*

*           Set I'th entry of DESCA to incorrect value, rest ok.

*

            CALL pdsetpblas( ictxt )

            desca( i ) =  -2

            infot = ( ( argpos + 3 ) * descmult ) + i

            CALL pdcallsub( subptr, scode )

            CALL pchkpbe( ictxt, nout, sname, infot )

*

*           Extra tests for RSRCA, CSRCA, LDA

*

            IF( ( i.EQ.rsrc_ ) .OR. ( i.EQ.csrc_ ) .OR.

     $          ( i.EQ.lld_ ) ) THEN

*

               CALL pdsetpblas( ictxt )

*

*              Test RSRCA >= NPROW

*

               IF( i.EQ.rsrc_ )

     $            desca( i ) =  nprow

*

*              Test CSRCA >= NPCOL

*

               IF( i.EQ.csrc_ )

     $            desca( i ) =  npcol

*

*              Test LDA >= MAX(1, PB_NUMROC(...)). Set to 1 as mat 2x2.

*

               IF( i.EQ.lld_ ) THEN

                  IF( myrow.EQ.0 .AND.mycol.EQ.0 ) THEN

                     desca( i ) = 1

                  ELSE

                     desca( i ) = 0

                  END IF

               END IF

*

               infot = ( ( argpos + 3 ) * descmult ) + i

               CALL pdcallsub( subptr, scode )

               CALL pchkpbe( ictxt, nout, sname, infot )

*

            END IF

*

   10    CONTINUE

*

      ELSE IF( lsame( argnam, 'B' ) ) THEN

*

*        Check IB. Set all other OK, bad IB

*

         CALL pdsetpblas( ictxt )

         ib    = -1

         infot = argpos + 1

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check JB. Set all other OK, bad JB

*

         CALL pdsetpblas( ictxt )

         jb    = -1

         infot = argpos + 2

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check DESCB. Set all other OK, bad DESCB

*

         DO 20 i = 1, dlen_

*

*           Set I'th entry of DESCB to incorrect value, rest ok.

*

            CALL pdsetpblas( ictxt )

            descb( i ) =  -2

            infot = ( ( argpos + 3 ) * descmult ) + i

            CALL pdcallsub( subptr, scode )

            CALL pchkpbe( ictxt, nout, sname, infot )

*

*           Extra tests for RSRCB, CSRCB, LDB

*

            IF( ( i.EQ.rsrc_ ) .OR. ( i.EQ.csrc_ ) .OR.

     $          ( i.EQ.lld_ ) ) THEN

*

               CALL pdsetpblas( ictxt )

*

*              Test RSRCB >= NPROW

*

               IF( i.EQ.rsrc_ )

     $            descb( i ) =  nprow

*

*              Test CSRCB >= NPCOL

*

               IF( i.EQ.csrc_ )

     $            descb( i ) =  npcol

*

*              Test LDB >= MAX(1, PB_NUMROC(...)). Set to 1 as mat 2x2.

*

               IF( i.EQ.lld_ ) THEN

                  IF( myrow.EQ.0 .AND.mycol.EQ.0 ) THEN

                     descb( i ) = 1

                  ELSE

                     descb( i ) = 0

                  END IF

               END IF

*

               infot = ( ( argpos + 3 ) * descmult ) + i

               CALL pdcallsub( subptr, scode )

               CALL pchkpbe( ictxt, nout, sname, infot )

*

            END IF

*

   20    CONTINUE

*

      ELSE IF( lsame( argnam, 'C' ) ) THEN

*

*        Check IC. Set all other OK, bad IC

*

         CALL pdsetpblas( ictxt )

         ic    = -1

         infot = argpos + 1

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check JC. Set all other OK, bad JC

*

         CALL pdsetpblas( ictxt )

         jc    = -1

         infot = argpos + 2

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check DESCC. Set all other OK, bad DESCC

*

         DO 30 i = 1, dlen_

*

*           Set I'th entry of DESCC to incorrect value, rest ok.

*

            CALL pdsetpblas( ictxt )

            descc( i ) =  -2

            infot = ( ( argpos + 3 ) * descmult ) + i

            CALL pdcallsub( subptr, scode )

            CALL pchkpbe( ictxt, nout, sname, infot )

*

*           Extra tests for RSRCC, CSRCC, LDC

*

            IF( ( i.EQ.rsrc_ ) .OR. ( i.EQ.csrc_ ) .OR.

     $          ( i.EQ.lld_ ) ) THEN

*

               CALL pdsetpblas( ictxt )

*

*              Test RSRCC >= NPROW

*

               IF( i.EQ.rsrc_ )

     $            descc( i ) =  nprow

*

*              Test CSRCC >= NPCOL

*

               IF( i.EQ.csrc_ )

     $            descc( i ) =  npcol

*

*              Test LDC >= MAX(1, PB_NUMROC(...)). Set to 1 as mat 2x2.

*

               IF( i.EQ.lld_ ) THEN

                  IF( myrow.EQ.0 .AND.mycol.EQ.0 ) THEN

                     descc( i ) = 1

                  ELSE

                     descc( i ) = 0

                  END IF

               END IF

*

               infot = ( ( argpos + 3 ) * descmult ) + i

               CALL pdcallsub( subptr, scode )

               CALL pchkpbe( ictxt, nout, sname, infot )

*

            END IF

*

   30    CONTINUE

*

      ELSE IF( lsame( argnam, 'X' ) ) THEN

*

*        Check IX. Set all other OK, bad IX

*

         CALL pdsetpblas( ictxt )

         ix    = -1

         infot = argpos + 1

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check JX. Set all other OK, bad JX

*

         CALL pdsetpblas( ictxt )

         jx    = -1

         infot = argpos + 2

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check DESCX. Set all other OK, bad DESCX

*

         DO 40 i = 1, dlen_

*

*           Set I'th entry of DESCX to incorrect value, rest ok.

*

            CALL pdsetpblas( ictxt )

            descx( i ) =  -2

            infot = ( ( argpos + 3 ) * descmult ) + i

            CALL pdcallsub( subptr, scode )

            CALL pchkpbe( ictxt, nout, sname, infot )

*

*           Extra tests for RSRCX, CSRCX, LDX

*

            IF( ( i.EQ.rsrc_ ) .OR. ( i.EQ.csrc_ ) .OR.

     $          ( i.EQ.lld_ ) ) THEN

*

               CALL pdsetpblas( ictxt )

*

*              Test RSRCX >= NPROW

*

               IF( i.EQ.rsrc_ )

     $            descx( i ) =  nprow

*

*              Test CSRCX >= NPCOL

*

               IF( i.EQ.csrc_ )

     $            descx( i ) =  npcol

*

*              Test LDX >= MAX(1, PB_NUMROC(...)). Set to 1 as mat 2x2.

*

               IF( i.EQ.lld_ ) THEN

                  IF( myrow.EQ.0 .AND.mycol.EQ.0 ) THEN

                     descx( i ) = 1

                  ELSE

                     descx( i ) = 0

                  END IF

               END IF

*

               infot = ( ( argpos + 3 ) * descmult ) + i

               CALL pdcallsub( subptr, scode )

               CALL pchkpbe( ictxt, nout, sname, infot )

*

            END IF

*

   40    CONTINUE

*

*        Check INCX. Set all other OK, bad INCX

*

         CALL pdsetpblas( ictxt )

         incx  =  -1

         infot = argpos + 4

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

      ELSE

*

*        Check IY. Set all other OK, bad IY

*

         CALL pdsetpblas( ictxt )

         iy    = -1

         infot = argpos + 1

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check JY. Set all other OK, bad JY

*

         CALL pdsetpblas( ictxt )

         jy    = -1

         infot = argpos + 2

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

*        Check DESCY. Set all other OK, bad DESCY

*

         DO 50 i = 1, dlen_

*

*           Set I'th entry of DESCY to incorrect value, rest ok.

*

            CALL pdsetpblas( ictxt )

            descy( i ) =  -2

            infot = ( ( argpos + 3 ) * descmult ) + i

            CALL pdcallsub( subptr, scode )

            CALL pchkpbe( ictxt, nout, sname, infot )

*

*           Extra tests for RSRCY, CSRCY, LDY

*

            IF( ( i.EQ.rsrc_ ) .OR. ( i.EQ.csrc_ ) .OR.

     $          ( i.EQ.lld_ ) ) THEN

*

               CALL pdsetpblas( ictxt )

*

*              Test RSRCY >= NPROW

*

               IF( i.EQ.rsrc_ )

     $            descy( i ) = nprow

*

*              Test CSRCY >= NPCOL

*

               IF( i.EQ.csrc_ )

     $            descy( i ) = npcol

*

*              Test LDY >= MAX(1, PB_NUMROC(...)). Set to 1 as mat 2x2.

*

               IF( i.EQ.lld_ ) THEN

                  IF( myrow.EQ.0 .AND.mycol.EQ.0 ) THEN

                     descy( i ) = 1

                  ELSE

                     descy( i ) = 0

                  END IF

               END IF

*

               infot = ( ( argpos + 3 ) * descmult ) + i

               CALL pdcallsub( subptr, scode )

               CALL pchkpbe( ictxt, nout, sname, infot )

*

            END IF

*

   50    CONTINUE

*

*        Check INCY. Set all other OK, bad INCY

*

         CALL pdsetpblas( ictxt )

         incy =  -1

         infot = argpos + 4

         CALL pdcallsub( subptr, scode )

         CALL pchkpbe( ictxt, nout, sname, infot )

*

      END IF

*

      RETURN

*

*     End of PDCHKMAT

*

      END

      SUBROUTINE pdcallsub( SUBPTR, SCODE )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER             SCODE

*     ..

*     .. Subroutine Arguments ..

      EXTERNAL            subptr

*     ..

*

*  Purpose

*  =======

*

*  PDCALLSUB calls the subroutine SUBPTR with the calling sequence iden-

*  tified by SCODE.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  SUBPTR  (global input) SUBROUTINE

*          On entry,  SUBPTR  is  a  subroutine. SUBPTR must be declared

*          EXTERNAL in the calling subroutine.

*

*  SCODE   (global input) INTEGER

*          On entry, SCODE specifies the calling sequence code.

*

*  Calling sequence encodings

*  ==========================

*

*  code Formal argument list                                Examples

*

*  11   (n,      v1,v2)                                     _SWAP, _COPY

*  12   (n,s1,   v1   )                                     _SCAL, _SCAL

*  13   (n,s1,   v1,v2)                                     _AXPY, _DOT_

*  14   (n,s1,i1,v1   )                                     _AMAX

*  15   (n,u1,   v1   )                                     _ASUM, _NRM2

*

*  21   (     trans,     m,n,s1,m1,v1,s2,v2)                _GEMV

*  22   (uplo,             n,s1,m1,v1,s2,v2)                _SYMV, _HEMV

*  23   (uplo,trans,diag,  n,   m1,v1      )                _TRMV, _TRSV

*  24   (                m,n,s1,v1,v2,m1)                   _GER_

*  25   (uplo,             n,s1,v1,   m1)                   _SYR

*  26   (uplo,             n,u1,v1,   m1)                   _HER

*  27   (uplo,             n,s1,v1,v2,m1)                   _SYR2, _HER2

*

*  31   (          transa,transb,     m,n,k,s1,m1,m2,s2,m3) _GEMM

*  32   (side,uplo,                   m,n,  s1,m1,m2,s2,m3) _SYMM, _HEMM

*  33   (     uplo,trans,               n,k,s1,m1,   s2,m3) _SYRK

*  34   (     uplo,trans,               n,k,u1,m1,   u2,m3) _HERK

*  35   (     uplo,trans,               n,k,s1,m1,m2,s2,m3) _SYR2K

*  36   (     uplo,trans,               n,k,s1,m1,m2,u2,m3) _HER2K

*  37   (                             m,n,  s1,m1,   s2,m3) _TRAN_

*  38   (side,uplo,transa,       diag,m,n,  s1,m1,m2      ) _TRMM, _TRSM

*  39   (          trans,             m,n,  s1,m1,   s2,m3) _GEADD

*  40   (     uplo,trans,             m,n,  s1,m1,   s2,m3) _TRADD

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      parameter( block_cyclic_2d_inb = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

*     ..

*     .. Common Blocks ..

      CHARACTER*1        DIAG, SIDE, TRANSA, TRANSB, UPLO

      INTEGER            IA, IB, IC, INCX, INCY, ISCLR, IX, IY, JA, JB,

     $                   JC, JX, JY, KDIM, MDIM, NDIM

      DOUBLE PRECISION   USCLR, SCLR

      INTEGER            DESCA( DLEN_ ), DESCB( DLEN_ ), DESCC( DLEN_ ),

     $                   DESCX( DLEN_ ), DESCY( DLEN_ )

      DOUBLE PRECISION   A( 2, 2 ), B( 2, 2 ), C( 2, 2 ), X( 2 ), Y( 2 )

      COMMON             /pblasc/diag, side, transa, transb, uplo

      COMMON             /pblasd/desca, descb, descc, descx, descy

      COMMON             /pblasi/ia, ib, ic, incx, incy, isclr, ix, iy,

     $                   ja, jb, jc, jx, jy

      COMMON             /pblasm/a, b, c

      COMMON             /pblasn/kdim, mdim, ndim

      COMMON             /pblass/sclr, usclr

      COMMON             /pblasv/x, y

*     ..

*     .. Executable Statements ..

*

*     Level 1 PBLAS

*

      IF( scode.EQ.11 ) THEN

*

         CALL subptr( ndim, x, ix, jx, descx, incx, y, iy, jy, descy,

     $                incy )

*

      ELSE IF( scode.EQ.12 ) THEN

*

         CALL subptr( ndim, sclr, x, ix, jx, descx, incx )

*

      ELSE IF( scode.EQ.13 ) THEN

*

         CALL subptr( ndim, sclr, x, ix, jx, descx, incx, y, iy, jy,

     $                descy, incy )

*

      ELSE IF( scode.EQ.14 ) THEN

*

         CALL subptr( ndim, sclr, isclr, x, ix, jx, descx, incx )

*

      ELSE IF( scode.EQ.15 ) THEN

*

         CALL subptr( ndim, usclr, x, ix, jx, descx, incx )

*

*     Level 2 PBLAS

*

      ELSE IF( scode.EQ.21 ) THEN

*

         CALL subptr( transa, mdim, ndim, sclr, a, ia, ja, desca, x, ix,

     $                jx, descx, incx, sclr, y, iy, jy, descy, incy )

*

      ELSE IF( scode.EQ.22 ) THEN

*

         CALL subptr( uplo, ndim, sclr, a, ia, ja, desca, x, ix, jx,

     $                descx, incx, sclr, y, iy, jy, descy, incy )

*

      ELSE IF( scode.EQ.23 ) THEN

*

         CALL subptr( uplo, transa, diag, ndim, a, ia, ja, desca, x, ix,

     $                jx, descx, incx )

*

      ELSE IF( scode.EQ.24 ) THEN

*

         CALL subptr( mdim, ndim, sclr, x, ix, jx, descx, incx, y, iy,

     $                jy, descy, incy, a, ia, ja, desca )

*

      ELSE IF( scode.EQ.25 ) THEN

*

         CALL subptr( uplo, ndim, sclr, x, ix, jx, descx, incx, a, ia,

     $                ja, desca )

*

      ELSE IF( scode.EQ.26 ) THEN

*

         CALL subptr( uplo, ndim, usclr, x, ix, jx, descx, incx, a, ia,

     $                ja, desca )

*

      ELSE IF( scode.EQ.27 ) THEN

*

         CALL subptr( uplo, ndim, sclr, x, ix, jx, descx, incx, y, iy,

     $                jy, descy, incy, a, ia, ja, desca )

*

*     Level 3 PBLAS

*

      ELSE IF( scode.EQ.31 ) THEN

*

         CALL subptr( transa, transb, mdim, ndim, kdim, sclr, a, ia, ja,

     $                desca, b, ib, jb, descb, sclr, c, ic, jc, descc )

*

      ELSE IF( scode.EQ.32 ) THEN

*

         CALL subptr( side, uplo, mdim, ndim, sclr, a, ia, ja, desca, b,

     $                ib, jb, descb, sclr, c, ic, jc, descc )

*

      ELSE IF( scode.EQ.33 ) THEN

*

         CALL subptr( uplo, transa, ndim, kdim, sclr, a, ia, ja, desca,

     $                sclr, c, ic, jc, descc )

*

      ELSE IF( scode.EQ.34 ) THEN

*

         CALL subptr( uplo, transa, ndim, kdim, usclr, a, ia, ja, desca,

     $                usclr, c, ic, jc, descc )

*

      ELSE IF( scode.EQ.35 ) THEN

*

         CALL subptr( uplo, transa, ndim, kdim, sclr, a, ia, ja, desca,

     $                b, ib, jb, descb, sclr, c, ic, jc, descc )

*

      ELSE IF( scode.EQ.36 ) THEN

*

         CALL subptr( uplo, transa, ndim, kdim, sclr, a, ia, ja, desca,

     $                b, ib, jb, descb, usclr, c, ic, jc, descc )

*

      ELSE IF( scode.EQ.37 ) THEN

*

         CALL subptr( mdim, ndim, sclr, a, ia, ja, desca, sclr, c, ic,

     $                jc, descc )

*

      ELSE IF( scode.EQ.38 ) THEN

*

         CALL subptr( side, uplo, transa, diag, mdim, ndim, sclr, a, ia,

     $                ja, desca, b, ib, jb, descb )

*

      ELSE IF( scode.EQ.39 ) THEN

*

         CALL subptr( transa, mdim, ndim, sclr, a, ia, ja, desca, sclr,

     $                c, ic, jc, descc )

*

      ELSE IF( scode.EQ.40 ) THEN

*

         CALL subptr( uplo, transa, mdim, ndim, sclr, a, ia, ja, desca,

     $                sclr, c, ic, jc, descc )

*

      END IF

*

      RETURN

*

*     End of PDCALLSUB

*

      END

      SUBROUTINE pderrset( ERR, ERRMAX, XTRUE, X )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      DOUBLE PRECISION   ERR, ERRMAX, X, XTRUE

*     ..

*

*  Purpose

*  =======

*

*  PDERRSET  computes the absolute difference ERR = |XTRUE - X| and com-

*  pares it with zero. ERRMAX accumulates the absolute error difference.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ERR     (local output) DOUBLE PRECISION

*          On exit, ERR specifies the absolute difference |XTRUE - X|.

*

*  ERRMAX  (local input/local output) DOUBLE PRECISION

*          On entry,  ERRMAX  specifies  a previously computed error. On

*          exit ERRMAX is the accumulated error MAX( ERRMAX, ERR ).

*

*  XTRUE   (local input) DOUBLE PRECISION

*          On entry, XTRUE specifies the true value.

*

*  X       (local input) DOUBLE PRECISION

*          On entry, X specifies the value to be compared to XTRUE.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. External Functions ..

      DOUBLE PRECISION   PDDIFF

      EXTERNAL           PDDIFF

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max

*     ..

*     .. Executable Statements ..

*

      err = abs( pddiff( xtrue, x ) )

*

      errmax = max( errmax, err )

*

      RETURN

*

*     End of PDERRSET

*

      END

      SUBROUTINE pdchkvin( ERRMAX, N, X, PX, IX, JX, DESCX, INCX,

     $                     INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            INCX, INFO, IX, JX, N

      DOUBLE PRECISION   ERRMAX

*     ..

*     .. Array Arguments ..

      INTEGER            DESCX( * )

      DOUBLE PRECISION   PX( * ), X( * )

*     ..

*

*  Purpose

*  =======

*

*  PDCHKVIN  checks that the submatrix sub( PX ) remained unchanged. The

*  local  array  entries are compared element by element, and their dif-

*  ference  is tested against 0.0 as well as the epsilon machine. Notice

*  that  this difference should be numerically exactly the zero machine,

*  but  because of the possible fluctuation of some of the data we flag-

*  ged differently a difference less than twice the epsilon machine. The

*  largest error is also returned.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ERRMAX  (global output) DOUBLE PRECISION

*          On exit,  ERRMAX  specifies the largest absolute element-wise

*          difference between sub( X ) and sub( PX ).

*

*  N       (global input) INTEGER

*          On entry,  N  specifies  the  length of the subvector operand

*          sub( X ). N must be at least zero.

*

*  X       (local input) DOUBLE PRECISION array

*          On entry, X is an array of  dimension  (DESCX( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PX.

*

*  PX      (local input) DOUBLE PRECISION array

*          On entry, PX is an array of dimension (DESCX( LLD_ ),*). This

*          array contains the local entries of the matrix PX.

*

*  IX      (global input) INTEGER

*          On entry, IX  specifies X's global row index, which points to

*          the beginning of the submatrix sub( X ).

*

*  JX      (global input) INTEGER

*          On entry, JX  specifies X's global column index, which points

*          to the beginning of the submatrix sub( X ).

*

*  DESCX   (global and local input) INTEGER array

*          On entry, DESCX  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix X.

*

*  INCX    (global input) INTEGER

*          On entry,  INCX   specifies  the  global  increment  for  the

*          elements of  X.  Only two values of  INCX   are  supported in

*          this version, namely 1 and M_X. INCX  must not be zero.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO = 0, no error has been found,

*          If INFO > 0, the maximum abolute error found is in (0,eps],

*          If INFO < 0, the maximum abolute error found is in (eps,+oo).

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO

      PARAMETER          ( ZERO = 0.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, ROWREP

      INTEGER            I, IB, ICTXT, ICURCOL, ICURROW, IIX, IN, IXCOL,

     $                   IXROW, J, JB, JJX, JN, KK, LDPX, LDX, LL,

     $                   MYCOL, MYROW, NPCOL, NPROW

      DOUBLE PRECISION   ERR, EPS

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, pb_infog2l, pderrset

*     ..

*     .. External Functions ..

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           pdlamch

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod

*     ..

*     .. Executable Statements ..

*

      info = 0

      errmax = zero

*

*     Quick return if possible

*

      IF( n.LE.0 )

     $   RETURN

*

      ictxt = descx( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      CALL pb_infog2l( ix, jx, descx, nprow, npcol, myrow, mycol, iix,

     $                 jjx, ixrow, ixcol )

*

      ldx    = descx( m_ )

      ldpx   = descx( lld_ )

      rowrep = ( ixrow.EQ.-1 )

      colrep = ( ixcol.EQ.-1 )

*

      IF( n.EQ.1 ) THEN

*

         IF( ( myrow.EQ.ixrow .OR. rowrep ) .AND.

     $       ( mycol.EQ.ixcol .OR. colrep ) )

     $      CALL pderrset( err, errmax, x( ix+(jx-1)*ldx ),

     $                     px( iix+(jjx-1)*ldpx ) )

*

      ELSE IF( incx.EQ.descx( m_ ) ) THEN

*

*        sub( X ) is a row vector

*

         jb = descx( inb_ ) - jx + 1

         IF( jb.LE.0 )

     $      jb = ( ( -jb ) / descx( nb_ ) + 1 ) * descx( nb_ ) + jb

         jb = min( jb, n )

         jn = jx + jb - 1

*

         IF( myrow.EQ.ixrow .OR. rowrep ) THEN

*

            icurcol = ixcol

            IF( mycol.EQ.icurcol .OR. colrep ) THEN

               DO 10 j = jx, jn

                  CALL pderrset( err, errmax, x( ix+(j-1)*ldx ),

     $                           px( iix+(jjx-1)*ldpx ) )

                  jjx = jjx + 1

   10          CONTINUE

            END IF

            icurcol = mod( icurcol+1, npcol )

*

            DO 30 j = jn+1, jx+n-1, descx( nb_ )

               jb = min( jx+n-j, descx( nb_ ) )

*

               IF( mycol.EQ.icurcol .OR. colrep ) THEN

*

                  DO 20 kk = 0, jb-1

                     CALL pderrset( err, errmax, x( ix+(j+kk-1)*ldx ),

     $                              px( iix+(jjx+kk-1)*ldpx ) )

   20             CONTINUE

*

                  jjx = jjx + jb

*

               END IF

*

               icurcol = mod( icurcol+1, npcol )

*

   30       CONTINUE

*

         END IF

*

      ELSE

*

*        sub( X ) is a column vector

*

         ib = descx( imb_ ) - ix + 1

         IF( ib.LE.0 )

     $      ib = ( ( -ib ) / descx( mb_ ) + 1 ) * descx( mb_ ) + ib

         ib = min( ib, n )

         in = ix + ib - 1

*

         IF( mycol.EQ.ixcol .OR. colrep ) THEN

*

            icurrow = ixrow

            IF( myrow.EQ.icurrow .OR. rowrep ) THEN

               DO 40 i = ix, in

                  CALL pderrset( err, errmax, x( i+(jx-1)*ldx ),

     $                           px( iix+(jjx-1)*ldpx ) )

                  iix = iix + 1

   40          CONTINUE

            END IF

            icurrow = mod( icurrow+1, nprow )

*

            DO 60 i = in+1, ix+n-1, descx( mb_ )

               ib = min( ix+n-i, descx( mb_ ) )

*

               IF( myrow.EQ.icurrow .OR. rowrep ) THEN

*

                  DO 50 kk = 0, ib-1

                     CALL pderrset( err, errmax, x( i+kk+(jx-1)*ldx ),

     $                              px( iix+kk+(jjx-1)*ldpx ) )

   50             CONTINUE

*

                  iix = iix + ib

*

               END IF

*

               icurrow = mod( icurrow+1, nprow )

*

   60       CONTINUE

*

         END IF

*

      END IF

*

      CALL dgamx2d( ictxt, 'All', ' ', 1, 1, errmax, 1, kk, ll, -1,

     $              -1, -1 )

*

      IF( errmax.GT.zero .AND. errmax.LE.eps ) THEN

         info = 1

      ELSE IF( errmax.GT.eps ) THEN

         info = -1

      END IF

*

      RETURN

*

*     End of PDCHKVIN

*

      END

      SUBROUTINE pdchkvout( N, X, PX, IX, JX, DESCX, INCX, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            INCX, INFO, IX, JX, N

*     ..

*     .. Array Arguments ..

      INTEGER            DESCX( * )

      DOUBLE PRECISION   PX( * ), X( * )

*     ..

*

*  Purpose

*  =======

*

*  PDCHKVOUT  checks  that the matrix PX \ sub( PX ) remained unchanged.

*  The  local array  entries  are compared element by element, and their

*  difference  is tested against 0.0 as well as the epsilon machine. No-

*  tice that this  difference should be numerically exactly the zero ma-

*  chine, but because  of  the  possible movement of some of the data we

*  flagged differently a difference less than twice the epsilon machine.

*  The largest error is reported.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  N       (global input) INTEGER

*          On entry,  N  specifies  the  length of the subvector operand

*          sub( X ). N must be at least zero.

*

*  X       (local input) DOUBLE PRECISION array

*          On entry, X is an array of  dimension  (DESCX( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PX.

*

*  PX      (local input) DOUBLE PRECISION array

*          On entry, PX is an array of dimension (DESCX( LLD_ ),*). This

*          array contains the local entries of the matrix PX.

*

*  IX      (global input) INTEGER

*          On entry, IX  specifies X's global row index, which points to

*          the beginning of the submatrix sub( X ).

*

*  JX      (global input) INTEGER

*          On entry, JX  specifies X's global column index, which points

*          to the beginning of the submatrix sub( X ).

*

*  DESCX   (global and local input) INTEGER array

*          On entry, DESCX  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix X.

*

*  INCX    (global input) INTEGER

*          On entry,  INCX   specifies  the  global  increment  for  the

*          elements of  X.  Only two values of  INCX   are  supported in

*          this version, namely 1 and M_X. INCX  must not be zero.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO = 0, no error has been found,

*          If INFO > 0, the maximum abolute error found is in (0,eps],

*          If INFO < 0, the maximum abolute error found is in (eps,+oo).

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO

      PARAMETER          ( ZERO = 0.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, ROWREP

      INTEGER            I, IB, ICTXT, ICURCOL, ICURROW, II, IMBX, INBX,

     $                   J, JB, JJ, KK, LDPX, LDX, LL, MBX, MPALL,

     $                   MYCOL, MYCOLDIST, MYROW, MYROWDIST, NBX, NPCOL,

     $                   nprow, nqall

      DOUBLE PRECISION   EPS, ERR, ERRMAX

*     ..

*     .. External Subroutines ..

      EXTERNAL           BLACS_GRIDINFO, DGAMX2D, PDERRSET

*     ..

*     .. External Functions ..

      INTEGER            PB_NUMROC

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           PDLAMCH, PB_NUMROC

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod

*     ..

*     .. Executable Statements ..

*

      info = 0

      errmax = zero

*

*     Quick return if possible

*

      IF( ( descx( m_ ).LE.0 ).OR.( descx( n_ ).LE.0 ) )

     $   RETURN

*

*     Start the operations

*

      ictxt = descx( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      mpall   = pb_numroc( descx( m_ ), 1, descx( imb_ ), descx( mb_ ),

     $                     myrow, descx( rsrc_ ), nprow )

      nqall   = pb_numroc( descx( n_ ), 1, descx( inb_ ), descx( nb_ ),

     $                     mycol, descx( csrc_ ), npcol )

*

      mbx     = descx( mb_ )

      nbx     = descx( nb_ )

      ldx     = descx( m_ )

      ldpx    = descx( lld_ )

      icurrow = descx( rsrc_ )

      icurcol = descx( csrc_ )

      rowrep  = ( icurrow.EQ.-1 )

      colrep  = ( icurcol.EQ.-1 )

      IF( myrow.EQ.icurrow .OR. rowrep ) THEN

         imbx = descx( imb_ )

      ELSE

         imbx = mbx

      END IF

      IF( mycol.EQ.icurcol .OR. colrep ) THEN

         inbx = descx( inb_ )

      ELSE

         inbx = nbx

      END IF

      IF( rowrep ) THEN

         myrowdist = 0

      ELSE

         myrowdist = mod( myrow - icurrow + nprow, nprow )

      END IF

      IF( colrep ) THEN

         mycoldist = 0

      ELSE

         mycoldist = mod( mycol - icurcol + npcol, npcol )

      END IF

      ii = 1

      jj = 1

*

      IF( incx.EQ.descx( m_ ) ) THEN

*

*        sub( X ) is a row vector

*

         IF( myrow.EQ.icurrow .OR. rowrep ) THEN

*

            i = 1

            IF( mycoldist.EQ.0 ) THEN

               j = 1

            ELSE

               j = descx( inb_ ) + ( mycoldist - 1 ) * nbx + 1

            END IF

            jb = min( max( 0, descx( n_ ) - j + 1 ), inbx )

            ib = min( descx( m_ ), descx( imb_ ) )

*

            DO 20 kk = 0, jb-1

               DO 10 ll = 0, ib-1

                  IF( i+ll.NE.ix .OR. j+kk.LT.jx .OR. j+kk.GT.jx+n-1 )

     $               CALL pderrset( err, errmax,

     $                              x( i+ll+(j+kk-1)*ldx ),

     $                              px( ii+ll+(jj+kk-1)*ldpx ) )

   10          CONTINUE

   20       CONTINUE

            IF( colrep ) THEN

               j = j + inbx

            ELSE

               j = j + inbx + ( npcol - 1 ) * nbx

            END IF

*

            DO 50 jj = inbx+1, nqall, nbx

               jb = min( nqall-jj+1, nbx )

*

               DO 40 kk = 0, jb-1

                  DO 30 ll = 0, ib-1

                     IF( i+ll.NE.ix .OR. j+kk.LT.jx .OR.

     $                   j+kk.GT.jx+n-1 )

     $                  CALL pderrset( err, errmax,

     $                                 x( i+ll+(j+kk-1)*ldx ),

     $                                 px( ii+ll+(jj+kk-1)*ldpx ) )

   30             CONTINUE

   40          CONTINUE

*

               IF( colrep ) THEN

                  j = j + nbx

               ELSE

                  j = j + npcol * nbx

               END IF

*

   50       CONTINUE

*

            ii = ii + ib

*

         END IF

*

         icurrow = mod( icurrow + 1, nprow )

*

         DO 110 i = descx( imb_ ) + 1, descx( m_ ), mbx

            ib = min( descx( m_ ) - i + 1, mbx )

*

            IF( myrow.EQ.icurrow .OR. rowrep ) THEN

*

               IF( mycoldist.EQ.0 ) THEN

                  j = 1

               ELSE

                  j = descx( inb_ ) + ( mycoldist - 1 ) * nbx + 1

               END IF

*

               jj = 1

               jb = min( max( 0, descx( n_ ) - j + 1 ), inbx )

               DO 70 kk = 0, jb-1

                  DO 60 ll = 0, ib-1

                     IF( i+ll.NE.ix .OR. j+kk.LT.jx .OR.

     $                   j+kk.GT.jx+n-1 )

     $                  CALL pderrset( err, errmax,

     $                                 x( i+ll+(j+kk-1)*ldx ),

     $                                 px( ii+ll+(jj+kk-1)*ldpx ) )

   60             CONTINUE

   70          CONTINUE

               IF( colrep ) THEN

                  j = j + inbx

               ELSE

                  j = j + inbx + ( npcol - 1 ) * nbx

               END IF

*

               DO 100 jj = inbx+1, nqall, nbx

                  jb = min( nqall-jj+1, nbx )

*

                  DO 90 kk = 0, jb-1

                     DO 80 ll = 0, ib-1

                        IF( i+ll.NE.ix .OR. j+kk.LT.jx .OR.

     $                      j+kk.GT.jx+n-1 )

     $                     CALL pderrset( err, errmax,

     $                                    x( i+ll+(j+kk-1)*ldx ),

     $                                    px( ii+ll+(jj+kk-1)*ldpx ) )

   80                CONTINUE

   90             CONTINUE

*

                  IF( colrep ) THEN

                     j = j + nbx

                  ELSE

                     j = j + npcol * nbx

                  END IF

*

  100          CONTINUE

*

               ii = ii + ib

*

            END IF

*

            icurrow = mod( icurrow + 1, nprow )

*

  110    CONTINUE

*

      ELSE

*

*        sub( X ) is a column vector

*

         IF( mycol.EQ.icurcol .OR. colrep ) THEN

*

            j = 1

            IF( myrowdist.EQ.0 ) THEN

               i = 1

            ELSE

               i = descx( imb_ ) + ( myrowdist - 1 ) * mbx + 1

            END IF

            ib = min( max( 0, descx( m_ ) - i + 1 ), imbx )

            jb = min( descx( n_ ), descx( inb_ ) )

*

            DO 130 kk = 0, jb-1

               DO 120 ll = 0, ib-1

                  IF( j+kk.NE.jx .OR. i+ll.LT.ix .OR. i+ll.GT.ix+n-1 )

     $               CALL pderrset( err, errmax,

     $                              x( i+ll+(j+kk-1)*ldx ),

     $                              px( ii+ll+(jj+kk-1)*ldpx ) )

  120          CONTINUE

  130       CONTINUE

            IF( rowrep ) THEN

               i = i + imbx

            ELSE

               i = i + imbx + ( nprow - 1 ) * mbx

            END IF

*

            DO 160 ii = imbx+1, mpall, mbx

               ib = min( mpall-ii+1, mbx )

*

               DO 150 kk = 0, jb-1

                  DO 140 ll = 0, ib-1

                     IF( j+kk.NE.jx .OR. i+ll.LT.ix .OR.

     $                   i+ll.GT.ix+n-1 )

     $                  CALL pderrset( err, errmax,

     $                                 x( i+ll+(j+kk-1)*ldx ),

     $                                 px( ii+ll+(jj+kk-1)*ldpx ) )

  140             CONTINUE

  150          CONTINUE

*

               IF( rowrep ) THEN

                  i = i + mbx

               ELSE

                  i = i + nprow * mbx

               END IF

*

  160       CONTINUE

*

            jj = jj + jb

*

         END IF

*

         icurcol = mod( icurcol + 1, npcol )

*

         DO 220 j = descx( inb_ ) + 1, descx( n_ ), nbx

            jb = min( descx( n_ ) - j + 1, nbx )

*

            IF( mycol.EQ.icurcol .OR. colrep ) THEN

*

               IF( myrowdist.EQ.0 ) THEN

                  i = 1

               ELSE

                  i = descx( imb_ ) + ( myrowdist - 1 ) * mbx + 1

               END IF

*

               ii = 1

               ib = min( max( 0, descx( m_ ) - i + 1 ), imbx )

               DO 180 kk = 0, jb-1

                  DO 170 ll = 0, ib-1

                     IF( j+kk.NE.jx .OR. i+ll.LT.ix .OR.

     $                   i+ll.GT.ix+n-1 )

     $                  CALL pderrset( err, errmax,

     $                                 x( i+ll+(j+kk-1)*ldx ),

     $                                 px( ii+ll+(jj+kk-1)*ldpx ) )

  170             CONTINUE

  180          CONTINUE

               IF( rowrep ) THEN

                  i = i + imbx

               ELSE

                  i = i + imbx + ( nprow - 1 ) * mbx

               END IF

*

               DO 210 ii = imbx+1, mpall, mbx

                  ib = min( mpall-ii+1, mbx )

*

                  DO 200 kk = 0, jb-1

                     DO 190 ll = 0, ib-1

                        IF( j+kk.NE.jx .OR. i+ll.LT.ix .OR.

     $                      i+ll.GT.ix+n-1 )

     $                     CALL pderrset( err, errmax,

     $                                    x( i+ll+(j+kk-1)*ldx ),

     $                                    px( ii+ll+(jj+kk-1)*ldpx ) )

  190                CONTINUE

  200             CONTINUE

*

                  IF( rowrep ) THEN

                     i = i + mbx

                  ELSE

                     i = i + nprow * mbx

                  END IF

*

  210          CONTINUE

*

               jj = jj + jb

*

            END IF

*

            icurcol = mod( icurcol + 1, npcol )

*

  220    CONTINUE

*

      END IF

*

      CALL dgamx2d( ictxt, 'All', ' ', 1, 1, errmax, 1, kk, ll, -1,

     $              -1, -1 )

*

      IF( errmax.GT.zero .AND. errmax.LE.eps ) THEN

         info = 1

      ELSE IF( errmax.GT.eps ) THEN

         info = -1

      END IF

*

      RETURN

*

*     End of PDCHKVOUT

*

      END

      SUBROUTINE pdchkmin( ERRMAX, M, N, A, PA, IA, JA, DESCA, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            IA, INFO, JA, M, N

      DOUBLE PRECISION   ERRMAX

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * )

      DOUBLE PRECISION   PA( * ), A( * )

*     ..

*

*  Purpose

*  =======

*

*  PDCHKMIN  checks that the submatrix sub( PA ) remained unchanged. The

*  local  array  entries are compared element by element, and their dif-

*  ference  is tested against 0.0 as well as the epsilon machine. Notice

*  that  this difference should be numerically exactly the zero machine,

*  but  because of the possible fluctuation of some of the data we flag-

*  ged differently a difference less than twice the epsilon machine. The

*  largest error is also returned.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ERRMAX  (global output) DOUBLE PRECISION

*          On exit,  ERRMAX  specifies the largest absolute element-wise

*          difference between sub( A ) and sub( PA ).

*

*  M       (global input) INTEGER

*          On entry,  M  specifies  the  number of rows of the submatrix

*          operand sub( A ). M must be at least zero.

*

*  N       (global input) INTEGER

*          On entry, N  specifies the number of columns of the submatrix

*          operand sub( A ). N must be at least zero.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  PA      (local input) DOUBLE PRECISION array

*          On entry, PA is an array of dimension (DESCA( LLD_ ),*). This

*          array contains the local entries of the matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO = 0, no error has been found,

*          If INFO > 0, the maximum abolute error found is in (0,eps],

*          If INFO < 0, the maximum abolute error found is in (eps,+oo).

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO

      PARAMETER          ( ZERO = 0.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, ROWREP

      INTEGER            H, I, IACOL, IAROW, IB, ICTXT, ICURCOL,

     $                   ICURROW, II, IIA, IN, J, JB, JJ, JJA, JN, K,

     $                   KK, LDA, LDPA, LL, MYCOL, MYROW, NPCOL, NPROW

      DOUBLE PRECISION   ERR, EPS

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, pb_infog2l, pderrset

*     ..

*     .. External Functions ..

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           pdlamch

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod

*     ..

*     .. Executable Statements ..

*

      info   = 0

      errmax = zero

*

*     Quick return if posssible

*

      IF( ( m.EQ.0 ).OR.( n.EQ.0 ) )

     $   RETURN

*

*     Start the operations

*

      ictxt = desca( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      CALL pb_infog2l( ia, ja, desca, nprow, npcol, myrow, mycol, iia,

     $                 jja, iarow, iacol )

*

      ii      = iia

      jj      = jja

      lda     = desca( m_ )

      ldpa    = desca( lld_ )

      icurrow = iarow

      icurcol = iacol

      rowrep  = ( iarow.EQ.-1 )

      colrep  = ( iacol.EQ.-1 )

*

*     Handle the first block of column separately

*

      jb = desca( inb_ ) - ja  + 1

      IF( jb.LE.0 )

     $   jb = ( ( -jb ) / desca( nb_ ) + 1 ) * desca( nb_ ) + jb

      jb = min( jb, n )

      jn = ja + jb - 1

*

      IF( mycol.EQ.icurcol .OR. colrep ) THEN

*

         DO 40 h = 0, jb-1

            ib = desca( imb_ ) - ia  + 1

            IF( ib.LE.0 )

     $         ib = ( ( -ib ) / desca( mb_ ) + 1 ) * desca( mb_ ) + ib

            ib = min( ib, m )

            in = ia + ib - 1

            IF( myrow.EQ.icurrow .OR. rowrep ) THEN

               DO 10 k = 0, ib-1

                  CALL pderrset( err, errmax, a( ia+k+(ja+h-1)*lda ),

     $                           pa( ii+k+(jj+h-1)*ldpa ) )

   10          CONTINUE

               ii = ii + ib

            END IF

            icurrow = mod( icurrow+1, nprow )

*

*           Loop over remaining block of rows

*

            DO 30 i = in+1, ia+m-1, desca( mb_ )

               ib = min( desca( mb_ ), ia+m-i )

               IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                  DO 20 k = 0, ib-1

                     CALL pderrset( err, errmax, a( i+k+(ja+h-1)*lda ),

     $                              pa( ii+k+(jj+h-1)*ldpa ) )

   20             CONTINUE

                  ii = ii + ib

               END IF

               icurrow = mod( icurrow+1, nprow )

   30       CONTINUE

*

            ii = iia

            icurrow = iarow

   40    CONTINUE

*

         jj = jj + jb

*

      END IF

*

      icurcol = mod( icurcol+1, npcol )

*

*     Loop over remaining column blocks

*

      DO 90 j = jn+1, ja+n-1, desca( nb_ )

         jb = min(  desca( nb_ ), ja+n-j )

         IF( mycol.EQ.icurcol .OR. colrep ) THEN

            DO 80 h = 0, jb-1

               ib = desca( imb_ ) - ia  + 1

               IF( ib.LE.0 )

     $            ib = ( ( -ib ) / desca( mb_ ) + 1 )*desca( mb_ ) + ib

               ib = min( ib, m )

               in = ia + ib - 1

               IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                  DO 50 k = 0, ib-1

                     CALL pderrset( err, errmax, a( ia+k+(j+h-1)*lda ),

     $                              pa( ii+k+(jj+h-1)*ldpa ) )

   50             CONTINUE

                  ii = ii + ib

               END IF

               icurrow = mod( icurrow+1, nprow )

*

*              Loop over remaining block of rows

*

               DO 70 i = in+1, ia+m-1, desca( mb_ )

                  ib = min( desca( mb_ ), ia+m-i )

                  IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                     DO 60 k = 0, ib-1

                        CALL pderrset( err, errmax,

     $                                 a( i+k+(j+h-1)*lda ),

     $                                 pa( ii+k+(jj+h-1)*ldpa ) )

   60                CONTINUE

                     ii = ii + ib

                  END IF

                  icurrow = mod( icurrow+1, nprow )

   70          CONTINUE

*

               ii = iia

               icurrow = iarow

   80       CONTINUE

*

            jj = jj + jb

         END IF

*

         icurcol = mod( icurcol+1, npcol )

*

   90 CONTINUE

*

      CALL dgamx2d( ictxt, 'All', ' ', 1, 1, errmax, 1, kk, ll, -1,

     $              -1, -1 )

*

      IF( errmax.GT.zero .AND. errmax.LE.eps ) THEN

         info = 1

      ELSE IF( errmax.GT.eps ) THEN

         info = -1

      END IF

*

      RETURN

*

*     End of PDCHKMIN

*

      END

      SUBROUTINE pdchkmout( M, N, A, PA, IA, JA, DESCA, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            IA, INFO, JA, M, N

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * )

      DOUBLE PRECISION   A( * ), PA( * )

*     ..

*

*  Purpose

*  =======

*

*  PDCHKMOUT  checks  that the matrix PA \ sub( PA ) remained unchanged.

*  The  local array  entries  are compared element by element, and their

*  difference  is tested against 0.0 as well as the epsilon machine. No-

*  tice that this  difference should be numerically exactly the zero ma-

*  chine, but because  of  the  possible movement of some of the data we

*  flagged differently a difference less than twice the epsilon machine.

*  The largest error is reported.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  M       (global input) INTEGER

*          On entry,  M  specifies  the  number of rows of the submatrix

*          sub( PA ). M must be at least zero.

*

*  N       (global input) INTEGER

*          On entry, N specifies the  number of columns of the submatrix

*          sub( PA ). N must be at least zero.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  PA      (local input) DOUBLE PRECISION array

*          On entry, PA is an array of dimension (DESCA( LLD_ ),*). This

*          array contains the local entries of the matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO = 0, no error has been found,

*          If INFO > 0, the maximum abolute error found is in (0,eps],

*          If INFO < 0, the maximum abolute error found is in (eps,+oo).

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO

      PARAMETER          ( ZERO = 0.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, ROWREP

      INTEGER            I, IB, ICTXT, ICURCOL, II, IMBA, J, JB, JJ, KK,

     $                   LDA, LDPA, LL, MPALL, MYCOL, MYROW, MYROWDIST,

     $                   NPCOL, NPROW

      DOUBLE PRECISION   EPS, ERR, ERRMAX

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, pderrset

*     ..

*     .. External Functions ..

      INTEGER            PB_NUMROC

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           PDLAMCH, PB_NUMROC

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          max, min, mod

*     ..

*     .. Executable Statements ..

*

      info = 0

      errmax = zero

*

*     Quick return if possible

*

      IF( ( desca( m_ ).LE.0 ).OR.( desca( n_ ).LE.0 ) )

     $   RETURN

*

*     Start the operations

*

      ictxt = desca( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      mpall = pb_numroc( desca( m_ ), 1, desca( imb_ ), desca( mb_ ),

     $                   myrow, desca( rsrc_ ), nprow )

*

      lda    = desca( m_ )

      ldpa   = desca( lld_ )

*

      ii = 1

      jj = 1

      rowrep  = ( desca( rsrc_ ).EQ.-1 )

      colrep  = ( desca( csrc_ ).EQ.-1 )

      icurcol = desca( csrc_ )

      IF( myrow.EQ.desca( rsrc_ ) .OR. rowrep ) THEN

         imba = desca( imb_ )

      ELSE

         imba = desca( mb_ )

      END IF

      IF( rowrep ) THEN

         myrowdist = 0

      ELSE

         myrowdist = mod( myrow - desca( rsrc_ ) + nprow, nprow )

      END IF

*

      IF( mycol.EQ.icurcol .OR. colrep ) THEN

*

         j = 1

         IF( myrowdist.EQ.0 ) THEN

            i = 1

         ELSE

            i = desca( imb_ ) + ( myrowdist - 1 ) * desca( mb_ ) + 1

         END IF

         ib = min( max( 0, desca( m_ ) - i + 1 ), imba )

         jb = min( desca( n_ ), desca( inb_ ) )

*

         DO 20 kk = 0, jb-1

            DO 10 ll = 0, ib-1

               IF( i+ll.LT.ia .OR. i+ll.GT.ia+m-1 .OR.

     $             j+kk.LT.ja .OR. j+kk.GT.ja+n-1 )

     $            CALL pderrset( err, errmax, a( i+ll+(j+kk-1)*lda ),

     $                           pa( ii+ll+(jj+kk-1)*ldpa ) )

   10       CONTINUE

   20    CONTINUE

         IF( rowrep ) THEN

            i = i + imba

         ELSE

            i = i + imba + ( nprow - 1 ) * desca( mb_ )

         END IF

*

         DO 50 ii = imba + 1, mpall, desca( mb_ )

            ib = min( mpall-ii+1, desca( mb_ ) )

*

            DO 40 kk = 0, jb-1

               DO 30 ll = 0, ib-1

                  IF( i+ll.LT.ia .OR. i+ll.GT.ia+m-1 .OR.

     $                j+kk.LT.ja .OR. j+kk.GT.ja+n-1 )

     $               CALL pderrset( err, errmax,

     $                              a( i+ll+(j+kk-1)*lda ),

     $                              pa( ii+ll+(jj+kk-1)*ldpa ) )

   30          CONTINUE

   40       CONTINUE

*

            IF( rowrep ) THEN

               i = i + desca( mb_ )

            ELSE

               i = i + nprow * desca( mb_ )

            END IF

*

   50    CONTINUE

*

         jj = jj + jb

*

      END IF

*

      icurcol = mod( icurcol + 1, npcol )

*

      DO 110 j = desca( inb_ ) + 1, desca( n_ ), desca( nb_ )

         jb = min( desca( n_ ) - j + 1, desca( nb_ ) )

*

         IF( mycol.EQ.icurcol .OR. colrep ) THEN

*

            IF( myrowdist.EQ.0 ) THEN

               i = 1

            ELSE

               i = desca( imb_ ) + ( myrowdist - 1 ) * desca( mb_ ) + 1

            END IF

*

            ii = 1

            ib = min( max( 0, desca( m_ ) - i + 1 ), imba )

            DO 70 kk = 0, jb-1

               DO 60 ll = 0, ib-1

                  IF( i+ll.LT.ia .OR. i+ll.GT.ia+m-1 .OR.

     $                j+kk.LT.ja .OR. j+kk.GT.ja+n-1 )

     $               CALL pderrset( err, errmax,

     $                              a( i+ll+(j+kk-1)*lda ),

     $                              pa( ii+ll+(jj+kk-1)*ldpa ) )

   60          CONTINUE

   70       CONTINUE

            IF( rowrep ) THEN

               i = i + imba

            ELSE

               i = i + imba + ( nprow - 1 ) * desca( mb_ )

            END IF

*

            DO 100 ii = imba+1, mpall, desca( mb_ )

               ib = min( mpall-ii+1, desca( mb_ ) )

*

               DO 90 kk = 0, jb-1

                  DO 80 ll = 0, ib-1

                     IF( i+ll.LT.ia .OR. i+ll.GT.ia+m-1 .OR.

     $                   j+kk.LT.ja .OR. j+kk.GT.ja+n-1 )

     $                  CALL pderrset( err, errmax,

     $                                 a( i+ll+(j+kk-1)*lda ),

     $                                 pa( ii+ll+(jj+kk-1)*ldpa ) )

   80             CONTINUE

   90          CONTINUE

*

               IF( rowrep ) THEN

                  i = i + desca( mb_ )

               ELSE

                  i = i + nprow * desca( mb_ )

               END IF

*

  100       CONTINUE

*

            jj = jj + jb

*

         END IF

*

         icurcol = mod( icurcol + 1, npcol )

*                                                           INSERT MODE

  110 CONTINUE

*

      CALL dgamx2d( ictxt, 'All', ' ', 1, 1, errmax, 1, kk, ll, -1,

     $              -1, -1 )

*

      IF( errmax.GT.zero .AND. errmax.LE.eps ) THEN

         info = 1

      ELSE IF( errmax.GT.eps ) THEN

         info = -1

      END IF

*

      RETURN

*

*     End of PDCHKMOUT

*

      END

      SUBROUTINE pdmprnt( ICTXT, NOUT, M, N, A, LDA, IRPRNT, ICPRNT,

     $                    CMATNM )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            ICPRNT, ICTXT, IRPRNT, LDA, M, N, NOUT

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)      CMATNM

      DOUBLE PRECISION   A( LDA, * )

*     ..

*

*  Purpose

*  =======

*

*  PDMPRNT prints to the standard output an array A of size m by n. Only

*  the process of coordinates ( IRPRNT, ICPRNT ) is printing.

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  M       (global input) INTEGER

*          On entry, M  specifies the number of rows of the matrix A.  M

*          must be at least zero.

*

*  N       (global input) INTEGER

*          On entry, N  specifies the number of columns of the matrix A.

*          N must be at least zero.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry,  A  is an array of dimension (LDA,N). The leading m

*          by n part of this array is printed.

*

*  LDA     (local input) INTEGER

*          On entry, LDA  specifies the leading dimension of  the  local

*          array A to be printed. LDA must be at least MAX( 1, M ).

*

*  IRPRNT  (global input) INTEGER

*          On entry, IRPRNT  specifies the process row coordinate of the

*          printing process.

*

*  ICPRNT  (global input) INTEGER

*          On entry,  ICPRNT  specifies the process column coordinate of

*          the printing process.

*

*  CMATNM  (global input) CHARACTER*(*)

*          On entry, CMATNM specifies the identifier of the matrix to be

*          printed.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER            I, J, MYCOL, MYROW, NPCOL, NPROW

*     ..

*     .. External Subroutines ..

      EXTERNAL           BLACS_GRIDINFO

*     ..

*     .. Executable Statements ..

*

*     Quick return if possible

*

      IF( ( m.LE.0 ).OR.( n.LE.0 ) )

     $   RETURN

*

*     Get grid parameters

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

*

         WRITE( nout, fmt = * )

         DO 20 j = 1, n

*

            DO 10 i = 1, m

*

               WRITE( nout, fmt = 9999 ) cmatnm, i, j, a( i, j )

*

   10       CONTINUE

*

   20    CONTINUE

*

      END IF

*

 9999 FORMAT( 1x, a, '(', i6, ',', i6, ')=', d30.18 )

*

      RETURN

*

*     End of PDMPRNT

*

      END

      SUBROUTINE pdvprnt( ICTXT, NOUT, N, X, INCX, IRPRNT, ICPRNT,

     $                    CVECNM )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            ICPRNT, ICTXT, INCX, IRPRNT, N, NOUT

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)      CVECNM

      DOUBLE PRECISION   X( * )

*     ..

*

*  Purpose

*  =======

*

*  PDVPRNT  prints  to the standard output an vector x of length n. Only

*  the process of coordinates ( IRPRNT, ICPRNT ) is printing.

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the unit number for the output file.

*          When NOUT is 6, output to screen,  when  NOUT is 0, output to

*          stderr. NOUT is only defined for process 0.

*

*  N       (global input) INTEGER

*          On entry, N  specifies the length of the vector X.  N must be

*          at least zero.

*

*  X       (global input) DOUBLE PRECISION array

*          On   entry,   X   is   an   array   of   dimension  at  least

*          ( 1 + ( n - 1 )*abs( INCX ) ).  Before  entry,  the incremen-

*          ted array X must contain the vector x.

*

*  INCX    (global input) INTEGER.

*          On entry, INCX specifies the increment for the elements of X.

*          INCX must not be zero.

*

*  IRPRNT  (global input) INTEGER

*          On entry, IRPRNT  specifies the process row coordinate of the

*          printing process.

*

*  ICPRNT  (global input) INTEGER

*          On entry,  ICPRNT  specifies the process column coordinate of

*          the printing process.

*

*  CVECNM  (global input) CHARACTER*(*)

*          On entry, CVECNM specifies the identifier of the vector to be

*          printed.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER            I, MYCOL, MYROW, NPCOL, NPROW

*     ..

*     .. External Subroutines ..

      EXTERNAL           BLACS_GRIDINFO

*     ..

*     .. Executable Statements ..

*

*     Quick return if possible

*

      IF( n.LE.0 )

     $   RETURN

*

*     Get grid parameters

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

*

         WRITE( nout, fmt = * )

         DO 10 i = 1, 1 + ( n-1 )*incx, incx

*

            WRITE( nout, fmt = 9999 ) cvecnm, i, x( i )

*

   10    CONTINUE

*

      END IF

*

 9999 FORMAT( 1x, a, '(', i6, ')=', d30.18 )

*

      RETURN

*

*     End of PDVPRNT

*

      END

      SUBROUTINE pdmvch( ICTXT, TRANS, M, N, ALPHA, A, IA, JA, DESCA,

     $                   X, IX, JX, DESCX, INCX, BETA, Y, PY, IY, JY,

     $                   DESCY, INCY, G, ERR, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        TRANS

      INTEGER            IA, ICTXT, INCX, INCY, INFO, IX, IY, JA, JX,

     $                   JY, M, N

      DOUBLE PRECISION   ALPHA, BETA, ERR

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * ), DESCX( * ), DESCY( * )

      DOUBLE PRECISION   A( * ), G( * ), PY( * ), X( * ), Y( * )

*     ..

*

*  Purpose

*  =======

*

*  PDMVCH checks the results of the computational tests.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  TRANS   (global input) CHARACTER*1

*          On entry,  TRANS  specifies which matrix-vector product is to

*          be computed as follows:

*             If TRANS = 'N',

*                sub( Y ) = BETA * sub( Y ) + sub( A )  * sub( X ),

*             otherwise

*                sub( Y ) = BETA * sub( Y ) + sub( A )' * sub( X ).

*

*  M       (global input) INTEGER

*          On entry,  M  specifies  the  number of rows of the submatrix

*          operand matrix A. M must be at least zero.

*

*  N       (global input) INTEGER

*          On entry,  N  specifies  the  number of columns of the subma-

*          trix operand matrix A. N must be at least zero.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  X       (local input) DOUBLE PRECISION array

*          On entry, X is an array of  dimension  (DESCX( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PX.

*

*  IX      (global input) INTEGER

*          On entry, IX  specifies X's global row index, which points to

*          the beginning of the submatrix sub( X ).

*

*  JX      (global input) INTEGER

*          On entry, JX  specifies X's global column index, which points

*          to the beginning of the submatrix sub( X ).

*

*  DESCX   (global and local input) INTEGER array

*          On entry, DESCX  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix X.

*

*  INCX    (global input) INTEGER

*          On entry,  INCX   specifies  the  global  increment  for  the

*          elements of  X.  Only two values of  INCX   are  supported in

*          this version, namely 1 and M_X. INCX  must not be zero.

*

*  BETA    (global input) DOUBLE PRECISION

*          On entry, BETA specifies the scalar beta.

*

*  Y       (local input/local output) DOUBLE PRECISION array

*          On entry, Y is an array of  dimension  (DESCY( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PY.

*

*  PY      (local input) DOUBLE PRECISION array

*          On entry, PY is an array of dimension (DESCY( LLD_ ),*). This

*          array contains the local entries of the matrix PY.

*

*  IY      (global input) INTEGER

*          On entry, IY  specifies Y's global row index, which points to

*          the beginning of the submatrix sub( Y ).

*

*  JY      (global input) INTEGER

*          On entry, JY  specifies Y's global column index, which points

*          to the beginning of the submatrix sub( Y ).

*

*  DESCY   (global and local input) INTEGER array

*          On entry, DESCY  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix Y.

*

*  INCY    (global input) INTEGER

*          On entry,  INCY   specifies  the  global  increment  for  the

*          elements of  Y.  Only two values of  INCY   are  supported in

*          this version, namely 1 and M_Y. INCY  must not be zero.

*

*  G       (workspace) DOUBLE PRECISION array

*          On entry, G is an array of dimension at least MAX( M, N ).  G

*          is used to compute the gauges.

*

*  ERR     (global output) DOUBLE PRECISION

*          On exit, ERR specifies the largest error in absolute value.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO <> 0, the result is less than half accurate.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO, ONE

      parameter( zero = 0.0d+0, one = 1.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, ROWREP, TRAN

      INTEGER            I, IB, ICURCOL, ICURROW, IIY, IN, IOFFA, IOFFX,

     $                   IOFFY, IYCOL, IYROW, J, JB, JJY, JN, KK, LDA,

     $                   LDPY, LDX, LDY, ML, MYCOL, MYROW, NL, NPCOL,

     $                   nprow

      DOUBLE PRECISION   EPS, ERRI, GTMP, TBETA, YTMP

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, igsum2d, pb_infog2l

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           lsame, pdlamch

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod, sqrt

*     ..

*     .. Executable Statements ..

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      IF( m.EQ.0 .OR. n.EQ.0 ) THEN

         tbeta = one

      ELSE

         tbeta = beta

      END IF

*

      tran = lsame( trans, 'T' ).OR.lsame( trans, 'C' )

      IF( tran ) THEN

         ml = n

         nl = m

      ELSE

         ml = m

         nl = n

      END IF

*

      lda = max( 1, desca( m_ ) )

      ldx = max( 1, descx( m_ ) )

      ldy = max( 1, descy( m_ ) )

*

*     Compute expected result in Y using data in A, X and Y.

*     Compute gauges in G. This part of the computation is performed

*     by every process in the grid.

*

      ioffy = iy + ( jy - 1 ) * ldy

      DO 30 i = 1, ml

         ytmp = zero

         gtmp = zero

         ioffx = ix + ( jx - 1 ) * ldx

         IF( tran )THEN

            ioffa = ia + ( ja + i - 2 ) * lda

            DO 10 j = 1, nl

               ytmp = ytmp + a( ioffa ) * x( ioffx )

               gtmp = gtmp + abs( a( ioffa ) * x( ioffx ) )

               ioffa = ioffa + 1

               ioffx = ioffx + incx

   10       CONTINUE

         ELSE

            ioffa = ia + i - 1 + ( ja - 1 ) * lda

            DO 20 j = 1, nl

               ytmp = ytmp + a( ioffa ) * x( ioffx )

               gtmp = gtmp + abs( a( ioffa ) * x( ioffx ) )

               ioffa = ioffa + lda

               ioffx = ioffx + incx

   20       CONTINUE

         END IF

         g( i ) = abs( alpha ) * gtmp + abs( tbeta * y( ioffy ) )

         y( ioffy ) = alpha * ytmp + tbeta * y( ioffy )

         ioffy = ioffy + incy

   30 CONTINUE

*

*     Compute the error ratio for this result.

*

      err  = zero

      info = 0

      ldpy = descy( lld_ )

      ioffy = iy + ( jy - 1 ) * ldy

      CALL pb_infog2l( iy, jy, descy, nprow, npcol, myrow, mycol, iiy,

     $                 jjy, iyrow, iycol )

      icurrow = iyrow

      icurcol = iycol

      rowrep  = ( iyrow.EQ.-1 )

      colrep  = ( iycol.EQ.-1 )

*

      IF( incy.EQ.descy( m_ ) ) THEN

*

*        sub( Y ) is a row vector

*

         jb = descy( inb_ ) - jy + 1

         IF( jb.LE.0 )

     $      jb = ( ( -jb ) / descy( nb_ ) + 1 ) * descy( nb_ ) + jb

         jb = min( jb, ml )

         jn = jy + jb - 1

*

         DO 50 j = jy, jn

*

            IF( ( myrow.EQ.icurrow .OR. rowrep ) .AND.

     $          ( mycol.EQ.icurcol .OR. colrep ) ) THEN

               erri = abs( py( iiy+(jjy-1)*ldpy ) - y( ioffy ) ) / eps

               IF( g( j-jy+1 ).NE.zero )

     $            erri = erri / g( j-jy+1 )

               err = max( err, erri )

               IF( err*sqrt( eps ).GE.one )

     $            info = 1

               jjy = jjy + 1

            END IF

*

            ioffy = ioffy + incy

*

   50    CONTINUE

*

         icurcol = mod( icurcol+1, npcol )

*

         DO 70 j = jn+1, jy+ml-1, descy( nb_ )

            jb = min( jy+ml-j, descy( nb_ ) )

*

            DO 60 kk = 0, jb-1

*

               IF( ( myrow.EQ.icurrow .OR. rowrep ) .AND.

     $             ( mycol.EQ.icurcol .OR. colrep ) ) THEN

                  erri = abs( py( iiy+(jjy-1)*ldpy ) - y( ioffy ) )/eps

                  IF( g( j+kk-jy+1 ).NE.zero )

     $               erri = erri / g( j+kk-jy+1 )

                  err = max( err, erri )

                  IF( err*sqrt( eps ).GE.one )

     $               info = 1

                  jjy = jjy + 1

               END IF

*

               ioffy = ioffy + incy

*

   60       CONTINUE

*

            icurcol = mod( icurcol+1, npcol )

*

   70    CONTINUE

*

      ELSE

*

*        sub( Y ) is a column vector

*

         ib = descy( imb_ ) - iy + 1

         IF( ib.LE.0 )

     $      ib = ( ( -ib ) / descy( mb_ ) + 1 ) * descy( mb_ ) + ib

         ib = min( ib, ml )

         in = iy + ib - 1

*

         DO 80 i = iy, in

*

            IF( ( myrow.EQ.icurrow .OR. rowrep ) .AND.

     $          ( mycol.EQ.icurcol .OR. colrep ) ) THEN

               erri = abs( py( iiy+(jjy-1)*ldpy ) - y( ioffy ) ) / eps

               IF( g( i-iy+1 ).NE.zero )

     $            erri = erri / g( i-iy+1 )

               err = max( err, erri )

               IF( err*sqrt( eps ).GE.one )

     $            info = 1

               iiy = iiy + 1

            END IF

*

            ioffy = ioffy + incy

*

   80    CONTINUE

*

         icurrow = mod( icurrow+1, nprow )

*

         DO 100 i = in+1, iy+ml-1, descy( mb_ )

            ib = min( iy+ml-i, descy( mb_ ) )

*

            DO 90 kk = 0, ib-1

*

               IF( ( myrow.EQ.icurrow .OR. rowrep ) .AND.

     $             ( mycol.EQ.icurcol .OR. colrep ) ) THEN

                  erri = abs( py( iiy+(jjy-1)*ldpy ) - y( ioffy ) )/eps

                  IF( g( i+kk-iy+1 ).NE.zero )

     $               erri = erri / g( i+kk-iy+1 )

                  err = max( err, erri )

                  IF( err*sqrt( eps ).GE.one )

     $               info = 1

                  iiy = iiy + 1

               END IF

*

               ioffy = ioffy + incy

*

   90       CONTINUE

*

            icurrow = mod( icurrow+1, nprow )

*

  100    CONTINUE

*

      END IF

*

*     If INFO = 0, all results are at least half accurate.

*

      CALL igsum2d( ictxt, 'All', ' ', 1, 1, info, 1, -1, mycol )

      CALL dgamx2d( ictxt, 'All', ' ', 1, 1, err, 1, i, j, -1, -1,

     $              mycol )

*

      RETURN

*

*     End of PDMVCH

*

      END

      SUBROUTINE pdvmch( ICTXT, UPLO, M, N, ALPHA, X, IX, JX, DESCX,

     $                   INCX, Y, IY, JY, DESCY, INCY, A, PA, IA, JA,

     $                   DESCA, G, ERR, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        UPLO

      INTEGER            IA, ICTXT, INCX, INCY, INFO, IX, IY, JA, JX,

     $                   JY, M, N

      DOUBLE PRECISION   ALPHA, ERR

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * ), DESCX( * ), DESCY( * )

      DOUBLE PRECISION   A( * ), G( * ), PA( * ), X( * ), Y( * )

*     ..

*

*  Purpose

*  =======

*

*  PDVMCH checks the results of the computational tests.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  UPLO    (global input) CHARACTER*1

*          On entry, UPLO specifies which part of the submatrix sub( A )

*          is to be referenced as follows:

*             If UPLO = 'L', only the lower triangular part,

*             If UPLO = 'U', only the upper triangular part,

*             else the entire matrix is to be referenced.

*

*  M       (global input) INTEGER

*          On entry,  M  specifies  the  number of rows of the submatrix

*          operand matrix A. M must be at least zero.

*

*  N       (global input) INTEGER

*          On entry,  N  specifies  the  number of columns of the subma-

*          trix operand matrix A. N must be at least zero.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  X       (local input) DOUBLE PRECISION array

*          On entry, X is an array of  dimension  (DESCX( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PX.

*

*  IX      (global input) INTEGER

*          On entry, IX  specifies X's global row index, which points to

*          the beginning of the submatrix sub( X ).

*

*  JX      (global input) INTEGER

*          On entry, JX  specifies X's global column index, which points

*          to the beginning of the submatrix sub( X ).

*

*  DESCX   (global and local input) INTEGER array

*          On entry, DESCX  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix X.

*

*  INCX    (global input) INTEGER

*          On entry,  INCX   specifies  the  global  increment  for  the

*          elements of  X.  Only two values of  INCX   are  supported in

*          this version, namely 1 and M_X. INCX  must not be zero.

*

*  Y       (local input) DOUBLE PRECISION array

*          On entry, Y is an array of  dimension  (DESCY( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PY.

*

*  IY      (global input) INTEGER

*          On entry, IY  specifies Y's global row index, which points to

*          the beginning of the submatrix sub( Y ).

*

*  JY      (global input) INTEGER

*          On entry, JY  specifies Y's global column index, which points

*          to the beginning of the submatrix sub( Y ).

*

*  DESCY   (global and local input) INTEGER array

*          On entry, DESCY  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix Y.

*

*  INCY    (global input) INTEGER

*          On entry,  INCY   specifies  the  global  increment  for  the

*          elements of  Y.  Only two values of  INCY   are  supported in

*          this version, namely 1 and M_Y. INCY  must not be zero.

*

*  A       (local input/local output) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  PA      (local input) DOUBLE PRECISION array

*          On entry, PA is an array of dimension (DESCA( LLD_ ),*). This

*          array contains the local entries of the matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  G       (workspace) DOUBLE PRECISION array

*          On entry, G is an array of dimension at least MAX( M, N ).  G

*          is used to compute the gauges.

*

*  ERR     (global output) DOUBLE PRECISION

*          On exit, ERR specifies the largest error in absolute value.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO <> 0, the result is less than half accurate.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO, ONE

      PARAMETER          ( ZERO = 0.0d+0, one = 1.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, LOWER, ROWREP, UPPER

      INTEGER            I, IACOL, IAROW, IB, IBEG, ICURROW, IEND, IIA,

     $                   in, ioffa, ioffx, ioffy, j, jja, kk, lda, ldpa,

     $                   ldx, ldy, mycol, myrow, npcol, nprow

      DOUBLE PRECISION   ATMP, EPS, ERRI, GTMP

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, igsum2d, pb_infog2l

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           LSAME, PDLAMCH

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod, sqrt

*     ..

*     .. Executable Statements ..

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      upper = lsame( uplo, 'U' )

      lower = lsame( uplo, 'L' )

*

      lda = max( 1, desca( m_ ) )

      ldx = max( 1, descx( m_ ) )

      ldy = max( 1, descy( m_ ) )

*

*     Compute expected result in A using data in A, X and Y.

*     Compute gauges in G. This part of the computation is performed

*     by every process in the grid.

*

      DO 70 j = 1, n

*

         ioffy = iy + ( jy - 1 ) * ldy + ( j - 1 ) * incy

*

         IF( lower ) THEN

            ibeg = j

            iend = m

            DO 10 i = 1, j-1

               g( i ) = zero

   10       CONTINUE

         ELSE IF( upper ) THEN

            ibeg = 1

            iend = j

            DO 20 i = j+1, m

               g( i ) = zero

   20       CONTINUE

         ELSE

            ibeg = 1

            iend = m

         END IF

*

         DO 30 i = ibeg, iend

*

            ioffx = ix + ( jx - 1 ) * ldx + ( i - 1 ) * incx

            ioffa = ia + i - 1 + ( ja + j - 2 ) * lda

            atmp = x( ioffx ) * y( ioffy )

            gtmp = abs( x( ioffx ) * y( ioffy ) )

            g( i ) = abs( alpha ) * gtmp + abs( a( ioffa ) )

            a( ioffa ) = alpha * atmp + a( ioffa )

*

   30    CONTINUE

*

*        Compute the error ratio for this result.

*

         info = 0

         err  = zero

         ldpa = desca( lld_ )

         ioffa = ia + ( ja + j - 2 ) * lda

         CALL pb_infog2l( ia, ja+j-1, desca, nprow, npcol, myrow, mycol,

     $                    iia, jja, iarow, iacol )

         rowrep = ( iarow.EQ.-1 )

         colrep = ( iacol.EQ.-1 )

*

         IF( mycol.EQ.iacol .OR. colrep ) THEN

*

            icurrow = iarow

            ib = desca( imb_ ) - ia + 1

            IF( ib.LE.0 )

     $         ib = ( ( -ib ) / desca( mb_ ) + 1 ) * desca( mb_ ) + ib

            ib = min( ib, m )

            in = ia + ib - 1

*

            DO 40 i = ia, in

*

               IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                  erri = abs( pa( iia+(jja-1)*ldpa ) - a( ioffa ) )/eps

                  IF( g( i-ia+1 ).NE.zero )

     $               erri = erri / g( i-ia+1 )

                  err = max( err, erri )

                  IF( err*sqrt( eps ).GE.one )

     $               info = 1

                  iia = iia + 1

               END IF

*

               ioffa = ioffa + 1

*

   40       CONTINUE

*

            icurrow = mod( icurrow+1, nprow )

*

            DO 60 i = in+1, ia+m-1, desca( mb_ )

               ib = min( ia+m-i, desca( mb_ ) )

*

               DO 50 kk = 0, ib-1

*

                  IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                     erri = abs( pa( iia+(jja-1)*ldpa )-a( ioffa ) )/eps

                     IF( g( i+kk-ia+1 ).NE.zero )

     $                  erri = erri / g( i+kk-ia+1 )

                     err = max( err, erri )

                     IF( err*sqrt( eps ).GE.one )

     $                  info = 1

                     iia = iia + 1

                  END IF

*

                  ioffa = ioffa + 1

*

   50          CONTINUE

*

               icurrow = mod( icurrow+1, nprow )

*

   60       CONTINUE

*

         END IF

*

*        If INFO = 0, all results are at least half accurate.

*

         CALL igsum2d( ictxt, 'All', ' ', 1, 1, info, 1, -1, mycol )

         CALL dgamx2d( ictxt, 'All', ' ', 1, 1, err, 1, i, j, -1, -1,

     $                 mycol )

         IF( info.NE.0 )

     $      GO TO 80

*

   70 CONTINUE

*

   80 CONTINUE

*

      RETURN

*

*     End of PDVMCH

*

      END

      SUBROUTINE pdvmch2( ICTXT, UPLO, M, N, ALPHA, X, IX, JX, DESCX,

     $                    INCX, Y, IY, JY, DESCY, INCY, A, PA, IA,

     $                    JA, DESCA, G, ERR, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        UPLO

      INTEGER            IA, ICTXT, INCX, INCY, INFO, IX, IY, JA, JX,

     $                   jy, m, n

      DOUBLE PRECISION   ALPHA, ERR

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * ), DESCX( * ), DESCY( * )

      DOUBLE PRECISION   A( * ), G( * ), PA( * ), X( * ), Y( * )

*     ..

*

*  Purpose

*  =======

*

*  PDVMCH2 checks the results of the computational tests.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  UPLO    (global input) CHARACTER*1

*          On entry, UPLO specifies which part of the submatrix sub( A )

*          is to be referenced as follows:

*             If UPLO = 'L', only the lower triangular part,

*             If UPLO = 'U', only the upper triangular part,

*             else the entire matrix is to be referenced.

*

*  M       (global input) INTEGER

*          On entry,  M  specifies  the  number of rows of the submatrix

*          operand matrix A. M must be at least zero.

*

*  N       (global input) INTEGER

*          On entry,  N  specifies  the  number of columns of the subma-

*          trix operand matrix A. N must be at least zero.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  X       (local input) DOUBLE PRECISION array

*          On entry, X is an array of  dimension  (DESCX( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PX.

*

*  IX      (global input) INTEGER

*          On entry, IX  specifies X's global row index, which points to

*          the beginning of the submatrix sub( X ).

*

*  JX      (global input) INTEGER

*          On entry, JX  specifies X's global column index, which points

*          to the beginning of the submatrix sub( X ).

*

*  DESCX   (global and local input) INTEGER array

*          On entry, DESCX  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix X.

*

*  INCX    (global input) INTEGER

*          On entry,  INCX   specifies  the  global  increment  for  the

*          elements of  X.  Only two values of  INCX   are  supported in

*          this version, namely 1 and M_X. INCX  must not be zero.

*

*  Y       (local input) DOUBLE PRECISION array

*          On entry, Y is an array of  dimension  (DESCY( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PY.

*

*  IY      (global input) INTEGER

*          On entry, IY  specifies Y's global row index, which points to

*          the beginning of the submatrix sub( Y ).

*

*  JY      (global input) INTEGER

*          On entry, JY  specifies Y's global column index, which points

*          to the beginning of the submatrix sub( Y ).

*

*  DESCY   (global and local input) INTEGER array

*          On entry, DESCY  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix Y.

*

*  INCY    (global input) INTEGER

*          On entry,  INCY   specifies  the  global  increment  for  the

*          elements of  Y.  Only two values of  INCY   are  supported in

*          this version, namely 1 and M_Y. INCY  must not be zero.

*

*  A       (local input/local output) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  PA      (local input) DOUBLE PRECISION array

*          On entry, PA is an array of dimension (DESCA( LLD_ ),*). This

*          array contains the local entries of the matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  G       (workspace) DOUBLE PRECISION array

*          On entry, G is an array of dimension at least MAX( M, N ).  G

*          is used to compute the gauges.

*

*  ERR     (global output) DOUBLE PRECISION

*          On exit, ERR specifies the largest error in absolute value.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO <> 0, the result is less than half accurate.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO, ONE

      PARAMETER          ( ZERO = 0.0d+0, one = 1.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, LOWER, ROWREP, UPPER

      INTEGER            I, IACOL, IAROW, IB, IBEG, ICURROW, IEND, IIA,

     $                   IN, IOFFA, IOFFXI, IOFFXJ, IOFFYI, IOFFYJ, J,

     $                   JJA, KK, LDA, LDPA, LDX, LDY, MYCOL, MYROW,

     $                   npcol, nprow

      DOUBLE PRECISION   EPS, ERRI, GTMP, ATMP

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, igsum2d, pb_infog2l

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           lsame, pdlamch

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod, sqrt

*     ..

*     .. Executable Statements ..

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      upper = lsame( uplo, 'U' )

      lower = lsame( uplo, 'L' )

*

      lda = max( 1, desca( m_ ) )

      ldx = max( 1, descx( m_ ) )

      ldy = max( 1, descy( m_ ) )

*

*     Compute expected result in A using data in A, X and Y.

*     Compute gauges in G. This part of the computation is performed

*     by every process in the grid.

*

      DO 70 j = 1, n

*

         ioffxj = ix + ( jx - 1 ) * ldx + ( j - 1 ) * incx

         ioffyj = iy + ( jy - 1 ) * ldy + ( j - 1 ) * incy

*

         IF( lower ) THEN

            ibeg = j

            iend = m

            DO 10 i = 1, j-1

               g( i ) = zero

   10       CONTINUE

         ELSE IF( upper ) THEN

            ibeg = 1

            iend = j

            DO 20 i = j+1, m

               g( i ) = zero

   20       CONTINUE

         ELSE

            ibeg = 1

            iend = m

         END IF

*

         DO 30 i = ibeg, iend

            ioffa = ia + i - 1 + ( ja + j - 2 ) * lda

            ioffxi = ix + ( jx - 1 ) * ldx + ( i - 1 ) * incx

            ioffyi = iy + ( jy - 1 ) * ldy + ( i - 1 ) * incy

            atmp = x( ioffxi ) * y( ioffyj )

            atmp = atmp + y( ioffyi ) * x( ioffxj )

            gtmp = abs( x( ioffxi ) * y( ioffyj ) )

            gtmp = gtmp + abs( y( ioffyi ) * x( ioffxj ) )

            g( i ) = abs( alpha ) * gtmp + abs( a( ioffa ) )

            a( ioffa ) = alpha*atmp + a( ioffa )

*

   30    CONTINUE

*

*        Compute the error ratio for this result.

*

         info = 0

         err  = zero

         ldpa = desca( lld_ )

         ioffa = ia + ( ja + j - 2 ) * lda

         CALL pb_infog2l( ia, ja+j-1, desca, nprow, npcol, myrow, mycol,

     $                    iia, jja, iarow, iacol )

         rowrep = ( iarow.EQ.-1 )

         colrep = ( iacol.EQ.-1 )

*

         IF( mycol.EQ.iacol .OR. colrep ) THEN

*

            icurrow = iarow

            ib = desca( imb_ ) - ia + 1

            IF( ib.LE.0 )

     $         ib = ( ( -ib ) / desca( mb_ ) + 1 ) * desca( mb_ ) + ib

            ib = min( ib, m )

            in = ia + ib - 1

*

            DO 40 i = ia, in

*

               IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                  erri = abs( pa( iia+(jja-1)*ldpa ) - a( ioffa ) )/eps

                  IF( g( i-ia+1 ).NE.zero )

     $               erri = erri / g( i-ia+1 )

                  err = max( err, erri )

                  IF( err*sqrt( eps ).GE.one )

     $               info = 1

                  iia = iia + 1

               END IF

*

               ioffa = ioffa + 1

*

   40       CONTINUE

*

            icurrow = mod( icurrow+1, nprow )

*

            DO 60 i = in+1, ia+m-1, desca( mb_ )

               ib = min( ia+m-i, desca( mb_ ) )

*

               DO 50 kk = 0, ib-1

*

                  IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                     erri = abs( pa( iia+(jja-1)*ldpa )-a( ioffa ) )/eps

                     IF( g( i+kk-ia+1 ).NE.zero )

     $                  erri = erri / g( i+kk-ia+1 )

                     err = max( err, erri )

                     IF( err*sqrt( eps ).GE.one )

     $                  info = 1

                     iia = iia + 1

                  END IF

*

                  ioffa = ioffa + 1

*

   50          CONTINUE

*

               icurrow = mod( icurrow+1, nprow )

*

   60       CONTINUE

*

         END IF

*

*        If INFO = 0, all results are at least half accurate.

*

         CALL igsum2d( ictxt, 'All', ' ', 1, 1, info, 1, -1, mycol )

         CALL dgamx2d( ictxt, 'All', ' ', 1, 1, err, 1, i, j, -1, -1,

     $                 mycol )

         IF( info.NE.0 )

     $      GO TO 80

*

   70 CONTINUE

*

   80 CONTINUE

*

      RETURN

*

*     End of PDVMCH2

*

      END

      SUBROUTINE pdmmch( ICTXT, TRANSA, TRANSB, M, N, K, ALPHA, A, IA,

     $                   JA, DESCA, B, IB, JB, DESCB, BETA, C, PC, IC,

     $                   JC, DESCC, CT, G, ERR, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        TRANSA, TRANSB

      INTEGER            IA, IB, IC, ICTXT, INFO, JA, JB, JC, K, M, N

      DOUBLE PRECISION   ALPHA, BETA, ERR

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * ), DESCB( * ), DESCC( * )

      DOUBLE PRECISION   A( * ), B( * ), C( * ), CT( * ), G( * ),

     $                   PC( * )

*     ..

*

*  Purpose

*  =======

*

*  PDMMCH checks the results of the computational tests.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  TRANSA  (global input) CHARACTER*1

*          On entry, TRANSA specifies if the matrix  operand  A is to be

*          transposed.

*

*  TRANSB  (global input) CHARACTER*1

*          On entry, TRANSB specifies if the matrix  operand  B is to be

*          transposed.

*

*  M       (global input) INTEGER

*          On entry, M specifies the number of rows of C.

*

*  N       (global input) INTEGER

*          On entry, N specifies the number of columns of C.

*

*  K       (global input) INTEGER

*          On entry, K specifies the number of columns (resp. rows) of A

*          when  TRANSA = 'N'  (resp. TRANSA <> 'N')  in PxGEMM, PxSYRK,

*          PxSYR2K, PxHERK and PxHER2K.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  B       (local input) DOUBLE PRECISION array

*          On entry, B is an array of  dimension  (DESCB( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PB.

*

*  IB      (global input) INTEGER

*          On entry, IB  specifies B's global row index, which points to

*          the beginning of the submatrix sub( B ).

*

*  JB      (global input) INTEGER

*          On entry, JB  specifies B's global column index, which points

*          to the beginning of the submatrix sub( B ).

*

*  DESCB   (global and local input) INTEGER array

*          On entry, DESCB  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix B.

*

*  BETA    (global input) DOUBLE PRECISION

*          On entry, BETA specifies the scalar beta.

*

*  C       (local input/local output) DOUBLE PRECISION array

*          On entry, C is an array of  dimension  (DESCC( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PC.

*

*  PC      (local input) DOUBLE PRECISION array

*          On entry, PC is an array of dimension (DESCC( LLD_ ),*). This

*          array contains the local pieces of the matrix PC.

*

*  IC      (global input) INTEGER

*          On entry, IC  specifies C's global row index, which points to

*          the beginning of the submatrix sub( C ).

*

*  JC      (global input) INTEGER

*          On entry, JC  specifies C's global column index, which points

*          to the beginning of the submatrix sub( C ).

*

*  DESCC   (global and local input) INTEGER array

*          On entry, DESCC  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix C.

*

*  CT      (workspace) DOUBLE PRECISION array

*          On entry, CT is an array of dimension at least MAX(M,N,K). CT

*          holds a copy of the current column of C.

*

*  G       (workspace) DOUBLE PRECISION array

*          On entry, G  is  an array of dimension at least MAX(M,N,K). G

*          is used to compute the gauges.

*

*  ERR     (global output) DOUBLE PRECISION

*          On exit, ERR specifies the largest error in absolute value.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO <> 0, the result is less than half accurate.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO, ONE

      PARAMETER          ( ZERO = 0.0d+0, one = 1.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, ROWREP, TRANA, TRANB

      INTEGER            I, IBB, ICCOL, ICROW, ICURROW, IIC, IN, IOFFA,

     $                   IOFFB, IOFFC, J, JJC, KK, LDA, LDB, LDC, LDPC,

     $                   mycol, myrow, npcol, nprow

      DOUBLE PRECISION   EPS, ERRI

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, igsum2d, pb_infog2l

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           LSAME, PDLAMCH

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod, sqrt

*     ..

*     .. Executable Statements ..

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      trana = lsame( transa, 'T' ).OR.lsame( transa, 'C' )

      tranb = lsame( transb, 'T' ).OR.lsame( transb, 'C' )

*

      lda = max( 1, desca( m_ ) )

      ldb = max( 1, descb( m_ ) )

      ldc = max( 1, descc( m_ ) )

*

*     Compute expected result in C using data in A, B and C.

*     Compute gauges in G. This part of the computation is performed

*     by every process in the grid.

*

      DO 240 j = 1, n

*

         ioffc = ic + ( jc + j - 2 ) * ldc

         DO 10 i = 1, m

            ct( i ) = zero

            g( i )  = zero

   10    CONTINUE

*

         IF( .NOT.trana .AND. .NOT.tranb ) THEN

            DO 30 kk = 1, k

               ioffb = ib + kk - 1 + ( jb + j - 2 ) * ldb

               DO 20 i = 1, m

                  ioffa = ia + i - 1 + ( ja + kk - 2 ) * lda

                  ct( i ) = ct( i ) + a( ioffa ) * b( ioffb )

                  g( i ) = g( i ) + abs( a( ioffa ) ) *

     $                     abs( b( ioffb ) )

   20          CONTINUE

   30       CONTINUE

         ELSE IF( trana .AND. .NOT.tranb ) THEN

            DO 50 kk = 1, k

               ioffb = ib + kk - 1 + ( jb + j - 2 ) * ldb

               DO 40 i = 1, m

                  ioffa = ia + kk - 1 + ( ja + i - 2 ) * lda

                  ct( i ) = ct( i ) + a( ioffa ) * b( ioffb )

                  g( i ) = g( i ) + abs( a( ioffa ) ) *

     $                     abs( b( ioffb ) )

   40          CONTINUE

   50       CONTINUE

         ELSE IF( .NOT.trana .AND. tranb ) THEN

            DO 70 kk = 1, k

               ioffb = ib + j - 1 + ( jb + kk - 2 ) * ldb

               DO 60 i = 1, m

                  ioffa = ia + i - 1 + ( ja + kk - 2 ) * lda

                  ct( i ) = ct( i ) + a( ioffa ) * b( ioffb )

                  g( i ) = g( i ) + abs( a( ioffa ) ) *

     $                     abs( b( ioffb ) )

   60          CONTINUE

   70       CONTINUE

         ELSE IF( trana .AND. tranb ) THEN

            DO 90 kk = 1, k

               ioffb = ib + j - 1 + ( jb + kk - 2 ) * ldb

               DO 80 i = 1, m

                  ioffa = ia + kk - 1 + ( ja + i - 2 ) * lda

                  ct( i ) = ct( i ) + a( ioffa ) * b( ioffb )

                  g( i ) = g( i ) + abs( a( ioffa ) ) *

     $                     abs( b( ioffb ) )

   80          CONTINUE

   90       CONTINUE

         END IF

*

         DO 200 i = 1, m

            ct( i ) = alpha*ct( i ) + beta * c( ioffc )

            g( i ) = abs( alpha )*g( i ) + abs( beta )*abs( c( ioffc ) )

            c( ioffc ) = ct( i )

            ioffc      = ioffc + 1

  200    CONTINUE

*

*        Compute the error ratio for this result.

*

         err  = zero

         info = 0

         ldpc = descc( lld_ )

         ioffc = ic + ( jc + j - 2 ) * ldc

         CALL pb_infog2l( ic, jc+j-1, descc, nprow, npcol, myrow, mycol,

     $                    iic, jjc, icrow, iccol )

         icurrow = icrow

         rowrep  = ( icrow.EQ.-1 )

         colrep  = ( iccol.EQ.-1 )

*

         IF( mycol.EQ.iccol .OR. colrep ) THEN

*

            ibb = descc( imb_ ) - ic + 1

            IF( ibb.LE.0 )

     $         ibb = ( ( -ibb ) / descc( mb_ ) + 1 )*descc( mb_ ) + ibb

            ibb = min( ibb, m )

            in = ic + ibb - 1

*

            DO 210 i = ic, in

*

               IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                  erri = abs( pc( iic+(jjc-1)*ldpc ) -

     $                        c( ioffc ) ) / eps

                  IF( g( i-ic+1 ).NE.zero )

     $               erri = erri / g( i-ic+1 )

                  err = max( err, erri )

                  IF( err*sqrt( eps ).GE.one )

     $               info = 1

                  iic = iic + 1

               END IF

*

               ioffc = ioffc + 1

*

  210       CONTINUE

*

            icurrow = mod( icurrow+1, nprow )

*

            DO 230 i = in+1, ic+m-1, descc( mb_ )

               ibb = min( ic+m-i, descc( mb_ ) )

*

               DO 220 kk = 0, ibb-1

*

                  IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                     erri = abs( pc( iic+(jjc-1)*ldpc ) -

     $                           c( ioffc ) )/eps

                     IF( g( i+kk-ic+1 ).NE.zero )

     $                  erri = erri / g( i+kk-ic+1 )

                     err = max( err, erri )

                     IF( err*sqrt( eps ).GE.one )

     $                  info = 1

                     iic = iic + 1

                  END IF

*

                  ioffc = ioffc + 1

*

  220          CONTINUE

*

               icurrow = mod( icurrow+1, nprow )

*

  230       CONTINUE

*

         END IF

*

*        If INFO = 0, all results are at least half accurate.

*

         CALL igsum2d( ictxt, 'All', ' ', 1, 1, info, 1, -1, mycol )

         CALL dgamx2d( ictxt, 'All', ' ', 1, 1, err, 1, i, j, -1, -1,

     $                 mycol )

         IF( info.NE.0 )

     $      GO TO 250

*

  240 CONTINUE

*

  250 CONTINUE

*

      RETURN

*

*     End of PDMMCH

*

      END

      SUBROUTINE pdmmch1( ICTXT, UPLO, TRANS, N, K, ALPHA, A, IA, JA,

     $                    DESCA, BETA, C, PC, IC, JC, DESCC, CT, G,

     $                    ERR, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        TRANS, UPLO

      INTEGER            IA, IC, ICTXT, INFO, JA, JC, K, N

      DOUBLE PRECISION   ALPHA, BETA, ERR

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * ), DESCC( * )

      DOUBLE PRECISION   A( * ), C( * ), CT( * ), G( * ), PC( * )

*     ..

*

*  Purpose

*  =======

*

*  PDMMCH1 checks the results of the computational tests.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  UPLO    (global input) CHARACTER*1

*          On entry,  UPLO  specifies which part of C should contain the

*          result.

*

*  TRANS   (global input) CHARACTER*1

*          On entry,  TRANS  specifies  whether  the  matrix A has to be

*          transposed or not before computing the matrix-matrix product.

*

*  N       (global input) INTEGER

*          On entry, N  specifies  the order  the submatrix operand C. N

*          must be at least zero.

*

*  K       (global input) INTEGER

*          On entry, K specifies the number of columns (resp. rows) of A

*          when  TRANS = 'N'  (resp. TRANS <> 'N').  K  must be at least

*          zero.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  BETA    (global input) DOUBLE PRECISION

*          On entry, BETA specifies the scalar beta.

*

*  C       (local input/local output) DOUBLE PRECISION array

*          On entry, C is an array of  dimension  (DESCC( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PC.

*

*  PC      (local input) DOUBLE PRECISION array

*          On entry, PC is an array of dimension (DESCC( LLD_ ),*). This

*          array contains the local pieces of the matrix PC.

*

*  IC      (global input) INTEGER

*          On entry, IC  specifies C's global row index, which points to

*          the beginning of the submatrix sub( C ).

*

*  JC      (global input) INTEGER

*          On entry, JC  specifies C's global column index, which points

*          to the beginning of the submatrix sub( C ).

*

*  DESCC   (global and local input) INTEGER array

*          On entry, DESCC  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix C.

*

*  CT      (workspace) DOUBLE PRECISION array

*          On entry, CT is an array of dimension at least MAX(M,N,K). CT

*          holds a copy of the current column of C.

*

*  G       (workspace) DOUBLE PRECISION array

*          On entry, G  is  an array of dimension at least MAX(M,N,K). G

*          is used to compute the gauges.

*

*  ERR     (global output) DOUBLE PRECISION

*          On exit, ERR specifies the largest error in absolute value.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO <> 0, the result is less than half accurate.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO, ONE

      PARAMETER          ( ZERO = 0.0d+0, one = 1.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, NOTRAN, ROWREP, TRAN, UPPER

      INTEGER            I, IBB, IBEG, ICCOL, ICROW, ICURROW, IEND, IIC,

     $                   IN, IOFFAK, IOFFAN, IOFFC, J, JJC, KK, LDA,

     $                   LDC, LDPC, MYCOL, MYROW, NPCOL, NPROW

      DOUBLE PRECISION   EPS, ERRI

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, igsum2d, pb_infog2l

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           lsame, pdlamch

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod, sqrt

*     ..

*     .. Executable Statements ..

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      upper  = lsame( uplo,  'U' )

      notran = lsame( trans, 'N' )

      tran   = lsame( trans, 'T' )

*

      lda = max( 1, desca( m_ ) )

      ldc = max( 1, descc( m_ ) )

*

*     Compute expected result in C using data in A, B and C.

*     Compute gauges in G. This part of the computation is performed

*     by every process in the grid.

*

      DO 140 j = 1, n

*

         IF( upper ) THEN

            ibeg = 1

            iend = j

         ELSE

            ibeg = j

            iend = n

         END IF

*

         DO 10 i = 1, n

            ct( i ) = zero

            g( i )  = zero

   10    CONTINUE

*

         IF( notran ) THEN

            DO 30 kk = 1, k

               ioffak = ia + j - 1 + ( ja + kk - 2 ) * lda

               DO 20 i = ibeg, iend

                  ioffan = ia + i - 1 + ( ja + kk - 2 ) * lda

                  ct( i ) = ct( i ) + a( ioffak ) * a( ioffan )

                  g( i ) = g( i ) + abs( a( ioffak ) ) *

     $                     abs( a( ioffan ) )

   20          CONTINUE

   30       CONTINUE

         ELSE IF( tran ) THEN

            DO 50 kk = 1, k

               ioffak = ia + kk - 1 + ( ja + j - 2 ) * lda

               DO 40 i = ibeg, iend

                  ioffan = ia + kk - 1 + ( ja + i - 2 ) * lda

                  ct( i ) = ct( i ) + a( ioffak ) * a( ioffan )

                  g( i ) = g( i ) + abs( a( ioffak ) ) *

     $                     abs( a( ioffan ) )

   40          CONTINUE

   50       CONTINUE

         END IF

*

         ioffc = ic + ibeg - 1 + ( jc + j - 2 ) * ldc

*

         DO 100 i = ibeg, iend

            ct( i ) = alpha*ct( i ) + beta * c( ioffc )

            g( i ) = abs( alpha )*g( i ) + abs( beta )*abs( c( ioffc ) )

            c( ioffc ) = ct( i )

            ioffc = ioffc + 1

  100    CONTINUE

*

*        Compute the error ratio for this result.

*

         err  = zero

         info = 0

         ldpc = descc( lld_ )

         ioffc = ic + ( jc + j - 2 ) * ldc

         CALL pb_infog2l( ic, jc+j-1, descc, nprow, npcol, myrow, mycol,

     $                    iic, jjc, icrow, iccol )

         icurrow = icrow

         rowrep  = ( icrow.EQ.-1 )

         colrep  = ( iccol.EQ.-1 )

*

         IF( mycol.EQ.iccol .OR. colrep ) THEN

*

            ibb = descc( imb_ ) - ic + 1

            IF( ibb.LE.0 )

     $         ibb = ( ( -ibb ) / descc( mb_ ) + 1 )*descc( mb_ ) + ibb

            ibb = min( ibb, n )

            in = ic + ibb - 1

*

            DO 110 i = ic, in

*

               IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                  erri = abs( pc( iic+(jjc-1)*ldpc ) -

     $                        c( ioffc ) ) / eps

                  IF( g( i-ic+1 ).NE.zero )

     $               erri = erri / g( i-ic+1 )

                  err = max( err, erri )

                  IF( err*sqrt( eps ).GE.one )

     $               info = 1

                  iic = iic + 1

               END IF

*

               ioffc = ioffc + 1

*

  110       CONTINUE

*

            icurrow = mod( icurrow+1, nprow )

*

            DO 130 i = in+1, ic+n-1, descc( mb_ )

               ibb = min( ic+n-i, descc( mb_ ) )

*

               DO 120 kk = 0, ibb-1

*

                  IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                     erri = abs( pc( iic+(jjc-1)*ldpc ) -

     $                           c( ioffc ) )/eps

                     IF( g( i+kk-ic+1 ).NE.zero )

     $                  erri = erri / g( i+kk-ic+1 )

                     err = max( err, erri )

                     IF( err*sqrt( eps ).GE.one )

     $                  info = 1

                     iic = iic + 1

                  END IF

*

                  ioffc = ioffc + 1

*

  120          CONTINUE

*

               icurrow = mod( icurrow+1, nprow )

*

  130       CONTINUE

*

         END IF

*

*        If INFO = 0, all results are at least half accurate.

*

         CALL igsum2d( ictxt, 'All', ' ', 1, 1, info, 1, -1, mycol )

         CALL dgamx2d( ictxt, 'All', ' ', 1, 1, err, 1, i, j, -1, -1,

     $                 mycol )

         IF( info.NE.0 )

     $      GO TO 150

*

  140 CONTINUE

*

  150 CONTINUE

*

      RETURN

*

*     End of PDMMCH1

*

      END

      SUBROUTINE pdmmch2( ICTXT, UPLO, TRANS, N, K, ALPHA, A, IA, JA,

     $                    DESCA, B, IB, JB, DESCB, BETA, C, PC, IC,

     $                    JC, DESCC, CT, G, ERR, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        TRANS, UPLO

      INTEGER            IA, IB, IC, ICTXT, INFO, JA, JB, JC, K, N

      DOUBLE PRECISION   ALPHA, BETA, ERR

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * ), DESCB( * ), DESCC( * )

      DOUBLE PRECISION   A( * ), B( * ), C( * ), CT( * ), G( * ),

     $                   pc( * )

*     ..

*

*  Purpose

*  =======

*

*  PDMMCH2 checks the results of the computational tests.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  UPLO    (global input) CHARACTER*1

*          On entry,  UPLO  specifies which part of C should contain the

*          result.

*

*  TRANS   (global input) CHARACTER*1

*          On entry,  TRANS  specifies whether the matrices A and B have

*          to  be  transposed  or not before computing the matrix-matrix

*          product.

*

*  N       (global input) INTEGER

*          On entry, N  specifies  the order  the submatrix operand C. N

*          must be at least zero.

*

*  K       (global input) INTEGER

*          On entry, K specifies the number of columns (resp. rows) of A

*          and B when  TRANS = 'N' (resp. TRANS <> 'N').  K  must  be at

*          least zero.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  B       (local input) DOUBLE PRECISION array

*          On entry, B is an array of  dimension  (DESCB( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PB.

*

*  IB      (global input) INTEGER

*          On entry, IB  specifies B's global row index, which points to

*          the beginning of the submatrix sub( B ).

*

*  JB      (global input) INTEGER

*          On entry, JB  specifies B's global column index, which points

*          to the beginning of the submatrix sub( B ).

*

*  DESCB   (global and local input) INTEGER array

*          On entry, DESCB  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix B.

*

*  BETA    (global input) DOUBLE PRECISION

*          On entry, BETA specifies the scalar beta.

*

*  C       (local input/local output) DOUBLE PRECISION array

*          On entry, C is an array of  dimension  (DESCC( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PC.

*

*  PC      (local input) DOUBLE PRECISION array

*          On entry, PC is an array of dimension (DESCC( LLD_ ),*). This

*          array contains the local pieces of the matrix PC.

*

*  IC      (global input) INTEGER

*          On entry, IC  specifies C's global row index, which points to

*          the beginning of the submatrix sub( C ).

*

*  JC      (global input) INTEGER

*          On entry, JC  specifies C's global column index, which points

*          to the beginning of the submatrix sub( C ).

*

*  DESCC   (global and local input) INTEGER array

*          On entry, DESCC  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix C.

*

*  CT      (workspace) DOUBLE PRECISION array

*          On entry, CT is an array of dimension at least MAX(M,N,K). CT

*          holds a copy of the current column of C.

*

*  G       (workspace) DOUBLE PRECISION array

*          On entry, G  is  an array of dimension at least MAX(M,N,K). G

*          is used to compute the gauges.

*

*  ERR     (global output) DOUBLE PRECISION

*          On exit, ERR specifies the largest error in absolute value.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO <> 0, the result is less than half accurate.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO, ONE

      PARAMETER          ( ZERO = 0.0d+0, one = 1.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, NOTRAN, ROWREP, TRAN, UPPER

      INTEGER            I, IBB, IBEG, ICCOL, ICROW, ICURROW, IEND, IIC,

     $                   IN, IOFFAK, IOFFAN, IOFFBK, IOFFBN, IOFFC, J,

     $                   JJC, KK, LDA, LDB, LDC, LDPC, MYCOL, MYROW,

     $                   NPCOL, NPROW

      DOUBLE PRECISION   EPS, ERRI

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, dgamx2d, igsum2d, pb_infog2l

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           LSAME, PDLAMCH

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min, mod, sqrt

*     ..

*     .. Executable Statements ..

*

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      eps = pdlamch( ictxt, 'eps' )

*

      upper = lsame( uplo, 'U' )

      notran = lsame( trans, 'N' )

      tran = lsame( trans, 'T' )

*

      lda = max( 1, desca( m_ ) )

      ldb = max( 1, descb( m_ ) )

      ldc = max( 1, descc( m_ ) )

*

*     Compute expected result in C using data in A, B and C.

*     Compute gauges in G. This part of the computation is performed

*     by every process in the grid.

*

      DO 140 j = 1, n

*

         IF( upper ) THEN

            ibeg = 1

            iend = j

         ELSE

            ibeg = j

            iend = n

         END IF

*

         DO 10 i = 1, n

            ct( i ) = zero

            g( i )  = zero

   10    CONTINUE

*

         IF( notran ) THEN

            DO 30 kk = 1, k

               ioffak = ia + j - 1 + ( ja + kk - 2 ) * lda

               ioffbk = ib + j - 1 + ( jb + kk - 2 ) * ldb

               DO 20 i = ibeg, iend

                  ioffan = ia + i - 1 + ( ja + kk - 2 ) * lda

                  ioffbn = ib + i - 1 + ( jb + kk - 2 ) * ldb

                  ct( i ) = ct( i ) + alpha * (

     $                      a( ioffan ) * b( ioffbk ) +

     $                      b( ioffbn ) * a( ioffak ) )

                  g( i ) = g( i ) + abs( alpha ) * (

     $                     abs( a( ioffan ) ) * abs( b( ioffbk ) ) +

     $                     abs( b( ioffbn ) ) * abs( a( ioffak ) ) )

   20          CONTINUE

   30       CONTINUE

         ELSE IF( tran ) THEN

            DO 50 kk = 1, k

               ioffak = ia + kk - 1 + ( ja + j - 2 ) * lda

               ioffbk = ib + kk - 1 + ( jb + j - 2 ) * ldb

               DO 40 i = ibeg, iend

                  ioffan = ia + kk - 1 + ( ja + i - 2 ) * lda

                  ioffbn = ib + kk - 1 + ( jb + i - 2 ) * ldb

                  ct( i ) = ct( i ) + alpha * (

     $                      a( ioffan ) * b( ioffbk ) +

     $                      b( ioffbn ) * a( ioffak ) )

                  g( i ) = g( i ) + abs( alpha ) * (

     $                     abs( a( ioffan ) ) * abs( b( ioffbk ) ) +

     $                     abs( b( ioffbn ) ) * abs( a( ioffak ) ) )

   40          CONTINUE

   50       CONTINUE

         END IF

*

         ioffc = ic + ibeg - 1 + ( jc + j - 2 ) * ldc

*

         DO 100 i = ibeg, iend

            ct( i ) = ct( i ) + beta * c( ioffc )

            g( i ) = g( i ) + abs( beta )*abs( c( ioffc ) )

            c( ioffc ) = ct( i )

            ioffc = ioffc + 1

  100    CONTINUE

*

*        Compute the error ratio for this result.

*

         err  = zero

         info = 0

         ldpc = descc( lld_ )

         ioffc = ic + ( jc + j - 2 ) * ldc

         CALL pb_infog2l( ic, jc+j-1, descc, nprow, npcol, myrow, mycol,

     $                    iic, jjc, icrow, iccol )

         icurrow = icrow

         rowrep  = ( icrow.EQ.-1 )

         colrep  = ( iccol.EQ.-1 )

*

         IF( mycol.EQ.iccol .OR. colrep ) THEN

*

            ibb = descc( imb_ ) - ic + 1

            IF( ibb.LE.0 )

     $         ibb = ( ( -ibb ) / descc( mb_ ) + 1 )*descc( mb_ ) + ibb

            ibb = min( ibb, n )

            in = ic + ibb - 1

*

            DO 110 i = ic, in

*

               IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                  erri = abs( pc( iic+(jjc-1)*ldpc ) -

     $                        c( ioffc ) ) / eps

                  IF( g( i-ic+1 ).NE.zero )

     $               erri = erri / g( i-ic+1 )

                  err = max( err, erri )

                  IF( err*sqrt( eps ).GE.one )

     $               info = 1

                  iic = iic + 1

               END IF

*

               ioffc = ioffc + 1

*

  110       CONTINUE

*

            icurrow = mod( icurrow+1, nprow )

*

            DO 130 i = in+1, ic+n-1, descc( mb_ )

               ibb = min( ic+n-i, descc( mb_ ) )

*

               DO 120 kk = 0, ibb-1

*

                  IF( myrow.EQ.icurrow .OR. rowrep ) THEN

                     erri = abs( pc( iic+(jjc-1)*ldpc ) -

     $                           c( ioffc ) )/eps

                     IF( g( i+kk-ic+1 ).NE.zero )

     $                  erri = erri / g( i+kk-ic+1 )

                     err = max( err, erri )

                     IF( err*sqrt( eps ).GE.one )

     $                  info = 1

                     iic = iic + 1

                  END IF

*

                  ioffc = ioffc + 1

*

  120          CONTINUE

*

               icurrow = mod( icurrow+1, nprow )

*

  130       CONTINUE

*

         END IF

*

*        If INFO = 0, all results are at least half accurate.

*

         CALL igsum2d( ictxt, 'All', ' ', 1, 1, info, 1, -1, mycol )

         CALL dgamx2d( ictxt, 'All', ' ', 1, 1, err, 1, i, j, -1, -1,

     $                 mycol )

         IF( info.NE.0 )

     $      GO TO 150

*

  140 CONTINUE

*

  150 CONTINUE

*

      RETURN

*

*     End of PDMMCH2

*

      END

      SUBROUTINE pdmmch3( UPLO, TRANS, M, N, ALPHA, A, IA, JA, DESCA,

     $                    BETA, C, PC, IC, JC, DESCC, ERR, INFO )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        TRANS, UPLO

      INTEGER            IA, IC, INFO, JA, JC, M, N

      DOUBLE PRECISION   ALPHA, BETA, ERR

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * ), DESCC( * )

      DOUBLE PRECISION   A( * ), C( * ), PC( * )

*     ..

*

*  Purpose

*  =======

*

*  PDMMCH3 checks the results of the computational tests.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  UPLO    (global input) CHARACTER*1

*          On entry,  UPLO  specifies which part of C should contain the

*          result.

*

*  TRANS   (global input) CHARACTER*1

*          On entry,  TRANS  specifies  whether  the  matrix A has to be

*          transposed  or not  before computing the  matrix-matrix addi-

*          tion.

*

*  M       (global input) INTEGER

*          On entry, M specifies the number of rows of C.

*

*  N       (global input) INTEGER

*          On entry, N specifies the number of columns of C.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry, A is an array of  dimension  (DESCA( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PA.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  BETA    (global input) DOUBLE PRECISION

*          On entry, BETA specifies the scalar beta.

*

*  C       (local input/local output) DOUBLE PRECISION array

*          On entry, C is an array of  dimension  (DESCC( M_ ),*).  This

*          array contains a local copy of the initial entire matrix PC.

*

*  PC      (local input) DOUBLE PRECISION array

*          On entry, PC is an array of dimension (DESCC( LLD_ ),*). This

*          array contains the local pieces of the matrix PC.

*

*  IC      (global input) INTEGER

*          On entry, IC  specifies C's global row index, which points to

*          the beginning of the submatrix sub( C ).

*

*  JC      (global input) INTEGER

*          On entry, JC  specifies C's global column index, which points

*          to the beginning of the submatrix sub( C ).

*

*  DESCC   (global and local input) INTEGER array

*          On entry, DESCC  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix C.

*

*  ERR     (global output) DOUBLE PRECISION

*          On exit, ERR specifies the largest error in absolute value.

*

*  INFO    (global output) INTEGER

*          On exit, if INFO <> 0, the result is less than half accurate.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      DOUBLE PRECISION   ZERO

      PARAMETER          ( ZERO = 0.0d+0 )

*     ..

*     .. Local Scalars ..

      LOGICAL            COLREP, LOWER, NOTRAN, ROWREP, UPPER

      INTEGER            I, ICCOL, ICROW, ICTXT, IIC, IOFFA, IOFFC, J,

     $                   JJC, LDA, LDC, LDPC, MYCOL, MYROW, NPCOL,

     $                   NPROW

      DOUBLE PRECISION   ERR0, ERRI, PREC

*     ..

*     .. External Subroutines ..

      EXTERNAL           BLACS_GRIDINFO, DGAMX2D, IGSUM2D, PB_INFOG2L,

     $                   pderraxpby

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      DOUBLE PRECISION   PDLAMCH

      EXTERNAL           LSAME, PDLAMCH

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max

*     ..

*     .. Executable Statements ..

*

      ictxt = descc( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      prec   = pdlamch( ictxt, 'eps' )

*

      upper  = lsame( uplo,  'U' )

      lower  = lsame( uplo,  'L' )

      notran = lsame( trans, 'N' )

*

*     Compute expected result in C using data in A and C. This part of

*     the computation is performed by every process in the grid.

*

      info   = 0

      err    = zero

*

      lda    = max( 1, desca( m_   ) )

      ldc    = max( 1, descc( m_   ) )

      ldpc   = max( 1, descc( lld_ ) )

      rowrep = ( descc( rsrc_ ).EQ.-1 )

      colrep = ( descc( csrc_ ).EQ.-1 )

*

      IF( notran ) THEN

*

         DO 20 j = jc, jc + n - 1

*

            ioffc = ic + ( j  - 1          ) * ldc

            ioffa = ia + ( ja - 1 + j - jc ) * lda

*

            DO 10 i = ic, ic + m - 1

*

               IF( upper ) THEN

                  IF( ( j - jc ).GE.( i - ic ) ) THEN

                     CALL pderraxpby( erri, alpha, a( ioffa ), beta,

     $                                c( ioffc ), prec )

                  ELSE

                     erri = zero

                  END IF

               ELSE IF( lower ) THEN

                  IF( ( j - jc ).LE.( i - ic ) ) THEN

                     CALL pderraxpby( erri, alpha, a( ioffa ), beta,

     $                                c( ioffc ), prec )

                  ELSE

                     erri = zero

                  END IF

               ELSE

                  CALL pderraxpby( erri, alpha, a( ioffa ), beta,

     $                             c( ioffc ), prec )

               END IF

*

               CALL pb_infog2l( i, j, descc, nprow, npcol, myrow, mycol,

     $                          iic, jjc, icrow, iccol )

               IF( ( myrow.EQ.icrow .OR. rowrep ) .AND.

     $             ( mycol.EQ.iccol .OR. colrep ) ) THEN

                  err0 = abs( pc( iic+(jjc-1)*ldpc )-c( ioffc ) )

                  IF( err0.GT.erri )

     $               info = 1

                  err = max( err, err0 )

               END IF

*

               ioffa = ioffa + 1

               ioffc = ioffc + 1

*

   10       CONTINUE

*

   20    CONTINUE

*

      ELSE

*

         DO 40 j = jc, jc + n - 1

*

            ioffc = ic +              ( j  - 1 ) * ldc

            ioffa = ia + ( j - jc ) + ( ja - 1 ) * lda

*

            DO 30 i = ic, ic + m - 1

*

               IF( upper ) THEN

                  IF( ( j - jc ).GE.( i - ic ) ) THEN

                     CALL pderraxpby( erri, alpha, a( ioffa ), beta,

     $                               c( ioffc ), prec )

                  ELSE

                     erri = zero

                  END IF

               ELSE IF( lower ) THEN

                  IF( ( j - jc ).LE.( i - ic ) ) THEN

                     CALL pderraxpby( erri, alpha, a( ioffa ), beta,

     $                               c( ioffc ), prec )

                  ELSE

                     erri = zero

                  END IF

               ELSE

                  CALL pderraxpby( erri, alpha, a( ioffa ), beta,

     $                            c( ioffc ), prec )

               END IF

*

               CALL pb_infog2l( i, j, descc, nprow, npcol, myrow, mycol,

     $                          iic, jjc, icrow, iccol )

               IF( ( myrow.EQ.icrow .OR. rowrep ) .AND.

     $             ( mycol.EQ.iccol .OR. colrep ) ) THEN

                  err0 = abs( pc( iic+(jjc-1)*ldpc )-c( ioffc ) )

                  IF( err0.GT.erri )

     $               info = 1

                  err = max( err, err0 )

               END IF

*

               ioffc = ioffc + 1

               ioffa = ioffa + lda

*

   30       CONTINUE

*

   40    CONTINUE

*

      END IF

*

*     If INFO = 0, all results are at least half accurate.

*

      CALL igsum2d( ictxt, 'All', ' ', 1, 1, info, 1, -1, mycol )

      CALL dgamx2d( ictxt, 'All', ' ', 1, 1, err, 1, i, j, -1, -1,

     $              mycol )

*

      RETURN

*

*     End of PDMMCH3

*

      END

      SUBROUTINE pderraxpby( ERRBND, ALPHA, X, BETA, Y, PREC )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      DOUBLE PRECISION   ALPHA, BETA, ERRBND, PREC, X, Y

*     ..

*

*  Purpose

*  =======

*

*  PDERRAXPBY  serially  computes  y := beta*y + alpha * x and returns a

*  scaled relative acceptable error bound on the result.

*

*  Arguments

*  =========

*

*  ERRBND  (global output) DOUBLE PRECISION

*          On exit, ERRBND  specifies the scaled relative acceptable er-

*          ror bound.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  X       (global input) DOUBLE PRECISION

*          On entry, X  specifies the scalar x to be scaled.

*

*  BETA    (global input) DOUBLE PRECISION

*          On entry, BETA specifies the scalar beta.

*

*  Y       (global input/global output) DOUBLE PRECISION

*          On entry,  Y  specifies  the scalar y to be added. On exit, Y

*          contains the resulting scalar y.

*

*  PREC    (global input) DOUBLE PRECISION

*          On entry, PREC specifies the machine precision.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      DOUBLE PRECISION   ONE, TWO, ZERO

      PARAMETER          ( ONE = 1.0d+0, two = 2.0d+0,

     $                   zero = 0.0d+0 )

*     ..

*     .. Local Scalars ..

      DOUBLE PRECISION   ADDBND, FACT, SUMPOS, SUMNEG, TMP

*     ..

*     .. Intrinsic Functions ..

*     ..

*     .. Executable Statements ..

*

      SUMPOS = zero

      sumneg = zero

      fact = one + two * prec

      addbnd = two * two * two * prec

*

      tmp = alpha * x

      IF( tmp.GE.zero ) THEN

         sumpos = sumpos + tmp * fact

      ELSE

         sumneg = sumneg - tmp * fact

      END IF

*

      tmp = beta * y

      IF( tmp.GE.zero ) THEN

         sumpos = sumpos + tmp * fact

      ELSE

         sumneg = sumneg - tmp * fact

      END IF

*

      y = ( beta * y ) + ( alpha * x )

*

      errbnd = addbnd * max( sumpos, sumneg )

*

      RETURN

*

*     End of PDERRAXPBY

*

      END

      DOUBLE PRECISION   FUNCTION pdlamch( ICTXT, CMACH )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        cmach

      INTEGER            ictxt

*     ..

*

*  Purpose

*  =======

*

*  PDLAMCH determines double precision machine parameters.

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  CMACH   (global input) CHARACTER*1

*          On entry, CMACH specifies the value to be returned by PDLAMCH

*          as follows:

*             = 'E' or 'e',   PDLAMCH := eps,

*             = 'S' or 's ,   PDLAMCH := sfmin,

*             = 'B' or 'b',   PDLAMCH := base,

*             = 'P' or 'p',   PDLAMCH := eps*base,

*             = 'N' or 'n',   PDLAMCH := t,

*             = 'R' or 'r',   PDLAMCH := rnd,

*             = 'M' or 'm',   PDLAMCH := emin,

*             = 'U' or 'u',   PDLAMCH := rmin,

*             = 'L' or 'l',   PDLAMCH := emax,

*             = 'O' or 'o',   PDLAMCH := rmax,

*

*          where

*

*          eps   = relative machine precision,

*          sfmin = safe minimum, such that 1/sfmin does not overflow,

*          base  = base of the machine,

*          prec  = eps*base,

*          t     = number of (base) digits in the mantissa,

*          rnd   = 1.0 when rounding occurs in addition, 0.0 otherwise,

*          emin  = minimum exponent before (gradual) underflow,

*          rmin  = underflow threshold - base**(emin-1),

*          emax  = largest exponent before overflow,

*          rmax  = overflow threshold  - (base**emax)*(1-eps).

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      CHARACTER*1        top

      INTEGER            idumm

      DOUBLE PRECISION   temp

*     ..

*     .. External Subroutines ..

      EXTERNAL           dgamn2d, dgamx2d, pb_topget

*     ..

*     .. External Functions ..

      LOGICAL            lsame

      DOUBLE PRECISION   dlamch

      EXTERNAL           dlamch, lsame

*     ..

*     .. Executable Statements ..

*

      temp = dlamch( cmach )

      idumm = 0

*

      IF( lsame( cmach, 'E' ).OR.lsame( cmach, 'S' ).OR.

     $    lsame( cmach, 'M' ).OR.lsame( cmach, 'U' ) ) THEN

         CALL pb_topget( ictxt, 'Combine', 'All', top )

         CALL dgamx2d( ictxt, 'All', top, 1, 1, temp, 1, idumm,

     $                 idumm, -1, -1, idumm )

      ELSE IF( lsame( cmach, 'L' ).OR.lsame( cmach, 'O' ) ) THEN

         CALL pb_topget( ictxt, 'Combine', 'All', top )

         CALL dgamn2d( ictxt, 'All', top, 1, 1, temp, 1, idumm,

     $                 idumm, -1, -1, idumm )

      END IF

*

      pdlamch = temp

*

      RETURN

*

*     End of PDLAMCH

*

      END

      SUBROUTINE pdlaset( UPLO, M, N, ALPHA, BETA, A, IA, JA, DESCA )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        UPLO

      INTEGER            IA, JA, M, N

      DOUBLE PRECISION   ALPHA, BETA

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * )

      DOUBLE PRECISION   A( * )

*     ..

*

*  Purpose

*  =======

*

*  PDLASET  initializes an m by n submatrix A(IA:IA+M-1,JA:JA+N-1) deno-

*  ted  by  sub( A )  to beta on the diagonal and alpha on the offdiago-

*  nals.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  UPLO    (global input) CHARACTER*1

*          On entry, UPLO specifies the part  of  the submatrix sub( A )

*          to be set:

*             = 'L' or 'l':   Lower triangular part is set; the strictly

*                      upper triangular part of sub( A ) is not changed;

*             = 'U' or 'u':   Upper triangular part is set; the strictly

*                      lower triangular part of sub( A ) is not changed;

*             Otherwise:  All of the matrix sub( A ) is set.

*

*  M       (global input) INTEGER

*          On entry,  M  specifies the number of rows of  the  submatrix

*          sub( A ). M  must be at least zero.

*

*  N       (global input) INTEGER

*          On entry, N  specifies the number of columns of the submatrix

*          sub( A ). N must be at least zero.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry,  ALPHA  specifies the scalar alpha, i.e., the cons-

*          tant to which the offdiagonal elements are to be set.

*

*  BETA    (global input) DOUBLE PRECISION

*          On entry, BETA  specifies the scalar beta, i.e., the constant

*          to which the diagonal elements are to be set.

*

*  A       (local input/local output) DOUBLE PRECISION array

*          On entry, A is an array of dimension (LLD_A, Ka), where Ka is

*          at least Lc( 1, JA+N-1 ).  Before  entry, this array contains

*          the local entries of the matrix  A  to be  set.  On exit, the

*          leading m by n submatrix sub( A ) is set as follows:

*

*          if UPLO = 'U',  A(IA+i-1,JA+j-1) = ALPHA, 1<=i<=j-1, 1<=j<=N,

*          if UPLO = 'L',  A(IA+i-1,JA+j-1) = ALPHA, j+1<=i<=M, 1<=j<=N,

*          otherwise,      A(IA+i-1,JA+j-1) = ALPHA, 1<=i<=M,   1<=j<=N,

*                                                      and IA+i.NE.JA+j,

*          and, for all UPLO,  A(IA+i-1,JA+i-1) = BETA,  1<=i<=min(M,N).

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

*     ..

*     .. Local Scalars ..

      LOGICAL            GODOWN, GOLEFT, ISCOLREP, ISROWREP, LOWER,

     $                   UPPER

      INTEGER            IACOL, IAROW, ICTXT, IIA, IIMAX, ILOW, IMB1,

     $                   IMBLOC, INB1, INBLOC, IOFFA, IOFFD, IUPP, JJA,

     $                   JJMAX, JOFFA, JOFFD, LCMT, LCMT00, LDA, LMBLOC,

     $                   LNBLOC, LOW, M1, MB, MBLKD, MBLKS, MBLOC, MP,

     $                   MRCOL, MRROW, MYCOL, MYROW, N1, NB, NBLKD,

     $                   NBLKS, NBLOC, NPCOL, NPROW, NQ, PMB, QNB, TMP1,

     $                   UPP

*     ..

*     .. Local Arrays ..

      INTEGER            DESCA2( DLEN_ )

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, pb_ainfog2l, pb_binfo,

     $                   pb_desctrans, pb_dlaset

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      EXTERNAL           lsame

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          min

*     ..

*     .. Executable Statements ..

*

      IF( m.EQ.0 .OR. n.EQ.0 )

     $   RETURN

*

*     Convert descriptor

*

      CALL pb_desctrans( desca, desca2 )

*

*     Get grid parameters

*

      ictxt = desca2( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      CALL pb_ainfog2l( m, n, ia, ja, desca2, nprow, npcol, myrow,

     $                  mycol, imb1, inb1, mp, nq, iia, jja, iarow,

     $                  iacol, mrrow, mrcol )

*

      IF( mp.LE.0 .OR. nq.LE.0 )

     $   RETURN

*

      isrowrep = ( desca2( rsrc_ ).LT.0 )

      iscolrep = ( desca2( csrc_ ).LT.0 )

      lda      = desca2( lld_ )

*

      upper = .NOT.( lsame( uplo, 'L' ) )

      lower = .NOT.( lsame( uplo, 'U' ) )

*

      IF( ( ( lower.AND.upper ).AND.( alpha.EQ.beta ) ).OR.

     $    (   isrowrep        .AND.  iscolrep        ) ) THEN

         IF( ( mp.GT.0 ).AND.( nq.GT.0 ) )

     $      CALL pb_dlaset( uplo, mp, nq, 0, alpha, beta,

     $                      a( iia + ( jja - 1 ) * lda ), lda )

         RETURN

      END IF

*

*     Initialize LCMT00, MBLKS, NBLKS, IMBLOC, INBLOC, LMBLOC, LNBLOC,

*     ILOW, LOW, IUPP, and UPP.

*

      mb = desca2( mb_ )

      nb = desca2( nb_ )

      CALL pb_binfo( 0, mp, nq, imb1, inb1, mb, nb, mrrow, mrcol,

     $               lcmt00, mblks, nblks, imbloc, inbloc, lmbloc,

     $               lnbloc, ilow, low, iupp, upp )

*

      ioffa = iia - 1

      joffa = jja - 1

      iimax = ioffa + mp

      jjmax = joffa + nq

*

      IF( isrowrep ) THEN

         pmb = mb

      ELSE

         pmb = nprow * mb

      END IF

      IF( iscolrep ) THEN

         qnb = nb

      ELSE

         qnb = npcol * nb

      END IF

*

      m1 = mp

      n1 = nq

*

*     Handle the first block of rows or columns separately, and update

*     LCMT00, MBLKS and NBLKS.

*

      godown = ( lcmt00.GT.iupp )

      goleft = ( lcmt00.LT.ilow )

*

      IF( .NOT.godown .AND. .NOT.goleft ) THEN

*

*        LCMT00 >= ILOW && LCMT00 <= IUPP

*

         goleft = ( ( lcmt00 - ( iupp - upp + pmb ) ).LT.ilow )

         godown = .NOT.goleft

*

         CALL pb_dlaset( uplo, imbloc, inbloc, lcmt00, alpha, beta,

     $                   a( iia+joffa*lda ), lda )

         IF( godown ) THEN

            IF( upper .AND. nq.GT.inbloc )

     $         CALL pb_dlaset( 'All', imbloc, nq-inbloc, 0, alpha,

     $                         alpha, a( iia+(joffa+inbloc)*lda ), lda )

            iia = iia + imbloc

            m1  = m1 - imbloc

         ELSE

            IF( lower .AND. mp.GT.imbloc )

     $         CALL pb_dlaset( 'All', mp-imbloc, inbloc, 0, alpha,

     $                         alpha, a( iia+imbloc+joffa*lda ), lda )

            jja = jja + inbloc

            n1  = n1 - inbloc

         END IF

*

      END IF

*

      IF( godown ) THEN

*

         lcmt00 = lcmt00 - ( iupp - upp + pmb )

         mblks  = mblks - 1

         ioffa  = ioffa + imbloc

*

   10    CONTINUE

         IF( mblks.GT.0 .AND. lcmt00.GT.upp ) THEN

            lcmt00 = lcmt00 - pmb

            mblks  = mblks - 1

            ioffa  = ioffa + mb

            GO TO 10

         END IF

*

         tmp1 = min( ioffa, iimax ) - iia + 1

         IF( upper .AND. tmp1.GT.0 ) THEN

            CALL pb_dlaset( 'All', tmp1, n1, 0, alpha, alpha,

     $                      a( iia+joffa*lda ), lda )

            iia = iia + tmp1

            m1  = m1 - tmp1

         END IF

*

         IF( mblks.LE.0 )

     $      RETURN

*

         lcmt  = lcmt00

         mblkd = mblks

         ioffd = ioffa

*

         mbloc = mb

   20    CONTINUE

         IF( mblkd.GT.0 .AND. lcmt.GE.ilow ) THEN

            IF( mblkd.EQ.1 )

     $         mbloc = lmbloc

            CALL pb_dlaset( uplo, mbloc, inbloc, lcmt, alpha, beta,

     $                      a( ioffd+1+joffa*lda ), lda )

            lcmt00 = lcmt

            lcmt   = lcmt - pmb

            mblks  = mblkd

            mblkd  = mblkd - 1

            ioffa  = ioffd

            ioffd  = ioffd + mbloc

            GO TO 20

         END IF

*

         tmp1 = m1 - ioffd + iia - 1

         IF( lower .AND. tmp1.GT.0 )

     $      CALL pb_dlaset( 'ALL', tmp1, inbloc, 0, alpha, alpha,

     $                      a( ioffd+1+joffa*lda ), lda )

*

         tmp1   = ioffa - iia + 1

         m1     = m1 - tmp1

         n1     = n1 - inbloc

         lcmt00 = lcmt00 + low - ilow + qnb

         nblks  = nblks - 1

         joffa  = joffa + inbloc

*

         IF( upper .AND. tmp1.GT.0 .AND. n1.GT.0 )

     $      CALL pb_dlaset( 'ALL', tmp1, n1, 0, alpha, alpha,

     $                      a( iia+joffa*lda ), lda )

*

         iia = ioffa + 1

         jja = joffa + 1

*

      ELSE IF( goleft ) THEN

*

         lcmt00 = lcmt00 + low - ilow + qnb

         nblks  = nblks - 1

         joffa  = joffa + inbloc

*

   30    CONTINUE

         IF( nblks.GT.0 .AND. lcmt00.LT.low ) THEN

            lcmt00 = lcmt00 + qnb

            nblks  = nblks - 1

            joffa  = joffa + nb

            GO TO 30

         END IF

*

         tmp1 = min( joffa, jjmax ) - jja + 1

         IF( lower .AND. tmp1.GT.0 ) THEN

            CALL pb_dlaset( 'All', m1, tmp1, 0, alpha, alpha,

     $                      a( iia+(jja-1)*lda ), lda )

            jja = jja + tmp1

            n1  = n1 - tmp1

         END IF

*

         IF( nblks.LE.0 )

     $      RETURN

*

         lcmt  = lcmt00

         nblkd = nblks

         joffd = joffa

*

         nbloc = nb

   40    CONTINUE

         IF( nblkd.GT.0 .AND. lcmt.LE.iupp ) THEN

            IF( nblkd.EQ.1 )

     $         nbloc = lnbloc

            CALL pb_dlaset( uplo, imbloc, nbloc, lcmt, alpha, beta,

     $                      a( iia+joffd*lda ), lda )

            lcmt00 = lcmt

            lcmt   = lcmt + qnb

            nblks  = nblkd

            nblkd  = nblkd - 1

            joffa  = joffd

            joffd  = joffd + nbloc

            GO TO 40

         END IF

*

         tmp1 = n1 - joffd + jja - 1

         IF( upper .AND. tmp1.GT.0 )

     $      CALL pb_dlaset( 'All', imbloc, tmp1, 0, alpha, alpha,

     $                      a( iia+joffd*lda ), lda )

*

         tmp1   = joffa - jja + 1

         m1     = m1 - imbloc

         n1     = n1 - tmp1

         lcmt00 = lcmt00 - ( iupp - upp + pmb )

         mblks  = mblks - 1

         ioffa  = ioffa + imbloc

*

         IF( lower .AND. m1.GT.0 .AND. tmp1.GT.0 )

     $      CALL pb_dlaset( 'All', m1, tmp1, 0, alpha, alpha,

     $                      a( ioffa+1+(jja-1)*lda ), lda )

*

         iia = ioffa + 1

         jja = joffa + 1

*

      END IF

*

      nbloc = nb

   50 CONTINUE

      IF( nblks.GT.0 ) THEN

         IF( nblks.EQ.1 )

     $      nbloc = lnbloc

   60    CONTINUE

         IF( mblks.GT.0 .AND. lcmt00.GT.upp ) THEN

            lcmt00 = lcmt00 - pmb

            mblks  = mblks - 1

            ioffa  = ioffa + mb

            GO TO 60

         END IF

*

         tmp1 = min( ioffa, iimax ) - iia + 1

         IF( upper .AND. tmp1.GT.0 ) THEN

            CALL pb_dlaset( 'All', tmp1, n1, 0, alpha, alpha,

     $                      a( iia+joffa*lda ), lda )

            iia = iia + tmp1

            m1  = m1 - tmp1

         END IF

*

         IF( mblks.LE.0 )

     $      RETURN

*

         lcmt  = lcmt00

         mblkd = mblks

         ioffd = ioffa

*

         mbloc = mb

   70    CONTINUE

         IF( mblkd.GT.0 .AND. lcmt.GE.low ) THEN

            IF( mblkd.EQ.1 )

     $         mbloc = lmbloc

            CALL pb_dlaset( uplo, mbloc, nbloc, lcmt, alpha, beta,

     $                      a( ioffd+1+joffa*lda ), lda )

            lcmt00 = lcmt

            lcmt   = lcmt - pmb

            mblks  = mblkd

            mblkd  = mblkd - 1

            ioffa  = ioffd

            ioffd  = ioffd + mbloc

            GO TO 70

         END IF

*

         tmp1 = m1 - ioffd + iia - 1

         IF( lower .AND. tmp1.GT.0 )

     $      CALL pb_dlaset( 'All', tmp1, nbloc, 0, alpha, alpha,

     $                      a( ioffd+1+joffa*lda ), lda )

*

         tmp1   = min( ioffa, iimax )  - iia + 1

         m1     = m1 - tmp1

         n1     = n1 - nbloc

         lcmt00 = lcmt00 + qnb

         nblks  = nblks - 1

         joffa  = joffa + nbloc

*

         IF( upper .AND. tmp1.GT.0 .AND. n1.GT.0 )

     $      CALL pb_dlaset( 'All', tmp1, n1, 0, alpha, alpha,

     $                      a( iia+joffa*lda ), lda )

*

         iia = ioffa + 1

         jja = joffa + 1

*

         GO TO 50

*

      END IF

*

      RETURN

*

*     End of PDLASET

*

      END

      SUBROUTINE pdlascal( TYPE, M, N, ALPHA, A, IA, JA, DESCA )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        TYPE

      INTEGER            IA, JA, M, N

      DOUBLE PRECISION   ALPHA

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * )

      DOUBLE PRECISION   A( * )

*     ..

*

*  Purpose

*  =======

*

*  PDLASCAL  scales the  m by n submatrix A(IA:IA+M-1,JA:JA+N-1) denoted

*  by sub( A ) by the scalar alpha. TYPE  specifies if sub( A ) is full,

*  upper triangular, lower triangular or upper Hessenberg.

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  TYPE    (global input) CHARACTER*1

*          On entry,  TYPE  specifies the type of the input submatrix as

*          follows:

*             = 'L' or 'l':  sub( A ) is a lower triangular matrix,

*             = 'U' or 'u':  sub( A ) is an upper triangular matrix,

*             = 'H' or 'h':  sub( A ) is an upper Hessenberg matrix,

*             otherwise sub( A ) is a  full matrix.

*

*  M       (global input) INTEGER

*          On entry,  M  specifies the number of rows of  the  submatrix

*          sub( A ). M  must be at least zero.

*

*  N       (global input) INTEGER

*          On entry, N  specifies the number of columns of the submatrix

*          sub( A ). N  must be at least zero.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  A       (local input/local output) DOUBLE PRECISION array

*          On entry, A is an array of dimension (LLD_A, Ka), where Ka is

*          at least Lc( 1, JA+N-1 ).  Before  entry, this array contains

*          the local entries of the matrix  A.

*          On exit, the local entries of this array corresponding to the

*          to  the entries of the submatrix sub( A ) are  overwritten by

*          the local entries of the m by n scaled submatrix.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

*     ..

*     .. Local Scalars ..

      CHARACTER*1        UPLO

      LOGICAL            GODOWN, GOLEFT, LOWER, UPPER

      INTEGER            IACOL, IAROW, ICTXT, IIA, IIMAX, ILOW, IMB1,

     $                   IMBLOC, INB1, INBLOC, IOFFA, IOFFD, ITYPE,

     $                   IUPP, JJA, JJMAX, JOFFA, JOFFD, LCMT, LCMT00,

     $                   LDA, LMBLOC, LNBLOC, LOW, M1, MB, MBLKD, MBLKS,

     $                   MBLOC, MP, MRCOL, MRROW, MYCOL, MYROW, N1, NB,

     $                   NBLKD, NBLKS, NBLOC, NPCOL, NPROW, NQ, PMB,

     $                   QNB, TMP1, UPP

*     ..

*     .. Local Arrays ..

      INTEGER            DESCA2( DLEN_ )

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, pb_ainfog2l, pb_binfo,

     $                   pb_desctrans, pb_dlascal, pb_infog2l

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      INTEGER            PB_NUMROC

      EXTERNAL           lsame, pb_numroc

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          min

*     ..

*     .. Executable Statements ..

*

*     Convert descriptor

*

      CALL pb_desctrans( desca, desca2 )

*

*     Get grid parameters

*

      ictxt = desca2( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

*     Quick return if possible

*

      IF( m.EQ.0 .OR. n.EQ.0 )

     $   RETURN

*

      IF( lsame( TYPE, 'L' ) ) then

         itype = 1

         uplo  = TYPE

         upper = .false.

         lower = .true.

         ioffd = 0

      ELSE IF( lsame( TYPE, 'U' ) ) then

         itype = 2

         uplo  = TYPE

         upper = .true.

         lower = .false.

         ioffd = 0

      ELSE IF( lsame( TYPE, 'H' ) ) then

         itype = 3

         uplo  = 'U'

         upper = .true.

         lower = .false.

         ioffd = 1

      ELSE

         itype = 0

         uplo  = 'A'

         upper = .true.

         lower = .true.

         ioffd = 0

      END IF

*

*     Compute local indexes

*

      IF( itype.EQ.0 ) THEN

*

*        Full matrix

*

         CALL pb_infog2l( ia, ja, desca2, nprow, npcol, myrow, mycol,

     $                    iia, jja, iarow, iacol )

         mp = pb_numroc( m, ia, desca2( imb_ ), desca2( mb_ ), myrow,

     $                   desca2( rsrc_ ), nprow )

         nq = pb_numroc( n, ja, desca2( inb_ ), desca2( nb_ ), mycol,

     $                   desca2( csrc_ ), npcol )

*

         IF( mp.LE.0 .OR. nq.LE.0 )

     $      RETURN

*

         lda   = desca2( lld_ )

         ioffa = iia + ( jja - 1 ) * lda

*

         CALL pb_dlascal( 'All', mp, nq, 0, alpha, a( ioffa ), lda )

*

      ELSE

*

*        Trapezoidal matrix

*

         CALL pb_ainfog2l( m, n, ia, ja, desca2, nprow, npcol, myrow,

     $                     mycol, imb1, inb1, mp, nq, iia, jja, iarow,

     $                     iacol, mrrow, mrcol )

*

         IF( mp.LE.0 .OR. nq.LE.0 )

     $      RETURN

*

*        Initialize LCMT00, MBLKS, NBLKS, IMBLOC, INBLOC, LMBLOC,

*        LNBLOC, ILOW, LOW, IUPP, and UPP.

*

         mb  = desca2( mb_ )

         nb  = desca2( nb_ )

         lda = desca2( lld_ )

*

         CALL pb_binfo( ioffd, mp, nq, imb1, inb1, mb, nb, mrrow,

     $                  mrcol, lcmt00, mblks, nblks, imbloc, inbloc,

     $                  lmbloc, lnbloc, ilow, low, iupp, upp )

*

         m1    = mp

         n1    = nq

         ioffa = iia - 1

         joffa = jja - 1

         iimax = ioffa + mp

         jjmax = joffa + nq

*

         IF( desca2( rsrc_ ).LT.0 ) THEN

            pmb = mb

         ELSE

            pmb = nprow * mb

         END IF

         IF( desca2( csrc_ ).LT.0 ) THEN

            qnb = nb

         ELSE

            qnb = npcol * nb

         END IF

*

*        Handle the first block of rows or columns separately, and

*        update LCMT00, MBLKS and NBLKS.

*

         godown = ( lcmt00.GT.iupp )

         goleft = ( lcmt00.LT.ilow )

*

         IF( .NOT.godown .AND. .NOT.goleft ) THEN

*

*           LCMT00 >= ILOW && LCMT00 <= IUPP

*

            goleft = ( ( lcmt00 - ( iupp - upp + pmb ) ).LT.ilow )

            godown = .NOT.goleft

*

            CALL pb_dlascal( uplo, imbloc, inbloc, lcmt00, alpha,

     $                       a( iia+joffa*lda ), lda )

            IF( godown ) THEN

               IF( upper .AND. nq.GT.inbloc )

     $            CALL pb_dlascal( 'All', imbloc, nq-inbloc, 0, alpha,

     $                             a( iia+(joffa+inbloc)*lda ), lda )

               iia = iia + imbloc

               m1  = m1 - imbloc

            ELSE

               IF( lower .AND. mp.GT.imbloc )

     $            CALL pb_dlascal( 'All', mp-imbloc, inbloc, 0, alpha,

     $                             a( iia+imbloc+joffa*lda ), lda )

               jja = jja + inbloc

               n1  = n1 - inbloc

            END IF

*

         END IF

*

         IF( godown ) THEN

*

            lcmt00 = lcmt00 - ( iupp - upp + pmb )

            mblks  = mblks - 1

            ioffa  = ioffa + imbloc

*

   10       CONTINUE

            IF( mblks.GT.0 .AND. lcmt00.GT.upp ) THEN

               lcmt00 = lcmt00 - pmb

               mblks  = mblks - 1

               ioffa  = ioffa + mb

               GO TO 10

            END IF

*

            tmp1 = min( ioffa, iimax ) - iia + 1

            IF( upper .AND. tmp1.GT.0 ) THEN

               CALL pb_dlascal( 'All', tmp1, n1, 0, alpha,

     $                          a( iia+joffa*lda ), lda )

               iia = iia + tmp1

               m1  = m1 - tmp1

            END IF

*

            IF( mblks.LE.0 )

     $         RETURN

*

            lcmt  = lcmt00

            mblkd = mblks

            ioffd = ioffa

*

            mbloc = mb

   20       CONTINUE

            IF( mblkd.GT.0 .AND. lcmt.GE.ilow ) THEN

               IF( mblkd.EQ.1 )

     $            mbloc = lmbloc

               CALL pb_dlascal( uplo, mbloc, inbloc, lcmt, alpha,

     $                          a( ioffd+1+joffa*lda ), lda )

               lcmt00 = lcmt

               lcmt   = lcmt - pmb

               mblks  = mblkd

               mblkd  = mblkd - 1

               ioffa  = ioffd

               ioffd  = ioffd + mbloc

               GO TO 20

            END IF

*

            tmp1 = m1 - ioffd + iia - 1

            IF( lower .AND. tmp1.GT.0 )

     $         CALL pb_dlascal( 'All', tmp1, inbloc, 0, alpha,

     $                          a( ioffd+1+joffa*lda ), lda )

*

            tmp1   = ioffa - iia + 1

            m1     = m1 - tmp1

            n1     = n1 - inbloc

            lcmt00 = lcmt00 + low - ilow + qnb

            nblks  = nblks - 1

            joffa  = joffa + inbloc

*

            IF( upper .AND. tmp1.GT.0 .AND. n1.GT.0 )

     $         CALL pb_dlascal( 'All', tmp1, n1, 0, alpha,

     $                          a( iia+joffa*lda ), lda )

*

            iia = ioffa + 1

            jja = joffa + 1

*

         ELSE IF( goleft ) THEN

*

            lcmt00 = lcmt00 + low - ilow + qnb

            nblks  = nblks - 1

            joffa  = joffa + inbloc

*

   30       CONTINUE

            IF( nblks.GT.0 .AND. lcmt00.LT.low ) THEN

               lcmt00 = lcmt00 + qnb

               nblks  = nblks - 1

               joffa  = joffa + nb

               GO TO 30

            END IF

*

            tmp1 = min( joffa, jjmax ) - jja + 1

            IF( lower .AND. tmp1.GT.0 ) THEN

               CALL pb_dlascal( 'All', m1, tmp1, 0, alpha,

     $                          a( iia+(jja-1)*lda ), lda )

               jja = jja + tmp1

               n1  = n1 - tmp1

            END IF

*

            IF( nblks.LE.0 )

     $         RETURN

*

            lcmt  = lcmt00

            nblkd = nblks

            joffd = joffa

*

            nbloc = nb

   40       CONTINUE

            IF( nblkd.GT.0 .AND. lcmt.LE.iupp ) THEN

               IF( nblkd.EQ.1 )

     $            nbloc = lnbloc

               CALL pb_dlascal( uplo, imbloc, nbloc, lcmt, alpha,

     $                          a( iia+joffd*lda ), lda )

               lcmt00 = lcmt

               lcmt   = lcmt + qnb

               nblks  = nblkd

               nblkd  = nblkd - 1

               joffa  = joffd

               joffd  = joffd + nbloc

               GO TO 40

            END IF

*

            tmp1 = n1 - joffd + jja - 1

            IF( upper .AND. tmp1.GT.0 )

     $         CALL pb_dlascal( 'All', imbloc, tmp1, 0, alpha,

     $                          a( iia+joffd*lda ), lda )

*

            tmp1   = joffa - jja + 1

            m1     = m1 - imbloc

            n1     = n1 - tmp1

            lcmt00 = lcmt00 - ( iupp - upp + pmb )

            mblks  = mblks - 1

            ioffa  = ioffa + imbloc

*

            IF( lower .AND. m1.GT.0 .AND. tmp1.GT.0 )

     $         CALL pb_dlascal( 'All', m1, tmp1, 0, alpha,

     $                          a( ioffa+1+(jja-1)*lda ), lda )

*

            iia = ioffa + 1

            jja = joffa + 1

*

         END IF

*

         nbloc = nb

   50    CONTINUE

         IF( nblks.GT.0 ) THEN

            IF( nblks.EQ.1 )

     $         nbloc = lnbloc

   60       CONTINUE

            IF( mblks.GT.0 .AND. lcmt00.GT.upp ) THEN

               lcmt00 = lcmt00 - pmb

               mblks  = mblks - 1

               ioffa  = ioffa + mb

               GO TO 60

            END IF

*

            tmp1 = min( ioffa, iimax ) - iia + 1

            IF( upper .AND. tmp1.GT.0 ) THEN

               CALL pb_dlascal( 'All', tmp1, n1, 0, alpha,

     $                          a( iia+joffa*lda ), lda )

               iia = iia + tmp1

               m1  = m1 - tmp1

            END IF

*

            IF( mblks.LE.0 )

     $         RETURN

*

            lcmt  = lcmt00

            mblkd = mblks

            ioffd = ioffa

*

            mbloc = mb

   70       CONTINUE

            IF( mblkd.GT.0 .AND. lcmt.GE.low ) THEN

               IF( mblkd.EQ.1 )

     $            mbloc = lmbloc

               CALL pb_dlascal( uplo, mbloc, nbloc, lcmt, alpha,

     $                          a( ioffd+1+joffa*lda ), lda )

               lcmt00 = lcmt

               lcmt   = lcmt - pmb

               mblks  = mblkd

               mblkd  = mblkd - 1

               ioffa  = ioffd

               ioffd  = ioffd + mbloc

               GO TO 70

            END IF

*

            tmp1 = m1 - ioffd + iia - 1

            IF( lower .AND. tmp1.GT.0 )

     $         CALL pb_dlascal( 'All', tmp1, nbloc, 0, alpha,

     $                          a( ioffd+1+joffa*lda ), lda )

*

            tmp1   = min( ioffa, iimax )  - iia + 1

            m1     = m1 - tmp1

            n1     = n1 - nbloc

            lcmt00 = lcmt00 + qnb

            nblks  = nblks - 1

            joffa  = joffa + nbloc

*

            IF( upper .AND. tmp1.GT.0 .AND. n1.GT.0 )

     $         CALL pb_dlascal( 'All', tmp1, n1, 0, alpha,

     $                          a( iia+joffa*lda ), lda )

*

            iia = ioffa + 1

            jja = joffa + 1

*

            GO TO 50

*

         END IF

*

      END IF

*

      RETURN

*

*     End of PDLASCAL

*

      END

      SUBROUTINE pdlagen( INPLACE, AFORM, DIAG, OFFA, M, N, IA, JA,

     $                    DESCA, IASEED, A, LDA )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      LOGICAL            inplace

      CHARACTER*1        aform, diag

      INTEGER            ia, iaseed, ja, lda, m, n, offa

*     ..

*     .. Array Arguments ..

      INTEGER            desca( * )

      DOUBLE PRECISION   A( LDA, * )

*     ..

*

*  Purpose

*  =======

*

*  PDLAGEN  generates  (or regenerates)  a  submatrix  sub( A ) denoting

*  A(IA:IA+M-1,JA:JA+N-1).

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  INPLACE (global input) LOGICAL

*          On entry, INPLACE specifies if the matrix should be generated

*          in place or not. If INPLACE is .TRUE., the local random array

*          to be generated  will start in memory at the local memory lo-

*          cation A( 1, 1 ),  otherwise it will start at the local posi-

*          tion induced by IA and JA.

*

*  AFORM   (global input) CHARACTER*1

*          On entry, AFORM specifies the type of submatrix to be genera-

*          ted as follows:

*             AFORM = 'S', sub( A ) is a symmetric matrix,

*             AFORM = 'H', sub( A ) is a Hermitian matrix,

*             AFORM = 'T', sub( A ) is overrwritten  with  the transpose

*                          of what would normally be generated,

*             AFORM = 'C', sub( A ) is overwritten  with  the  conjugate

*                          transpose  of  what would normally be genera-

*                          ted.

*             AFORM = 'N', a random submatrix is generated.

*

*  DIAG    (global input) CHARACTER*1

*          On entry, DIAG specifies if the generated submatrix is diago-

*          nally dominant or not as follows:

*             DIAG = 'D' : sub( A ) is diagonally dominant,

*             DIAG = 'N' : sub( A ) is not diagonally dominant.

*

*  OFFA    (global input) INTEGER

*          On entry, OFFA  specifies  the  offdiagonal of the underlying

*          matrix A(1:DESCA(M_),1:DESCA(N_)) of interest when the subma-

*          trix is symmetric, Hermitian or diagonally dominant. OFFA = 0

*          specifies the main diagonal,  OFFA > 0  specifies a subdiago-

*          nal,  and OFFA < 0 specifies a superdiagonal (see further de-

*          tails).

*

*  M       (global input) INTEGER

*          On entry, M specifies the global number of matrix rows of the

*          submatrix sub( A ) to be generated. M must be at least zero.

*

*  N       (global input) INTEGER

*          On entry,  N specifies the global number of matrix columns of

*          the  submatrix  sub( A )  to be generated. N must be at least

*          zero.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  IASEED  (global input) INTEGER

*          On entry, IASEED  specifies  the  seed number to generate the

*          matrix A. IASEED must be at least zero.

*

*  A       (local output) DOUBLE PRECISION array

*          On entry, A is an array of dimension (LLD_A, Ka), where Ka is

*          at least Lc( 1, JA+N-1 ).  On  exit, this array  contains the

*          local entries of the randomly generated submatrix sub( A ).

*

*  LDA     (local input) INTEGER

*          On entry,  LDA  specifies  the local leading dimension of the

*          array A. When INPLACE is .FALSE., LDA is usually DESCA(LLD_).

*          This restriction is however not enforced, and this subroutine

*          requires only that LDA >= MAX( 1, Mp ) where

*

*          Mp = PB_NUMROC( M, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW ).

*

*          PB_NUMROC  is  a ScaLAPACK tool function; MYROW, MYCOL, NPROW

*          and NPCOL  can  be determined by calling the BLACS subroutine

*          BLACS_GRIDINFO.

*

*  Further Details

*  ===============

*

*  OFFD  is  tied  to  the matrix described by  DESCA, as opposed to the

*  piece that is currently  (re)generated.  This is a global information

*  independent from the distribution  parameters.  Below are examples of

*  the meaning of OFFD for a global 7 by 5 matrix:

*

*  ---------------------------------------------------------------------

*  OFFD   |  0 -1 -2 -3 -4         0 -1 -2 -3 -4          0 -1 -2 -3 -4

*  -------|-------------------------------------------------------------

*         |     | OFFD=-1          |   OFFD=0                 OFFD=2

*         |     V                  V

*  0      |  .  d  .  .  .      -> d  .  .  .  .          .  .  .  .  .

*  1      |  .  .  d  .  .         .  d  .  .  .          .  .  .  .  .

*  2      |  .  .  .  d  .         .  .  d  .  .       -> d  .  .  .  .

*  3      |  .  .  .  .  d         .  .  .  d  .          .  d  .  .  .

*  4      |  .  .  .  .  .         .  .  .  .  d          .  .  d  .  .

*  5      |  .  .  .  .  .         .  .  .  .  .          .  .  .  d  .

*  6      |  .  .  .  .  .         .  .  .  .  .          .  .  .  .  d

*  ---------------------------------------------------------------------

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

      INTEGER            JMP_1, JMP_COL, JMP_IMBV, JMP_INBV, JMP_LEN,

     $                   JMP_MB, JMP_NB, JMP_NPIMBLOC, JMP_NPMB,

     $                   JMP_NQINBLOC, JMP_NQNB, JMP_ROW

      PARAMETER          ( JMP_1 = 1, jmp_row = 2, jmp_col = 3,

     $                   jmp_mb = 4, jmp_imbv = 5, jmp_npmb = 6,

     $                   jmp_npimbloc = 7, jmp_nb = 8, jmp_inbv = 9,

     $                   jmp_nqnb = 10, jmp_nqinbloc = 11,

     $                   jmp_len = 11 )

*     ..

*     .. Local Scalars ..

      LOGICAL            DIAGDO, SYMM, HERM, NOTRAN

      INTEGER            CSRC, I, IACOL, IAROW, ICTXT, IIA, ILOCBLK,

     $                   ILOCOFF, ILOW, IMB, IMB1, IMBLOC, IMBVIR, INB,

     $                   inb1, inbloc, inbvir, info, ioffda, itmp, iupp,

     $                   ivir, jja, jlocblk, jlocoff, jvir, lcmt00,

     $                   lmbloc, lnbloc, low, maxmn, mb, mblks, mp,

     $                   mrcol, mrrow, mycdist, mycol, myrdist, myrow,

     $                   nb, nblks, npcol, nprow, nq, nvir, rsrc, upp

      DOUBLE PRECISION   ALPHA

*     ..

*     .. Local Arrays ..

      INTEGER            DESCA2( DLEN_ ), IMULADD( 4, JMP_LEN ),

     $                   IRAN( 2 ), JMP( JMP_LEN ), MULADD0( 4 )

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, pb_ainfog2l, pb_binfo,

     $                   pb_chkmat, pb_desctrans, pb_dlagen, pb_initjmp,

     $                   pb_initmuladd, pb_jump, pb_jumpit, pb_locinfo,

     $                   pb_setlocran, pb_setran, pdladom, pxerbla

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      EXTERNAL           LSAME

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          DBLE, MAX, MIN

*     ..

*     .. Data Statements ..

      DATA               ( muladd0( i ), i = 1, 4 ) / 20077, 16838,

     $                   12345, 0 /

*     ..

*     .. Executable Statements ..

*

*     Convert descriptor

*

      CALL pb_desctrans( desca, desca2 )

*

*     Test the input arguments

*

      ictxt = desca2( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

*     Test the input parameters

*

      info = 0

      IF( nprow.EQ.-1 ) THEN

         info = -( 1000 + ctxt_ )

      ELSE

         symm   = lsame( aform, 'S' )

         herm   = lsame( aform, 'H' )

         notran = lsame( aform, 'N' )

         diagdo = lsame( diag, 'D' )

         IF( .NOT.( symm.OR.herm.OR.notran ) .AND.

     $       .NOT.( lsame( aform, 'T' )    ) .AND.

     $       .NOT.( lsame( aform, 'C' )    ) ) THEN

            info = -2

         ELSE IF( ( .NOT.diagdo ) .AND.

     $            ( .NOT.lsame( diag, 'N' ) ) ) THEN

            info = -3

         END IF

         CALL pb_chkmat( ictxt, m, 5, n, 6, ia, ja, desca2, 10, info )

      END IF

*

      IF( info.NE.0 ) THEN

         CALL pxerbla( ictxt, 'PDLAGEN', -info )

         RETURN

      END IF

*

*     Quick return if possible

*

      IF( ( m.LE.0 ).OR.( n.LE.0 ) )

     $   RETURN

*

*     Start the operations

*

      mb   = desca2( mb_   )

      nb   = desca2( nb_   )

      imb  = desca2( imb_  )

      inb  = desca2( inb_  )

      rsrc = desca2( rsrc_ )

      csrc = desca2( csrc_ )

*

*     Figure out local information about the distributed matrix operand

*

      CALL pb_ainfog2l( m, n, ia, ja, desca2, nprow, npcol, myrow,

     $                  mycol, imb1, inb1, mp, nq, iia, jja, iarow,

     $                  iacol, mrrow, mrcol )

*

*     Decide where the entries shall be stored in memory

*

      IF( inplace ) THEN

         iia = 1

         jja = 1

      END IF

*

*     Initialize LCMT00, MBLKS, NBLKS, IMBLOC, INBLOC, LMBLOC, LNBLOC,

*     ILOW, LOW, IUPP, and UPP.

*

      ioffda = ja + offa - ia

      CALL pb_binfo( ioffda, mp, nq, imb1, inb1, mb, nb, mrrow,

     $               mrcol, lcmt00, mblks, nblks, imbloc, inbloc,

     $               lmbloc, lnbloc, ilow, low, iupp, upp )

*

*     Initialize ILOCBLK, ILOCOFF, MYRDIST, JLOCBLK, JLOCOFF, MYCDIST

*     This values correspond to the square virtual underlying matrix

*     of size MAX( M_ + MAX( 0, -OFFA ), N_ + MAX( 0, OFFA ) ) used

*     to set up the random sequence. For practical purposes, the size

*     of this virtual matrix is upper bounded by M_ + N_ - 1.

*

      itmp   = max( 0, -offa )

      ivir   = ia  + itmp

      imbvir = imb + itmp

      nvir   = desca2( m_ ) + itmp

*

      CALL pb_locinfo( ivir, imbvir, mb, myrow, rsrc, nprow, ilocblk,

     $                 ilocoff, myrdist )

*

      itmp   = max( 0, offa )

      jvir   = ja  + itmp

      inbvir = inb + itmp

      nvir   = max( max( nvir, desca2( n_ ) + itmp ),

     $              desca2( m_ ) + desca2( n_ ) - 1 )

*

      CALL pb_locinfo( jvir, inbvir, nb, mycol, csrc, npcol, jlocblk,

     $                 jlocoff, mycdist )

*

      IF( symm .OR. herm .OR. notran ) THEN

*

         CALL pb_initjmp( .true., nvir, imbvir, inbvir, imbloc, inbloc,

     $                    mb, nb, rsrc, csrc, nprow, npcol, 1, jmp )

*

*        Compute constants to jump JMP( * ) numbers in the sequence

*

         CALL pb_initmuladd( muladd0, jmp, imuladd )

*

*        Compute and set the random value corresponding to A( IA, JA )

*

         CALL pb_setlocran( iaseed, ilocblk, jlocblk, ilocoff, jlocoff,

     $                      myrdist, mycdist, nprow, npcol, jmp,

     $                      imuladd, iran )

*

         CALL pb_dlagen( 'Lower', aform, a( iia, jja ), lda, lcmt00,

     $                   iran, mblks, imbloc, mb, lmbloc, nblks, inbloc,

     $                   nb, lnbloc, jmp, imuladd )

*

      END IF

*

      IF( symm .OR. herm .OR. ( .NOT. notran ) ) THEN

*

         CALL pb_initjmp( .false., nvir, imbvir, inbvir, imbloc, inbloc,

     $                    mb, nb, rsrc, csrc, nprow, npcol, 1, jmp )

*

*        Compute constants to jump JMP( * ) numbers in the sequence

*

         CALL pb_initmuladd( muladd0, jmp, imuladd )

*

*        Compute and set the random value corresponding to A( IA, JA )

*

         CALL pb_setlocran( iaseed, ilocblk, jlocblk, ilocoff, jlocoff,

     $                      myrdist, mycdist, nprow, npcol, jmp,

     $                      imuladd, iran )

*

         CALL pb_dlagen( 'Upper', aform, a( iia, jja ), lda, lcmt00,

     $                   iran, mblks, imbloc, mb, lmbloc, nblks, inbloc,

     $                   nb, lnbloc, jmp, imuladd )

*

      END IF

*

      IF( diagdo ) THEN

*

         maxmn = max( desca2( m_ ), desca2( n_ ) )

         alpha = dble( maxmn )

*

         IF( ioffda.GE.0 ) THEN

            CALL pdladom( inplace, min( max( 0, m-ioffda ), n ), alpha,

     $                    a, min( ia+ioffda, ia+m-1 ), ja, desca )

         ELSE

            CALL pdladom( inplace, min( m, max( 0, n+ioffda ) ), alpha,

     $                    a, ia, min( ja-ioffda, ja+n-1 ), desca )

         END IF

*

      END IF

*

      RETURN

*

*     End of PDLAGEN

*

      END

      SUBROUTINE pdladom( INPLACE, N, ALPHA, A, IA, JA, DESCA )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      LOGICAL            INPLACE

      INTEGER            IA, JA, N

      DOUBLE PRECISION   ALPHA

*     ..

*     .. Array Arguments ..

      INTEGER            DESCA( * )

      DOUBLE PRECISION   A( * )

*     ..

*

*  Purpose

*  =======

*

*  PDLADOM  adds alpha to the diagonal entries  of  an  n by n submatrix

*  sub( A ) denoting A( IA:IA+N-1, JA:JA+N-1 ).

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  INPLACE (global input) LOGICAL

*          On entry, INPLACE specifies if the matrix should be generated

*          in place or not. If INPLACE is .TRUE., the local random array

*          to be generated  will start in memory at the local memory lo-

*          cation A( 1, 1 ),  otherwise it will start at the local posi-

*          tion induced by IA and JA.

*

*  N       (global input) INTEGER

*          On entry,  N  specifies  the  global  order  of the submatrix

*          sub( A ) to be modified. N must be at least zero.

*

*  ALPHA   (global input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  A       (local input/local output) DOUBLE PRECISION array

*          On entry, A is an array of dimension (LLD_A, Ka), where Ka is

*          at least Lc( 1, JA+N-1 ).  Before  entry, this array contains

*          the local entries of the matrix A. On exit, the local entries

*          of this array corresponding to the main diagonal of  sub( A )

*          have been updated.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

*     ..

*     .. Local Scalars ..

      LOGICAL            GODOWN, GOLEFT

      INTEGER            I, IACOL, IAROW, ICTXT, IIA, IJOFFA, ILOW,

     $                   IMB1, IMBLOC, INB1, INBLOC, IOFFA, IOFFD, IUPP,

     $                   JJA, JOFFA, JOFFD, LCMT, LCMT00, LDA, LDAP1,

     $                   LMBLOC, LNBLOC, LOW, MB, MBLKD, MBLKS, MBLOC,

     $                   MRCOL, MRROW, MYCOL, MYROW, NB, NBLKD, NBLKS,

     $                   NBLOC, NP, NPCOL, NPROW, NQ, PMB, QNB, UPP

      DOUBLE PRECISION   ATMP

*     ..

*     .. Local Scalars ..

      INTEGER            DESCA2( DLEN_ )

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_gridinfo, pb_ainfog2l, pb_binfo,

     $                   pb_desctrans

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          abs, max, min

*     ..

*     .. Executable Statements ..

*

*     Convert descriptor

*

      CALL pb_desctrans( desca, desca2 )

*

*     Get grid parameters

*

      ictxt = desca2( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

*

      IF( n.EQ.0 )

     $   RETURN

*

      CALL pb_ainfog2l( n, n, ia, ja, desca2, nprow, npcol, myrow,

     $                  mycol, imb1, inb1, np, nq, iia, jja, iarow,

     $                  iacol, mrrow, mrcol )

*

*     Decide where the entries shall be stored in memory

*

      IF( inplace ) THEN

         iia = 1

         jja = 1

      END IF

*

*     Initialize LCMT00, MBLKS, NBLKS, IMBLOC, INBLOC, LMBLOC, LNBLOC,

*     ILOW, LOW, IUPP, and UPP.

*

      mb = desca2( mb_ )

      nb = desca2( nb_ )

*

      CALL pb_binfo( 0, np, nq, imb1, inb1, mb, nb, mrrow, mrcol,

     $               lcmt00, mblks, nblks, imbloc, inbloc, lmbloc,

     $               lnbloc, ilow, low, iupp, upp )

*

      ioffa  = iia - 1

      joffa  = jja - 1

      lda    = desca2( lld_ )

      ldap1  = lda + 1

*

      IF( desca2( rsrc_ ).LT.0 ) THEN

         pmb = mb

      ELSE

         pmb = nprow * mb

      END IF

      IF( desca2( csrc_ ).LT.0 ) THEN

         qnb = nb

      ELSE

         qnb = npcol * nb

      END IF

*

*     Handle the first block of rows or columns separately, and update

*     LCMT00, MBLKS and NBLKS.

*

      godown = ( lcmt00.GT.iupp )

      goleft = ( lcmt00.LT.ilow )

*

      IF( .NOT.godown .AND. .NOT.goleft ) THEN

*

*        LCMT00 >= ILOW && LCMT00 <= IUPP

*

         IF( lcmt00.GE.0 ) THEN

            ijoffa = ioffa+lcmt00 + ( joffa - 1 ) * lda

            DO 10 i = 1, min( inbloc, max( 0, imbloc - lcmt00 ) )

               atmp = a( ijoffa + i*ldap1 )

               a( ijoffa + i*ldap1 ) = abs( atmp ) + alpha

   10       CONTINUE

         ELSE

            ijoffa = ioffa + ( joffa - lcmt00 - 1 ) * lda

            DO 20 i = 1, min( imbloc, max( 0, inbloc + lcmt00 ) )

               atmp = a( ijoffa + i*ldap1 )

               a( ijoffa + i*ldap1 ) = abs( atmp ) + alpha

   20       CONTINUE

         END IF

         goleft = ( ( lcmt00 - ( iupp - upp + pmb ) ).LT.ilow )

         godown = .NOT.goleft

*

      END IF

*

      IF( godown ) THEN

*

         lcmt00 = lcmt00 - ( iupp - upp + pmb )

         mblks  = mblks - 1

         ioffa  = ioffa + imbloc

*

   30    CONTINUE

         IF( mblks.GT.0 .AND. lcmt00.GT.upp ) THEN

            lcmt00 = lcmt00 - pmb

            mblks  = mblks - 1

            ioffa  = ioffa + mb

            GO TO 30

         END IF

*

         lcmt  = lcmt00

         mblkd = mblks

         ioffd = ioffa

*

         mbloc = mb

   40    CONTINUE

         IF( mblkd.GT.0 .AND. lcmt.GE.ilow ) THEN

            IF( mblkd.EQ.1 )

     $         mbloc = lmbloc

            IF( lcmt.GE.0 ) THEN

               ijoffa = ioffd + lcmt + ( joffa - 1 ) * lda

               DO 50 i = 1, min( inbloc, max( 0, mbloc - lcmt ) )

                  atmp = a( ijoffa + i*ldap1 )

                  a( ijoffa + i*ldap1 ) = abs( atmp ) + alpha

   50          CONTINUE

            ELSE

               ijoffa = ioffd + ( joffa - lcmt - 1 ) * lda

               DO 60 i = 1, min( mbloc, max( 0, inbloc + lcmt ) )

                  atmp = a( ijoffa + i*ldap1 )

                  a( ijoffa + i*ldap1 ) = abs( atmp ) + alpha

   60          CONTINUE

            END IF

            lcmt00 = lcmt

            lcmt   = lcmt - pmb

            mblks  = mblkd

            mblkd  = mblkd - 1

            ioffa  = ioffd

            ioffd  = ioffd + mbloc

            GO TO 40

         END IF

*

         lcmt00 = lcmt00 + low - ilow + qnb

         nblks  = nblks - 1

         joffa  = joffa + inbloc

*

      ELSE IF( goleft ) THEN

*

         lcmt00 = lcmt00 + low - ilow + qnb

         nblks  = nblks - 1

         joffa  = joffa + inbloc

*

   70    CONTINUE

         IF( nblks.GT.0 .AND. lcmt00.LT.low ) THEN

            lcmt00 = lcmt00 + qnb

            nblks  = nblks - 1

            joffa  = joffa + nb

            GO TO 70

         END IF

*

         lcmt  = lcmt00

         nblkd = nblks

         joffd = joffa

*

         nbloc = nb

   80    CONTINUE

         IF( nblkd.GT.0 .AND. lcmt.LE.iupp ) THEN

            IF( nblkd.EQ.1 )

     $         nbloc = lnbloc

            IF( lcmt.GE.0 ) THEN

               ijoffa = ioffa + lcmt + ( joffd - 1 ) * lda

               DO 90 i = 1, min( nbloc, max( 0, imbloc - lcmt ) )

                  atmp = a( ijoffa + i*ldap1 )

                  a( ijoffa + i*ldap1 ) = abs( atmp ) + alpha

   90          CONTINUE

            ELSE

               ijoffa = ioffa + ( joffd - lcmt - 1 ) * lda

               DO 100 i = 1, min( imbloc, max( 0, nbloc + lcmt ) )

                  atmp = a( ijoffa + i*ldap1 )

                  a( ijoffa + i*ldap1 ) = abs( atmp ) + alpha

  100          CONTINUE

            END IF

            lcmt00 = lcmt

            lcmt   = lcmt + qnb

            nblks  = nblkd

            nblkd  = nblkd - 1

            joffa  = joffd

            joffd  = joffd + nbloc

            GO TO 80

         END IF

*

         lcmt00 = lcmt00 - ( iupp - upp + pmb )

         mblks  = mblks - 1

         ioffa  = ioffa + imbloc

*

      END IF

*

      nbloc = nb

  110 CONTINUE

      IF( nblks.GT.0 ) THEN

         IF( nblks.EQ.1 )

     $      nbloc = lnbloc

  120    CONTINUE

         IF( mblks.GT.0 .AND. lcmt00.GT.upp ) THEN

            lcmt00 = lcmt00 - pmb

            mblks  = mblks - 1

            ioffa  = ioffa + mb

            GO TO 120

         END IF

*

         lcmt  = lcmt00

         mblkd = mblks

         ioffd = ioffa

*

         mbloc = mb

  130    CONTINUE

         IF( mblkd.GT.0 .AND. lcmt.GE.low ) THEN

            IF( mblkd.EQ.1 )

     $         mbloc = lmbloc

            IF( lcmt.GE.0 ) THEN

               ijoffa = ioffd + lcmt + ( joffa - 1 ) * lda

               DO 140 i = 1, min( nbloc, max( 0, mbloc - lcmt ) )

                  atmp = a( ijoffa + i*ldap1 )

                  a( ijoffa + i*ldap1 ) = abs( atmp ) + alpha

  140          CONTINUE

            ELSE

               ijoffa = ioffd + ( joffa - lcmt - 1 ) * lda

               DO 150 i = 1, min( mbloc, max( 0, nbloc + lcmt ) )

                  atmp = a( ijoffa + i*ldap1 )

                  a( ijoffa + i*ldap1 ) = abs( atmp ) + alpha

  150          CONTINUE

            END IF

            lcmt00 = lcmt

            lcmt   = lcmt - pmb

            mblks  = mblkd

            mblkd  = mblkd - 1

            ioffa  = ioffd

            ioffd  = ioffd + mbloc

            GO TO 130

         END IF

*

         lcmt00 = lcmt00 + qnb

         nblks  = nblks - 1

         joffa  = joffa + nbloc

         GO TO 110

*

      END IF

*

      RETURN

*

*     End of PDLADOM

*

      END

      SUBROUTINE pb_pdlaprnt( M, N, A, IA, JA, DESCA, IRPRNT, ICPRNT,

     $                        CMATNM, NOUT, WORK )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            IA, ICPRNT, IRPRNT, JA, M, N, NOUT

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)      CMATNM

      INTEGER            DESCA( * )

      DOUBLE PRECISION   A( * ), WORK( * )

*     ..

*

*  Purpose

*  =======

*

*  PB_PDLAPRNT  prints to the standard output a submatrix sub( A ) deno-

*  ting A(IA:IA+M-1,JA:JA+N-1). The local pieces are sent and printed by

*  the process of coordinates (IRPRNT, ICPRNT).

*

*  Notes

*  =====

*

*  A description  vector  is associated with each 2D block-cyclicly dis-

*  tributed matrix.  This  vector  stores  the  information  required to

*  establish the  mapping  between a  matrix entry and its corresponding

*  process and memory location.

*

*  In  the  following  comments,   the character _  should  be  read  as

*  "of  the  distributed  matrix".  Let  A  be a generic term for any 2D

*  block cyclicly distributed matrix.  Its description vector is DESCA:

*

*  NOTATION         STORED IN       EXPLANATION

*  ---------------- --------------- ------------------------------------

*  DTYPE_A (global) DESCA( DTYPE_ ) The descriptor type.

*  CTXT_A  (global) DESCA( CTXT_  ) The BLACS context handle, indicating

*                                   the NPROW x NPCOL BLACS process grid

*                                   A  is distributed over.  The context

*                                   itself  is  global,  but  the handle

*                                   (the integer value) may vary.

*  M_A     (global) DESCA( M_     ) The  number of rows in the distribu-

*                                   ted matrix A, M_A >= 0.

*  N_A     (global) DESCA( N_     ) The number of columns in the distri-

*                                   buted matrix A, N_A >= 0.

*  IMB_A   (global) DESCA( IMB_   ) The number of rows of the upper left

*                                   block of the matrix A, IMB_A > 0.

*  INB_A   (global) DESCA( INB_   ) The  number  of columns of the upper

*                                   left   block   of   the   matrix  A,

*                                   INB_A > 0.

*  MB_A    (global) DESCA( MB_    ) The blocking factor used to  distri-

*                                   bute the last  M_A-IMB_A rows of  A,

*                                   MB_A > 0.

*  NB_A    (global) DESCA( NB_    ) The blocking factor used to  distri-

*                                   bute the last  N_A-INB_A  columns of

*                                   A, NB_A > 0.

*  RSRC_A  (global) DESCA( RSRC_  ) The process row over which the first

*                                   row of the matrix  A is distributed,

*                                   NPROW > RSRC_A >= 0.

*  CSRC_A  (global) DESCA( CSRC_  ) The  process  column  over which the

*                                   first  column of  A  is distributed.

*                                   NPCOL > CSRC_A >= 0.

*  LLD_A   (local)  DESCA( LLD_   ) The  leading  dimension of the local

*                                   array  storing  the  local blocks of

*                                   the distributed matrix A,

*                                   IF( Lc( 1, N_A ) > 0 )

*                                      LLD_A >= MAX( 1, Lr( 1, M_A ) )

*                                   ELSE

*                                      LLD_A >= 1.

*

*  Let K be the number of  rows of a matrix A starting at the global in-

*  dex IA,i.e, A( IA:IA+K-1, : ). Lr( IA, K ) denotes the number of rows

*  that the process of row coordinate MYROW ( 0 <= MYROW < NPROW ) would

*  receive if these K rows were distributed over NPROW processes.  If  K

*  is the number of columns of a matrix  A  starting at the global index

*  JA, i.e, A( :, JA:JA+K-1, : ), Lc( JA, K ) denotes the number  of co-

*  lumns that the process MYCOL ( 0 <= MYCOL < NPCOL ) would  receive if

*  these K columns were distributed over NPCOL processes.

*

*  The values of Lr() and Lc() may be determined via a call to the func-

*  tion PB_NUMROC:

*  Lr( IA, K ) = PB_NUMROC( K, IA, IMB_A, MB_A, MYROW, RSRC_A, NPROW )

*  Lc( JA, K ) = PB_NUMROC( K, JA, INB_A, NB_A, MYCOL, CSRC_A, NPCOL )

*

*  Arguments

*  =========

*

*  M       (global input) INTEGER

*          On entry,  M  specifies the number of rows of  the  submatrix

*          sub( A ). M  must be at least zero.

*

*  N       (global input) INTEGER

*          On entry, N  specifies the number of columns of the submatrix

*          sub( A ). N must be at least zero.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry, A is an array of dimension (LLD_A, Ka), where Ka is

*          at least Lc( 1, JA+N-1 ).  Before  entry, this array contains

*          the local entries of the matrix A.

*

*  IA      (global input) INTEGER

*          On entry, IA  specifies A's global row index, which points to

*          the beginning of the submatrix sub( A ).

*

*  JA      (global input) INTEGER

*          On entry, JA  specifies A's global column index, which points

*          to the beginning of the submatrix sub( A ).

*

*  DESCA   (global and local input) INTEGER array

*          On entry, DESCA  is an integer array of dimension DLEN_. This

*          is the array descriptor for the matrix A.

*

*  IRPRNT  (global input) INTEGER

*          On entry, IRPRNT specifies the row index of the printing pro-

*          cess.

*

*  ICPRNT  (global input) INTEGER

*          On entry, ICPRNT specifies the  column  index of the printing

*          process.

*

*  CMATNM  (global input) CHARACTER*(*)

*          On entry, CMATNM is the name of the matrix to be printed.

*

*  NOUT    (global input) INTEGER

*          On entry, NOUT specifies the output unit number. When NOUT is

*          equal to 6, the submatrix is printed on the screen.

*

*  WORK    (local workspace) DOUBLE PRECISION array

*          On entry, WORK is a work array of dimension at least equal to

*          MAX( IMB_A, MB_A ).

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      PARAMETER          ( BLOCK_CYCLIC_2D_INB = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

*     ..

*     .. Local Scalars ..

      INTEGER            MYCOL, MYROW, NPCOL, NPROW, PCOL, PROW

*     ..

*     .. Local Arrays ..

      INTEGER            DESCA2( DLEN_ )

*     ..

*     .. External Subroutines ..

      EXTERNAL           BLACS_GRIDINFO, PB_DESCTRANS, PB_PDLAPRN2

*     ..

*     .. Executable Statements ..

*

*     Quick return if possible

*

      IF( ( m.LE.0 ).OR.( n.LE.0 ) )

     $   RETURN

*

*     Convert descriptor

*

      CALL pb_desctrans( desca, desca2 )

*

      CALL blacs_gridinfo( desca2( ctxt_ ), nprow, npcol, myrow, mycol )

*

      IF( desca2( rsrc_ ).GE.0 ) THEN

         IF( desca2( csrc_ ).GE.0 ) THEN

            CALL pb_pdlaprn2( m, n, a, ia, ja, desca2, irprnt, icprnt,

     $                        cmatnm, nout, desca2( rsrc_ ),

     $                        desca2( csrc_ ), work )

         ELSE

            DO 10 pcol = 0, npcol - 1

               IF( ( myrow.EQ.irprnt ).AND.( mycol.EQ.icprnt ) )

     $            WRITE( nout, * ) 'Colum-replicated array -- ' ,

     $                             'copy in process column: ', pcol

               CALL pb_pdlaprn2( m, n, a, ia, ja, desca2, irprnt,

     $                           icprnt, cmatnm, nout, desca2( rsrc_ ),

     $                           pcol, work )

   10       CONTINUE

         END IF

      ELSE

         IF( desca2( csrc_ ).GE.0 ) THEN

            DO 20 prow = 0, nprow - 1

               IF( ( myrow.EQ.irprnt ).AND.( mycol.EQ.icprnt ) )

     $            WRITE( nout, * ) 'Row-replicated array -- ' ,

     $                             'copy in process row: ', prow

               CALL pb_pdlaprn2( m, n, a, ia, ja, desca2, irprnt,

     $                           icprnt, cmatnm, nout, prow,

     $                           desca2( csrc_ ), work )

   20       CONTINUE

         ELSE

            DO 40 prow = 0, nprow - 1

               DO 30 pcol = 0, npcol - 1

                  IF( ( myrow.EQ.irprnt ).AND.( mycol.EQ.icprnt ) )

     $               WRITE( nout, * ) 'Replicated array -- ' ,

     $                      'copy in process (', prow, ',', pcol, ')'

                  CALL pb_pdlaprn2( m, n, a, ia, ja, desca2, irprnt,

     $                              icprnt, cmatnm, nout, prow, pcol,

     $                              work )

   30          CONTINUE

   40       CONTINUE

         END IF

      END IF

*

      RETURN

*

*     End of PB_PDLAPRNT

*

      END

      SUBROUTINE pb_pdlaprn2( M, N, A, IA, JA, DESCA, IRPRNT, ICPRNT,

     $                        CMATNM, NOUT, PROW, PCOL, WORK )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            IA, ICPRNT, IRPRNT, JA, M, N, NOUT, PCOL, PROW

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)      CMATNM

      INTEGER            DESCA( * )

      DOUBLE PRECISION   A( * ), WORK( * )

*     ..

*

*     .. Parameters ..

      INTEGER            BLOCK_CYCLIC_2D_INB, CSRC_, CTXT_, DLEN_,

     $                   DTYPE_, IMB_, INB_, LLD_, MB_, M_, NB_, N_,

     $                   RSRC_

      parameter( block_cyclic_2d_inb = 2, dlen_ = 11,

     $                   dtype_ = 1, ctxt_ = 2, m_ = 3, n_ = 4,

     $                   imb_ = 5, inb_ = 6, mb_ = 7, nb_ = 8,

     $                   rsrc_ = 9, csrc_ = 10, lld_ = 11 )

*     ..

*     .. Local Scalars ..

      LOGICAL            AISCOLREP, AISROWREP

      INTEGER            H, I, IACOL, IAROW, IB, ICTXT, ICURCOL,

     $                   ICURROW, II, IIA, IN, J, JB, JJ, JJA, JN, K,

     $                   LDA, LDW, MYCOL, MYROW, NPCOL, NPROW

*     ..

*     .. External Subroutines ..

      EXTERNAL           blacs_barrier, blacs_gridinfo, dgerv2d,

     $                   dgesd2d, pb_infog2l

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          min

*     ..

*     .. Executable Statements ..

*

*     Get grid parameters

*

      ictxt = desca( ctxt_ )

      CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )

      CALL pb_infog2l( ia, ja, desca, nprow, npcol, myrow, mycol,

     $                 iia, jja, iarow, iacol )

      ii = iia

      jj = jja

      IF( desca( rsrc_ ).LT.0 ) THEN

         aisrowrep = .true.

         iarow     = prow

         icurrow   = prow

      ELSE

         aisrowrep = .false.

         icurrow   = iarow

      END IF

      IF( desca( csrc_ ).LT.0 ) THEN

         aiscolrep = .true.

         iacol     = pcol

         icurcol   = pcol

      ELSE

         aiscolrep = .false.

         icurcol   = iacol

      END IF

      lda = desca( lld_ )

      ldw = max( desca( imb_ ), desca( mb_ ) )

*

*     Handle the first block of column separately

*

      jb = desca( inb_ ) - ja + 1

      IF( jb.LE.0 )

     $   jb = ( (-jb) / desca( nb_ ) + 1 ) * desca( nb_ ) + jb

      jb = min( jb, n )

      jn = ja+jb-1

      DO 60 h = 0, jb-1

         ib = desca( imb_ ) - ia + 1

         IF( ib.LE.0 )

     $      ib = ( (-ib) / desca( mb_ ) + 1 ) * desca( mb_ ) + ib

         ib = min( ib, m )

         in = ia+ib-1

         IF( icurrow.EQ.irprnt .AND. icurcol.EQ.icprnt ) THEN

            IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

               DO 10 k = 0, ib-1

                  WRITE( nout, fmt = 9999 )

     $                   cmatnm, ia+k, ja+h, a( ii+k+(jj+h-1)*lda )

   10          CONTINUE

            END IF

         ELSE

            IF( myrow.EQ.icurrow .AND. mycol.EQ.icurcol ) THEN

               CALL dgesd2d( ictxt, ib, 1, a( ii+(jj+h-1)*lda ), lda,

     $                       irprnt, icprnt )

            ELSE IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

               CALL dgerv2d( ictxt, ib, 1, work, ldw, icurrow, icurcol )

               DO 20 k = 1, ib

                  WRITE( nout, fmt = 9999 )

     $                   cmatnm, ia+k-1, ja+h, work( k )

   20          CONTINUE

            END IF

         END IF

         IF( myrow.EQ.icurrow )

     $      ii = ii + ib

         IF( .NOT.aisrowrep )

     $      icurrow = mod( icurrow+1, nprow )

         CALL blacs_barrier( ictxt, 'All' )

*

*        Loop over remaining block of rows

*

         DO 50 i = in+1, ia+m-1, desca( mb_ )

            ib = min( desca( mb_ ), ia+m-i )

            IF( icurrow.EQ.irprnt .AND. icurcol.EQ.icprnt ) THEN

               IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

                  DO 30 k = 0, ib-1

                     WRITE( nout, fmt = 9999 )

     $                      cmatnm, i+k, ja+h, a( ii+k+(jj+h-1)*lda )

   30             CONTINUE

               END IF

            ELSE

               IF( myrow.EQ.icurrow .AND. mycol.EQ.icurcol ) THEN

                  CALL dgesd2d( ictxt, ib, 1, a( ii+(jj+h-1)*lda ),

     $                          lda, irprnt, icprnt )

               ELSE IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

                  CALL dgerv2d( ictxt, ib, 1, work, ldw, icurrow,

     $                          icurcol )

                  DO 40 k = 1, ib

                     WRITE( nout, fmt = 9999 )

     $                      cmatnm, i+k-1, ja+h, work( k )

   40             CONTINUE

               END IF

            END IF

            IF( myrow.EQ.icurrow )

     $         ii = ii + ib

            IF( .NOT.aisrowrep )

     $         icurrow = mod( icurrow+1, nprow )

            CALL blacs_barrier( ictxt, 'All' )

   50    CONTINUE

*

         ii = iia

         icurrow = iarow

   60 CONTINUE

*

      IF( mycol.EQ.icurcol )

     $   jj = jj + jb

      IF( .NOT.aiscolrep )

     $   icurcol = mod( icurcol+1, npcol )

      CALL blacs_barrier( ictxt, 'All' )

*

*     Loop over remaining column blocks

*

      DO 130 j = jn+1, ja+n-1, desca( nb_ )

         jb = min(  desca( nb_ ), ja+n-j )

         DO 120 h = 0, jb-1

            ib = desca( imb_ )-ia+1

            IF( ib.LE.0 )

     $         ib = ( (-ib) / desca( mb_ ) + 1 ) * desca( mb_ ) + ib

            ib = min( ib, m )

            in = ia+ib-1

            IF( icurrow.EQ.irprnt .AND. icurcol.EQ.icprnt ) THEN

               IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

                  DO 70 k = 0, ib-1

                     WRITE( nout, fmt = 9999 )

     $                      cmatnm, ia+k, j+h, a( ii+k+(jj+h-1)*lda )

   70             CONTINUE

               END IF

            ELSE

               IF( myrow.EQ.icurrow .AND. mycol.EQ.icurcol ) THEN

                  CALL dgesd2d( ictxt, ib, 1, a( ii+(jj+h-1)*lda ),

     $                          lda, irprnt, icprnt )

               ELSE IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

                  CALL dgerv2d( ictxt, ib, 1, work, ldw, icurrow,

     $                          icurcol )

                  DO 80 k = 1, ib

                     WRITE( nout, fmt = 9999 )

     $                      cmatnm, ia+k-1, j+h, work( k )

   80             CONTINUE

               END IF

            END IF

            IF( myrow.EQ.icurrow )

     $         ii = ii + ib

            icurrow = mod( icurrow+1, nprow )

            CALL blacs_barrier( ictxt, 'All' )

*

*           Loop over remaining block of rows

*

            DO 110 i = in+1, ia+m-1, desca( mb_ )

               ib = min( desca( mb_ ), ia+m-i )

               IF( icurrow.EQ.irprnt .AND. icurcol.EQ.icprnt ) THEN

                  IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

                     DO 90 k = 0, ib-1

                        WRITE( nout, fmt = 9999 )

     $                         cmatnm, i+k, j+h, a( ii+k+(jj+h-1)*lda )

   90                CONTINUE

                  END IF

               ELSE

                  IF( myrow.EQ.icurrow .AND. mycol.EQ.icurcol ) THEN

                     CALL dgesd2d( ictxt, ib, 1, a( ii+(jj+h-1)*lda ),

     $                             lda, irprnt, icprnt )

                   ELSE IF( myrow.EQ.irprnt .AND. mycol.EQ.icprnt ) THEN

                     CALL dgerv2d( ictxt, ib, 1, work, ldw, icurrow,

     $                             icurcol )

                     DO 100 k = 1, ib

                        WRITE( nout, fmt = 9999 )

     $                         cmatnm, i+k-1, j+h, work( k )

  100                CONTINUE

                  END IF

               END IF

               IF( myrow.EQ.icurrow )

     $            ii = ii + ib

               IF( .NOT.aisrowrep )

     $            icurrow = mod( icurrow+1, nprow )

               CALL blacs_barrier( ictxt, 'All' )

  110       CONTINUE

*

            ii = iia

            icurrow = iarow

  120    CONTINUE

*

         IF( mycol.EQ.icurcol )

     $      jj = jj + jb

         IF( .NOT.aiscolrep )

     $      icurcol = mod( icurcol+1, npcol )

         CALL blacs_barrier( ictxt, 'All' )

*

  130 CONTINUE

*

 9999 FORMAT( 1x, a, '(', i6, ',', i6, ')=', d30.18 )

*

      RETURN

*

*     End of PB_PDLAPRN2

*

      END

      SUBROUTINE pb_dfillpad( ICTXT, M, N, A, LDA, IPRE, IPOST, CHKVAL )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            ICTXT, IPOST, IPRE, LDA, M, N

      DOUBLE PRECISION   CHKVAL

*     ..

*     .. Array Arguments ..

      DOUBLE PRECISION   A( * )

*     ..

*

*  Purpose

*  =======

*

*  PB_DFILLPAD surrounds a two dimensional local array with a guard-zone

*  initialized to the value CHKVAL. The user may later call the  routine

*  PB_DCHEKPAD to discover if the guardzone has been violated. There are

*  three guardzones. The first is a buffer of size  IPRE  that is before

*  the start of the array. The second is the buffer of size IPOST  which

*  is after the end of the array to be padded. Finally, there is a guard

*  zone inside every column of the array to be padded, in  the  elements

*  of A(M+1:LDA, J).

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  M       (local input) INTEGER

*          On entry, M  specifies the number of rows in the local  array

*          A.  M must be at least zero.

*

*  N       (local input) INTEGER

*          On entry, N  specifies the number of columns in the local ar-

*          ray A. N must be at least zero.

*

*  A       (local input/local output) DOUBLE PRECISION array

*          On entry,  A  is an array of dimension (LDA,N). On exit, this

*          array is the padded array.

*

*  LDA     (local input) INTEGER

*          On entry,  LDA  specifies  the leading dimension of the local

*          array to be padded. LDA must be at least MAX( 1, M ).

*

*  IPRE    (local input) INTEGER

*          On entry, IPRE specifies the size of  the  guard zone  to put

*          before the start of the padded array.

*

*  IPOST   (local input) INTEGER

*          On entry, IPOST specifies the size of the  guard zone  to put

*          after the end of the padded array.

*

*  CHKVAL  (local input) DOUBLE PRECISION

*          On entry, CHKVAL specifies the value to pad the array with.

*

*  -- Written on April 1, 1998 by

*     R. Clint Whaley, University of Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER            I, J, K

*     ..

*     .. Executable Statements ..

*

*     Put check buffer in front of A

*

      IF( IPRE.GT.0 ) THEN

         DO 10 I = 1, ipre

            a( i ) = chkval

   10    CONTINUE

      ELSE

         WRITE( *, fmt = '(A)' )

     $          'WARNING no pre-guardzone in PB_DFILLPAD'

      END IF

*

*     Put check buffer in back of A

*

      IF( ipost.GT.0 ) THEN

         j = ipre+lda*n+1

         DO 20 i = j, j+ipost-1

            a( i ) = chkval

   20    CONTINUE

      ELSE

         WRITE( *, fmt = '(A)' )

     $          'WARNING no post-guardzone in PB_DFILLPAD'

      END IF

*

*     Put check buffer in all (LDA-M) gaps

*

      IF( lda.GT.m ) THEN

         k = ipre + m + 1

         DO 40 j = 1, n

            DO 30 i = k, k + ( lda - m ) - 1

               a( i ) = chkval

   30       CONTINUE

            k = k + lda

   40    CONTINUE

      END IF

*

      RETURN

*

*     End of PB_DFILLPAD

*

      END

      SUBROUTINE pb_dchekpad( ICTXT, MESS, M, N, A, LDA, IPRE, IPOST,

     $                        CHKVAL )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            ICTXT, IPOST, IPRE, LDA, M, N

      DOUBLE PRECISION   CHKVAL

*     ..

*     .. Array Arguments ..

      CHARACTER*(*)      MESS

      DOUBLE PRECISION   A( * )

*     ..

*

*  Purpose

*  =======

*

*  PB_DCHEKPAD checks that the padding around a local array has not been

*  overwritten since the call to PB_DFILLPAD.  Three types of errors are

*  reported:

*

*  1) Overwrite in pre-guardzone.  This indicates a memory overwrite has

*  occurred in the  first  IPRE  elements which form a buffer before the

*  beginning of A. Therefore, the error message:

*     'Overwrite in  pre-guardzone: loc(  5) =         18.00000'

*  tells that the 5th element of the IPRE long buffer has been overwrit-

*  ten with the value 18, where it should still have the value CHKVAL.

*

*  2) Overwrite in post-guardzone. This indicates a memory overwrite has

*  occurred in the last IPOST elements which form a buffer after the end

*  of A. Error reports are refered from the end of A.  Therefore,

*     'Overwrite in post-guardzone: loc( 19) =         24.00000'

*  tells  that the  19th element after the end of A was overwritten with

*  the value 24, where it should still have the value of CHKVAL.

*

*  3) Overwrite in lda-m gap.  Tells you elements between M and LDA were

*  overwritten.  So,

*     'Overwrite in lda-m gap: A( 12,  3) =         22.00000'

*  tells  that the element at the 12th row and 3rd column of A was over-

*  written with the value of 22, where it should still have the value of

*  CHKVAL.

*

*  Arguments

*  =========

*

*  ICTXT   (local input) INTEGER

*          On entry,  ICTXT  specifies the BLACS context handle, indica-

*          ting the global  context of the operation. The context itself

*          is global, but the value of ICTXT is local.

*

*  MESS    (local input) CHARACTER*(*)

*          On entry, MESS is a ttring containing a user-defined message.

*

*  M       (local input) INTEGER

*          On entry, M  specifies the number of rows in the local  array

*          A.  M must be at least zero.

*

*  N       (local input) INTEGER

*          On entry, N  specifies the number of columns in the local ar-

*          ray A. N must be at least zero.

*

*  A       (local input) DOUBLE PRECISION array

*          On entry,  A  is an array of dimension (LDA,N).

*

*  LDA     (local input) INTEGER

*          On entry,  LDA  specifies  the leading dimension of the local

*          array to be padded. LDA must be at least MAX( 1, M ).

*

*  IPRE    (local input) INTEGER

*          On entry, IPRE specifies the size of  the  guard zone  to put

*          before the start of the padded array.

*

*  IPOST   (local input) INTEGER

*          On entry, IPOST specifies the size of the  guard zone  to put

*          after the end of the padded array.

*

*  CHKVAL  (local input) DOUBLE PRECISION

*          On entry, CHKVAL specifies the value to pad the array with.

*

*

*  -- Written on April 1, 1998 by

*     R. Clint Whaley, University of Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      CHARACTER*1        TOP

      INTEGER            I, IAM, IDUMM, INFO, J, K, MYCOL, MYROW, NPCOL,

     $                   NPROW

*     ..

*     .. External Subroutines ..

      EXTERNAL           BLACS_GRIDINFO, IGAMX2D, PB_TOPGET

*     ..

*     .. Executable Statements ..

*

*     Get grid parameters

*

      CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL )

      IAM  = myrow*npcol + mycol

      info = -1

*

*     Check buffer in front of A

*

      IF( ipre.GT.0 ) THEN

         DO 10 i = 1, ipre

            IF( a( i ).NE.chkval ) THEN

               WRITE( *, fmt = 9998 ) myrow, mycol, mess, ' pre', i,

     $                                a( i )

               info = iam

            END IF

   10    CONTINUE

      ELSE

         WRITE( *, fmt = * ) 'WARNING no pre-guardzone in PB_DCHEKPAD'

      END IF

*

*     Check buffer after A

*

      IF( ipost.GT.0 ) THEN

         j = ipre+lda*n+1

         DO 20 i = j, j+ipost-1

            IF( a( i ).NE.chkval ) THEN

               WRITE( *, fmt = 9998 ) myrow, mycol, mess, 'post',

     $                                i-j+1, a( i )

               info = iam

            END IF

   20    CONTINUE

      ELSE

         WRITE( *, fmt = * )

     $          'WARNING no post-guardzone buffer in PB_DCHEKPAD'

      END IF

*

*     Check all (LDA-M) gaps

*

      IF( lda.GT.m ) THEN

         k = ipre + m + 1

         DO 40 j = 1, n

            DO 30 i = k, k + (lda-m) - 1

               IF( a( i ).NE.chkval ) THEN

                  WRITE( *, fmt = 9997 ) myrow, mycol, mess,

     $               i-ipre-lda*(j-1), j, a( i )

                  info = iam

               END IF

   30       CONTINUE

            k = k + lda

   40    CONTINUE

      END IF

*

      CALL pb_topget( ictxt, 'Combine', 'All', top )

      CALL igamx2d( ictxt, 'All', top, 1, 1, info, 1, idumm, idumm, -1,

     $              0, 0 )

      IF( iam.EQ.0 .AND. info.GE.0 ) THEN

         WRITE( *, fmt = 9999 ) info / npcol, mod( info, npcol ), mess

      END IF

*

 9999 FORMAT( '{', i5, ',', i5, '}:  Memory overwrite in ', a )

 9998 FORMAT( '{', i5, ',', i5, '}:  ', a, ' memory overwrite in ',

     $        a4, '-guardzone: loc(', i3, ') = ', g20.7 )

 9997 FORMAT( '{', i5, ',', i5, '}: ', a, ' memory overwrite in ',

     $        'lda-m gap: loc(', i3, ',', i3, ') = ', g20.7 )

*

      RETURN

*

*     End of PB_DCHEKPAD

*

      END

      SUBROUTINE pb_dlaset( UPLO, M, N, IOFFD, ALPHA, BETA, A, LDA )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        UPLO

      INTEGER            IOFFD, LDA, M, N

      DOUBLE PRECISION   ALPHA, BETA

*     ..

*     .. Array Arguments ..

      DOUBLE PRECISION   A( LDA, * )

*     ..

*

*  Purpose

*  =======

*

*  PB_DLASET initializes a two-dimensional array A to beta on the diago-

*  nal specified by IOFFD and alpha on the offdiagonals.

*

*  Arguments

*  =========

*

*  UPLO    (global input) CHARACTER*1

*          On entry,  UPLO  specifies  which trapezoidal part of the ar-

*          ray A is to be set as follows:

*             = 'L' or 'l':   Lower triangular part is set; the strictly

*                             upper triangular part of A is not changed,

*             = 'U' or 'u':   Upper triangular part is set; the strictly

*                             lower triangular part of A is not changed,

*             = 'D' or 'd'    Only the diagonal of A is set,

*             Otherwise:      All of the array A is set.

*

*  M       (input) INTEGER

*          On entry,  M  specifies the number of rows of the array A.  M

*          must be at least zero.

*

*  N       (input) INTEGER

*          On entry,  N  specifies the number of columns of the array A.

*          N must be at least zero.

*

*  IOFFD   (input) INTEGER

*          On entry, IOFFD specifies the position of the offdiagonal de-

*          limiting the upper and lower trapezoidal part of A as follows

*          (see the notes below):

*

*             IOFFD = 0  specifies the main diagonal A( i, i ),

*                        with i = 1 ... MIN( M, N ),

*             IOFFD > 0  specifies the subdiagonal   A( i+IOFFD, i ),

*                        with i = 1 ... MIN( M-IOFFD, N ),

*             IOFFD < 0  specifies the superdiagonal A( i, i-IOFFD ),

*                        with i = 1 ... MIN( M, N+IOFFD ).

*

*  ALPHA   (input) DOUBLE PRECISION

*          On entry,  ALPHA specifies the value to which the offdiagonal

*          array elements are set to.

*

*  BETA    (input) DOUBLE PRECISION

*          On entry, BETA  specifies the value to which the diagonal ar-

*          ray elements are set to.

*

*  A       (input/output) DOUBLE PRECISION array

*          On entry, A is an array of dimension  (LDA,N).  Before  entry

*          with UPLO = 'U' or 'u', the leading m by n part of the  array

*          A  must  contain  the upper trapezoidal part of the matrix as

*          specified by IOFFD to be set, and  the  strictly lower trape-

*          zoidal  part of A is not referenced; When IUPLO = 'L' or 'l',

*          the leading m by n part of  the  array  A  must  contain  the

*          lower trapezoidal part of the matrix as specified by IOFFD to

*          be set,  and  the  strictly  upper  trapezoidal part of  A is

*          not referenced.

*

*  LDA     (input) INTEGER

*          On entry, LDA specifies the leading dimension of the array A.

*          LDA must be at least max( 1, M ).

*

*  Notes

*  =====

*                           N                                    N

*             ----------------------------                  -----------

*            |       d                    |                |           |

*          M |         d        'U'       |                |      'U'  |

*            |  'L'     'D'               |                |d          |

*            |             d              |              M |  d        |

*             ----------------------------                 |   'D'     |

*                                                          |      d    |

*               IOFFD < 0                                  | 'L'    d  |

*                                                          |          d|

*                  N                                       |           |

*             -----------                                   -----------

*            |    d   'U'|

*            |      d    |                                   IOFFD > 0

*          M |       'D' |

*            |          d|                              N

*            |  'L'      |                 ----------------------------

*            |           |                |          'U'               |

*            |           |                |d                           |

*            |           |                | 'D'                        |

*            |           |                |    d                       |

*            |           |                |'L'   d                     |

*             -----------                  ----------------------------

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER            I, J, JTMP, MN

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      EXTERNAL           LSAME

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          MAX, MIN

*     ..

*     .. Executable Statements ..

*

*     Quick return if possible

*

      IF( M.LE.0 .OR. N.LE.0 )

     $   RETURN

*

*     Start the operations

*

      IF( LSAME( UPLO, 'L' ) ) THEN

*

*        Set the diagonal to BETA and the strictly lower triangular

*        part of the array to ALPHA.

*

         mn = max( 0, -ioffd )

         DO 20 j = 1, min( mn, n )

            DO 10 i = 1, m

               a( i, j ) = alpha

   10       CONTINUE

   20    CONTINUE

         DO 40 j = mn + 1, min( m - ioffd, n )

            jtmp = j + ioffd

            a( jtmp, j ) = beta

            DO 30 i = jtmp + 1, m

               a( i, j ) = alpha

   30       CONTINUE

   40    CONTINUE

*

      ELSE IF( lsame( uplo, 'U' ) ) THEN

*

*        Set the diagonal to BETA and the strictly upper triangular

*        part of the array to ALPHA.

*

         mn = min( m - ioffd, n )

         DO 60 j = max( 0, -ioffd ) + 1, mn

            jtmp = j + ioffd

            DO 50 i = 1, jtmp - 1

               a( i, j ) = alpha

   50       CONTINUE

            a( jtmp, j ) = beta

   60    CONTINUE

         DO 80 j = max( 0, mn ) + 1, n

            DO 70 i = 1, m

               a( i, j ) = alpha

   70       CONTINUE

   80    CONTINUE

*

      ELSE IF( lsame( uplo, 'D' ) ) THEN

*

*        Set the array to BETA on the diagonal.

*

         DO 90 j = max( 0, -ioffd ) + 1, min( m - ioffd, n )

            a( j + ioffd, j ) = beta

   90    CONTINUE

*

      ELSE

*

*        Set the array to BETA on the diagonal and ALPHA on the

*        offdiagonal.

*

         DO 110 j = 1, n

            DO 100 i = 1, m

               a( i, j ) = alpha

  100       CONTINUE

  110    CONTINUE

         IF( alpha.NE.beta .AND. ioffd.LT.m .AND. ioffd.GT.-n ) THEN

            DO 120 j = max( 0, -ioffd ) + 1, min( m - ioffd, n )

               a( j + ioffd, j ) = beta

  120       CONTINUE

         END IF

*

      END IF

*

      RETURN

*

*     End of PB_DLASET

*

      END

      SUBROUTINE pb_dlascal( UPLO, M, N, IOFFD, ALPHA, A, LDA )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        UPLO

      INTEGER            IOFFD, LDA, M, N

      DOUBLE PRECISION   ALPHA

*     ..

*     .. Array Arguments ..

      DOUBLE PRECISION   A( LDA, * )

*     ..

*

*  Purpose

*  =======

*

*  PB_DLASCAL scales a two-dimensional array A by the scalar alpha.

*

*  Arguments

*  =========

*

*  UPLO    (input) CHARACTER*1

*          On entry,  UPLO  specifies  which trapezoidal part of the ar-

*          ray A is to be scaled as follows:

*             = 'L' or 'l':          the lower trapezoid of A is scaled,

*             = 'U' or 'u':          the upper trapezoid of A is scaled,

*             = 'D' or 'd':       diagonal specified by IOFFD is scaled,

*             Otherwise:                   all of the array A is scaled.

*

*  M       (input) INTEGER

*          On entry,  M  specifies the number of rows of the array A.  M

*          must be at least zero.

*

*  N       (input) INTEGER

*          On entry,  N  specifies the number of columns of the array A.

*          N must be at least zero.

*

*  IOFFD   (input) INTEGER

*          On entry, IOFFD specifies the position of the offdiagonal de-

*          limiting the upper and lower trapezoidal part of A as follows

*          (see the notes below):

*

*             IOFFD = 0  specifies the main diagonal A( i, i ),

*                        with i = 1 ... MIN( M, N ),

*             IOFFD > 0  specifies the subdiagonal   A( i+IOFFD, i ),

*                        with i = 1 ... MIN( M-IOFFD, N ),

*             IOFFD < 0  specifies the superdiagonal A( i, i-IOFFD ),

*                        with i = 1 ... MIN( M, N+IOFFD ).

*

*  ALPHA   (input) DOUBLE PRECISION

*          On entry, ALPHA specifies the scalar alpha.

*

*  A       (input/output) DOUBLE PRECISION array

*          On entry, A is an array of dimension  (LDA,N).  Before  entry

*          with  UPLO = 'U' or 'u', the leading m by n part of the array

*          A must contain the upper trapezoidal  part  of the matrix  as

*          specified by  IOFFD to be scaled, and the strictly lower tra-

*          pezoidal part of A is not referenced; When UPLO = 'L' or 'l',

*          the leading m by n part of the array A must contain the lower

*          trapezoidal  part  of  the matrix as specified by IOFFD to be

*          scaled,  and  the strictly upper trapezoidal part of A is not

*          referenced. On exit, the entries of the  trapezoid part of  A

*          determined by UPLO and IOFFD are scaled.

*

*  LDA     (input) INTEGER

*          On entry, LDA specifies the leading dimension of the array A.

*          LDA must be at least max( 1, M ).

*

*  Notes

*  =====

*                           N                                    N

*             ----------------------------                  -----------

*            |       d                    |                |           |

*          M |         d        'U'       |                |      'U'  |

*            |  'L'     'D'               |                |d          |

*            |             d              |              M |  d        |

*             ----------------------------                 |   'D'     |

*                                                          |      d    |

*              IOFFD < 0                                   | 'L'    d  |

*                                                          |          d|

*                  N                                       |           |

*             -----------                                   -----------

*            |    d   'U'|

*            |      d    |                                   IOFFD > 0

*          M |       'D' |

*            |          d|                              N

*            |  'L'      |                 ----------------------------

*            |           |                |          'U'               |

*            |           |                |d                           |

*            |           |                | 'D'                        |

*            |           |                |    d                       |

*            |           |                |'L'   d                     |

*             -----------                  ----------------------------

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Local Scalars ..

      INTEGER            I, J, JTMP, MN

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      EXTERNAL           LSAME

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          MAX, MIN

*     ..

*     .. Executable Statements ..

*

*     Quick return if possible

*

      IF( M.LE.0 .OR. N.LE.0 )

     $   RETURN

*

*     Start the operations

*

      IF( LSAME( UPLO, 'L' ) ) THEN

*

*        Scales the lower triangular part of the array by ALPHA.

*

         MN = max( 0, -ioffd )

         DO 20 j = 1, min( mn, n )

            DO 10 i = 1, m

               a( i, j ) = alpha * a( i, j )

   10       CONTINUE

   20    CONTINUE

         DO 40 j = mn + 1, min( m - ioffd, n )

            DO 30 i = j + ioffd, m

               a( i, j ) = alpha * a( i, j )

   30       CONTINUE

   40    CONTINUE

*

      ELSE IF( lsame( uplo, 'U' ) ) THEN

*

*        Scales the upper triangular part of the array by ALPHA.

*

         mn = min( m - ioffd, n )

         DO 60 j = max( 0, -ioffd ) + 1, mn

            DO 50 i = 1, j + ioffd

               a( i, j ) = alpha * a( i, j )

   50       CONTINUE

   60    CONTINUE

         DO 80 j = max( 0, mn ) + 1, n

            DO 70 i = 1, m

               a( i, j ) = alpha * a( i, j )

   70       CONTINUE

   80    CONTINUE

*

      ELSE IF( lsame( uplo, 'D' ) ) THEN

*

*        Scales the diagonal entries by ALPHA.

*

         DO 90 j = max( 0, -ioffd ) + 1, min( m - ioffd, n )

            jtmp = j + ioffd

            a( jtmp, j ) = alpha * a( jtmp, j )

   90    CONTINUE

*

      ELSE

*

*        Scales the entire array by ALPHA.

*

         DO 110 j = 1, n

            DO 100 i = 1, m

               a( i, j ) = alpha * a( i, j )

  100       CONTINUE

  110    CONTINUE

*

      END IF

*

      RETURN

*

*     End of PB_DLASCAL

*

      END

      SUBROUTINE pb_dlagen( UPLO, AFORM, A, LDA, LCMT00, IRAN, MBLKS,

     $                      IMBLOC, MB, LMBLOC, NBLKS, INBLOC, NB,

     $                      LNBLOC, JMP, IMULADD )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      CHARACTER*1        UPLO, AFORM

      INTEGER            IMBLOC, INBLOC, LCMT00, LDA, LMBLOC, LNBLOC,

     $                   mb, mblks, nb, nblks

*     ..

*     .. Array Arguments ..

      INTEGER            IMULADD( 4, * ), IRAN( * ), JMP( * )

      DOUBLE PRECISION   A( LDA, * )

*     ..

*

*  Purpose

*  =======

*

*  PB_DLAGEN locally initializes an array A.

*

*  Arguments

*  =========

*

*  UPLO    (global input) CHARACTER*1

*          On entry, UPLO  specifies whether the lower (UPLO='L') trape-

*          zoidal part or the upper (UPLO='U') trapezoidal part is to be

*          generated  when  the  matrix  to be generated is symmetric or

*          Hermitian. For  all  the  other values of AFORM, the value of

*          this input argument is ignored.

*

*  AFORM   (global input) CHARACTER*1

*          On entry, AFORM specifies the type of submatrix to be genera-

*          ted as follows:

*             AFORM = 'S', sub( A ) is a symmetric matrix,

*             AFORM = 'H', sub( A ) is a Hermitian matrix,

*             AFORM = 'T', sub( A ) is overrwritten  with  the transpose

*                          of what would normally be generated,

*             AFORM = 'C', sub( A ) is overwritten  with  the  conjugate

*                          transpose  of  what would normally be genera-

*                          ted.

*             AFORM = 'N', a random submatrix is generated.

*

*  A       (local output) DOUBLE PRECISION array

*          On entry,  A  is  an array of dimension (LLD_A, *).  On exit,

*          this array contains the local entries of the randomly genera-

*          ted submatrix sub( A ).

*

*  LDA     (local input) INTEGER

*          On entry,  LDA  specifies  the local leading dimension of the

*          array A. LDA must be at least one.

*

*  LCMT00  (global input) INTEGER

*          On entry, LCMT00 is the LCM value specifying the off-diagonal

*          of the underlying matrix of interest. LCMT00=0 specifies  the

*          main diagonal, LCMT00 > 0 specifies a subdiagonal, LCMT00 < 0

*          specifies superdiagonals.

*

*  IRAN    (local input) INTEGER array

*          On entry, IRAN  is an array of dimension 2 containing respec-

*          tively the 16-lower and 16-higher bits of the encoding of the

*          entry of  the  random  sequence  corresponding locally to the

*          first local array entry to generate. Usually,  this  array is

*          computed by PB_SETLOCRAN.

*

*  MBLKS   (local input) INTEGER

*          On entry, MBLKS specifies the local number of blocks of rows.

*          MBLKS is at least zero.

*

*  IMBLOC  (local input) INTEGER

*          On entry, IMBLOC specifies  the  number of rows (size) of the

*          local uppest  blocks. IMBLOC is at least zero.

*

*  MB      (global input) INTEGER

*          On entry, MB  specifies the blocking factor used to partition

*          the rows of the matrix.  MB  must be at least one.

*

*  LMBLOC  (local input) INTEGER

*          On entry, LMBLOC specifies the number of  rows  (size) of the

*          local lowest blocks. LMBLOC is at least zero.

*

*  NBLKS   (local input) INTEGER

*          On entry,  NBLKS  specifies the local number of blocks of co-

*          lumns. NBLKS is at least zero.

*

*  INBLOC  (local input) INTEGER

*          On entry,  INBLOC  specifies the number of columns (size)  of

*          the local leftmost blocks. INBLOC is at least zero.

*

*  NB      (global input) INTEGER

*          On entry, NB  specifies the blocking factor used to partition

*          the the columns of the matrix.  NB  must be at least one.

*

*  LNBLOC  (local input) INTEGER

*          On entry,  LNBLOC  specifies  the number of columns (size) of

*          the local rightmost blocks. LNBLOC is at least zero.

*

*  JMP     (local input) INTEGER array

*          On entry, JMP is an array of dimension JMP_LEN containing the

*          different jump values used by the random matrix generator.

*

*  IMULADD (local input) INTEGER array

*          On entry, IMULADD is an array of dimension (4, JMP_LEN).  The

*          jth  column  of this array contains the encoded initial cons-

*          tants a_j and c_j to  jump  from X( n ) to  X( n + JMP( j ) )

*          (= a_j * X( n ) + c_j) in the random sequence. IMULADD(1:2,j)

*          contains respectively the 16-lower and 16-higher bits of  the

*          constant a_j, and IMULADD(3:4,j)  contains  the 16-lower  and

*          16-higher bits of the constant c_j.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      INTEGER            JMP_1, JMP_COL, JMP_IMBV, JMP_INBV, JMP_LEN,

     $                   JMP_MB, JMP_NB, JMP_NPIMBLOC, JMP_NPMB,

     $                   JMP_NQINBLOC, JMP_NQNB, JMP_ROW

      PARAMETER          ( JMP_1 = 1, jmp_row = 2, jmp_col = 3,

     $                   jmp_mb = 4, jmp_imbv = 5, jmp_npmb = 6,

     $                   jmp_npimbloc = 7, jmp_nb = 8, jmp_inbv = 9,

     $                   jmp_nqnb = 10, jmp_nqinbloc = 11,

     $                   jmp_len = 11 )

*     ..

*     .. Local Scalars ..

      INTEGER            I, IB, IBLK, II, IK, ITMP, JB, JBLK, JJ, JK,

     $                   JTMP, LCMTC, LCMTR, LOW, MNB, UPP

      DOUBLE PRECISION   DUMMY

*     ..

*     .. Local Arrays ..

      INTEGER            IB0( 2 ), IB1( 2 ), IB2( 2 ), IB3( 2 )

*     ..

*     .. External Subroutines ..

      EXTERNAL           PB_JUMPIT

*     ..

*     .. External Functions ..

      LOGICAL            LSAME

      DOUBLE PRECISION   PB_DRAND

      EXTERNAL           LSAME, PB_DRAND

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          max, min

*     ..

*     .. Executable Statements ..

*

      DO 10 i = 1, 2

         ib1( i ) = iran( i )

         ib2( i ) = iran( i )

         ib3( i ) = iran( i )

   10 CONTINUE

*

      IF( lsame( aform, 'N' ) ) THEN

*

*        Generate random matrix

*

         jj = 1

*

         DO 50 jblk = 1, nblks

*

            IF( jblk.EQ.1 ) THEN

               jb = inbloc

            ELSE IF( jblk.EQ.nblks ) THEN

               jb = lnbloc

            ELSE

               jb = nb

            END IF

*

            DO 40 jk = jj, jj + jb - 1

*

               ii = 1

*

               DO 30 iblk = 1, mblks

*

                  IF( iblk.EQ.1 ) THEN

                     ib = imbloc

                  ELSE IF( iblk.EQ.mblks ) THEN

                     ib = lmbloc

                  ELSE

                     ib = mb

                  END IF

*

*                 Blocks are IB by JB

*

                  DO 20 ik = ii, ii + ib - 1

                     a( ik, jk ) = pb_drand( 0 )

   20             CONTINUE

*

                  ii = ii + ib

*

                  IF( iblk.EQ.1 ) THEN

*

*                    Jump IMBLOC + ( NPROW - 1 ) * MB rows

*

                     CALL pb_jumpit( imuladd( 1, jmp_npimbloc ), ib1,

     $                               ib0 )

*

                  ELSE

*

*                    Jump NPROW * MB rows

*

                     CALL pb_jumpit( imuladd( 1, jmp_npmb ), ib1, ib0 )

*

                  END IF

*

                  ib1( 1 ) = ib0( 1 )

                  ib1( 2 ) = ib0( 2 )

*

   30          CONTINUE

*

*              Jump one column

*

               CALL pb_jumpit( imuladd( 1, jmp_col ), ib2, ib0 )

*

               ib1( 1 ) = ib0( 1 )

               ib1( 2 ) = ib0( 2 )

               ib2( 1 ) = ib0( 1 )

               ib2( 2 ) = ib0( 2 )

*

   40       CONTINUE

*

            jj = jj + jb

*

            IF( jblk.EQ.1 ) THEN

*

*              Jump INBLOC + ( NPCOL - 1 ) * NB columns

*

               CALL pb_jumpit( imuladd( 1, jmp_nqinbloc ), ib3, ib0 )

*

            ELSE

*

*              Jump NPCOL * NB columns

*

               CALL pb_jumpit( imuladd( 1, jmp_nqnb ), ib3, ib0 )

*

            END IF

*

            ib1( 1 ) = ib0( 1 )

            ib1( 2 ) = ib0( 2 )

            ib2( 1 ) = ib0( 1 )

            ib2( 2 ) = ib0( 2 )

            ib3( 1 ) = ib0( 1 )

            ib3( 2 ) = ib0( 2 )

*

   50    CONTINUE

*

      ELSE IF( lsame( aform, 'T' ) .OR. lsame( aform, 'C' ) ) THEN

*

*        Generate the transpose of the matrix that would be normally

*        generated.

*

         ii = 1

*

         DO 90 iblk = 1, mblks

*

            IF( iblk.EQ.1 ) THEN

               ib = imbloc

            ELSE IF( iblk.EQ.mblks ) THEN

               ib = lmbloc

            ELSE

               ib = mb

            END IF

*

            DO 80 ik = ii, ii + ib - 1

*

               jj = 1

*

               DO 70 jblk = 1, nblks

*

                  IF( jblk.EQ.1 ) THEN

                     jb = inbloc

                  ELSE IF( jblk.EQ.nblks ) THEN

                     jb = lnbloc

                  ELSE

                     jb = nb

                  END IF

*

*                 Blocks are IB by JB

*

                  DO 60 jk = jj, jj + jb - 1

                     a( ik, jk ) = pb_drand( 0 )

   60             CONTINUE

*

                  jj = jj + jb

*

                  IF( jblk.EQ.1 ) THEN

*

*                    Jump INBLOC + ( NPCOL - 1 ) * NB columns

*

                     CALL pb_jumpit( imuladd( 1, jmp_nqinbloc ), ib1,

     $                               ib0 )

*

                  ELSE

*

*                    Jump NPCOL * NB columns

*

                     CALL pb_jumpit( imuladd( 1, jmp_nqnb ), ib1, ib0 )

*

                  END IF

*

                  ib1( 1 ) = ib0( 1 )

                  ib1( 2 ) = ib0( 2 )

*

   70          CONTINUE

*

*              Jump one row

*

               CALL pb_jumpit( imuladd( 1, jmp_row ), ib2, ib0 )

*

               ib1( 1 ) = ib0( 1 )

               ib1( 2 ) = ib0( 2 )

               ib2( 1 ) = ib0( 1 )

               ib2( 2 ) = ib0( 2 )

*

   80       CONTINUE

*

            ii = ii + ib

*

            IF( iblk.EQ.1 ) THEN

*

*              Jump IMBLOC + ( NPROW - 1 ) * MB rows

*

               CALL pb_jumpit( imuladd( 1, jmp_npimbloc ), ib3, ib0 )

*

            ELSE

*

*              Jump NPROW * MB rows

*

               CALL pb_jumpit( imuladd( 1, jmp_npmb ), ib3, ib0 )

*

            END IF

*

            ib1( 1 ) = ib0( 1 )

            ib1( 2 ) = ib0( 2 )

            ib2( 1 ) = ib0( 1 )

            ib2( 2 ) = ib0( 2 )

            ib3( 1 ) = ib0( 1 )

            ib3( 2 ) = ib0( 2 )

*

   90    CONTINUE

*

      ELSE IF( ( lsame( aform, 'S' ) ).OR.( lsame( aform, 'H' ) ) ) THEN

*

*        Generate a symmetric matrix

*

         IF( lsame( uplo, 'L' ) ) THEN

*

*           generate lower trapezoidal part

*

            jj = 1

            lcmtc = lcmt00

*

            DO 170 jblk = 1, nblks

*

               IF( jblk.EQ.1 ) THEN

                  jb  = inbloc

                  low = 1 - inbloc

               ELSE IF( jblk.EQ.nblks ) THEN

                  jb = lnbloc

                  low = 1 - nb

               ELSE

                  jb  = nb

                  low = 1 - nb

               END IF

*

               DO 160 jk = jj, jj + jb - 1

*

                  ii = 1

                  lcmtr = lcmtc

*

                  DO 150 iblk = 1, mblks

*

                     IF( iblk.EQ.1 ) THEN

                        ib  = imbloc

                        upp = imbloc - 1

                     ELSE IF( iblk.EQ.mblks ) THEN

                        ib  = lmbloc

                        upp = mb - 1

                     ELSE

                        ib  = mb

                        upp = mb - 1

                     END IF

*

*                    Blocks are IB by JB

*

                     IF( lcmtr.GT.upp ) THEN

*

                        DO 100 ik = ii, ii + ib - 1

                           dummy = pb_drand( 0 )

  100                   CONTINUE

*

                     ELSE IF( lcmtr.GE.low ) THEN

*

                        jtmp = jk - jj + 1

                        mnb  = max( 0, -lcmtr )

*

                        IF( jtmp.LE.min( mnb, jb ) ) THEN

*

                           DO 110 ik = ii, ii + ib - 1

                              a( ik, jk ) = pb_drand( 0 )

  110                      CONTINUE

*

                        ELSE IF( ( jtmp.GE.( mnb + 1 )         ) .AND.

     $                           ( jtmp.LE.min( ib-lcmtr, jb ) ) ) THEN

*

                           itmp = ii + jtmp + lcmtr - 1

*

                           DO 120 ik = ii, itmp - 1

                              dummy = pb_drand( 0 )

  120                      CONTINUE

*

                           DO 130 ik = itmp, ii + ib - 1

                              a( ik, jk ) = pb_drand( 0 )

  130                      CONTINUE

*

                        END IF

*

                     ELSE

*

                        DO 140 ik = ii, ii + ib - 1

                           a( ik, jk ) = pb_drand( 0 )

  140                   CONTINUE

*

                     END IF

*

                     ii = ii + ib

*

                     IF( iblk.EQ.1 ) THEN

*

*                       Jump IMBLOC + ( NPROW - 1 ) * MB rows

*

                        lcmtr = lcmtr - jmp( jmp_npimbloc )

                        CALL pb_jumpit( imuladd( 1, jmp_npimbloc ), ib1,

     $                                  ib0 )

*

                     ELSE

*

*                       Jump NPROW * MB rows

*

                        lcmtr = lcmtr - jmp( jmp_npmb )

                        CALL pb_jumpit( imuladd( 1, jmp_npmb ), ib1,

     $                                  ib0 )

*

                     END IF

*

                     ib1( 1 ) = ib0( 1 )

                     ib1( 2 ) = ib0( 2 )

*

  150             CONTINUE

*

*                 Jump one column

*

                  CALL pb_jumpit( imuladd( 1, jmp_col ), ib2, ib0 )

*

                  ib1( 1 ) = ib0( 1 )

                  ib1( 2 ) = ib0( 2 )

                  ib2( 1 ) = ib0( 1 )

                  ib2( 2 ) = ib0( 2 )

*

  160          CONTINUE

*

               jj = jj + jb

*

               IF( jblk.EQ.1 ) THEN

*

*                 Jump INBLOC + ( NPCOL - 1 ) * NB columns

*

                  lcmtc = lcmtc + jmp( jmp_nqinbloc )

                  CALL pb_jumpit( imuladd( 1, jmp_nqinbloc ), ib3, ib0 )

*

               ELSE

*

*                 Jump NPCOL * NB columns

*

                  lcmtc = lcmtc + jmp( jmp_nqnb )

                  CALL pb_jumpit( imuladd( 1, jmp_nqnb ), ib3, ib0 )

*

               END IF

*

               ib1( 1 ) = ib0( 1 )

               ib1( 2 ) = ib0( 2 )

               ib2( 1 ) = ib0( 1 )

               ib2( 2 ) = ib0( 2 )

               ib3( 1 ) = ib0( 1 )

               ib3( 2 ) = ib0( 2 )

*

  170       CONTINUE

*

         ELSE

*

*           generate upper trapezoidal part

*

            ii = 1

            lcmtr = lcmt00

*

            DO 250 iblk = 1, mblks

*

               IF( iblk.EQ.1 ) THEN

                  ib  = imbloc

                  upp = imbloc - 1

               ELSE IF( iblk.EQ.mblks ) THEN

                  ib  = lmbloc

                  upp = mb - 1

               ELSE

                  ib  = mb

                  upp = mb - 1

               END IF

*

               DO 240 ik = ii, ii + ib - 1

*

                  jj = 1

                  lcmtc = lcmtr

*

                  DO 230 jblk = 1, nblks

*

                     IF( jblk.EQ.1 ) THEN

                        jb  = inbloc

                        low = 1 - inbloc

                     ELSE IF( jblk.EQ.nblks ) THEN

                        jb  = lnbloc

                        low = 1 - nb

                     ELSE

                        jb  = nb

                        low = 1 - nb

                     END IF

*

*                    Blocks are IB by JB

*

                     IF( lcmtc.LT.low ) THEN

*

                        DO 180 jk = jj, jj + jb - 1

                           dummy = pb_drand( 0 )

  180                   CONTINUE

*

                     ELSE IF( lcmtc.LE.upp ) THEN

*

                        itmp = ik - ii + 1

                        mnb  = max( 0, lcmtc )

*

                        IF( itmp.LE.min( mnb, ib ) ) THEN

*

                           DO 190 jk = jj, jj + jb - 1

                              a( ik, jk ) = pb_drand( 0 )

  190                      CONTINUE

*

                        ELSE IF( ( itmp.GE.( mnb + 1 )         ) .AND.

     $                           ( itmp.LE.min( jb+lcmtc, ib ) ) ) THEN

*

                           jtmp = jj + itmp - lcmtc - 1

*

                           DO 200 jk = jj, jtmp - 1

                              dummy = pb_drand( 0 )

  200                      CONTINUE

*

                           DO 210 jk = jtmp, jj + jb - 1

                              a( ik, jk ) = pb_drand( 0 )

  210                      CONTINUE

*

                        END IF

*

                     ELSE

*

                        DO 220 jk = jj, jj + jb - 1

                           a( ik, jk ) = pb_drand( 0 )

  220                   CONTINUE

*

                     END IF

*

                     jj = jj + jb

*

                     IF( jblk.EQ.1 ) THEN

*

*                       Jump INBLOC + ( NPCOL - 1 ) * NB columns

*

                        lcmtc = lcmtc + jmp( jmp_nqinbloc )

                        CALL pb_jumpit( imuladd( 1, jmp_nqinbloc ), ib1,

     $                                  ib0 )

*

                     ELSE

*

*                       Jump NPCOL * NB columns

*

                        lcmtc = lcmtc + jmp( jmp_nqnb )

                        CALL pb_jumpit( imuladd( 1, jmp_nqnb ), ib1,

     $                                  ib0 )

*

                     END IF

*

                     ib1( 1 ) = ib0( 1 )

                     ib1( 2 ) = ib0( 2 )

*

  230             CONTINUE

*

*                 Jump one row

*

                  CALL pb_jumpit( imuladd( 1, jmp_row ), ib2, ib0 )

*

                  ib1( 1 ) = ib0( 1 )

                  ib1( 2 ) = ib0( 2 )

                  ib2( 1 ) = ib0( 1 )

                  ib2( 2 ) = ib0( 2 )

*

  240          CONTINUE

*

               ii = ii + ib

*

               IF( iblk.EQ.1 ) THEN

*

*                 Jump IMBLOC + ( NPROW - 1 ) * MB rows

*

                  lcmtr = lcmtr - jmp( jmp_npimbloc )

                  CALL pb_jumpit( imuladd( 1, jmp_npimbloc ), ib3, ib0 )

*

               ELSE

*

*                 Jump NPROW * MB rows

*

                  lcmtr = lcmtr - jmp( jmp_npmb )

                  CALL pb_jumpit( imuladd( 1, jmp_npmb ), ib3, ib0 )

*

               END IF

*

               ib1( 1 ) = ib0( 1 )

               ib1( 2 ) = ib0( 2 )

               ib2( 1 ) = ib0( 1 )

               ib2( 2 ) = ib0( 2 )

               ib3( 1 ) = ib0( 1 )

               ib3( 2 ) = ib0( 2 )

*

  250       CONTINUE

*

         END IF

*

      END IF

*

      RETURN

*

*     End of PB_DLAGEN

*

      END

      DOUBLE PRECISION   FUNCTION pb_drand( IDUMM )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            idumm

*     ..

*

*  Purpose

*  =======

*

*  PB_DRAND generates the next number in the random sequence. This func-

*  tion ensures that this number will be in the interval ( -1.0, 1.0 ).

*

*  Arguments

*  =========

*

*  IDUMM   (local input) INTEGER

*          This argument is ignored, but necessary to a FORTRAN 77 func-

*          tion.

*

*  Further Details

*  ===============

*

*  On entry, the array IRAND stored in the common block  RANCOM contains

*  the information (2 integers)  required to generate the next number in

*  the sequence X( n ). This number is computed as

*

*     X( n ) = ( 2^16 * IRAND( 2 ) + IRAND( 1 ) ) / d,

*

*  where the constant d is the  largest  32 bit  positive  integer.  The

*  array  IRAND  is  then  updated for the generation of the next number

*  X( n+1 ) in the random sequence as follows X( n+1 ) = a * X( n ) + c.

*  The constants  a  and c  should have been preliminarily stored in the

*  array  IACS  as  2 pairs of integers. The initial set up of IRAND and

*  IACS is performed by the routine PB_SETRAN.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      DOUBLE PRECISION   one, two

      PARAMETER          ( one = 1.0d+0, two = 2.0d+0 )

*     ..

*     .. External Functions ..

      DOUBLE PRECISION   pb_dran

      EXTERNAL           pb_dran

*     ..

*     .. Executable Statements ..

*

      pb_drand = one - two * pb_dran( idumm )

*

      RETURN

*

*     End of PB_DRAND

*

      END

      DOUBLE PRECISION   FUNCTION pb_dran( IDUMM )

*

*  -- PBLAS test routine (version 2.0) --

*     University of Tennessee, Knoxville, Oak Ridge National Laboratory,

*     and University of California, Berkeley.

*     April 1, 1998

*

*     .. Scalar Arguments ..

      INTEGER            idumm

*     ..

*

*  Purpose

*  =======

*

*  PB_DRAN generates the next number in the random sequence.

*

*  Arguments

*  =========

*

*  IDUMM   (local input) INTEGER

*          This argument is ignored, but necessary to a FORTRAN 77 func-

*          tion.

*

*  Further Details

*  ===============

*

*  On entry, the array IRAND stored in the common block  RANCOM contains

*  the information (2 integers)  required to generate the next number in

*  the sequence X( n ). This number is computed as

*

*     X( n ) = ( 2^16 * IRAND( 2 ) + IRAND( 1 ) ) / d,

*

*  where the constant d is the  largest  32 bit  positive  integer.  The

*  array  IRAND  is  then  updated for the generation of the next number

*  X( n+1 ) in the random sequence as follows X( n+1 ) = a * X( n ) + c.

*  The constants  a  and c  should have been preliminarily stored in the

*  array  IACS  as  2 pairs of integers. The initial set up of IRAND and

*  IACS is performed by the routine PB_SETRAN.

*

*  -- Written on April 1, 1998 by

*     Antoine Petitet, University  of  Tennessee, Knoxville 37996, USA.

*

*  =====================================================================

*

*     .. Parameters ..

      DOUBLE PRECISION   divfac, pow16

      PARAMETER          ( divfac = 2.147483648d+9,

     $                   pow16 = 6.5536d+4 )

*     ..

*     .. Local Arrays ..

      INTEGER            j( 2 )

*     ..

*     .. External Subroutines ..

      EXTERNAL           pb_ladd, pb_lmul

*     ..

*     .. Intrinsic Functions ..

      INTRINSIC          dble

*     ..

*     .. Common Blocks ..

      INTEGER            iacs( 4 ), irand( 2 )

      common             /rancom/ irand, iacs

*     ..

*     .. Save Statements ..

      SAVE               /rancom/

*     ..

*     .. Executable Statements ..

*

      pb_dran = ( dble( irand( 1 ) ) + pow16 * dble( irand( 2 ) ) ) /

     $            divfac

*

      CALL pb_lmul( irand, iacs, j )

      CALL pb_ladd( j, iacs( 3 ), irand )

*

      RETURN

*

*     End of PB_DRAN

*

      END