ScaLAPACK 2.1  2.1
ScaLAPACK: Scalable Linear Algebra PACKage
pdlaqge.f
Go to the documentation of this file.
1  SUBROUTINE pdlaqge( M, N, A, IA, JA, DESCA, R, C, ROWCND, COLCND,
2  $ AMAX, EQUED )
3 *
4 * -- ScaLAPACK auxiliary routine (version 1.7) --
5 * University of Tennessee, Knoxville, Oak Ridge National Laboratory,
6 * and University of California, Berkeley.
7 * May 1, 1997
8 *
9 * .. Scalar Arguments ..
10  CHARACTER EQUED
11  INTEGER IA, JA, M, N
12  DOUBLE PRECISION AMAX, COLCND, ROWCND
13 * ..
14 * .. Array Arguments ..
15  INTEGER DESCA( * )
16  DOUBLE PRECISION A( * ), C( * ), R( * )
17 * ..
18 *
19 * Purpose
20 * =======
21 *
22 * PDLAQGE equilibrates a general M-by-N distributed matrix
23 * sub( A ) = A(IA:IA+M-1,JA:JA+N-1) using the row and scaling
24 * factors in the vectors R and C.
25 *
26 * Notes
27 * =====
28 *
29 * Each global data object is described by an associated description
30 * vector. This vector stores the information required to establish
31 * the mapping between an object element and its corresponding process
32 * and memory location.
33 *
34 * Let A be a generic term for any 2D block cyclicly distributed array.
35 * Such a global array has an associated description vector DESCA.
36 * In the following comments, the character _ should be read as
37 * "of the global array".
38 *
39 * NOTATION STORED IN EXPLANATION
40 * --------------- -------------- --------------------------------------
41 * DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
42 * DTYPE_A = 1.
43 * CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
44 * the BLACS process grid A is distribu-
45 * ted over. The context itself is glo-
46 * bal, but the handle (the integer
47 * value) may vary.
48 * M_A (global) DESCA( M_ ) The number of rows in the global
49 * array A.
50 * N_A (global) DESCA( N_ ) The number of columns in the global
51 * array A.
52 * MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
53 * the rows of the array.
54 * NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
55 * the columns of the array.
56 * RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
57 * row of the array A is distributed.
58 * CSRC_A (global) DESCA( CSRC_ ) The process column over which the
59 * first column of the array A is
60 * distributed.
61 * LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
62 * array. LLD_A >= MAX(1,LOCr(M_A)).
63 *
64 * Let K be the number of rows or columns of a distributed matrix,
65 * and assume that its process grid has dimension p x q.
66 * LOCr( K ) denotes the number of elements of K that a process
67 * would receive if K were distributed over the p processes of its
68 * process column.
69 * Similarly, LOCc( K ) denotes the number of elements of K that a
70 * process would receive if K were distributed over the q processes of
71 * its process row.
72 * The values of LOCr() and LOCc() may be determined via a call to the
73 * ScaLAPACK tool function, NUMROC:
74 * LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
75 * LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).
76 * An upper bound for these quantities may be computed by:
77 * LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
78 * LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
79 *
80 * Arguments
81 * =========
82 *
83 * M (global input) INTEGER
84 * The number of rows to be operated on i.e the number of rows
85 * of the distributed submatrix sub( A ). M >= 0.
86 *
87 * N (global input) INTEGER
88 * The number of columns to be operated on i.e the number of
89 * columns of the distributed submatrix sub( A ). N >= 0.
90 *
91 * A (local input/local output) DOUBLE PRECISION pointer into the
92 * local memory to an array of dimension (LLD_A,LOCc(JA+N-1))
93 * containing on entry the M-by-N matrix sub( A ). On exit,
94 * the equilibrated distributed matrix. See EQUED for the
95 * form of the equilibrated distributed submatrix.
96 *
97 * IA (global input) INTEGER
98 * The row index in the global array A indicating the first
99 * row of sub( A ).
100 *
101 * JA (global input) INTEGER
102 * The column index in the global array A indicating the
103 * first column of sub( A ).
104 *
105 * DESCA (global and local input) INTEGER array of dimension DLEN_.
106 * The array descriptor for the distributed matrix A.
107 *
108 * R (local input) DOUBLE PRECISION array, dimension LOCr(M_A)
109 * The row scale factors for sub( A ). R is aligned with the
110 * distributed matrix A, and replicated across every process
111 * column. R is tied to the distributed matrix A.
112 *
113 * C (local input) DOUBLE PRECISION array, dimension LOCc(N_A)
114 * The column scale factors of sub( A ). C is aligned with the
115 * distributed matrix A, and replicated down every process
116 * row. C is tied to the distributed matrix A.
117 *
118 * ROWCND (global input) DOUBLE PRECISION
119 * The global ratio of the smallest R(i) to the largest R(i),
120 * IA <= i <= IA+M-1.
121 *
122 * COLCND (global input) DOUBLE PRECISION
123 * The global ratio of the smallest C(i) to the largest C(i),
124 * JA <= j <= JA+N-1.
125 *
126 * AMAX (global input) DOUBLE PRECISION
127 * Absolute value of largest distributed submatrix entry.
128 *
129 * EQUED (global output) CHARACTER
130 * Specifies the form of equilibration that was done.
131 * = 'N': No equilibration
132 * = 'R': Row equilibration, i.e., sub( A ) has been pre-
133 * multiplied by diag(R(IA:IA+M-1)),
134 * = 'C': Column equilibration, i.e., sub( A ) has been post-
135 * multiplied by diag(C(JA:JA+N-1)),
136 * = 'B': Both row and column equilibration, i.e., sub( A )
137 * has been replaced by
138 * diag(R(IA:IA+M-1)) * sub( A ) * diag(C(JA:JA+N-1)).
139 *
140 * Internal Parameters
141 * ===================
142 *
143 * THRESH is a threshold value used to decide if row or column scaling
144 * should be done based on the ratio of the row or column scaling
145 * factors. If ROWCND < THRESH, row scaling is done, and if
146 * COLCND < THRESH, column scaling is done.
147 *
148 * LARGE and SMALL are threshold values used to decide if row scaling
149 * should be done based on the absolute size of the largest matrix
150 * element. If AMAX > LARGE or AMAX < SMALL, row scaling is done.
151 *
152 * =====================================================================
153 *
154 * .. Parameters ..
155  INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_,
156  $ lld_, mb_, m_, nb_, n_, rsrc_
157  parameter( block_cyclic_2d = 1, dlen_ = 9, dtype_ = 1,
158  $ ctxt_ = 2, m_ = 3, n_ = 4, mb_ = 5, nb_ = 6,
159  $ rsrc_ = 7, csrc_ = 8, lld_ = 9 )
160  DOUBLE PRECISION ONE, THRESH
161  parameter( one = 1.0d+0, thresh = 0.1d+0 )
162 * ..
163 * .. Local Scalars ..
164  INTEGER I, IACOL, IAROW, ICOFF, ICTXT, IIA, IOFFA,
165  $ iroff, j, jja, lda, mp, mycol, myrow, npcol,
166  $ nprow, nq
167  DOUBLE PRECISION CJ, LARGE, SMALL
168 * ..
169 * .. External Subroutines ..
170  EXTERNAL blacs_gridinfo, infog2l
171 * ..
172 * .. External Functions ..
173  INTEGER NUMROC
174  DOUBLE PRECISION PDLAMCH
175  EXTERNAL numroc, pdlamch
176 * ..
177 * .. Intrinsic Functions ..
178  INTRINSIC mod
179 * ..
180 * .. Executable Statements ..
181 *
182 * Quick return if possible
183 *
184  IF( m.LE.0 .OR. n.LE.0 ) THEN
185  equed = 'N'
186  RETURN
187  END IF
188 *
189 * Get grid parameters and compute local indexes
190 *
191  ictxt = desca( ctxt_ )
192  CALL blacs_gridinfo( ictxt, nprow, npcol, myrow, mycol )
193  CALL infog2l( ia, ja, desca, nprow, npcol, myrow, mycol, iia, jja,
194  $ iarow, iacol )
195  iroff = mod( ia-1, desca( mb_ ) )
196  icoff = mod( ja-1, desca( nb_ ) )
197  mp = numroc( m+iroff, desca( mb_ ), myrow, iarow, nprow )
198  nq = numroc( n+icoff, desca( nb_ ), mycol, iacol, npcol )
199  IF( myrow.EQ.iarow )
200  $ mp = mp - iroff
201  IF( mycol.EQ.iacol )
202  $ nq = nq - icoff
203  lda = desca( lld_ )
204 *
205 * Initialize LARGE and SMALL.
206 *
207  small = pdlamch( ictxt, 'Safe minimum' ) /
208  $ pdlamch( ictxt, 'Precision' )
209  large = one / small
210 *
211  IF( rowcnd.GE.thresh .AND. amax.GE.small .AND. amax.LE.large )
212  $ THEN
213 *
214 * No row scaling
215 *
216  IF( colcnd.GE.thresh ) THEN
217 *
218 * No column scaling
219 *
220  equed = 'N'
221 *
222  ELSE
223 *
224 * Column scaling
225 *
226  ioffa = (jja-1)*lda
227  DO 20 j = jja, jja+nq-1
228  cj = c( j )
229  DO 10 i = iia, iia+mp-1
230  a( ioffa + i ) = cj*a( ioffa + i )
231  10 CONTINUE
232  ioffa = ioffa + lda
233  20 CONTINUE
234  equed = 'C'
235  END IF
236 *
237  ELSE IF( colcnd.GE.thresh ) THEN
238 *
239 * Row scaling, no column scaling
240 *
241  ioffa = (jja-1)*lda
242  DO 40 j = jja, jja+nq-1
243  DO 30 i = iia, iia+mp-1
244  a( ioffa + i ) = r( i )*a( ioffa + i )
245  30 CONTINUE
246  ioffa = ioffa + lda
247  40 CONTINUE
248  equed = 'R'
249 *
250  ELSE
251 *
252 * Row and column scaling
253 *
254  ioffa = (jja-1)*lda
255  DO 60 j = jja, jja+nq-1
256  cj = c( j )
257  DO 50 i = iia, iia+mp-1
258  a( ioffa + i ) = cj*r( i )*a( ioffa + i )
259  50 CONTINUE
260  ioffa = ioffa + lda
261  60 CONTINUE
262  equed = 'B'
263 *
264  END IF
265 *
266  RETURN
267 *
268 * End of PDLAQGE
269 *
270  END
infog2l
subroutine infog2l(GRINDX, GCINDX, DESC, NPROW, NPCOL, MYROW, MYCOL, LRINDX, LCINDX, RSRC, CSRC)
Definition: infog2l.f:3
pdlaqge
subroutine pdlaqge(M, N, A, IA, JA, DESCA, R, C, ROWCND, COLCND, AMAX, EQUED)
Definition: pdlaqge.f:3