LAPACK  3.10.1 LAPACK: Linear Algebra PACKage
sgesvj.f
Go to the documentation of this file.
1 *> \brief \b SGESVJ
2 *
3 * =========== DOCUMENTATION ===========
4 *
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
7 *
8 *> \htmlonly
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/sgesvj.f">
11 *> [TGZ]</a>
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/sgesvj.f">
13 *> [ZIP]</a>
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/sgesvj.f">
15 *> [TXT]</a>
16 *> \endhtmlonly
17 *
18 * Definition:
19 * ===========
20 *
21 * SUBROUTINE SGESVJ( JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V,
22 * LDV, WORK, LWORK, INFO )
23 *
24 * .. Scalar Arguments ..
25 * INTEGER INFO, LDA, LDV, LWORK, M, MV, N
26 * CHARACTER*1 JOBA, JOBU, JOBV
27 * ..
28 * .. Array Arguments ..
29 * REAL A( LDA, * ), SVA( N ), V( LDV, * ),
30 * \$ WORK( LWORK )
31 * ..
32 *
33 *
34 *> \par Purpose:
35 * =============
36 *>
37 *> \verbatim
38 *>
39 *> SGESVJ computes the singular value decomposition (SVD) of a real
40 *> M-by-N matrix A, where M >= N. The SVD of A is written as
41 *> [++] [xx] [x0] [xx]
42 *> A = U * SIGMA * V^t, [++] = [xx] * [ox] * [xx]
43 *> [++] [xx]
44 *> where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
45 *> matrix, and V is an N-by-N orthogonal matrix. The diagonal elements
46 *> of SIGMA are the singular values of A. The columns of U and V are the
47 *> left and the right singular vectors of A, respectively.
48 *> SGESVJ can sometimes compute tiny singular values and their singular vectors much
49 *> more accurately than other SVD routines, see below under Further Details.
50 *> \endverbatim
51 *
52 * Arguments:
53 * ==========
54 *
55 *> \param[in] JOBA
56 *> \verbatim
57 *> JOBA is CHARACTER*1
58 *> Specifies the structure of A.
59 *> = 'L': The input matrix A is lower triangular;
60 *> = 'U': The input matrix A is upper triangular;
61 *> = 'G': The input matrix A is general M-by-N matrix, M >= N.
62 *> \endverbatim
63 *>
64 *> \param[in] JOBU
65 *> \verbatim
66 *> JOBU is CHARACTER*1
67 *> Specifies whether to compute the left singular vectors
68 *> (columns of U):
69 *> = 'U': The left singular vectors corresponding to the nonzero
70 *> singular values are computed and returned in the leading
71 *> columns of A. See more details in the description of A.
72 *> The default numerical orthogonality threshold is set to
73 *> approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=SLAMCH('E').
74 *> = 'C': Analogous to JOBU='U', except that user can control the
75 *> level of numerical orthogonality of the computed left
76 *> singular vectors. TOL can be set to TOL = CTOL*EPS, where
77 *> CTOL is given on input in the array WORK.
78 *> No CTOL smaller than ONE is allowed. CTOL greater
79 *> than 1 / EPS is meaningless. The option 'C'
80 *> can be used if M*EPS is satisfactory orthogonality
81 *> of the computed left singular vectors, so CTOL=M could
82 *> save few sweeps of Jacobi rotations.
83 *> See the descriptions of A and WORK(1).
84 *> = 'N': The matrix U is not computed. However, see the
85 *> description of A.
86 *> \endverbatim
87 *>
88 *> \param[in] JOBV
89 *> \verbatim
90 *> JOBV is CHARACTER*1
91 *> Specifies whether to compute the right singular vectors, that
92 *> is, the matrix V:
93 *> = 'V': the matrix V is computed and returned in the array V
94 *> = 'A': the Jacobi rotations are applied to the MV-by-N
95 *> array V. In other words, the right singular vector
96 *> matrix V is not computed explicitly; instead it is
97 *> applied to an MV-by-N matrix initially stored in the
98 *> first MV rows of V.
99 *> = 'N': the matrix V is not computed and the array V is not
100 *> referenced
101 *> \endverbatim
102 *>
103 *> \param[in] M
104 *> \verbatim
105 *> M is INTEGER
106 *> The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0.
107 *> \endverbatim
108 *>
109 *> \param[in] N
110 *> \verbatim
111 *> N is INTEGER
112 *> The number of columns of the input matrix A.
113 *> M >= N >= 0.
114 *> \endverbatim
115 *>
116 *> \param[in,out] A
117 *> \verbatim
118 *> A is REAL array, dimension (LDA,N)
119 *> On entry, the M-by-N matrix A.
120 *> On exit,
121 *> If JOBU = 'U' .OR. JOBU = 'C':
122 *> If INFO = 0:
123 *> RANKA orthonormal columns of U are returned in the
124 *> leading RANKA columns of the array A. Here RANKA <= N
125 *> is the number of computed singular values of A that are
126 *> above the underflow threshold SLAMCH('S'). The singular
127 *> vectors corresponding to underflowed or zero singular
128 *> values are not computed. The value of RANKA is returned
129 *> in the array WORK as RANKA=NINT(WORK(2)). Also see the
130 *> descriptions of SVA and WORK. The computed columns of U
131 *> are mutually numerically orthogonal up to approximately
132 *> TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU = 'C'),
133 *> see the description of JOBU.
134 *> If INFO > 0,
135 *> the procedure SGESVJ did not converge in the given number
136 *> of iterations (sweeps). In that case, the computed
137 *> columns of U may not be orthogonal up to TOL. The output
138 *> U (stored in A), SIGMA (given by the computed singular
139 *> values in SVA(1:N)) and V is still a decomposition of the
140 *> input matrix A in the sense that the residual
141 *> ||A-SCALE*U*SIGMA*V^T||_2 / ||A||_2 is small.
142 *> If JOBU = 'N':
143 *> If INFO = 0:
144 *> Note that the left singular vectors are 'for free' in the
145 *> one-sided Jacobi SVD algorithm. However, if only the
146 *> singular values are needed, the level of numerical
147 *> orthogonality of U is not an issue and iterations are
148 *> stopped when the columns of the iterated matrix are
149 *> numerically orthogonal up to approximately M*EPS. Thus,
150 *> on exit, A contains the columns of U scaled with the
151 *> corresponding singular values.
152 *> If INFO > 0:
153 *> the procedure SGESVJ did not converge in the given number
154 *> of iterations (sweeps).
155 *> \endverbatim
156 *>
157 *> \param[in] LDA
158 *> \verbatim
159 *> LDA is INTEGER
160 *> The leading dimension of the array A. LDA >= max(1,M).
161 *> \endverbatim
162 *>
163 *> \param[out] SVA
164 *> \verbatim
165 *> SVA is REAL array, dimension (N)
166 *> On exit,
167 *> If INFO = 0 :
168 *> depending on the value SCALE = WORK(1), we have:
169 *> If SCALE = ONE:
170 *> SVA(1:N) contains the computed singular values of A.
171 *> During the computation SVA contains the Euclidean column
172 *> norms of the iterated matrices in the array A.
173 *> If SCALE .NE. ONE:
174 *> The singular values of A are SCALE*SVA(1:N), and this
175 *> factored representation is due to the fact that some of the
176 *> singular values of A might underflow or overflow.
177 *>
178 *> If INFO > 0 :
179 *> the procedure SGESVJ did not converge in the given number of
180 *> iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.
181 *> \endverbatim
182 *>
183 *> \param[in] MV
184 *> \verbatim
185 *> MV is INTEGER
186 *> If JOBV = 'A', then the product of Jacobi rotations in SGESVJ
187 *> is applied to the first MV rows of V. See the description of JOBV.
188 *> \endverbatim
189 *>
190 *> \param[in,out] V
191 *> \verbatim
192 *> V is REAL array, dimension (LDV,N)
193 *> If JOBV = 'V', then V contains on exit the N-by-N matrix of
194 *> the right singular vectors;
195 *> If JOBV = 'A', then V contains the product of the computed right
196 *> singular vector matrix and the initial matrix in
197 *> the array V.
198 *> If JOBV = 'N', then V is not referenced.
199 *> \endverbatim
200 *>
201 *> \param[in] LDV
202 *> \verbatim
203 *> LDV is INTEGER
204 *> The leading dimension of the array V, LDV >= 1.
205 *> If JOBV = 'V', then LDV >= max(1,N).
206 *> If JOBV = 'A', then LDV >= max(1,MV) .
207 *> \endverbatim
208 *>
209 *> \param[in,out] WORK
210 *> \verbatim
211 *> WORK is REAL array, dimension (LWORK)
212 *> On entry,
213 *> If JOBU = 'C' :
214 *> WORK(1) = CTOL, where CTOL defines the threshold for convergence.
215 *> The process stops if all columns of A are mutually
216 *> orthogonal up to CTOL*EPS, EPS=SLAMCH('E').
217 *> It is required that CTOL >= ONE, i.e. it is not
218 *> allowed to force the routine to obtain orthogonality
219 *> below EPSILON.
220 *> On exit,
221 *> WORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N)
222 *> are the computed singular vcalues of A.
223 *> (See description of SVA().)
224 *> WORK(2) = NINT(WORK(2)) is the number of the computed nonzero
225 *> singular values.
226 *> WORK(3) = NINT(WORK(3)) is the number of the computed singular
227 *> values that are larger than the underflow threshold.
228 *> WORK(4) = NINT(WORK(4)) is the number of sweeps of Jacobi
229 *> rotations needed for numerical convergence.
230 *> WORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep.
231 *> This is useful information in cases when SGESVJ did
232 *> not converge, as it can be used to estimate whether
233 *> the output is still useful and for post festum analysis.
234 *> WORK(6) = the largest absolute value over all sines of the
235 *> Jacobi rotation angles in the last sweep. It can be
236 *> useful for a post festum analysis.
237 *> \endverbatim
238 *>
239 *> \param[in] LWORK
240 *> \verbatim
241 *> LWORK is INTEGER
242 *> length of WORK, WORK >= MAX(6,M+N)
243 *> \endverbatim
244 *>
245 *> \param[out] INFO
246 *> \verbatim
247 *> INFO is INTEGER
248 *> = 0: successful exit.
249 *> < 0: if INFO = -i, then the i-th argument had an illegal value
250 *> > 0: SGESVJ did not converge in the maximal allowed number (30)
251 *> of sweeps. The output may still be useful. See the
252 *> description of WORK.
253 *> \endverbatim
254 *
255 * Authors:
256 * ========
257 *
258 *> \author Univ. of Tennessee
259 *> \author Univ. of California Berkeley
260 *> \author Univ. of Colorado Denver
261 *> \author NAG Ltd.
262 *
263 *> \ingroup realGEcomputational
264 *
265 *> \par Further Details:
266 * =====================
267 *>
268 *> The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
269 *> rotations. The rotations are implemented as fast scaled rotations of
270 *> Anda and Park [1]. In the case of underflow of the Jacobi angle, a
271 *> modified Jacobi transformation of Drmac [4] is used. Pivot strategy uses
272 *> column interchanges of de Rijk [2]. The relative accuracy of the computed
273 *> singular values and the accuracy of the computed singular vectors (in
274 *> angle metric) is as guaranteed by the theory of Demmel and Veselic [3].
275 *> The condition number that determines the accuracy in the full rank case
276 *> is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
277 *> spectral condition number. The best performance of this Jacobi SVD
278 *> procedure is achieved if used in an accelerated version of Drmac and
279 *> Veselic [5,6], and it is the kernel routine in the SIGMA library [7].
280 *> Some tuning parameters (marked with [TP]) are available for the
281 *> implementer. \n
282 *> The computational range for the nonzero singular values is the machine
283 *> number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
284 *> denormalized singular values can be computed with the corresponding
285 *> gradual loss of accurate digits.
286 *>
287 *> \par Contributors:
288 * ==================
289 *>
290 *> Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)
291 *>
292 *> \par References:
293 * ================
294 *>
295 *> [1] A. A. Anda and H. Park: Fast plane rotations with dynamic scaling. \n
296 *> SIAM J. matrix Anal. Appl., Vol. 15 (1994), pp. 162-174. \n\n
297 *> [2] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the
298 *> singular value decomposition on a vector computer. \n
299 *> SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371. \n\n
300 *> [3] J. Demmel and K. Veselic: Jacobi method is more accurate than QR. \n
301 *> [4] Z. Drmac: Implementation of Jacobi rotations for accurate singular
302 *> value computation in floating point arithmetic. \n
303 *> SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222. \n\n
304 *> [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. \n
305 *> SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. \n
306 *> LAPACK Working note 169. \n\n
307 *> [6] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. \n
308 *> SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. \n
309 *> LAPACK Working note 170. \n\n
310 *> [7] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
311 *> QSVD, (H,K)-SVD computations.\n
312 *> Department of Mathematics, University of Zagreb, 2008.
313 *>
314 *> \par Bugs, Examples and Comments:
315 * =================================
316 *>
317 *> Please report all bugs and send interesting test examples and comments to
318 *> drmac@math.hr. Thank you.
319 *
320 * =====================================================================
321  SUBROUTINE sgesvj( JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V,
322  \$ LDV, WORK, LWORK, INFO )
323 *
324 * -- LAPACK computational routine --
325 * -- LAPACK is a software package provided by Univ. of Tennessee, --
326 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
327 *
328 * .. Scalar Arguments ..
329  INTEGER INFO, LDA, LDV, LWORK, M, MV, N
330  CHARACTER*1 JOBA, JOBU, JOBV
331 * ..
332 * .. Array Arguments ..
333  REAL A( LDA, * ), SVA( N ), V( LDV, * ),
334  \$ work( lwork )
335 * ..
336 *
337 * =====================================================================
338 *
339 * .. Local Parameters ..
340  REAL ZERO, HALF, ONE
341  parameter( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
342  INTEGER NSWEEP
343  parameter( nsweep = 30 )
344 * ..
345 * .. Local Scalars ..
346  REAL AAPP, AAPP0, AAPQ, AAQQ, APOAQ, AQOAP, BIG,
347  \$ bigtheta, cs, ctol, epsln, large, mxaapq,
348  \$ mxsinj, rootbig, rooteps, rootsfmin, roottol,
349  \$ skl, sfmin, small, sn, t, temp1, theta,
350  \$ thsign, tol
351  INTEGER BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1,
352  \$ iswrot, jbc, jgl, kbl, lkahead, mvl, n2, n34,
353  \$ n4, nbl, notrot, p, pskipped, q, rowskip,
354  \$ swband
355  LOGICAL APPLV, GOSCALE, LOWER, LSVEC, NOSCALE, ROTOK,
356  \$ rsvec, uctol, upper
357 * ..
358 * .. Local Arrays ..
359  REAL FASTR( 5 )
360 * ..
361 * .. Intrinsic Functions ..
362  INTRINSIC abs, max, min, float, sign, sqrt
363 * ..
364 * .. External Functions ..
365 * ..
366 * from BLAS
367  REAL SDOT, SNRM2
368  EXTERNAL sdot, snrm2
369  INTEGER ISAMAX
370  EXTERNAL isamax
371 * from LAPACK
372  REAL SLAMCH
373  EXTERNAL slamch
374  LOGICAL LSAME
375  EXTERNAL lsame
376 * ..
377 * .. External Subroutines ..
378 * ..
379 * from BLAS
380  EXTERNAL saxpy, scopy, srotm, sscal, sswap
381 * from LAPACK
382  EXTERNAL slascl, slaset, slassq, xerbla
383 *
384  EXTERNAL sgsvj0, sgsvj1
385 * ..
386 * .. Executable Statements ..
387 *
388 * Test the input arguments
389 *
390  lsvec = lsame( jobu, 'U' )
391  uctol = lsame( jobu, 'C' )
392  rsvec = lsame( jobv, 'V' )
393  applv = lsame( jobv, 'A' )
394  upper = lsame( joba, 'U' )
395  lower = lsame( joba, 'L' )
396 *
397  IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
398  info = -1
399  ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
400  info = -2
401  ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
402  info = -3
403  ELSE IF( m.LT.0 ) THEN
404  info = -4
405  ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
406  info = -5
407  ELSE IF( lda.LT.m ) THEN
408  info = -7
409  ELSE IF( mv.LT.0 ) THEN
410  info = -9
411  ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
412  \$ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
413  info = -11
414  ELSE IF( uctol .AND. ( work( 1 ).LE.one ) ) THEN
415  info = -12
416  ELSE IF( lwork.LT.max( m+n, 6 ) ) THEN
417  info = -13
418  ELSE
419  info = 0
420  END IF
421 *
422 * #:(
423  IF( info.NE.0 ) THEN
424  CALL xerbla( 'SGESVJ', -info )
425  RETURN
426  END IF
427 *
428 * #:) Quick return for void matrix
429 *
430  IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
431 *
432 * Set numerical parameters
433 * The stopping criterion for Jacobi rotations is
434 *
435 * max_{i<>j}|A(:,i)^T * A(:,j)|/(||A(:,i)||*||A(:,j)||) < CTOL*EPS
436 *
437 * where EPS is the round-off and CTOL is defined as follows:
438 *
439  IF( uctol ) THEN
440 * ... user controlled
441  ctol = work( 1 )
442  ELSE
443 * ... default
444  IF( lsvec .OR. rsvec .OR. applv ) THEN
445  ctol = sqrt( float( m ) )
446  ELSE
447  ctol = float( m )
448  END IF
449  END IF
450 * ... and the machine dependent parameters are
451 *[!] (Make sure that SLAMCH() works properly on the target machine.)
452 *
453  epsln = slamch( 'Epsilon' )
454  rooteps = sqrt( epsln )
455  sfmin = slamch( 'SafeMinimum' )
456  rootsfmin = sqrt( sfmin )
457  small = sfmin / epsln
458  big = slamch( 'Overflow' )
459 * BIG = ONE / SFMIN
460  rootbig = one / rootsfmin
461  large = big / sqrt( float( m*n ) )
462  bigtheta = one / rooteps
463 *
464  tol = ctol*epsln
465  roottol = sqrt( tol )
466 *
467  IF( float( m )*epsln.GE.one ) THEN
468  info = -4
469  CALL xerbla( 'SGESVJ', -info )
470  RETURN
471  END IF
472 *
473 * Initialize the right singular vector matrix.
474 *
475  IF( rsvec ) THEN
476  mvl = n
477  CALL slaset( 'A', mvl, n, zero, one, v, ldv )
478  ELSE IF( applv ) THEN
479  mvl = mv
480  END IF
481  rsvec = rsvec .OR. applv
482 *
483 * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
484 *(!) If necessary, scale A to protect the largest singular value
485 * from overflow. It is possible that saving the largest singular
486 * value destroys the information about the small ones.
487 * This initial scaling is almost minimal in the sense that the
488 * goal is to make sure that no column norm overflows, and that
489 * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
490 * in A are detected, the procedure returns with INFO=-6.
491 *
492  skl = one / sqrt( float( m )*float( n ) )
493  noscale = .true.
494  goscale = .true.
495 *
496  IF( lower ) THEN
497 * the input matrix is M-by-N lower triangular (trapezoidal)
498  DO 1874 p = 1, n
499  aapp = zero
500  aaqq = one
501  CALL slassq( m-p+1, a( p, p ), 1, aapp, aaqq )
502  IF( aapp.GT.big ) THEN
503  info = -6
504  CALL xerbla( 'SGESVJ', -info )
505  RETURN
506  END IF
507  aaqq = sqrt( aaqq )
508  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
509  sva( p ) = aapp*aaqq
510  ELSE
511  noscale = .false.
512  sva( p ) = aapp*( aaqq*skl )
513  IF( goscale ) THEN
514  goscale = .false.
515  DO 1873 q = 1, p - 1
516  sva( q ) = sva( q )*skl
517  1873 CONTINUE
518  END IF
519  END IF
520  1874 CONTINUE
521  ELSE IF( upper ) THEN
522 * the input matrix is M-by-N upper triangular (trapezoidal)
523  DO 2874 p = 1, n
524  aapp = zero
525  aaqq = one
526  CALL slassq( p, a( 1, p ), 1, aapp, aaqq )
527  IF( aapp.GT.big ) THEN
528  info = -6
529  CALL xerbla( 'SGESVJ', -info )
530  RETURN
531  END IF
532  aaqq = sqrt( aaqq )
533  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
534  sva( p ) = aapp*aaqq
535  ELSE
536  noscale = .false.
537  sva( p ) = aapp*( aaqq*skl )
538  IF( goscale ) THEN
539  goscale = .false.
540  DO 2873 q = 1, p - 1
541  sva( q ) = sva( q )*skl
542  2873 CONTINUE
543  END IF
544  END IF
545  2874 CONTINUE
546  ELSE
547 * the input matrix is M-by-N general dense
548  DO 3874 p = 1, n
549  aapp = zero
550  aaqq = one
551  CALL slassq( m, a( 1, p ), 1, aapp, aaqq )
552  IF( aapp.GT.big ) THEN
553  info = -6
554  CALL xerbla( 'SGESVJ', -info )
555  RETURN
556  END IF
557  aaqq = sqrt( aaqq )
558  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
559  sva( p ) = aapp*aaqq
560  ELSE
561  noscale = .false.
562  sva( p ) = aapp*( aaqq*skl )
563  IF( goscale ) THEN
564  goscale = .false.
565  DO 3873 q = 1, p - 1
566  sva( q ) = sva( q )*skl
567  3873 CONTINUE
568  END IF
569  END IF
570  3874 CONTINUE
571  END IF
572 *
573  IF( noscale )skl = one
574 *
575 * Move the smaller part of the spectrum from the underflow threshold
576 *(!) Start by determining the position of the nonzero entries of the
577 * array SVA() relative to ( SFMIN, BIG ).
578 *
579  aapp = zero
580  aaqq = big
581  DO 4781 p = 1, n
582  IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
583  aapp = max( aapp, sva( p ) )
584  4781 CONTINUE
585 *
586 * #:) Quick return for zero matrix
587 *
588  IF( aapp.EQ.zero ) THEN
589  IF( lsvec )CALL slaset( 'G', m, n, zero, one, a, lda )
590  work( 1 ) = one
591  work( 2 ) = zero
592  work( 3 ) = zero
593  work( 4 ) = zero
594  work( 5 ) = zero
595  work( 6 ) = zero
596  RETURN
597  END IF
598 *
599 * #:) Quick return for one-column matrix
600 *
601  IF( n.EQ.1 ) THEN
602  IF( lsvec )CALL slascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
603  \$ a( 1, 1 ), lda, ierr )
604  work( 1 ) = one / skl
605  IF( sva( 1 ).GE.sfmin ) THEN
606  work( 2 ) = one
607  ELSE
608  work( 2 ) = zero
609  END IF
610  work( 3 ) = zero
611  work( 4 ) = zero
612  work( 5 ) = zero
613  work( 6 ) = zero
614  RETURN
615  END IF
616 *
617 * Protect small singular values from underflow, and try to
618 * avoid underflows/overflows in computing Jacobi rotations.
619 *
620  sn = sqrt( sfmin / epsln )
621  temp1 = sqrt( big / float( n ) )
622  IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
623  \$ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
624  temp1 = min( big, temp1 / aapp )
625 * AAQQ = AAQQ*TEMP1
626 * AAPP = AAPP*TEMP1
627  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
628  temp1 = min( sn / aaqq, big / ( aapp*sqrt( float( n ) ) ) )
629 * AAQQ = AAQQ*TEMP1
630 * AAPP = AAPP*TEMP1
631  ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
632  temp1 = max( sn / aaqq, temp1 / aapp )
633 * AAQQ = AAQQ*TEMP1
634 * AAPP = AAPP*TEMP1
635  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
636  temp1 = min( sn / aaqq, big / ( sqrt( float( n ) )*aapp ) )
637 * AAQQ = AAQQ*TEMP1
638 * AAPP = AAPP*TEMP1
639  ELSE
640  temp1 = one
641  END IF
642 *
643 * Scale, if necessary
644 *
645  IF( temp1.NE.one ) THEN
646  CALL slascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
647  END IF
648  skl = temp1*skl
649  IF( skl.NE.one ) THEN
650  CALL slascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
651  skl = one / skl
652  END IF
653 *
654 * Row-cyclic Jacobi SVD algorithm with column pivoting
655 *
656  emptsw = ( n*( n-1 ) ) / 2
657  notrot = 0
658  fastr( 1 ) = zero
659 *
660 * A is represented in factored form A = A * diag(WORK), where diag(WORK)
661 * is initialized to identity. WORK is updated during fast scaled
662 * rotations.
663 *
664  DO 1868 q = 1, n
665  work( q ) = one
666  1868 CONTINUE
667 *
668 *
669  swband = 3
670 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
671 * if SGESVJ is used as a computational routine in the preconditioned
672 * Jacobi SVD algorithm SGESVJ. For sweeps i=1:SWBAND the procedure
673 * works on pivots inside a band-like region around the diagonal.
674 * The boundaries are determined dynamically, based on the number of
675 * pivots above a threshold.
676 *
677  kbl = min( 8, n )
678 *[TP] KBL is a tuning parameter that defines the tile size in the
679 * tiling of the p-q loops of pivot pairs. In general, an optimal
680 * value of KBL depends on the matrix dimensions and on the
681 * parameters of the computer's memory.
682 *
683  nbl = n / kbl
684  IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
685 *
686  blskip = kbl**2
687 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
688 *
689  rowskip = min( 5, kbl )
690 *[TP] ROWSKIP is a tuning parameter.
691 *
693 *[TP] LKAHEAD is a tuning parameter.
694 *
695 * Quasi block transformations, using the lower (upper) triangular
696 * structure of the input matrix. The quasi-block-cycling usually
697 * invokes cubic convergence. Big part of this cycle is done inside
698 * canonical subspaces of dimensions less than M.
699 *
700  IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
701 *[TP] The number of partition levels and the actual partition are
702 * tuning parameters.
703  n4 = n / 4
704  n2 = n / 2
705  n34 = 3*n4
706  IF( applv ) THEN
707  q = 0
708  ELSE
709  q = 1
710  END IF
711 *
712  IF( lower ) THEN
713 *
714 * This works very well on lower triangular matrices, in particular
715 * in the framework of the preconditioned Jacobi SVD (xGEJSV).
716 * The idea is simple:
717 * [+ 0 0 0] Note that Jacobi transformations of [0 0]
718 * [+ + 0 0] [0 0]
719 * [+ + x 0] actually work on [x 0] [x 0]
720 * [+ + x x] [x x]. [x x]
721 *
722  CALL sgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
723  \$ work( n34+1 ), sva( n34+1 ), mvl,
724  \$ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
725  \$ 2, work( n+1 ), lwork-n, ierr )
726 *
727  CALL sgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
728  \$ work( n2+1 ), sva( n2+1 ), mvl,
729  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
730  \$ work( n+1 ), lwork-n, ierr )
731 *
732  CALL sgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
733  \$ work( n2+1 ), sva( n2+1 ), mvl,
734  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
735  \$ work( n+1 ), lwork-n, ierr )
736 *
737  CALL sgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
738  \$ work( n4+1 ), sva( n4+1 ), mvl,
739  \$ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
740  \$ work( n+1 ), lwork-n, ierr )
741 *
742  CALL sgsvj0( jobv, m, n4, a, lda, work, sva, mvl, v, ldv,
743  \$ epsln, sfmin, tol, 1, work( n+1 ), lwork-n,
744  \$ ierr )
745 *
746  CALL sgsvj1( jobv, m, n2, n4, a, lda, work, sva, mvl, v,
747  \$ ldv, epsln, sfmin, tol, 1, work( n+1 ),
748  \$ lwork-n, ierr )
749 *
750 *
751  ELSE IF( upper ) THEN
752 *
753 *
754  CALL sgsvj0( jobv, n4, n4, a, lda, work, sva, mvl, v, ldv,
755  \$ epsln, sfmin, tol, 2, work( n+1 ), lwork-n,
756  \$ ierr )
757 *
758  CALL sgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, work( n4+1 ),
759  \$ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
760  \$ epsln, sfmin, tol, 1, work( n+1 ), lwork-n,
761  \$ ierr )
762 *
763  CALL sgsvj1( jobv, n2, n2, n4, a, lda, work, sva, mvl, v,
764  \$ ldv, epsln, sfmin, tol, 1, work( n+1 ),
765  \$ lwork-n, ierr )
766 *
767  CALL sgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
768  \$ work( n2+1 ), sva( n2+1 ), mvl,
769  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
770  \$ work( n+1 ), lwork-n, ierr )
771
772  END IF
773 *
774  END IF
775 *
776 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
777 *
778  DO 1993 i = 1, nsweep
779 *
780 * .. go go go ...
781 *
782  mxaapq = zero
783  mxsinj = zero
784  iswrot = 0
785 *
786  notrot = 0
787  pskipped = 0
788 *
789 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
790 * 1 <= p < q <= N. This is the first step toward a blocked implementation
791 * of the rotations. New implementation, based on block transformations,
792 * is under development.
793 *
794  DO 2000 ibr = 1, nbl
795 *
796  igl = ( ibr-1 )*kbl + 1
797 *
798  DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
799 *
800  igl = igl + ir1*kbl
801 *
802  DO 2001 p = igl, min( igl+kbl-1, n-1 )
803 *
804 * .. de Rijk's pivoting
805 *
806  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
807  IF( p.NE.q ) THEN
808  CALL sswap( m, a( 1, p ), 1, a( 1, q ), 1 )
809  IF( rsvec )CALL sswap( mvl, v( 1, p ), 1,
810  \$ v( 1, q ), 1 )
811  temp1 = sva( p )
812  sva( p ) = sva( q )
813  sva( q ) = temp1
814  temp1 = work( p )
815  work( p ) = work( q )
816  work( q ) = temp1
817  END IF
818 *
819  IF( ir1.EQ.0 ) THEN
820 *
821 * Column norms are periodically updated by explicit
822 * norm computation.
823 * Caveat:
824 * Unfortunately, some BLAS implementations compute SNRM2(M,A(1,p),1)
825 * as SQRT(SDOT(M,A(1,p),1,A(1,p),1)), which may cause the result to
826 * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
827 * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
828 * Hence, SNRM2 cannot be trusted, not even in the case when
829 * the true norm is far from the under(over)flow boundaries.
830 * If properly implemented SNRM2 is available, the IF-THEN-ELSE
831 * below should read "AAPP = SNRM2( M, A(1,p), 1 ) * WORK(p)".
832 *
833  IF( ( sva( p ).LT.rootbig ) .AND.
834  \$ ( sva( p ).GT.rootsfmin ) ) THEN
835  sva( p ) = snrm2( m, a( 1, p ), 1 )*work( p )
836  ELSE
837  temp1 = zero
838  aapp = one
839  CALL slassq( m, a( 1, p ), 1, temp1, aapp )
840  sva( p ) = temp1*sqrt( aapp )*work( p )
841  END IF
842  aapp = sva( p )
843  ELSE
844  aapp = sva( p )
845  END IF
846 *
847  IF( aapp.GT.zero ) THEN
848 *
849  pskipped = 0
850 *
851  DO 2002 q = p + 1, min( igl+kbl-1, n )
852 *
853  aaqq = sva( q )
854 *
855  IF( aaqq.GT.zero ) THEN
856 *
857  aapp0 = aapp
858  IF( aaqq.GE.one ) THEN
859  rotok = ( small*aapp ).LE.aaqq
860  IF( aapp.LT.( big / aaqq ) ) THEN
861  aapq = ( sdot( m, a( 1, p ), 1, a( 1,
862  \$ q ), 1 )*work( p )*work( q ) /
863  \$ aaqq ) / aapp
864  ELSE
865  CALL scopy( m, a( 1, p ), 1,
866  \$ work( n+1 ), 1 )
867  CALL slascl( 'G', 0, 0, aapp,
868  \$ work( p ), m, 1,
869  \$ work( n+1 ), lda, ierr )
870  aapq = sdot( m, work( n+1 ), 1,
871  \$ a( 1, q ), 1 )*work( q ) / aaqq
872  END IF
873  ELSE
874  rotok = aapp.LE.( aaqq / small )
875  IF( aapp.GT.( small / aaqq ) ) THEN
876  aapq = ( sdot( m, a( 1, p ), 1, a( 1,
877  \$ q ), 1 )*work( p )*work( q ) /
878  \$ aaqq ) / aapp
879  ELSE
880  CALL scopy( m, a( 1, q ), 1,
881  \$ work( n+1 ), 1 )
882  CALL slascl( 'G', 0, 0, aaqq,
883  \$ work( q ), m, 1,
884  \$ work( n+1 ), lda, ierr )
885  aapq = sdot( m, work( n+1 ), 1,
886  \$ a( 1, p ), 1 )*work( p ) / aapp
887  END IF
888  END IF
889 *
890  mxaapq = max( mxaapq, abs( aapq ) )
891 *
892 * TO rotate or NOT to rotate, THAT is the question ...
893 *
894  IF( abs( aapq ).GT.tol ) THEN
895 *
896 * .. rotate
897 *[RTD] ROTATED = ROTATED + ONE
898 *
899  IF( ir1.EQ.0 ) THEN
900  notrot = 0
901  pskipped = 0
902  iswrot = iswrot + 1
903  END IF
904 *
905  IF( rotok ) THEN
906 *
907  aqoap = aaqq / aapp
908  apoaq = aapp / aaqq
909  theta = -half*abs( aqoap-apoaq ) / aapq
910 *
911  IF( abs( theta ).GT.bigtheta ) THEN
912 *
913  t = half / theta
914  fastr( 3 ) = t*work( p ) / work( q )
915  fastr( 4 ) = -t*work( q ) /
916  \$ work( p )
917  CALL srotm( m, a( 1, p ), 1,
918  \$ a( 1, q ), 1, fastr )
919  IF( rsvec )CALL srotm( mvl,
920  \$ v( 1, p ), 1,
921  \$ v( 1, q ), 1,
922  \$ fastr )
923  sva( q ) = aaqq*sqrt( max( zero,
924  \$ one+t*apoaq*aapq ) )
925  aapp = aapp*sqrt( max( zero,
926  \$ one-t*aqoap*aapq ) )
927  mxsinj = max( mxsinj, abs( t ) )
928 *
929  ELSE
930 *
931 * .. choose correct signum for THETA and rotate
932 *
933  thsign = -sign( one, aapq )
934  t = one / ( theta+thsign*
935  \$ sqrt( one+theta*theta ) )
936  cs = sqrt( one / ( one+t*t ) )
937  sn = t*cs
938 *
939  mxsinj = max( mxsinj, abs( sn ) )
940  sva( q ) = aaqq*sqrt( max( zero,
941  \$ one+t*apoaq*aapq ) )
942  aapp = aapp*sqrt( max( zero,
943  \$ one-t*aqoap*aapq ) )
944 *
945  apoaq = work( p ) / work( q )
946  aqoap = work( q ) / work( p )
947  IF( work( p ).GE.one ) THEN
948  IF( work( q ).GE.one ) THEN
949  fastr( 3 ) = t*apoaq
950  fastr( 4 ) = -t*aqoap
951  work( p ) = work( p )*cs
952  work( q ) = work( q )*cs
953  CALL srotm( m, a( 1, p ), 1,
954  \$ a( 1, q ), 1,
955  \$ fastr )
956  IF( rsvec )CALL srotm( mvl,
957  \$ v( 1, p ), 1, v( 1, q ),
958  \$ 1, fastr )
959  ELSE
960  CALL saxpy( m, -t*aqoap,
961  \$ a( 1, q ), 1,
962  \$ a( 1, p ), 1 )
963  CALL saxpy( m, cs*sn*apoaq,
964  \$ a( 1, p ), 1,
965  \$ a( 1, q ), 1 )
966  work( p ) = work( p )*cs
967  work( q ) = work( q ) / cs
968  IF( rsvec ) THEN
969  CALL saxpy( mvl, -t*aqoap,
970  \$ v( 1, q ), 1,
971  \$ v( 1, p ), 1 )
972  CALL saxpy( mvl,
973  \$ cs*sn*apoaq,
974  \$ v( 1, p ), 1,
975  \$ v( 1, q ), 1 )
976  END IF
977  END IF
978  ELSE
979  IF( work( q ).GE.one ) THEN
980  CALL saxpy( m, t*apoaq,
981  \$ a( 1, p ), 1,
982  \$ a( 1, q ), 1 )
983  CALL saxpy( m, -cs*sn*aqoap,
984  \$ a( 1, q ), 1,
985  \$ a( 1, p ), 1 )
986  work( p ) = work( p ) / cs
987  work( q ) = work( q )*cs
988  IF( rsvec ) THEN
989  CALL saxpy( mvl, t*apoaq,
990  \$ v( 1, p ), 1,
991  \$ v( 1, q ), 1 )
992  CALL saxpy( mvl,
993  \$ -cs*sn*aqoap,
994  \$ v( 1, q ), 1,
995  \$ v( 1, p ), 1 )
996  END IF
997  ELSE
998  IF( work( p ).GE.work( q ) )
999  \$ THEN
1000  CALL saxpy( m, -t*aqoap,
1001  \$ a( 1, q ), 1,
1002  \$ a( 1, p ), 1 )
1003  CALL saxpy( m, cs*sn*apoaq,
1004  \$ a( 1, p ), 1,
1005  \$ a( 1, q ), 1 )
1006  work( p ) = work( p )*cs
1007  work( q ) = work( q ) / cs
1008  IF( rsvec ) THEN
1009  CALL saxpy( mvl,
1010  \$ -t*aqoap,
1011  \$ v( 1, q ), 1,
1012  \$ v( 1, p ), 1 )
1013  CALL saxpy( mvl,
1014  \$ cs*sn*apoaq,
1015  \$ v( 1, p ), 1,
1016  \$ v( 1, q ), 1 )
1017  END IF
1018  ELSE
1019  CALL saxpy( m, t*apoaq,
1020  \$ a( 1, p ), 1,
1021  \$ a( 1, q ), 1 )
1022  CALL saxpy( m,
1023  \$ -cs*sn*aqoap,
1024  \$ a( 1, q ), 1,
1025  \$ a( 1, p ), 1 )
1026  work( p ) = work( p ) / cs
1027  work( q ) = work( q )*cs
1028  IF( rsvec ) THEN
1029  CALL saxpy( mvl,
1030  \$ t*apoaq, v( 1, p ),
1031  \$ 1, v( 1, q ), 1 )
1032  CALL saxpy( mvl,
1033  \$ -cs*sn*aqoap,
1034  \$ v( 1, q ), 1,
1035  \$ v( 1, p ), 1 )
1036  END IF
1037  END IF
1038  END IF
1039  END IF
1040  END IF
1041 *
1042  ELSE
1043 * .. have to use modified Gram-Schmidt like transformation
1044  CALL scopy( m, a( 1, p ), 1,
1045  \$ work( n+1 ), 1 )
1046  CALL slascl( 'G', 0, 0, aapp, one, m,
1047  \$ 1, work( n+1 ), lda,
1048  \$ ierr )
1049  CALL slascl( 'G', 0, 0, aaqq, one, m,
1050  \$ 1, a( 1, q ), lda, ierr )
1051  temp1 = -aapq*work( p ) / work( q )
1052  CALL saxpy( m, temp1, work( n+1 ), 1,
1053  \$ a( 1, q ), 1 )
1054  CALL slascl( 'G', 0, 0, one, aaqq, m,
1055  \$ 1, a( 1, q ), lda, ierr )
1056  sva( q ) = aaqq*sqrt( max( zero,
1057  \$ one-aapq*aapq ) )
1058  mxsinj = max( mxsinj, sfmin )
1059  END IF
1060 * END IF ROTOK THEN ... ELSE
1061 *
1062 * In the case of cancellation in updating SVA(q), SVA(p)
1063 * recompute SVA(q), SVA(p).
1064 *
1065  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1066  \$ THEN
1067  IF( ( aaqq.LT.rootbig ) .AND.
1068  \$ ( aaqq.GT.rootsfmin ) ) THEN
1069  sva( q ) = snrm2( m, a( 1, q ), 1 )*
1070  \$ work( q )
1071  ELSE
1072  t = zero
1073  aaqq = one
1074  CALL slassq( m, a( 1, q ), 1, t,
1075  \$ aaqq )
1076  sva( q ) = t*sqrt( aaqq )*work( q )
1077  END IF
1078  END IF
1079  IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1080  IF( ( aapp.LT.rootbig ) .AND.
1081  \$ ( aapp.GT.rootsfmin ) ) THEN
1082  aapp = snrm2( m, a( 1, p ), 1 )*
1083  \$ work( p )
1084  ELSE
1085  t = zero
1086  aapp = one
1087  CALL slassq( m, a( 1, p ), 1, t,
1088  \$ aapp )
1089  aapp = t*sqrt( aapp )*work( p )
1090  END IF
1091  sva( p ) = aapp
1092  END IF
1093 *
1094  ELSE
1095 * A(:,p) and A(:,q) already numerically orthogonal
1096  IF( ir1.EQ.0 )notrot = notrot + 1
1097 *[RTD] SKIPPED = SKIPPED + 1
1098  pskipped = pskipped + 1
1099  END IF
1100  ELSE
1101 * A(:,q) is zero column
1102  IF( ir1.EQ.0 )notrot = notrot + 1
1103  pskipped = pskipped + 1
1104  END IF
1105 *
1106  IF( ( i.LE.swband ) .AND.
1107  \$ ( pskipped.GT.rowskip ) ) THEN
1108  IF( ir1.EQ.0 )aapp = -aapp
1109  notrot = 0
1110  GO TO 2103
1111  END IF
1112 *
1113  2002 CONTINUE
1114 * END q-LOOP
1115 *
1116  2103 CONTINUE
1117 * bailed out of q-loop
1118 *
1119  sva( p ) = aapp
1120 *
1121  ELSE
1122  sva( p ) = aapp
1123  IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1124  \$ notrot = notrot + min( igl+kbl-1, n ) - p
1125  END IF
1126 *
1127  2001 CONTINUE
1128 * end of the p-loop
1129 * end of doing the block ( ibr, ibr )
1130  1002 CONTINUE
1131 * end of ir1-loop
1132 *
1133 * ... go to the off diagonal blocks
1134 *
1135  igl = ( ibr-1 )*kbl + 1
1136 *
1137  DO 2010 jbc = ibr + 1, nbl
1138 *
1139  jgl = ( jbc-1 )*kbl + 1
1140 *
1141 * doing the block at ( ibr, jbc )
1142 *
1143  ijblsk = 0
1144  DO 2100 p = igl, min( igl+kbl-1, n )
1145 *
1146  aapp = sva( p )
1147  IF( aapp.GT.zero ) THEN
1148 *
1149  pskipped = 0
1150 *
1151  DO 2200 q = jgl, min( jgl+kbl-1, n )
1152 *
1153  aaqq = sva( q )
1154  IF( aaqq.GT.zero ) THEN
1155  aapp0 = aapp
1156 *
1157 * .. M x 2 Jacobi SVD ..
1158 *
1159 * Safe Gram matrix computation
1160 *
1161  IF( aaqq.GE.one ) THEN
1162  IF( aapp.GE.aaqq ) THEN
1163  rotok = ( small*aapp ).LE.aaqq
1164  ELSE
1165  rotok = ( small*aaqq ).LE.aapp
1166  END IF
1167  IF( aapp.LT.( big / aaqq ) ) THEN
1168  aapq = ( sdot( m, a( 1, p ), 1, a( 1,
1169  \$ q ), 1 )*work( p )*work( q ) /
1170  \$ aaqq ) / aapp
1171  ELSE
1172  CALL scopy( m, a( 1, p ), 1,
1173  \$ work( n+1 ), 1 )
1174  CALL slascl( 'G', 0, 0, aapp,
1175  \$ work( p ), m, 1,
1176  \$ work( n+1 ), lda, ierr )
1177  aapq = sdot( m, work( n+1 ), 1,
1178  \$ a( 1, q ), 1 )*work( q ) / aaqq
1179  END IF
1180  ELSE
1181  IF( aapp.GE.aaqq ) THEN
1182  rotok = aapp.LE.( aaqq / small )
1183  ELSE
1184  rotok = aaqq.LE.( aapp / small )
1185  END IF
1186  IF( aapp.GT.( small / aaqq ) ) THEN
1187  aapq = ( sdot( m, a( 1, p ), 1, a( 1,
1188  \$ q ), 1 )*work( p )*work( q ) /
1189  \$ aaqq ) / aapp
1190  ELSE
1191  CALL scopy( m, a( 1, q ), 1,
1192  \$ work( n+1 ), 1 )
1193  CALL slascl( 'G', 0, 0, aaqq,
1194  \$ work( q ), m, 1,
1195  \$ work( n+1 ), lda, ierr )
1196  aapq = sdot( m, work( n+1 ), 1,
1197  \$ a( 1, p ), 1 )*work( p ) / aapp
1198  END IF
1199  END IF
1200 *
1201  mxaapq = max( mxaapq, abs( aapq ) )
1202 *
1203 * TO rotate or NOT to rotate, THAT is the question ...
1204 *
1205  IF( abs( aapq ).GT.tol ) THEN
1206  notrot = 0
1207 *[RTD] ROTATED = ROTATED + 1
1208  pskipped = 0
1209  iswrot = iswrot + 1
1210 *
1211  IF( rotok ) THEN
1212 *
1213  aqoap = aaqq / aapp
1214  apoaq = aapp / aaqq
1215  theta = -half*abs( aqoap-apoaq ) / aapq
1216  IF( aaqq.GT.aapp0 )theta = -theta
1217 *
1218  IF( abs( theta ).GT.bigtheta ) THEN
1219  t = half / theta
1220  fastr( 3 ) = t*work( p ) / work( q )
1221  fastr( 4 ) = -t*work( q ) /
1222  \$ work( p )
1223  CALL srotm( m, a( 1, p ), 1,
1224  \$ a( 1, q ), 1, fastr )
1225  IF( rsvec )CALL srotm( mvl,
1226  \$ v( 1, p ), 1,
1227  \$ v( 1, q ), 1,
1228  \$ fastr )
1229  sva( q ) = aaqq*sqrt( max( zero,
1230  \$ one+t*apoaq*aapq ) )
1231  aapp = aapp*sqrt( max( zero,
1232  \$ one-t*aqoap*aapq ) )
1233  mxsinj = max( mxsinj, abs( t ) )
1234  ELSE
1235 *
1236 * .. choose correct signum for THETA and rotate
1237 *
1238  thsign = -sign( one, aapq )
1239  IF( aaqq.GT.aapp0 )thsign = -thsign
1240  t = one / ( theta+thsign*
1241  \$ sqrt( one+theta*theta ) )
1242  cs = sqrt( one / ( one+t*t ) )
1243  sn = t*cs
1244  mxsinj = max( mxsinj, abs( sn ) )
1245  sva( q ) = aaqq*sqrt( max( zero,
1246  \$ one+t*apoaq*aapq ) )
1247  aapp = aapp*sqrt( max( zero,
1248  \$ one-t*aqoap*aapq ) )
1249 *
1250  apoaq = work( p ) / work( q )
1251  aqoap = work( q ) / work( p )
1252  IF( work( p ).GE.one ) THEN
1253 *
1254  IF( work( q ).GE.one ) THEN
1255  fastr( 3 ) = t*apoaq
1256  fastr( 4 ) = -t*aqoap
1257  work( p ) = work( p )*cs
1258  work( q ) = work( q )*cs
1259  CALL srotm( m, a( 1, p ), 1,
1260  \$ a( 1, q ), 1,
1261  \$ fastr )
1262  IF( rsvec )CALL srotm( mvl,
1263  \$ v( 1, p ), 1, v( 1, q ),
1264  \$ 1, fastr )
1265  ELSE
1266  CALL saxpy( m, -t*aqoap,
1267  \$ a( 1, q ), 1,
1268  \$ a( 1, p ), 1 )
1269  CALL saxpy( m, cs*sn*apoaq,
1270  \$ a( 1, p ), 1,
1271  \$ a( 1, q ), 1 )
1272  IF( rsvec ) THEN
1273  CALL saxpy( mvl, -t*aqoap,
1274  \$ v( 1, q ), 1,
1275  \$ v( 1, p ), 1 )
1276  CALL saxpy( mvl,
1277  \$ cs*sn*apoaq,
1278  \$ v( 1, p ), 1,
1279  \$ v( 1, q ), 1 )
1280  END IF
1281  work( p ) = work( p )*cs
1282  work( q ) = work( q ) / cs
1283  END IF
1284  ELSE
1285  IF( work( q ).GE.one ) THEN
1286  CALL saxpy( m, t*apoaq,
1287  \$ a( 1, p ), 1,
1288  \$ a( 1, q ), 1 )
1289  CALL saxpy( m, -cs*sn*aqoap,
1290  \$ a( 1, q ), 1,
1291  \$ a( 1, p ), 1 )
1292  IF( rsvec ) THEN
1293  CALL saxpy( mvl, t*apoaq,
1294  \$ v( 1, p ), 1,
1295  \$ v( 1, q ), 1 )
1296  CALL saxpy( mvl,
1297  \$ -cs*sn*aqoap,
1298  \$ v( 1, q ), 1,
1299  \$ v( 1, p ), 1 )
1300  END IF
1301  work( p ) = work( p ) / cs
1302  work( q ) = work( q )*cs
1303  ELSE
1304  IF( work( p ).GE.work( q ) )
1305  \$ THEN
1306  CALL saxpy( m, -t*aqoap,
1307  \$ a( 1, q ), 1,
1308  \$ a( 1, p ), 1 )
1309  CALL saxpy( m, cs*sn*apoaq,
1310  \$ a( 1, p ), 1,
1311  \$ a( 1, q ), 1 )
1312  work( p ) = work( p )*cs
1313  work( q ) = work( q ) / cs
1314  IF( rsvec ) THEN
1315  CALL saxpy( mvl,
1316  \$ -t*aqoap,
1317  \$ v( 1, q ), 1,
1318  \$ v( 1, p ), 1 )
1319  CALL saxpy( mvl,
1320  \$ cs*sn*apoaq,
1321  \$ v( 1, p ), 1,
1322  \$ v( 1, q ), 1 )
1323  END IF
1324  ELSE
1325  CALL saxpy( m, t*apoaq,
1326  \$ a( 1, p ), 1,
1327  \$ a( 1, q ), 1 )
1328  CALL saxpy( m,
1329  \$ -cs*sn*aqoap,
1330  \$ a( 1, q ), 1,
1331  \$ a( 1, p ), 1 )
1332  work( p ) = work( p ) / cs
1333  work( q ) = work( q )*cs
1334  IF( rsvec ) THEN
1335  CALL saxpy( mvl,
1336  \$ t*apoaq, v( 1, p ),
1337  \$ 1, v( 1, q ), 1 )
1338  CALL saxpy( mvl,
1339  \$ -cs*sn*aqoap,
1340  \$ v( 1, q ), 1,
1341  \$ v( 1, p ), 1 )
1342  END IF
1343  END IF
1344  END IF
1345  END IF
1346  END IF
1347 *
1348  ELSE
1349  IF( aapp.GT.aaqq ) THEN
1350  CALL scopy( m, a( 1, p ), 1,
1351  \$ work( n+1 ), 1 )
1352  CALL slascl( 'G', 0, 0, aapp, one,
1353  \$ m, 1, work( n+1 ), lda,
1354  \$ ierr )
1355  CALL slascl( 'G', 0, 0, aaqq, one,
1356  \$ m, 1, a( 1, q ), lda,
1357  \$ ierr )
1358  temp1 = -aapq*work( p ) / work( q )
1359  CALL saxpy( m, temp1, work( n+1 ),
1360  \$ 1, a( 1, q ), 1 )
1361  CALL slascl( 'G', 0, 0, one, aaqq,
1362  \$ m, 1, a( 1, q ), lda,
1363  \$ ierr )
1364  sva( q ) = aaqq*sqrt( max( zero,
1365  \$ one-aapq*aapq ) )
1366  mxsinj = max( mxsinj, sfmin )
1367  ELSE
1368  CALL scopy( m, a( 1, q ), 1,
1369  \$ work( n+1 ), 1 )
1370  CALL slascl( 'G', 0, 0, aaqq, one,
1371  \$ m, 1, work( n+1 ), lda,
1372  \$ ierr )
1373  CALL slascl( 'G', 0, 0, aapp, one,
1374  \$ m, 1, a( 1, p ), lda,
1375  \$ ierr )
1376  temp1 = -aapq*work( q ) / work( p )
1377  CALL saxpy( m, temp1, work( n+1 ),
1378  \$ 1, a( 1, p ), 1 )
1379  CALL slascl( 'G', 0, 0, one, aapp,
1380  \$ m, 1, a( 1, p ), lda,
1381  \$ ierr )
1382  sva( p ) = aapp*sqrt( max( zero,
1383  \$ one-aapq*aapq ) )
1384  mxsinj = max( mxsinj, sfmin )
1385  END IF
1386  END IF
1387 * END IF ROTOK THEN ... ELSE
1388 *
1389 * In the case of cancellation in updating SVA(q)
1390 * .. recompute SVA(q)
1391  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1392  \$ THEN
1393  IF( ( aaqq.LT.rootbig ) .AND.
1394  \$ ( aaqq.GT.rootsfmin ) ) THEN
1395  sva( q ) = snrm2( m, a( 1, q ), 1 )*
1396  \$ work( q )
1397  ELSE
1398  t = zero
1399  aaqq = one
1400  CALL slassq( m, a( 1, q ), 1, t,
1401  \$ aaqq )
1402  sva( q ) = t*sqrt( aaqq )*work( q )
1403  END IF
1404  END IF
1405  IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1406  IF( ( aapp.LT.rootbig ) .AND.
1407  \$ ( aapp.GT.rootsfmin ) ) THEN
1408  aapp = snrm2( m, a( 1, p ), 1 )*
1409  \$ work( p )
1410  ELSE
1411  t = zero
1412  aapp = one
1413  CALL slassq( m, a( 1, p ), 1, t,
1414  \$ aapp )
1415  aapp = t*sqrt( aapp )*work( p )
1416  END IF
1417  sva( p ) = aapp
1418  END IF
1419 * end of OK rotation
1420  ELSE
1421  notrot = notrot + 1
1422 *[RTD] SKIPPED = SKIPPED + 1
1423  pskipped = pskipped + 1
1424  ijblsk = ijblsk + 1
1425  END IF
1426  ELSE
1427  notrot = notrot + 1
1428  pskipped = pskipped + 1
1429  ijblsk = ijblsk + 1
1430  END IF
1431 *
1432  IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1433  \$ THEN
1434  sva( p ) = aapp
1435  notrot = 0
1436  GO TO 2011
1437  END IF
1438  IF( ( i.LE.swband ) .AND.
1439  \$ ( pskipped.GT.rowskip ) ) THEN
1440  aapp = -aapp
1441  notrot = 0
1442  GO TO 2203
1443  END IF
1444 *
1445  2200 CONTINUE
1446 * end of the q-loop
1447  2203 CONTINUE
1448 *
1449  sva( p ) = aapp
1450 *
1451  ELSE
1452 *
1453  IF( aapp.EQ.zero )notrot = notrot +
1454  \$ min( jgl+kbl-1, n ) - jgl + 1
1455  IF( aapp.LT.zero )notrot = 0
1456 *
1457  END IF
1458 *
1459  2100 CONTINUE
1460 * end of the p-loop
1461  2010 CONTINUE
1462 * end of the jbc-loop
1463  2011 CONTINUE
1464 *2011 bailed out of the jbc-loop
1465  DO 2012 p = igl, min( igl+kbl-1, n )
1466  sva( p ) = abs( sva( p ) )
1467  2012 CONTINUE
1468 ***
1469  2000 CONTINUE
1470 *2000 :: end of the ibr-loop
1471 *
1472 * .. update SVA(N)
1473  IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1474  \$ THEN
1475  sva( n ) = snrm2( m, a( 1, n ), 1 )*work( n )
1476  ELSE
1477  t = zero
1478  aapp = one
1479  CALL slassq( m, a( 1, n ), 1, t, aapp )
1480  sva( n ) = t*sqrt( aapp )*work( n )
1481  END IF
1482 *
1484 *
1485  IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1486  \$ ( iswrot.LE.n ) ) )swband = i
1487 *
1488  IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( float( n ) )*
1489  \$ tol ) .AND. ( float( n )*mxaapq*mxsinj.LT.tol ) ) THEN
1490  GO TO 1994
1491  END IF
1492 *
1493  IF( notrot.GE.emptsw )GO TO 1994
1494 *
1495  1993 CONTINUE
1496 * end i=1:NSWEEP loop
1497 *
1498 * #:( Reaching this point means that the procedure has not converged.
1499  info = nsweep - 1
1500  GO TO 1995
1501 *
1502  1994 CONTINUE
1503 * #:) Reaching this point means numerical convergence after the i-th
1504 * sweep.
1505 *
1506  info = 0
1507 * #:) INFO = 0 confirms successful iterations.
1508  1995 CONTINUE
1509 *
1510 * Sort the singular values and find how many are above
1511 * the underflow threshold.
1512 *
1513  n2 = 0
1514  n4 = 0
1515  DO 5991 p = 1, n - 1
1516  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
1517  IF( p.NE.q ) THEN
1518  temp1 = sva( p )
1519  sva( p ) = sva( q )
1520  sva( q ) = temp1
1521  temp1 = work( p )
1522  work( p ) = work( q )
1523  work( q ) = temp1
1524  CALL sswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1525  IF( rsvec )CALL sswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1526  END IF
1527  IF( sva( p ).NE.zero ) THEN
1528  n4 = n4 + 1
1529  IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1530  END IF
1531  5991 CONTINUE
1532  IF( sva( n ).NE.zero ) THEN
1533  n4 = n4 + 1
1534  IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1535  END IF
1536 *
1537 * Normalize the left singular vectors.
1538 *
1539  IF( lsvec .OR. uctol ) THEN
1540  DO 1998 p = 1, n2
1541  CALL sscal( m, work( p ) / sva( p ), a( 1, p ), 1 )
1542  1998 CONTINUE
1543  END IF
1544 *
1545 * Scale the product of Jacobi rotations (assemble the fast rotations).
1546 *
1547  IF( rsvec ) THEN
1548  IF( applv ) THEN
1549  DO 2398 p = 1, n
1550  CALL sscal( mvl, work( p ), v( 1, p ), 1 )
1551  2398 CONTINUE
1552  ELSE
1553  DO 2399 p = 1, n
1554  temp1 = one / snrm2( mvl, v( 1, p ), 1 )
1555  CALL sscal( mvl, temp1, v( 1, p ), 1 )
1556  2399 CONTINUE
1557  END IF
1558  END IF
1559 *
1560 * Undo scaling, if necessary (and possible).
1561  IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1562  \$ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1563  \$ ( sfmin / skl ) ) ) ) THEN
1564  DO 2400 p = 1, n
1565  sva( p ) = skl*sva( p )
1566  2400 CONTINUE
1567  skl = one
1568  END IF
1569 *
1570  work( 1 ) = skl
1571 * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1572 * then some of the singular values may overflow or underflow and
1573 * the spectrum is given in this factored representation.
1574 *
1575  work( 2 ) = float( n4 )
1576 * N4 is the number of computed nonzero singular values of A.
1577 *
1578  work( 3 ) = float( n2 )
1579 * N2 is the number of singular values of A greater than SFMIN.
1580 * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1581 * that may carry some information.
1582 *
1583  work( 4 ) = float( i )
1584 * i is the index of the last sweep before declaring convergence.
1585 *
1586  work( 5 ) = mxaapq
1587 * MXAAPQ is the largest absolute value of scaled pivots in the
1588 * last sweep
1589 *
1590  work( 6 ) = mxsinj
1591 * MXSINJ is the largest absolute value of the sines of Jacobi angles
1592 * in the last sweep
1593 *
1594  RETURN
1595 * ..
1596 * .. END OF SGESVJ
1597 * ..
1598  END
subroutine slassq(n, x, incx, scl, sumsq)
SLASSQ updates a sum of squares represented in scaled form.
Definition: slassq.f90:137
subroutine slascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
SLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: slascl.f:143
subroutine slaset(UPLO, M, N, ALPHA, BETA, A, LDA)
SLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values.
Definition: slaset.f:110
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:60
subroutine sgesvj(JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V, LDV, WORK, LWORK, INFO)
SGESVJ
Definition: sgesvj.f:323
subroutine sgsvj1(JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
SGSVJ1 pre-processor for the routine sgesvj, applies Jacobi rotations targeting only particular pivot...
Definition: sgsvj1.f:236
subroutine sgsvj0(JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
SGSVJ0 pre-processor for the routine sgesvj.
Definition: sgsvj0.f:218
subroutine sswap(N, SX, INCX, SY, INCY)
SSWAP
Definition: sswap.f:82
subroutine scopy(N, SX, INCX, SY, INCY)
SCOPY
Definition: scopy.f:82
subroutine sscal(N, SA, SX, INCX)
SSCAL
Definition: sscal.f:79
subroutine srotm(N, SX, INCX, SY, INCY, SPARAM)
SROTM
Definition: srotm.f:97
subroutine saxpy(N, SA, SX, INCX, SY, INCY)
SAXPY
Definition: saxpy.f:89