LAPACK  3.8.0
LAPACK: Linear Algebra PACKage

◆ cgesvj()

subroutine cgesvj ( character*1  JOBA,
character*1  JOBU,
character*1  JOBV,
integer  M,
integer  N,
complex, dimension( lda, * )  A,
integer  LDA,
real, dimension( n )  SVA,
integer  MV,
complex, dimension( ldv, * )  V,
integer  LDV,
complex, dimension( lwork )  CWORK,
integer  LWORK,
real, dimension( lrwork )  RWORK,
integer  LRWORK,
integer  INFO 
)

CGESVJ

Download CGESVJ + dependencies [TGZ] [ZIP] [TXT]

Purpose:
 CGESVJ computes the singular value decomposition (SVD) of a complex
 M-by-N matrix A, where M >= N. The SVD of A is written as
                                    [++]   [xx]   [x0]   [xx]
              A = U * SIGMA * V^*,  [++] = [xx] * [ox] * [xx]
                                    [++]   [xx]
 where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
 matrix, and V is an N-by-N unitary matrix. The diagonal elements
 of SIGMA are the singular values of A. The columns of U and V are the
 left and the right singular vectors of A, respectively.
Parameters
[in]JOBA
          JOBA is CHARACTER*1
          Specifies the structure of A.
          = 'L': The input matrix A is lower triangular;
          = 'U': The input matrix A is upper triangular;
          = 'G': The input matrix A is general M-by-N matrix, M >= N.
[in]JOBU
          JOBU is CHARACTER*1
          Specifies whether to compute the left singular vectors
          (columns of U):
          = 'U' or 'F': The left singular vectors corresponding to the nonzero
                 singular values are computed and returned in the leading
                 columns of A. See more details in the description of A.
                 The default numerical orthogonality threshold is set to
                 approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=SLAMCH('E').
          = 'C': Analogous to JOBU='U', except that user can control the
                 level of numerical orthogonality of the computed left
                 singular vectors. TOL can be set to TOL = CTOL*EPS, where
                 CTOL is given on input in the array WORK.
                 No CTOL smaller than ONE is allowed. CTOL greater
                 than 1 / EPS is meaningless. The option 'C'
                 can be used if M*EPS is satisfactory orthogonality
                 of the computed left singular vectors, so CTOL=M could
                 save few sweeps of Jacobi rotations.
                 See the descriptions of A and WORK(1).
          = 'N': The matrix U is not computed. However, see the
                 description of A.
[in]JOBV
          JOBV is CHARACTER*1
          Specifies whether to compute the right singular vectors, that
          is, the matrix V:
          = 'V' or 'J': the matrix V is computed and returned in the array V
          = 'A' : the Jacobi rotations are applied to the MV-by-N
                  array V. In other words, the right singular vector
                  matrix V is not computed explicitly; instead it is
                  applied to an MV-by-N matrix initially stored in the
                  first MV rows of V.
          = 'N' : the matrix V is not computed and the array V is not
                  referenced
[in]M
          M is INTEGER
          The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0.
[in]N
          N is INTEGER
          The number of columns of the input matrix A.
          M >= N >= 0.
[in,out]A
          A is COMPLEX array, dimension (LDA,N)
          On entry, the M-by-N matrix A.
          On exit,
          If JOBU .EQ. 'U' .OR. JOBU .EQ. 'C':
                 If INFO .EQ. 0 :
                 RANKA orthonormal columns of U are returned in the
                 leading RANKA columns of the array A. Here RANKA <= N
                 is the number of computed singular values of A that are
                 above the underflow threshold SLAMCH('S'). The singular
                 vectors corresponding to underflowed or zero singular
                 values are not computed. The value of RANKA is returned
                 in the array RWORK as RANKA=NINT(RWORK(2)). Also see the
                 descriptions of SVA and RWORK. The computed columns of U
                 are mutually numerically orthogonal up to approximately
                 TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU.EQ.'C'),
                 see the description of JOBU.
                 If INFO .GT. 0,
                 the procedure CGESVJ did not converge in the given number
                 of iterations (sweeps). In that case, the computed
                 columns of U may not be orthogonal up to TOL. The output
                 U (stored in A), SIGMA (given by the computed singular
                 values in SVA(1:N)) and V is still a decomposition of the
                 input matrix A in the sense that the residual
                 || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small.
          If JOBU .EQ. 'N':
                 If INFO .EQ. 0 :
                 Note that the left singular vectors are 'for free' in the
                 one-sided Jacobi SVD algorithm. However, if only the
                 singular values are needed, the level of numerical
                 orthogonality of U is not an issue and iterations are
                 stopped when the columns of the iterated matrix are
                 numerically orthogonal up to approximately M*EPS. Thus,
                 on exit, A contains the columns of U scaled with the
                 corresponding singular values.
                 If INFO .GT. 0 :
                 the procedure CGESVJ did not converge in the given number
                 of iterations (sweeps).
[in]LDA
          LDA is INTEGER
          The leading dimension of the array A.  LDA >= max(1,M).
[out]SVA
          SVA is REAL array, dimension (N)
          On exit,
          If INFO .EQ. 0 :
          depending on the value SCALE = RWORK(1), we have:
                 If SCALE .EQ. ONE:
                 SVA(1:N) contains the computed singular values of A.
                 During the computation SVA contains the Euclidean column
                 norms of the iterated matrices in the array A.
                 If SCALE .NE. ONE:
                 The singular values of A are SCALE*SVA(1:N), and this
                 factored representation is due to the fact that some of the
                 singular values of A might underflow or overflow.

          If INFO .GT. 0 :
          the procedure CGESVJ did not converge in the given number of
          iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.
[in]MV
          MV is INTEGER
          If JOBV .EQ. 'A', then the product of Jacobi rotations in CGESVJ
          is applied to the first MV rows of V. See the description of JOBV.
[in,out]V
          V is COMPLEX array, dimension (LDV,N)
          If JOBV = 'V', then V contains on exit the N-by-N matrix of
                         the right singular vectors;
          If JOBV = 'A', then V contains the product of the computed right
                         singular vector matrix and the initial matrix in
                         the array V.
          If JOBV = 'N', then V is not referenced.
[in]LDV
          LDV is INTEGER
          The leading dimension of the array V, LDV .GE. 1.
          If JOBV .EQ. 'V', then LDV .GE. max(1,N).
          If JOBV .EQ. 'A', then LDV .GE. max(1,MV) .
[in,out]CWORK
          CWORK is COMPLEX array, dimension (max(1,LWORK))
          Used as workspace.
          If on entry LWORK .EQ. -1, then a workspace query is assumed and
          no computation is done; CWORK(1) is set to the minial (and optimal)
          length of CWORK.
[in]LWORK
          LWORK is INTEGER.
          Length of CWORK, LWORK >= M+N.
[in,out]RWORK
          RWORK is REAL array, dimension (max(6,LRWORK))
          On entry,
          If JOBU .EQ. 'C' :
          RWORK(1) = CTOL, where CTOL defines the threshold for convergence.
                    The process stops if all columns of A are mutually
                    orthogonal up to CTOL*EPS, EPS=SLAMCH('E').
                    It is required that CTOL >= ONE, i.e. it is not
                    allowed to force the routine to obtain orthogonality
                    below EPSILON.
          On exit,
          RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N)
                    are the computed singular values of A.
                    (See description of SVA().)
          RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero
                    singular values.
          RWORK(3) = NINT(RWORK(3)) is the number of the computed singular
                    values that are larger than the underflow threshold.
          RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi
                    rotations needed for numerical convergence.
          RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep.
                    This is useful information in cases when CGESVJ did
                    not converge, as it can be used to estimate whether
                    the output is stil useful and for post festum analysis.
          RWORK(6) = the largest absolute value over all sines of the
                    Jacobi rotation angles in the last sweep. It can be
                    useful for a post festum analysis.
         If on entry LRWORK .EQ. -1, then a workspace query is assumed and
         no computation is done; RWORK(1) is set to the minial (and optimal)
         length of RWORK.
[in]LRWORK
         LRWORK is INTEGER
         Length of RWORK, LRWORK >= MAX(6,N).
[out]INFO
          INFO is INTEGER
          = 0 : successful exit.
          < 0 : if INFO = -i, then the i-th argument had an illegal value
          > 0 : CGESVJ did not converge in the maximal allowed number
                (NSWEEP=30) of sweeps. The output may still be useful.
                See the description of RWORK.
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Date
June 2016
Further Details:
 The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
 rotations. In the case of underflow of the tangent of the Jacobi angle, a
 modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses
 column interchanges of de Rijk [1]. The relative accuracy of the computed
 singular values and the accuracy of the computed singular vectors (in
 angle metric) is as guaranteed by the theory of Demmel and Veselic [2].
 The condition number that determines the accuracy in the full rank case
 is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
 spectral condition number. The best performance of this Jacobi SVD
 procedure is achieved if used in an  accelerated version of Drmac and
 Veselic [4,5], and it is the kernel routine in the SIGMA library [6].
 Some tunning parameters (marked with [TP]) are available for the
 implementer.
 The computational range for the nonzero singular values is the  machine
 number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
 denormalized singular values can be computed with the corresponding
 gradual loss of accurate digits.
Contributor:
  ============

  Zlatko Drmac (Zagreb, Croatia)
References:
 [1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the
    singular value decomposition on a vector computer.
    SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371.
 [2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR.
 [3] Z. Drmac: Implementation of Jacobi rotations for accurate singular
    value computation in floating point arithmetic.
    SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222.
 [4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
    SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
    LAPACK Working note 169.
 [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
    SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
    LAPACK Working note 170.
 [6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
    QSVD, (H,K)-SVD computations.
    Department of Mathematics, University of Zagreb, 2008, 2015.
Bugs, examples and comments:
  ===========================
  Please report all bugs and send interesting test examples and comments to
  drmac@math.hr. Thank you.

Definition at line 353 of file cgesvj.f.

353 *
354 * -- LAPACK computational routine (version 3.8.0) --
355 * -- LAPACK is a software package provided by Univ. of Tennessee, --
356 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
357 * June 2016
358 *
359  IMPLICIT NONE
360 * .. Scalar Arguments ..
361  INTEGER info, lda, ldv, lwork, lrwork, m, mv, n
362  CHARACTER*1 joba, jobu, jobv
363 * ..
364 * .. Array Arguments ..
365  COMPLEX a( lda, * ), v( ldv, * ), cwork( lwork )
366  REAL rwork( lrwork ), sva( n )
367 * ..
368 *
369 * =====================================================================
370 *
371 * .. Local Parameters ..
372  REAL zero, half, one
373  parameter( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
374  COMPLEX czero, cone
375  parameter( czero = (0.0e0, 0.0e0), cone = (1.0e0, 0.0e0) )
376  INTEGER nsweep
377  parameter( nsweep = 30 )
378 * ..
379 * .. Local Scalars ..
380  COMPLEX aapq, ompq
381  REAL aapp, aapp0, aapq1, aaqq, apoaq, aqoap, big,
382  $ bigtheta, cs, ctol, epsln, mxaapq,
383  $ mxsinj, rootbig, rooteps, rootsfmin, roottol,
384  $ skl, sfmin, small, sn, t, temp1, theta, thsign, tol
385  INTEGER blskip, emptsw, i, ibr, ierr, igl, ijblsk, ir1,
386  $ iswrot, jbc, jgl, kbl, lkahead, mvl, n2, n34,
387  $ n4, nbl, notrot, p, pskipped, q, rowskip, swband
388  LOGICAL applv, goscale, lower, lquery, lsvec, noscale, rotok,
389  $ rsvec, uctol, upper
390 * ..
391 * ..
392 * .. Intrinsic Functions ..
393  INTRINSIC abs, max, min, conjg, REAL, sign, sqrt
394 * ..
395 * .. External Functions ..
396 * ..
397 * from BLAS
398  REAL scnrm2
399  COMPLEX cdotc
400  EXTERNAL cdotc, scnrm2
401  INTEGER isamax
402  EXTERNAL isamax
403 * from LAPACK
404  REAL slamch
405  EXTERNAL slamch
406  LOGICAL lsame
407  EXTERNAL lsame
408 * ..
409 * .. External Subroutines ..
410 * ..
411 * from BLAS
412  EXTERNAL ccopy, crot, csscal, cswap, caxpy
413 * from LAPACK
414  EXTERNAL clascl, claset, classq, slascl, xerbla
415  EXTERNAL cgsvj0, cgsvj1
416 * ..
417 * .. Executable Statements ..
418 *
419 * Test the input arguments
420 *
421  lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
422  uctol = lsame( jobu, 'C' )
423  rsvec = lsame( jobv, 'V' ) .OR. lsame( jobv, 'J' )
424  applv = lsame( jobv, 'A' )
425  upper = lsame( joba, 'U' )
426  lower = lsame( joba, 'L' )
427 *
428  lquery = ( lwork .EQ. -1 ) .OR. ( lrwork .EQ. -1 )
429  IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
430  info = -1
431  ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
432  info = -2
433  ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
434  info = -3
435  ELSE IF( m.LT.0 ) THEN
436  info = -4
437  ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
438  info = -5
439  ELSE IF( lda.LT.m ) THEN
440  info = -7
441  ELSE IF( mv.LT.0 ) THEN
442  info = -9
443  ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
444  $ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
445  info = -11
446  ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
447  info = -12
448  ELSE IF( lwork.LT.( m+n ) .AND. ( .NOT.lquery ) ) THEN
449  info = -13
450  ELSE IF( lrwork.LT.max( n, 6 ) .AND. ( .NOT.lquery ) ) THEN
451  info = -15
452  ELSE
453  info = 0
454  END IF
455 *
456 * #:(
457  IF( info.NE.0 ) THEN
458  CALL xerbla( 'CGESVJ', -info )
459  RETURN
460  ELSE IF ( lquery ) THEN
461  cwork(1) = m + n
462  rwork(1) = max( n, 6 )
463  RETURN
464  END IF
465 *
466 * #:) Quick return for void matrix
467 *
468  IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
469 *
470 * Set numerical parameters
471 * The stopping criterion for Jacobi rotations is
472 *
473 * max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
474 *
475 * where EPS is the round-off and CTOL is defined as follows:
476 *
477  IF( uctol ) THEN
478 * ... user controlled
479  ctol = rwork( 1 )
480  ELSE
481 * ... default
482  IF( lsvec .OR. rsvec .OR. applv ) THEN
483  ctol = sqrt( REAL( M ) )
484  ELSE
485  ctol = REAL( m )
486  END IF
487  END IF
488 * ... and the machine dependent parameters are
489 *[!] (Make sure that SLAMCH() works properly on the target machine.)
490 *
491  epsln = slamch( 'Epsilon' )
492  rooteps = sqrt( epsln )
493  sfmin = slamch( 'SafeMinimum' )
494  rootsfmin = sqrt( sfmin )
495  small = sfmin / epsln
496 * BIG = SLAMCH( 'Overflow' )
497  big = one / sfmin
498  rootbig = one / rootsfmin
499 * LARGE = BIG / SQRT( REAL( M*N ) )
500  bigtheta = one / rooteps
501 *
502  tol = ctol*epsln
503  roottol = sqrt( tol )
504 *
505  IF( REAL( m )*epsln.GE.one ) then
506  info = -4
507  CALL xerbla( 'CGESVJ', -info )
508  RETURN
509  END IF
510 *
511 * Initialize the right singular vector matrix.
512 *
513  IF( rsvec ) THEN
514  mvl = n
515  CALL claset( 'A', mvl, n, czero, cone, v, ldv )
516  ELSE IF( applv ) THEN
517  mvl = mv
518  END IF
519  rsvec = rsvec .OR. applv
520 *
521 * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
522 *(!) If necessary, scale A to protect the largest singular value
523 * from overflow. It is possible that saving the largest singular
524 * value destroys the information about the small ones.
525 * This initial scaling is almost minimal in the sense that the
526 * goal is to make sure that no column norm overflows, and that
527 * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
528 * in A are detected, the procedure returns with INFO=-6.
529 *
530  skl = one / sqrt( REAL( m )*REAL( N ) )
531  noscale = .true.
532  goscale = .true.
533 *
534  IF( lower ) THEN
535 * the input matrix is M-by-N lower triangular (trapezoidal)
536  DO 1874 p = 1, n
537  aapp = zero
538  aaqq = one
539  CALL classq( m-p+1, a( p, p ), 1, aapp, aaqq )
540  IF( aapp.GT.big ) THEN
541  info = -6
542  CALL xerbla( 'CGESVJ', -info )
543  RETURN
544  END IF
545  aaqq = sqrt( aaqq )
546  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
547  sva( p ) = aapp*aaqq
548  ELSE
549  noscale = .false.
550  sva( p ) = aapp*( aaqq*skl )
551  IF( goscale ) THEN
552  goscale = .false.
553  DO 1873 q = 1, p - 1
554  sva( q ) = sva( q )*skl
555  1873 CONTINUE
556  END IF
557  END IF
558  1874 CONTINUE
559  ELSE IF( upper ) THEN
560 * the input matrix is M-by-N upper triangular (trapezoidal)
561  DO 2874 p = 1, n
562  aapp = zero
563  aaqq = one
564  CALL classq( p, a( 1, p ), 1, aapp, aaqq )
565  IF( aapp.GT.big ) THEN
566  info = -6
567  CALL xerbla( 'CGESVJ', -info )
568  RETURN
569  END IF
570  aaqq = sqrt( aaqq )
571  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
572  sva( p ) = aapp*aaqq
573  ELSE
574  noscale = .false.
575  sva( p ) = aapp*( aaqq*skl )
576  IF( goscale ) THEN
577  goscale = .false.
578  DO 2873 q = 1, p - 1
579  sva( q ) = sva( q )*skl
580  2873 CONTINUE
581  END IF
582  END IF
583  2874 CONTINUE
584  ELSE
585 * the input matrix is M-by-N general dense
586  DO 3874 p = 1, n
587  aapp = zero
588  aaqq = one
589  CALL classq( m, a( 1, p ), 1, aapp, aaqq )
590  IF( aapp.GT.big ) THEN
591  info = -6
592  CALL xerbla( 'CGESVJ', -info )
593  RETURN
594  END IF
595  aaqq = sqrt( aaqq )
596  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
597  sva( p ) = aapp*aaqq
598  ELSE
599  noscale = .false.
600  sva( p ) = aapp*( aaqq*skl )
601  IF( goscale ) THEN
602  goscale = .false.
603  DO 3873 q = 1, p - 1
604  sva( q ) = sva( q )*skl
605  3873 CONTINUE
606  END IF
607  END IF
608  3874 CONTINUE
609  END IF
610 *
611  IF( noscale )skl = one
612 *
613 * Move the smaller part of the spectrum from the underflow threshold
614 *(!) Start by determining the position of the nonzero entries of the
615 * array SVA() relative to ( SFMIN, BIG ).
616 *
617  aapp = zero
618  aaqq = big
619  DO 4781 p = 1, n
620  IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
621  aapp = max( aapp, sva( p ) )
622  4781 CONTINUE
623 *
624 * #:) Quick return for zero matrix
625 *
626  IF( aapp.EQ.zero ) THEN
627  IF( lsvec )CALL claset( 'G', m, n, czero, cone, a, lda )
628  rwork( 1 ) = one
629  rwork( 2 ) = zero
630  rwork( 3 ) = zero
631  rwork( 4 ) = zero
632  rwork( 5 ) = zero
633  rwork( 6 ) = zero
634  RETURN
635  END IF
636 *
637 * #:) Quick return for one-column matrix
638 *
639  IF( n.EQ.1 ) THEN
640  IF( lsvec )CALL clascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
641  $ a( 1, 1 ), lda, ierr )
642  rwork( 1 ) = one / skl
643  IF( sva( 1 ).GE.sfmin ) THEN
644  rwork( 2 ) = one
645  ELSE
646  rwork( 2 ) = zero
647  END IF
648  rwork( 3 ) = zero
649  rwork( 4 ) = zero
650  rwork( 5 ) = zero
651  rwork( 6 ) = zero
652  RETURN
653  END IF
654 *
655 * Protect small singular values from underflow, and try to
656 * avoid underflows/overflows in computing Jacobi rotations.
657 *
658  sn = sqrt( sfmin / epsln )
659  temp1 = sqrt( big / REAL( N ) )
660  IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
661  $ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
662  temp1 = min( big, temp1 / aapp )
663 * AAQQ = AAQQ*TEMP1
664 * AAPP = AAPP*TEMP1
665  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
666  temp1 = min( sn / aaqq, big / ( aapp*sqrt( REAL( N ) ) ) )
667 * AAQQ = AAQQ*TEMP1
668 * AAPP = AAPP*TEMP1
669  ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
670  temp1 = max( sn / aaqq, temp1 / aapp )
671 * AAQQ = AAQQ*TEMP1
672 * AAPP = AAPP*TEMP1
673  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
674  temp1 = min( sn / aaqq, big / ( sqrt( REAL( N ) )*aapp ) )
675 * AAQQ = AAQQ*TEMP1
676 * AAPP = AAPP*TEMP1
677  ELSE
678  temp1 = one
679  END IF
680 *
681 * Scale, if necessary
682 *
683  IF( temp1.NE.one ) THEN
684  CALL slascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
685  END IF
686  skl = temp1*skl
687  IF( skl.NE.one ) THEN
688  CALL clascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
689  skl = one / skl
690  END IF
691 *
692 * Row-cyclic Jacobi SVD algorithm with column pivoting
693 *
694  emptsw = ( n*( n-1 ) ) / 2
695  notrot = 0
696 
697  DO 1868 q = 1, n
698  cwork( q ) = cone
699  1868 CONTINUE
700 *
701 *
702 *
703  swband = 3
704 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
705 * if CGESVJ is used as a computational routine in the preconditioned
706 * Jacobi SVD algorithm CGEJSV. For sweeps i=1:SWBAND the procedure
707 * works on pivots inside a band-like region around the diagonal.
708 * The boundaries are determined dynamically, based on the number of
709 * pivots above a threshold.
710 *
711  kbl = min( 8, n )
712 *[TP] KBL is a tuning parameter that defines the tile size in the
713 * tiling of the p-q loops of pivot pairs. In general, an optimal
714 * value of KBL depends on the matrix dimensions and on the
715 * parameters of the computer's memory.
716 *
717  nbl = n / kbl
718  IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
719 *
720  blskip = kbl**2
721 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
722 *
723  rowskip = min( 5, kbl )
724 *[TP] ROWSKIP is a tuning parameter.
725 *
726  lkahead = 1
727 *[TP] LKAHEAD is a tuning parameter.
728 *
729 * Quasi block transformations, using the lower (upper) triangular
730 * structure of the input matrix. The quasi-block-cycling usually
731 * invokes cubic convergence. Big part of this cycle is done inside
732 * canonical subspaces of dimensions less than M.
733 *
734  IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
735 *[TP] The number of partition levels and the actual partition are
736 * tuning parameters.
737  n4 = n / 4
738  n2 = n / 2
739  n34 = 3*n4
740  IF( applv ) THEN
741  q = 0
742  ELSE
743  q = 1
744  END IF
745 *
746  IF( lower ) THEN
747 *
748 * This works very well on lower triangular matrices, in particular
749 * in the framework of the preconditioned Jacobi SVD (xGEJSV).
750 * The idea is simple:
751 * [+ 0 0 0] Note that Jacobi transformations of [0 0]
752 * [+ + 0 0] [0 0]
753 * [+ + x 0] actually work on [x 0] [x 0]
754 * [+ + x x] [x x]. [x x]
755 *
756  CALL cgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
757  $ cwork( n34+1 ), sva( n34+1 ), mvl,
758  $ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
759  $ 2, cwork( n+1 ), lwork-n, ierr )
760 
761  CALL cgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
762  $ cwork( n2+1 ), sva( n2+1 ), mvl,
763  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
764  $ cwork( n+1 ), lwork-n, ierr )
765 
766  CALL cgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
767  $ cwork( n2+1 ), sva( n2+1 ), mvl,
768  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
769  $ cwork( n+1 ), lwork-n, ierr )
770 *
771  CALL cgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
772  $ cwork( n4+1 ), sva( n4+1 ), mvl,
773  $ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
774  $ cwork( n+1 ), lwork-n, ierr )
775 *
776  CALL cgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
777  $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
778  $ ierr )
779 *
780  CALL cgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
781  $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
782  $ lwork-n, ierr )
783 *
784 *
785  ELSE IF( upper ) THEN
786 *
787 *
788  CALL cgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
789  $ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
790  $ ierr )
791 *
792  CALL cgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
793  $ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
794  $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
795  $ ierr )
796 *
797  CALL cgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
798  $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
799  $ lwork-n, ierr )
800 *
801  CALL cgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
802  $ cwork( n2+1 ), sva( n2+1 ), mvl,
803  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
804  $ cwork( n+1 ), lwork-n, ierr )
805 
806  END IF
807 *
808  END IF
809 *
810 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
811 *
812  DO 1993 i = 1, nsweep
813 *
814 * .. go go go ...
815 *
816  mxaapq = zero
817  mxsinj = zero
818  iswrot = 0
819 *
820  notrot = 0
821  pskipped = 0
822 *
823 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
824 * 1 <= p < q <= N. This is the first step toward a blocked implementation
825 * of the rotations. New implementation, based on block transformations,
826 * is under development.
827 *
828  DO 2000 ibr = 1, nbl
829 *
830  igl = ( ibr-1 )*kbl + 1
831 *
832  DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
833 *
834  igl = igl + ir1*kbl
835 *
836  DO 2001 p = igl, min( igl+kbl-1, n-1 )
837 *
838 * .. de Rijk's pivoting
839 *
840  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
841  IF( p.NE.q ) THEN
842  CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
843  IF( rsvec )CALL cswap( mvl, v( 1, p ), 1,
844  $ v( 1, q ), 1 )
845  temp1 = sva( p )
846  sva( p ) = sva( q )
847  sva( q ) = temp1
848  aapq = cwork(p)
849  cwork(p) = cwork(q)
850  cwork(q) = aapq
851  END IF
852 *
853  IF( ir1.EQ.0 ) THEN
854 *
855 * Column norms are periodically updated by explicit
856 * norm computation.
857 *[!] Caveat:
858 * Unfortunately, some BLAS implementations compute SCNRM2(M,A(1,p),1)
859 * as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
860 * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
861 * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
862 * Hence, SCNRM2 cannot be trusted, not even in the case when
863 * the true norm is far from the under(over)flow boundaries.
864 * If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
865 * below should be replaced with "AAPP = SCNRM2( M, A(1,p), 1 )".
866 *
867  IF( ( sva( p ).LT.rootbig ) .AND.
868  $ ( sva( p ).GT.rootsfmin ) ) THEN
869  sva( p ) = scnrm2( m, a( 1, p ), 1 )
870  ELSE
871  temp1 = zero
872  aapp = one
873  CALL classq( m, a( 1, p ), 1, temp1, aapp )
874  sva( p ) = temp1*sqrt( aapp )
875  END IF
876  aapp = sva( p )
877  ELSE
878  aapp = sva( p )
879  END IF
880 *
881  IF( aapp.GT.zero ) THEN
882 *
883  pskipped = 0
884 *
885  DO 2002 q = p + 1, min( igl+kbl-1, n )
886 *
887  aaqq = sva( q )
888 *
889  IF( aaqq.GT.zero ) THEN
890 *
891  aapp0 = aapp
892  IF( aaqq.GE.one ) THEN
893  rotok = ( small*aapp ).LE.aaqq
894  IF( aapp.LT.( big / aaqq ) ) THEN
895  aapq = ( cdotc( m, a( 1, p ), 1,
896  $ a( 1, q ), 1 ) / aaqq ) / aapp
897  ELSE
898  CALL ccopy( m, a( 1, p ), 1,
899  $ cwork(n+1), 1 )
900  CALL clascl( 'G', 0, 0, aapp, one,
901  $ m, 1, cwork(n+1), lda, ierr )
902  aapq = cdotc( m, cwork(n+1), 1,
903  $ a( 1, q ), 1 ) / aaqq
904  END IF
905  ELSE
906  rotok = aapp.LE.( aaqq / small )
907  IF( aapp.GT.( small / aaqq ) ) THEN
908  aapq = ( cdotc( m, a( 1, p ), 1,
909  $ a( 1, q ), 1 ) / aapp ) / aaqq
910  ELSE
911  CALL ccopy( m, a( 1, q ), 1,
912  $ cwork(n+1), 1 )
913  CALL clascl( 'G', 0, 0, aaqq,
914  $ one, m, 1,
915  $ cwork(n+1), lda, ierr )
916  aapq = cdotc( m, a(1, p ), 1,
917  $ cwork(n+1), 1 ) / aapp
918  END IF
919  END IF
920 *
921 * AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
922  aapq1 = -abs(aapq)
923  mxaapq = max( mxaapq, -aapq1 )
924 *
925 * TO rotate or NOT to rotate, THAT is the question ...
926 *
927  IF( abs( aapq1 ).GT.tol ) THEN
928  ompq = aapq / abs(aapq)
929 *
930 * .. rotate
931 *[RTD] ROTATED = ROTATED + ONE
932 *
933  IF( ir1.EQ.0 ) THEN
934  notrot = 0
935  pskipped = 0
936  iswrot = iswrot + 1
937  END IF
938 *
939  IF( rotok ) THEN
940 *
941  aqoap = aaqq / aapp
942  apoaq = aapp / aaqq
943  theta = -half*abs( aqoap-apoaq )/aapq1
944 *
945  IF( abs( theta ).GT.bigtheta ) THEN
946 *
947  t = half / theta
948  cs = one
949 
950  CALL crot( m, a(1,p), 1, a(1,q), 1,
951  $ cs, conjg(ompq)*t )
952  IF ( rsvec ) THEN
953  CALL crot( mvl, v(1,p), 1,
954  $ v(1,q), 1, cs, conjg(ompq)*t )
955  END IF
956 
957  sva( q ) = aaqq*sqrt( max( zero,
958  $ one+t*apoaq*aapq1 ) )
959  aapp = aapp*sqrt( max( zero,
960  $ one-t*aqoap*aapq1 ) )
961  mxsinj = max( mxsinj, abs( t ) )
962 *
963  ELSE
964 *
965 * .. choose correct signum for THETA and rotate
966 *
967  thsign = -sign( one, aapq1 )
968  t = one / ( theta+thsign*
969  $ sqrt( one+theta*theta ) )
970  cs = sqrt( one / ( one+t*t ) )
971  sn = t*cs
972 *
973  mxsinj = max( mxsinj, abs( sn ) )
974  sva( q ) = aaqq*sqrt( max( zero,
975  $ one+t*apoaq*aapq1 ) )
976  aapp = aapp*sqrt( max( zero,
977  $ one-t*aqoap*aapq1 ) )
978 *
979  CALL crot( m, a(1,p), 1, a(1,q), 1,
980  $ cs, conjg(ompq)*sn )
981  IF ( rsvec ) THEN
982  CALL crot( mvl, v(1,p), 1,
983  $ v(1,q), 1, cs, conjg(ompq)*sn )
984  END IF
985  END IF
986  cwork(p) = -cwork(q) * ompq
987 *
988  ELSE
989 * .. have to use modified Gram-Schmidt like transformation
990  CALL ccopy( m, a( 1, p ), 1,
991  $ cwork(n+1), 1 )
992  CALL clascl( 'G', 0, 0, aapp, one, m,
993  $ 1, cwork(n+1), lda,
994  $ ierr )
995  CALL clascl( 'G', 0, 0, aaqq, one, m,
996  $ 1, a( 1, q ), lda, ierr )
997  CALL caxpy( m, -aapq, cwork(n+1), 1,
998  $ a( 1, q ), 1 )
999  CALL clascl( 'G', 0, 0, one, aaqq, m,
1000  $ 1, a( 1, q ), lda, ierr )
1001  sva( q ) = aaqq*sqrt( max( zero,
1002  $ one-aapq1*aapq1 ) )
1003  mxsinj = max( mxsinj, sfmin )
1004  END IF
1005 * END IF ROTOK THEN ... ELSE
1006 *
1007 * In the case of cancellation in updating SVA(q), SVA(p)
1008 * recompute SVA(q), SVA(p).
1009 *
1010  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1011  $ THEN
1012  IF( ( aaqq.LT.rootbig ) .AND.
1013  $ ( aaqq.GT.rootsfmin ) ) THEN
1014  sva( q ) = scnrm2( m, a( 1, q ), 1 )
1015  ELSE
1016  t = zero
1017  aaqq = one
1018  CALL classq( m, a( 1, q ), 1, t,
1019  $ aaqq )
1020  sva( q ) = t*sqrt( aaqq )
1021  END IF
1022  END IF
1023  IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1024  IF( ( aapp.LT.rootbig ) .AND.
1025  $ ( aapp.GT.rootsfmin ) ) THEN
1026  aapp = scnrm2( m, a( 1, p ), 1 )
1027  ELSE
1028  t = zero
1029  aapp = one
1030  CALL classq( m, a( 1, p ), 1, t,
1031  $ aapp )
1032  aapp = t*sqrt( aapp )
1033  END IF
1034  sva( p ) = aapp
1035  END IF
1036 *
1037  ELSE
1038 * A(:,p) and A(:,q) already numerically orthogonal
1039  IF( ir1.EQ.0 )notrot = notrot + 1
1040 *[RTD] SKIPPED = SKIPPED + 1
1041  pskipped = pskipped + 1
1042  END IF
1043  ELSE
1044 * A(:,q) is zero column
1045  IF( ir1.EQ.0 )notrot = notrot + 1
1046  pskipped = pskipped + 1
1047  END IF
1048 *
1049  IF( ( i.LE.swband ) .AND.
1050  $ ( pskipped.GT.rowskip ) ) THEN
1051  IF( ir1.EQ.0 )aapp = -aapp
1052  notrot = 0
1053  GO TO 2103
1054  END IF
1055 *
1056  2002 CONTINUE
1057 * END q-LOOP
1058 *
1059  2103 CONTINUE
1060 * bailed out of q-loop
1061 *
1062  sva( p ) = aapp
1063 *
1064  ELSE
1065  sva( p ) = aapp
1066  IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1067  $ notrot = notrot + min( igl+kbl-1, n ) - p
1068  END IF
1069 *
1070  2001 CONTINUE
1071 * end of the p-loop
1072 * end of doing the block ( ibr, ibr )
1073  1002 CONTINUE
1074 * end of ir1-loop
1075 *
1076 * ... go to the off diagonal blocks
1077 *
1078  igl = ( ibr-1 )*kbl + 1
1079 *
1080  DO 2010 jbc = ibr + 1, nbl
1081 *
1082  jgl = ( jbc-1 )*kbl + 1
1083 *
1084 * doing the block at ( ibr, jbc )
1085 *
1086  ijblsk = 0
1087  DO 2100 p = igl, min( igl+kbl-1, n )
1088 *
1089  aapp = sva( p )
1090  IF( aapp.GT.zero ) THEN
1091 *
1092  pskipped = 0
1093 *
1094  DO 2200 q = jgl, min( jgl+kbl-1, n )
1095 *
1096  aaqq = sva( q )
1097  IF( aaqq.GT.zero ) THEN
1098  aapp0 = aapp
1099 *
1100 * .. M x 2 Jacobi SVD ..
1101 *
1102 * Safe Gram matrix computation
1103 *
1104  IF( aaqq.GE.one ) THEN
1105  IF( aapp.GE.aaqq ) THEN
1106  rotok = ( small*aapp ).LE.aaqq
1107  ELSE
1108  rotok = ( small*aaqq ).LE.aapp
1109  END IF
1110  IF( aapp.LT.( big / aaqq ) ) THEN
1111  aapq = ( cdotc( m, a( 1, p ), 1,
1112  $ a( 1, q ), 1 ) / aaqq ) / aapp
1113  ELSE
1114  CALL ccopy( m, a( 1, p ), 1,
1115  $ cwork(n+1), 1 )
1116  CALL clascl( 'G', 0, 0, aapp,
1117  $ one, m, 1,
1118  $ cwork(n+1), lda, ierr )
1119  aapq = cdotc( m, cwork(n+1), 1,
1120  $ a( 1, q ), 1 ) / aaqq
1121  END IF
1122  ELSE
1123  IF( aapp.GE.aaqq ) THEN
1124  rotok = aapp.LE.( aaqq / small )
1125  ELSE
1126  rotok = aaqq.LE.( aapp / small )
1127  END IF
1128  IF( aapp.GT.( small / aaqq ) ) THEN
1129  aapq = ( cdotc( m, a( 1, p ), 1,
1130  $ a( 1, q ), 1 ) / max(aaqq,aapp) )
1131  $ / min(aaqq,aapp)
1132  ELSE
1133  CALL ccopy( m, a( 1, q ), 1,
1134  $ cwork(n+1), 1 )
1135  CALL clascl( 'G', 0, 0, aaqq,
1136  $ one, m, 1,
1137  $ cwork(n+1), lda, ierr )
1138  aapq = cdotc( m, a( 1, p ), 1,
1139  $ cwork(n+1), 1 ) / aapp
1140  END IF
1141  END IF
1142 *
1143 * AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1144  aapq1 = -abs(aapq)
1145  mxaapq = max( mxaapq, -aapq1 )
1146 *
1147 * TO rotate or NOT to rotate, THAT is the question ...
1148 *
1149  IF( abs( aapq1 ).GT.tol ) THEN
1150  ompq = aapq / abs(aapq)
1151  notrot = 0
1152 *[RTD] ROTATED = ROTATED + 1
1153  pskipped = 0
1154  iswrot = iswrot + 1
1155 *
1156  IF( rotok ) THEN
1157 *
1158  aqoap = aaqq / aapp
1159  apoaq = aapp / aaqq
1160  theta = -half*abs( aqoap-apoaq )/ aapq1
1161  IF( aaqq.GT.aapp0 )theta = -theta
1162 *
1163  IF( abs( theta ).GT.bigtheta ) THEN
1164  t = half / theta
1165  cs = one
1166  CALL crot( m, a(1,p), 1, a(1,q), 1,
1167  $ cs, conjg(ompq)*t )
1168  IF( rsvec ) THEN
1169  CALL crot( mvl, v(1,p), 1,
1170  $ v(1,q), 1, cs, conjg(ompq)*t )
1171  END IF
1172  sva( q ) = aaqq*sqrt( max( zero,
1173  $ one+t*apoaq*aapq1 ) )
1174  aapp = aapp*sqrt( max( zero,
1175  $ one-t*aqoap*aapq1 ) )
1176  mxsinj = max( mxsinj, abs( t ) )
1177  ELSE
1178 *
1179 * .. choose correct signum for THETA and rotate
1180 *
1181  thsign = -sign( one, aapq1 )
1182  IF( aaqq.GT.aapp0 )thsign = -thsign
1183  t = one / ( theta+thsign*
1184  $ sqrt( one+theta*theta ) )
1185  cs = sqrt( one / ( one+t*t ) )
1186  sn = t*cs
1187  mxsinj = max( mxsinj, abs( sn ) )
1188  sva( q ) = aaqq*sqrt( max( zero,
1189  $ one+t*apoaq*aapq1 ) )
1190  aapp = aapp*sqrt( max( zero,
1191  $ one-t*aqoap*aapq1 ) )
1192 *
1193  CALL crot( m, a(1,p), 1, a(1,q), 1,
1194  $ cs, conjg(ompq)*sn )
1195  IF( rsvec ) THEN
1196  CALL crot( mvl, v(1,p), 1,
1197  $ v(1,q), 1, cs, conjg(ompq)*sn )
1198  END IF
1199  END IF
1200  cwork(p) = -cwork(q) * ompq
1201 *
1202  ELSE
1203 * .. have to use modified Gram-Schmidt like transformation
1204  IF( aapp.GT.aaqq ) THEN
1205  CALL ccopy( m, a( 1, p ), 1,
1206  $ cwork(n+1), 1 )
1207  CALL clascl( 'G', 0, 0, aapp, one,
1208  $ m, 1, cwork(n+1),lda,
1209  $ ierr )
1210  CALL clascl( 'G', 0, 0, aaqq, one,
1211  $ m, 1, a( 1, q ), lda,
1212  $ ierr )
1213  CALL caxpy( m, -aapq, cwork(n+1),
1214  $ 1, a( 1, q ), 1 )
1215  CALL clascl( 'G', 0, 0, one, aaqq,
1216  $ m, 1, a( 1, q ), lda,
1217  $ ierr )
1218  sva( q ) = aaqq*sqrt( max( zero,
1219  $ one-aapq1*aapq1 ) )
1220  mxsinj = max( mxsinj, sfmin )
1221  ELSE
1222  CALL ccopy( m, a( 1, q ), 1,
1223  $ cwork(n+1), 1 )
1224  CALL clascl( 'G', 0, 0, aaqq, one,
1225  $ m, 1, cwork(n+1),lda,
1226  $ ierr )
1227  CALL clascl( 'G', 0, 0, aapp, one,
1228  $ m, 1, a( 1, p ), lda,
1229  $ ierr )
1230  CALL caxpy( m, -conjg(aapq),
1231  $ cwork(n+1), 1, a( 1, p ), 1 )
1232  CALL clascl( 'G', 0, 0, one, aapp,
1233  $ m, 1, a( 1, p ), lda,
1234  $ ierr )
1235  sva( p ) = aapp*sqrt( max( zero,
1236  $ one-aapq1*aapq1 ) )
1237  mxsinj = max( mxsinj, sfmin )
1238  END IF
1239  END IF
1240 * END IF ROTOK THEN ... ELSE
1241 *
1242 * In the case of cancellation in updating SVA(q), SVA(p)
1243 * .. recompute SVA(q), SVA(p)
1244  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1245  $ THEN
1246  IF( ( aaqq.LT.rootbig ) .AND.
1247  $ ( aaqq.GT.rootsfmin ) ) THEN
1248  sva( q ) = scnrm2( m, a( 1, q ), 1)
1249  ELSE
1250  t = zero
1251  aaqq = one
1252  CALL classq( m, a( 1, q ), 1, t,
1253  $ aaqq )
1254  sva( q ) = t*sqrt( aaqq )
1255  END IF
1256  END IF
1257  IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1258  IF( ( aapp.LT.rootbig ) .AND.
1259  $ ( aapp.GT.rootsfmin ) ) THEN
1260  aapp = scnrm2( m, a( 1, p ), 1 )
1261  ELSE
1262  t = zero
1263  aapp = one
1264  CALL classq( m, a( 1, p ), 1, t,
1265  $ aapp )
1266  aapp = t*sqrt( aapp )
1267  END IF
1268  sva( p ) = aapp
1269  END IF
1270 * end of OK rotation
1271  ELSE
1272  notrot = notrot + 1
1273 *[RTD] SKIPPED = SKIPPED + 1
1274  pskipped = pskipped + 1
1275  ijblsk = ijblsk + 1
1276  END IF
1277  ELSE
1278  notrot = notrot + 1
1279  pskipped = pskipped + 1
1280  ijblsk = ijblsk + 1
1281  END IF
1282 *
1283  IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1284  $ THEN
1285  sva( p ) = aapp
1286  notrot = 0
1287  GO TO 2011
1288  END IF
1289  IF( ( i.LE.swband ) .AND.
1290  $ ( pskipped.GT.rowskip ) ) THEN
1291  aapp = -aapp
1292  notrot = 0
1293  GO TO 2203
1294  END IF
1295 *
1296  2200 CONTINUE
1297 * end of the q-loop
1298  2203 CONTINUE
1299 *
1300  sva( p ) = aapp
1301 *
1302  ELSE
1303 *
1304  IF( aapp.EQ.zero )notrot = notrot +
1305  $ min( jgl+kbl-1, n ) - jgl + 1
1306  IF( aapp.LT.zero )notrot = 0
1307 *
1308  END IF
1309 *
1310  2100 CONTINUE
1311 * end of the p-loop
1312  2010 CONTINUE
1313 * end of the jbc-loop
1314  2011 CONTINUE
1315 *2011 bailed out of the jbc-loop
1316  DO 2012 p = igl, min( igl+kbl-1, n )
1317  sva( p ) = abs( sva( p ) )
1318  2012 CONTINUE
1319 ***
1320  2000 CONTINUE
1321 *2000 :: end of the ibr-loop
1322 *
1323 * .. update SVA(N)
1324  IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1325  $ THEN
1326  sva( n ) = scnrm2( m, a( 1, n ), 1 )
1327  ELSE
1328  t = zero
1329  aapp = one
1330  CALL classq( m, a( 1, n ), 1, t, aapp )
1331  sva( n ) = t*sqrt( aapp )
1332  END IF
1333 *
1334 * Additional steering devices
1335 *
1336  IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1337  $ ( iswrot.LE.n ) ) )swband = i
1338 *
1339  IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( REAL( N ) )*
1340  $ tol ) .AND. ( REAL( n )*mxaapq*mxsinj.LT.tol ) ) then
1341  GO TO 1994
1342  END IF
1343 *
1344  IF( notrot.GE.emptsw )GO TO 1994
1345 *
1346  1993 CONTINUE
1347 * end i=1:NSWEEP loop
1348 *
1349 * #:( Reaching this point means that the procedure has not converged.
1350  info = nsweep - 1
1351  GO TO 1995
1352 *
1353  1994 CONTINUE
1354 * #:) Reaching this point means numerical convergence after the i-th
1355 * sweep.
1356 *
1357  info = 0
1358 * #:) INFO = 0 confirms successful iterations.
1359  1995 CONTINUE
1360 *
1361 * Sort the singular values and find how many are above
1362 * the underflow threshold.
1363 *
1364  n2 = 0
1365  n4 = 0
1366  DO 5991 p = 1, n - 1
1367  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
1368  IF( p.NE.q ) THEN
1369  temp1 = sva( p )
1370  sva( p ) = sva( q )
1371  sva( q ) = temp1
1372  CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1373  IF( rsvec )CALL cswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1374  END IF
1375  IF( sva( p ).NE.zero ) THEN
1376  n4 = n4 + 1
1377  IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1378  END IF
1379  5991 CONTINUE
1380  IF( sva( n ).NE.zero ) THEN
1381  n4 = n4 + 1
1382  IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1383  END IF
1384 *
1385 * Normalize the left singular vectors.
1386 *
1387  IF( lsvec .OR. uctol ) THEN
1388  DO 1998 p = 1, n4
1389 * CALL CSSCAL( M, ONE / SVA( p ), A( 1, p ), 1 )
1390  CALL clascl( 'G',0,0, sva(p), one, m, 1, a(1,p), m, ierr )
1391  1998 CONTINUE
1392  END IF
1393 *
1394 * Scale the product of Jacobi rotations.
1395 *
1396  IF( rsvec ) THEN
1397  DO 2399 p = 1, n
1398  temp1 = one / scnrm2( mvl, v( 1, p ), 1 )
1399  CALL csscal( mvl, temp1, v( 1, p ), 1 )
1400  2399 CONTINUE
1401  END IF
1402 *
1403 * Undo scaling, if necessary (and possible).
1404  IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1405  $ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1406  $ ( sfmin / skl ) ) ) ) THEN
1407  DO 2400 p = 1, n
1408  sva( p ) = skl*sva( p )
1409  2400 CONTINUE
1410  skl = one
1411  END IF
1412 *
1413  rwork( 1 ) = skl
1414 * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1415 * then some of the singular values may overflow or underflow and
1416 * the spectrum is given in this factored representation.
1417 *
1418  rwork( 2 ) = REAL( n4 )
1419 * N4 is the number of computed nonzero singular values of A.
1420 *
1421  rwork( 3 ) = REAL( n2 )
1422 * N2 is the number of singular values of A greater than SFMIN.
1423 * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1424 * that may carry some information.
1425 *
1426  rwork( 4 ) = REAL( i )
1427 * i is the index of the last sweep before declaring convergence.
1428 *
1429  rwork( 5 ) = mxaapq
1430 * MXAAPQ is the largest absolute value of scaled pivots in the
1431 * last sweep
1432 *
1433  rwork( 6 ) = mxsinj
1434 * MXSINJ is the largest absolute value of the sines of Jacobi angles
1435 * in the last sweep
1436 *
1437  RETURN
1438 * ..
1439 * .. END OF CGESVJ
1440 * ..
real function scnrm2(N, X, INCX)
SCNRM2
Definition: scnrm2.f:77
subroutine cgsvj0(JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
CGSVJ0 pre-processor for the routine cgesvj.
Definition: cgsvj0.f:220
subroutine classq(N, X, INCX, SCALE, SUMSQ)
CLASSQ updates a sum of squares represented in scaled form.
Definition: classq.f:108
subroutine crot(N, CX, INCX, CY, INCY, C, S)
CROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors...
Definition: crot.f:105
subroutine claset(UPLO, M, N, ALPHA, BETA, A, LDA)
CLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values...
Definition: claset.f:108
subroutine clascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
CLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: clascl.f:145
integer function isamax(N, SX, INCX)
ISAMAX
Definition: isamax.f:73
complex function cdotc(N, CX, INCX, CY, INCY)
CDOTC
Definition: cdotc.f:85
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:62
subroutine slascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
SLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: slascl.f:145
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:55
real function slamch(CMACH)
SLAMCH
Definition: slamch.f:69
subroutine ccopy(N, CX, INCX, CY, INCY)
CCOPY
Definition: ccopy.f:83
subroutine cswap(N, CX, INCX, CY, INCY)
CSWAP
Definition: cswap.f:83
subroutine cgsvj1(JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
CGSVJ1 pre-processor for the routine cgesvj, applies Jacobi rotations targeting only particular pivot...
Definition: cgsvj1.f:238
subroutine caxpy(N, CA, CX, INCX, CY, INCY)
CAXPY
Definition: caxpy.f:90
subroutine csscal(N, SA, CX, INCX)
CSSCAL
Definition: csscal.f:80
Here is the call graph for this function:
Here is the caller graph for this function: