LAPACK  3.6.1
LAPACK: Linear Algebra PACKage
subroutine zgesvj ( character*1  JOBA,
character*1  JOBU,
character*1  JOBV,
integer  M,
integer  N,
complex*16, dimension( lda, * )  A,
integer  LDA,
double precision, dimension( n )  SVA,
integer  MV,
complex*16, dimension( ldv, * )  V,
integer  LDV,
complex*16, dimension( lwork )  CWORK,
integer  LWORK,
double precision, dimension( lrwork )  RWORK,
integer  LRWORK,
integer  INFO 
)

ZGESVJ

Download ZGESVJ + dependencies [TGZ] [ZIP] [TXT]

Purpose:
 ZGESVJ computes the singular value decomposition (SVD) of a complex
 M-by-N matrix A, where M >= N. The SVD of A is written as
                                    [++]   [xx]   [x0]   [xx]
              A = U * SIGMA * V^*,  [++] = [xx] * [ox] * [xx]
                                    [++]   [xx]
 where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
 matrix, and V is an N-by-N unitary matrix. The diagonal elements
 of SIGMA are the singular values of A. The columns of U and V are the
 left and the right singular vectors of A, respectively.
Parameters
[in]JOBA
          JOBA is CHARACTER* 1
          Specifies the structure of A.
          = 'L': The input matrix A is lower triangular;
          = 'U': The input matrix A is upper triangular;
          = 'G': The input matrix A is general M-by-N matrix, M >= N.
[in]JOBU
          JOBU is CHARACTER*1
          Specifies whether to compute the left singular vectors
          (columns of U):
          = 'U': The left singular vectors corresponding to the nonzero
                 singular values are computed and returned in the leading
                 columns of A. See more details in the description of A.
                 The default numerical orthogonality threshold is set to
                 approximately TOL=CTOL*EPS, CTOL=DSQRT(M), EPS=DLAMCH('E').
          = 'C': Analogous to JOBU='U', except that user can control the
                 level of numerical orthogonality of the computed left
                 singular vectors. TOL can be set to TOL = CTOL*EPS, where
                 CTOL is given on input in the array WORK.
                 No CTOL smaller than ONE is allowed. CTOL greater
                 than 1 / EPS is meaningless. The option 'C'
                 can be used if M*EPS is satisfactory orthogonality
                 of the computed left singular vectors, so CTOL=M could
                 save few sweeps of Jacobi rotations.
                 See the descriptions of A and WORK(1).
          = 'N': The matrix U is not computed. However, see the
                 description of A.
[in]JOBV
          JOBV is CHARACTER*1
          Specifies whether to compute the right singular vectors, that
          is, the matrix V:
          = 'V' : the matrix V is computed and returned in the array V
          = 'A' : the Jacobi rotations are applied to the MV-by-N
                  array V. In other words, the right singular vector
                  matrix V is not computed explicitly, instead it is
                  applied to an MV-by-N matrix initially stored in the
                  first MV rows of V.
          = 'N' : the matrix V is not computed and the array V is not
                  referenced
[in]M
          M is INTEGER
          The number of rows of the input matrix A. 1/DLAMCH('E') > M >= 0.  
[in]N
          N is INTEGER
          The number of columns of the input matrix A.
          M >= N >= 0.
[in,out]A
          A is COMPLEX*16 array, dimension (LDA,N)
          On entry, the M-by-N matrix A.
          On exit,
          If JOBU .EQ. 'U' .OR. JOBU .EQ. 'C':
                 If INFO .EQ. 0 :
                 RANKA orthonormal columns of U are returned in the
                 leading RANKA columns of the array A. Here RANKA <= N
                 is the number of computed singular values of A that are
                 above the underflow threshold DLAMCH('S'). The singular
                 vectors corresponding to underflowed or zero singular
                 values are not computed. The value of RANKA is returned
                 in the array RWORK as RANKA=NINT(RWORK(2)). Also see the
                 descriptions of SVA and RWORK. The computed columns of U
                 are mutually numerically orthogonal up to approximately
                 TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU.EQ.'C'),
                 see the description of JOBU.
                 If INFO .GT. 0,
                 the procedure ZGESVJ did not converge in the given number
                 of iterations (sweeps). In that case, the computed
                 columns of U may not be orthogonal up to TOL. The output
                 U (stored in A), SIGMA (given by the computed singular
                 values in SVA(1:N)) and V is still a decomposition of the
                 input matrix A in the sense that the residual
                 || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small.
          If JOBU .EQ. 'N':
                 If INFO .EQ. 0 :
                 Note that the left singular vectors are 'for free' in the
                 one-sided Jacobi SVD algorithm. However, if only the
                 singular values are needed, the level of numerical
                 orthogonality of U is not an issue and iterations are
                 stopped when the columns of the iterated matrix are
                 numerically orthogonal up to approximately M*EPS. Thus,
                 on exit, A contains the columns of U scaled with the
                 corresponding singular values.
                 If INFO .GT. 0 :
                 the procedure ZGESVJ did not converge in the given number
                 of iterations (sweeps).
[in]LDA
          LDA is INTEGER
          The leading dimension of the array A.  LDA >= max(1,M).
[out]SVA
          SVA is DOUBLE PRECISION array, dimension (N)
          On exit,
          If INFO .EQ. 0 :
          depending on the value SCALE = RWORK(1), we have:
                 If SCALE .EQ. ONE:
                 SVA(1:N) contains the computed singular values of A.
                 During the computation SVA contains the Euclidean column
                 norms of the iterated matrices in the array A.
                 If SCALE .NE. ONE:
                 The singular values of A are SCALE*SVA(1:N), and this
                 factored representation is due to the fact that some of the
                 singular values of A might underflow or overflow.

          If INFO .GT. 0 :
          the procedure ZGESVJ did not converge in the given number of
          iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.
[in]MV
          MV is INTEGER
          If JOBV .EQ. 'A', then the product of Jacobi rotations in ZGESVJ
          is applied to the first MV rows of V. See the description of JOBV.
[in,out]V
          V is COMPLEX*16 array, dimension (LDV,N)
          If JOBV = 'V', then V contains on exit the N-by-N matrix of
                         the right singular vectors;
          If JOBV = 'A', then V contains the product of the computed right
                         singular vector matrix and the initial matrix in
                         the array V.
          If JOBV = 'N', then V is not referenced.
[in]LDV
          LDV is INTEGER
          The leading dimension of the array V, LDV .GE. 1.
          If JOBV .EQ. 'V', then LDV .GE. max(1,N).
          If JOBV .EQ. 'A', then LDV .GE. max(1,MV) .
[in,out]CWORK
          CWORK is COMPLEX*16 array, dimension M+N.
          Used as work space.
[in]LWORK
          LWORK is INTEGER.
          Length of CWORK, LWORK >= M+N.
[in,out]RWORK
          RWORK is DOUBLE PRECISION array, dimension max(6,M+N).
          On entry,
          If JOBU .EQ. 'C' :
          RWORK(1) = CTOL, where CTOL defines the threshold for convergence.
                    The process stops if all columns of A are mutually
                    orthogonal up to CTOL*EPS, EPS=DLAMCH('E').
                    It is required that CTOL >= ONE, i.e. it is not
                    allowed to force the routine to obtain orthogonality
                    below EPSILON.
          On exit,
          RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N)
                    are the computed singular values of A.
                    (See description of SVA().)
          RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero
                    singular values.
          RWORK(3) = NINT(RWORK(3)) is the number of the computed singular
                    values that are larger than the underflow threshold.
          RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi
                    rotations needed for numerical convergence.
          RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep.
                    This is useful information in cases when ZGESVJ did
                    not converge, as it can be used to estimate whether
                    the output is stil useful and for post festum analysis.
          RWORK(6) = the largest absolute value over all sines of the
                    Jacobi rotation angles in the last sweep. It can be
                    useful for a post festum analysis.
[in]LRWORK
          LRWORK is INTEGER
         Length of RWORK, LRWORK >= MAX(6,N).
[out]INFO
          INFO is INTEGER
          = 0 : successful exit.
          < 0 : if INFO = -i, then the i-th argument had an illegal value
          > 0 : ZGESVJ did not converge in the maximal allowed number 
                (NSWEEP=30) of sweeps. The output may still be useful. 
                See the description of RWORK.
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Date
June 2016
Further Details:
 The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
 rotations. In the case of underflow of the tangent of the Jacobi angle, a
 modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses
 column interchanges of de Rijk [1]. The relative accuracy of the computed
 singular values and the accuracy of the computed singular vectors (in
 angle metric) is as guaranteed by the theory of Demmel and Veselic [2].
 The condition number that determines the accuracy in the full rank case
 is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
 spectral condition number. The best performance of this Jacobi SVD
 procedure is achieved if used in an  accelerated version of Drmac and
 Veselic [4,5], and it is the kernel routine in the SIGMA library [6].
 Some tunning parameters (marked with [TP]) are available for the
 implementer. 
 The computational range for the nonzero singular values is the  machine
 number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
 denormalized singular values can be computed with the corresponding
 gradual loss of accurate digits.
Contributors:
  ============

  Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)
References:
[1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the singular value decomposition on a vector computer. SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371. [2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR. [3] Z. Drmac: Implementation of Jacobi rotations for accurate singular value computation in floating point arithmetic. SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222. [4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. LAPACK Working note 169. [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. LAPACK Working note 170. [6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, QSVD, (H,K)-SVD computations. Department of Mathematics, University of Zagreb, 2008, 2015.
Bugs, examples and comments:
  ===========================
  Please report all bugs and send interesting test examples and comments to
  drmac@math.hr. Thank you.

Definition at line 344 of file zgesvj.f.

344 *
345 * -- LAPACK computational routine (version 3.6.1) --
346 * -- LAPACK is a software package provided by Univ. of Tennessee, --
347 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
348 * June 2016
349 *
350  IMPLICIT NONE
351 * .. Scalar Arguments ..
352  INTEGER info, lda, ldv, lwork, lrwork, m, mv, n
353  CHARACTER*1 joba, jobu, jobv
354 * ..
355 * .. Array Arguments ..
356  COMPLEX*16 a( lda, * ), v( ldv, * ), cwork( lwork )
357  DOUBLE PRECISION rwork( lrwork ), sva( n )
358 * ..
359 *
360 * =====================================================================
361 *
362 * .. Local Parameters ..
363  DOUBLE PRECISION zero, half, one
364  parameter( zero = 0.0d0, half = 0.5d0, one = 1.0d0)
365  COMPLEX*16 czero, cone
366  parameter( czero = (0.0d0, 0.0d0), cone = (1.0d0, 0.0d0) )
367  INTEGER nsweep
368  parameter( nsweep = 30 )
369 * ..
370 * .. Local Scalars ..
371  COMPLEX*16 aapq, ompq
372  DOUBLE PRECISION aapp, aapp0, aapq1, aaqq, apoaq, aqoap, big,
373  $ bigtheta, cs, ctol, epsln, large, mxaapq,
374  $ mxsinj, rootbig, rooteps, rootsfmin, roottol,
375  $ skl, sfmin, small, sn, t, temp1, theta, thsign, tol
376  INTEGER blskip, emptsw, i, ibr, ierr, igl, ijblsk, ir1,
377  $ iswrot, jbc, jgl, kbl, lkahead, mvl, n2, n34,
378  $ n4, nbl, notrot, p, pskipped, q, rowskip, swband
379  LOGICAL applv, goscale, lower, lsvec, noscale, rotok,
380  $ rsvec, uctol, upper
381 * ..
382 * ..
383 * .. Intrinsic Functions ..
384  INTRINSIC abs, dmax1, dmin1, dconjg, dble, min0, max0,
385  $ dsign, dsqrt
386 * ..
387 * .. External Functions ..
388 * ..
389 * from BLAS
390  DOUBLE PRECISION dznrm2
391  COMPLEX*16 zdotc
392  EXTERNAL zdotc, dznrm2
393  INTEGER idamax
394  EXTERNAL idamax
395 * from LAPACK
396  DOUBLE PRECISION dlamch
397  EXTERNAL dlamch
398  LOGICAL lsame
399  EXTERNAL lsame
400 * ..
401 * .. External Subroutines ..
402 * ..
403 * from BLAS
404  EXTERNAL zcopy, zrot, zdscal, zswap
405 * from LAPACK
406  EXTERNAL dlascl, zlascl, zlaset, zlassq, xerbla
407  EXTERNAL zgsvj0, zgsvj1
408 * ..
409 * .. Executable Statements ..
410 *
411 * Test the input arguments
412 *
413  lsvec = lsame( jobu, 'U' )
414  uctol = lsame( jobu, 'C' )
415  rsvec = lsame( jobv, 'V' )
416  applv = lsame( jobv, 'A' )
417  upper = lsame( joba, 'U' )
418  lower = lsame( joba, 'L' )
419 *
420  IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
421  info = -1
422  ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
423  info = -2
424  ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
425  info = -3
426  ELSE IF( m.LT.0 ) THEN
427  info = -4
428  ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
429  info = -5
430  ELSE IF( lda.LT.m ) THEN
431  info = -7
432  ELSE IF( mv.LT.0 ) THEN
433  info = -9
434  ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
435  $ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
436  info = -11
437  ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
438  info = -12
439  ELSE IF( lwork.LT.( m+n ) ) THEN
440  info = -13
441  ELSE IF( lrwork.LT.max0( n, 6 ) ) THEN
442  info = -15
443  ELSE
444  info = 0
445  END IF
446 *
447 * #:(
448  IF( info.NE.0 ) THEN
449  CALL xerbla( 'ZGESVJ', -info )
450  RETURN
451  END IF
452 *
453 * #:) Quick return for void matrix
454 *
455  IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
456 *
457 * Set numerical parameters
458 * The stopping criterion for Jacobi rotations is
459 *
460 * max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
461 *
462 * where EPS is the round-off and CTOL is defined as follows:
463 *
464  IF( uctol ) THEN
465 * ... user controlled
466  ctol = rwork( 1 )
467  ELSE
468 * ... default
469  IF( lsvec .OR. rsvec .OR. applv ) THEN
470  ctol = dsqrt( dble( m ) )
471  ELSE
472  ctol = dble( m )
473  END IF
474  END IF
475 * ... and the machine dependent parameters are
476 *[!] (Make sure that DLAMCH() works properly on the target machine.)
477 *
478  epsln = dlamch( 'Epsilon' )
479  rooteps = dsqrt( epsln )
480  sfmin = dlamch( 'SafeMinimum' )
481  rootsfmin = dsqrt( sfmin )
482  small = sfmin / epsln
483  big = dlamch( 'Overflow' )
484 * BIG = ONE / SFMIN
485  rootbig = one / rootsfmin
486  large = big / dsqrt( dble( m*n ) )
487  bigtheta = one / rooteps
488 *
489  tol = ctol*epsln
490  roottol = dsqrt( tol )
491 *
492  IF( dble( m )*epsln.GE.one ) THEN
493  info = -4
494  CALL xerbla( 'ZGESVJ', -info )
495  RETURN
496  END IF
497 *
498 * Initialize the right singular vector matrix.
499 *
500  IF( rsvec ) THEN
501  mvl = n
502  CALL zlaset( 'A', mvl, n, czero, cone, v, ldv )
503  ELSE IF( applv ) THEN
504  mvl = mv
505  END IF
506  rsvec = rsvec .OR. applv
507 *
508 * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
509 *(!) If necessary, scale A to protect the largest singular value
510 * from overflow. It is possible that saving the largest singular
511 * value destroys the information about the small ones.
512 * This initial scaling is almost minimal in the sense that the
513 * goal is to make sure that no column norm overflows, and that
514 * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
515 * in A are detected, the procedure returns with INFO=-6.
516 *
517  skl = one / dsqrt( dble( m )*dble( n ) )
518  noscale = .true.
519  goscale = .true.
520 *
521  IF( lower ) THEN
522 * the input matrix is M-by-N lower triangular (trapezoidal)
523  DO 1874 p = 1, n
524  aapp = zero
525  aaqq = one
526  CALL zlassq( m-p+1, a( p, p ), 1, aapp, aaqq )
527  IF( aapp.GT.big ) THEN
528  info = -6
529  CALL xerbla( 'ZGESVJ', -info )
530  RETURN
531  END IF
532  aaqq = dsqrt( aaqq )
533  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
534  sva( p ) = aapp*aaqq
535  ELSE
536  noscale = .false.
537  sva( p ) = aapp*( aaqq*skl )
538  IF( goscale ) THEN
539  goscale = .false.
540  DO 1873 q = 1, p - 1
541  sva( q ) = sva( q )*skl
542  1873 CONTINUE
543  END IF
544  END IF
545  1874 CONTINUE
546  ELSE IF( upper ) THEN
547 * the input matrix is M-by-N upper triangular (trapezoidal)
548  DO 2874 p = 1, n
549  aapp = zero
550  aaqq = one
551  CALL zlassq( p, a( 1, p ), 1, aapp, aaqq )
552  IF( aapp.GT.big ) THEN
553  info = -6
554  CALL xerbla( 'ZGESVJ', -info )
555  RETURN
556  END IF
557  aaqq = dsqrt( aaqq )
558  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
559  sva( p ) = aapp*aaqq
560  ELSE
561  noscale = .false.
562  sva( p ) = aapp*( aaqq*skl )
563  IF( goscale ) THEN
564  goscale = .false.
565  DO 2873 q = 1, p - 1
566  sva( q ) = sva( q )*skl
567  2873 CONTINUE
568  END IF
569  END IF
570  2874 CONTINUE
571  ELSE
572 * the input matrix is M-by-N general dense
573  DO 3874 p = 1, n
574  aapp = zero
575  aaqq = one
576  CALL zlassq( m, a( 1, p ), 1, aapp, aaqq )
577  IF( aapp.GT.big ) THEN
578  info = -6
579  CALL xerbla( 'ZGESVJ', -info )
580  RETURN
581  END IF
582  aaqq = dsqrt( aaqq )
583  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
584  sva( p ) = aapp*aaqq
585  ELSE
586  noscale = .false.
587  sva( p ) = aapp*( aaqq*skl )
588  IF( goscale ) THEN
589  goscale = .false.
590  DO 3873 q = 1, p - 1
591  sva( q ) = sva( q )*skl
592  3873 CONTINUE
593  END IF
594  END IF
595  3874 CONTINUE
596  END IF
597 *
598  IF( noscale )skl = one
599 *
600 * Move the smaller part of the spectrum from the underflow threshold
601 *(!) Start by determining the position of the nonzero entries of the
602 * array SVA() relative to ( SFMIN, BIG ).
603 *
604  aapp = zero
605  aaqq = big
606  DO 4781 p = 1, n
607  IF( sva( p ).NE.zero )aaqq = dmin1( aaqq, sva( p ) )
608  aapp = dmax1( aapp, sva( p ) )
609  4781 CONTINUE
610 *
611 * #:) Quick return for zero matrix
612 *
613  IF( aapp.EQ.zero ) THEN
614  IF( lsvec )CALL zlaset( 'G', m, n, czero, cone, a, lda )
615  rwork( 1 ) = one
616  rwork( 2 ) = zero
617  rwork( 3 ) = zero
618  rwork( 4 ) = zero
619  rwork( 5 ) = zero
620  rwork( 6 ) = zero
621  RETURN
622  END IF
623 *
624 * #:) Quick return for one-column matrix
625 *
626  IF( n.EQ.1 ) THEN
627  IF( lsvec )CALL zlascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
628  $ a( 1, 1 ), lda, ierr )
629  rwork( 1 ) = one / skl
630  IF( sva( 1 ).GE.sfmin ) THEN
631  rwork( 2 ) = one
632  ELSE
633  rwork( 2 ) = zero
634  END IF
635  rwork( 3 ) = zero
636  rwork( 4 ) = zero
637  rwork( 5 ) = zero
638  rwork( 6 ) = zero
639  RETURN
640  END IF
641 *
642 * Protect small singular values from underflow, and try to
643 * avoid underflows/overflows in computing Jacobi rotations.
644 *
645  sn = dsqrt( sfmin / epsln )
646  temp1 = dsqrt( big / dble( n ) )
647  IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
648  $ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
649  temp1 = dmin1( big, temp1 / aapp )
650 * AAQQ = AAQQ*TEMP1
651 * AAPP = AAPP*TEMP1
652  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
653  temp1 = dmin1( sn / aaqq, big / (aapp*dsqrt( dble(n)) ) )
654 * AAQQ = AAQQ*TEMP1
655 * AAPP = AAPP*TEMP1
656  ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
657  temp1 = dmax1( sn / aaqq, temp1 / aapp )
658 * AAQQ = AAQQ*TEMP1
659 * AAPP = AAPP*TEMP1
660  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
661  temp1 = dmin1( sn / aaqq, big / ( dsqrt( dble( n ) )*aapp ) )
662 * AAQQ = AAQQ*TEMP1
663 * AAPP = AAPP*TEMP1
664  ELSE
665  temp1 = one
666  END IF
667 *
668 * Scale, if necessary
669 *
670  IF( temp1.NE.one ) THEN
671  CALL dlascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
672  END IF
673  skl = temp1*skl
674  IF( skl.NE.one ) THEN
675  CALL zlascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
676  skl = one / skl
677  END IF
678 *
679 * Row-cyclic Jacobi SVD algorithm with column pivoting
680 *
681  emptsw = ( n*( n-1 ) ) / 2
682  notrot = 0
683 
684  DO 1868 q = 1, n
685  cwork( q ) = cone
686  1868 CONTINUE
687 *
688 *
689 *
690  swband = 3
691 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
692 * if ZGESVJ is used as a computational routine in the preconditioned
693 * Jacobi SVD algorithm ZGEJSV. For sweeps i=1:SWBAND the procedure
694 * works on pivots inside a band-like region around the diagonal.
695 * The boundaries are determined dynamically, based on the number of
696 * pivots above a threshold.
697 *
698  kbl = min0( 8, n )
699 *[TP] KBL is a tuning parameter that defines the tile size in the
700 * tiling of the p-q loops of pivot pairs. In general, an optimal
701 * value of KBL depends on the matrix dimensions and on the
702 * parameters of the computer's memory.
703 *
704  nbl = n / kbl
705  IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
706 *
707  blskip = kbl**2
708 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
709 *
710  rowskip = min0( 5, kbl )
711 *[TP] ROWSKIP is a tuning parameter.
712 *
713  lkahead = 1
714 *[TP] LKAHEAD is a tuning parameter.
715 *
716 * Quasi block transformations, using the lower (upper) triangular
717 * structure of the input matrix. The quasi-block-cycling usually
718 * invokes cubic convergence. Big part of this cycle is done inside
719 * canonical subspaces of dimensions less than M.
720 *
721  IF( ( lower .OR. upper ) .AND. ( n.GT.max0( 64, 4*kbl ) ) ) THEN
722 *[TP] The number of partition levels and the actual partition are
723 * tuning parameters.
724  n4 = n / 4
725  n2 = n / 2
726  n34 = 3*n4
727  IF( applv ) THEN
728  q = 0
729  ELSE
730  q = 1
731  END IF
732 *
733  IF( lower ) THEN
734 *
735 * This works very well on lower triangular matrices, in particular
736 * in the framework of the preconditioned Jacobi SVD (xGEJSV).
737 * The idea is simple:
738 * [+ 0 0 0] Note that Jacobi transformations of [0 0]
739 * [+ + 0 0] [0 0]
740 * [+ + x 0] actually work on [x 0] [x 0]
741 * [+ + x x] [x x]. [x x]
742 *
743  CALL zgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
744  $ cwork( n34+1 ), sva( n34+1 ), mvl,
745  $ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
746  $ 2, cwork( n+1 ), lwork-n, ierr )
747 
748  CALL zgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
749  $ cwork( n2+1 ), sva( n2+1 ), mvl,
750  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
751  $ cwork( n+1 ), lwork-n, ierr )
752 
753  CALL zgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
754  $ cwork( n2+1 ), sva( n2+1 ), mvl,
755  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
756  $ cwork( n+1 ), lwork-n, ierr )
757 
758  CALL zgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
759  $ cwork( n4+1 ), sva( n4+1 ), mvl,
760  $ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
761  $ cwork( n+1 ), lwork-n, ierr )
762 *
763  CALL zgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
764  $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
765  $ ierr )
766 *
767  CALL zgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
768  $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
769  $ lwork-n, ierr )
770 *
771 *
772  ELSE IF( upper ) THEN
773 *
774 *
775  CALL zgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
776  $ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
777  $ ierr )
778 *
779  CALL zgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
780  $ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
781  $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
782  $ ierr )
783 *
784  CALL zgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
785  $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
786  $ lwork-n, ierr )
787 *
788  CALL zgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
789  $ cwork( n2+1 ), sva( n2+1 ), mvl,
790  $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
791  $ cwork( n+1 ), lwork-n, ierr )
792 
793  END IF
794 *
795  END IF
796 *
797 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
798 *
799  DO 1993 i = 1, nsweep
800 *
801 * .. go go go ...
802 *
803  mxaapq = zero
804  mxsinj = zero
805  iswrot = 0
806 *
807  notrot = 0
808  pskipped = 0
809 *
810 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
811 * 1 <= p < q <= N. This is the first step toward a blocked implementation
812 * of the rotations. New implementation, based on block transformations,
813 * is under development.
814 *
815  DO 2000 ibr = 1, nbl
816 *
817  igl = ( ibr-1 )*kbl + 1
818 *
819  DO 1002 ir1 = 0, min0( lkahead, nbl-ibr )
820 *
821  igl = igl + ir1*kbl
822 *
823  DO 2001 p = igl, min0( igl+kbl-1, n-1 )
824 *
825 * .. de Rijk's pivoting
826 *
827  q = idamax( n-p+1, sva( p ), 1 ) + p - 1
828  IF( p.NE.q ) THEN
829  CALL zswap( m, a( 1, p ), 1, a( 1, q ), 1 )
830  IF( rsvec )CALL zswap( mvl, v( 1, p ), 1,
831  $ v( 1, q ), 1 )
832  temp1 = sva( p )
833  sva( p ) = sva( q )
834  sva( q ) = temp1
835  aapq = cwork(p)
836  cwork(p) = cwork(q)
837  cwork(q) = aapq
838  END IF
839 *
840  IF( ir1.EQ.0 ) THEN
841 *
842 * Column norms are periodically updated by explicit
843 * norm computation.
844 *[!] Caveat:
845 * Unfortunately, some BLAS implementations compute DZNRM2(M,A(1,p),1)
846 * as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
847 * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
848 * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
849 * Hence, DZNRM2 cannot be trusted, not even in the case when
850 * the true norm is far from the under(over)flow boundaries.
851 * If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
852 * below should be replaced with "AAPP = DZNRM2( M, A(1,p), 1 )".
853 *
854  IF( ( sva( p ).LT.rootbig ) .AND.
855  $ ( sva( p ).GT.rootsfmin ) ) THEN
856  sva( p ) = dznrm2( m, a( 1, p ), 1 )
857  ELSE
858  temp1 = zero
859  aapp = one
860  CALL zlassq( m, a( 1, p ), 1, temp1, aapp )
861  sva( p ) = temp1*dsqrt( aapp )
862  END IF
863  aapp = sva( p )
864  ELSE
865  aapp = sva( p )
866  END IF
867 *
868  IF( aapp.GT.zero ) THEN
869 *
870  pskipped = 0
871 *
872  DO 2002 q = p + 1, min0( igl+kbl-1, n )
873 *
874  aaqq = sva( q )
875 *
876  IF( aaqq.GT.zero ) THEN
877 *
878  aapp0 = aapp
879  IF( aaqq.GE.one ) THEN
880  rotok = ( small*aapp ).LE.aaqq
881  IF( aapp.LT.( big / aaqq ) ) THEN
882  aapq = ( zdotc( m, a( 1, p ), 1,
883  $ a( 1, q ), 1 ) / aaqq ) / aapp
884  ELSE
885  CALL zcopy( m, a( 1, p ), 1,
886  $ cwork(n+1), 1 )
887  CALL zlascl( 'G', 0, 0, aapp, one,
888  $ m, 1, cwork(n+1), lda, ierr )
889  aapq = zdotc( m, cwork(n+1), 1,
890  $ a( 1, q ), 1 ) / aaqq
891  END IF
892  ELSE
893  rotok = aapp.LE.( aaqq / small )
894  IF( aapp.GT.( small / aaqq ) ) THEN
895  aapq = ( zdotc( m, a( 1, p ), 1,
896  $ a( 1, q ), 1 ) / aaqq ) / aapp
897  ELSE
898  CALL zcopy( m, a( 1, q ), 1,
899  $ cwork(n+1), 1 )
900  CALL zlascl( 'G', 0, 0, aaqq,
901  $ one, m, 1,
902  $ cwork(n+1), lda, ierr )
903  aapq = zdotc( m, a(1, p ), 1,
904  $ cwork(n+1), 1 ) / aapp
905  END IF
906  END IF
907 *
908 * AAPQ = AAPQ * DCONJG( CWORK(p) ) * CWORK(q)
909  aapq1 = -abs(aapq)
910  mxaapq = dmax1( mxaapq, -aapq1 )
911 *
912 * TO rotate or NOT to rotate, THAT is the question ...
913 *
914  IF( abs( aapq1 ).GT.tol ) THEN
915 *
916 * .. rotate
917 *[RTD] ROTATED = ROTATED + ONE
918 *
919  IF( ir1.EQ.0 ) THEN
920  notrot = 0
921  pskipped = 0
922  iswrot = iswrot + 1
923  END IF
924 *
925  IF( rotok ) THEN
926 *
927  ompq = aapq / abs(aapq)
928  aqoap = aaqq / aapp
929  apoaq = aapp / aaqq
930  theta = -half*abs( aqoap-apoaq )/aapq1
931 *
932  IF( abs( theta ).GT.bigtheta ) THEN
933 *
934  t = half / theta
935  cs = one
936 
937  CALL zrot( m, a(1,p), 1, a(1,q), 1,
938  $ cs, dconjg(ompq)*t )
939  IF ( rsvec ) THEN
940  CALL zrot( mvl, v(1,p), 1,
941  $ v(1,q), 1, cs, dconjg(ompq)*t )
942  END IF
943 
944  sva( q ) = aaqq*dsqrt( dmax1( zero,
945  $ one+t*apoaq*aapq1 ) )
946  aapp = aapp*dsqrt( dmax1( zero,
947  $ one-t*aqoap*aapq1 ) )
948  mxsinj = dmax1( mxsinj, abs( t ) )
949 *
950  ELSE
951 *
952 * .. choose correct signum for THETA and rotate
953 *
954  thsign = -dsign( one, aapq1 )
955  t = one / ( theta+thsign*
956  $ dsqrt( one+theta*theta ) )
957  cs = dsqrt( one / ( one+t*t ) )
958  sn = t*cs
959 *
960  mxsinj = dmax1( mxsinj, abs( sn ) )
961  sva( q ) = aaqq*dsqrt( dmax1( zero,
962  $ one+t*apoaq*aapq1 ) )
963  aapp = aapp*dsqrt( dmax1( zero,
964  $ one-t*aqoap*aapq1 ) )
965 *
966  CALL zrot( m, a(1,p), 1, a(1,q), 1,
967  $ cs, dconjg(ompq)*sn )
968  IF ( rsvec ) THEN
969  CALL zrot( mvl, v(1,p), 1,
970  $ v(1,q), 1, cs, dconjg(ompq)*sn )
971  END IF
972  END IF
973  cwork(p) = -cwork(q) * ompq
974 *
975  ELSE
976 * .. have to use modified Gram-Schmidt like transformation
977  CALL zcopy( m, a( 1, p ), 1,
978  $ cwork(n+1), 1 )
979  CALL zlascl( 'G', 0, 0, aapp, one, m,
980  $ 1, cwork(n+1), lda,
981  $ ierr )
982  CALL zlascl( 'G', 0, 0, aaqq, one, m,
983  $ 1, a( 1, q ), lda, ierr )
984  CALL zaxpy( m, -aapq, cwork(n+1), 1,
985  $ a( 1, q ), 1 )
986  CALL zlascl( 'G', 0, 0, one, aaqq, m,
987  $ 1, a( 1, q ), lda, ierr )
988  sva( q ) = aaqq*dsqrt( dmax1( zero,
989  $ one-aapq1*aapq1 ) )
990  mxsinj = dmax1( mxsinj, sfmin )
991  END IF
992 * END IF ROTOK THEN ... ELSE
993 *
994 * In the case of cancellation in updating SVA(q), SVA(p)
995 * recompute SVA(q), SVA(p).
996 *
997  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
998  $ THEN
999  IF( ( aaqq.LT.rootbig ) .AND.
1000  $ ( aaqq.GT.rootsfmin ) ) THEN
1001  sva( q ) = dznrm2( m, a( 1, q ), 1 )
1002  ELSE
1003  t = zero
1004  aaqq = one
1005  CALL zlassq( m, a( 1, q ), 1, t,
1006  $ aaqq )
1007  sva( q ) = t*dsqrt( aaqq )
1008  END IF
1009  END IF
1010  IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1011  IF( ( aapp.LT.rootbig ) .AND.
1012  $ ( aapp.GT.rootsfmin ) ) THEN
1013  aapp = dznrm2( m, a( 1, p ), 1 )
1014  ELSE
1015  t = zero
1016  aapp = one
1017  CALL zlassq( m, a( 1, p ), 1, t,
1018  $ aapp )
1019  aapp = t*dsqrt( aapp )
1020  END IF
1021  sva( p ) = aapp
1022  END IF
1023 *
1024  ELSE
1025 * A(:,p) and A(:,q) already numerically orthogonal
1026  IF( ir1.EQ.0 )notrot = notrot + 1
1027 *[RTD] SKIPPED = SKIPPED + 1
1028  pskipped = pskipped + 1
1029  END IF
1030  ELSE
1031 * A(:,q) is zero column
1032  IF( ir1.EQ.0 )notrot = notrot + 1
1033  pskipped = pskipped + 1
1034  END IF
1035 *
1036  IF( ( i.LE.swband ) .AND.
1037  $ ( pskipped.GT.rowskip ) ) THEN
1038  IF( ir1.EQ.0 )aapp = -aapp
1039  notrot = 0
1040  GO TO 2103
1041  END IF
1042 *
1043  2002 CONTINUE
1044 * END q-LOOP
1045 *
1046  2103 CONTINUE
1047 * bailed out of q-loop
1048 *
1049  sva( p ) = aapp
1050 *
1051  ELSE
1052  sva( p ) = aapp
1053  IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1054  $ notrot = notrot + min0( igl+kbl-1, n ) - p
1055  END IF
1056 *
1057  2001 CONTINUE
1058 * end of the p-loop
1059 * end of doing the block ( ibr, ibr )
1060  1002 CONTINUE
1061 * end of ir1-loop
1062 *
1063 * ... go to the off diagonal blocks
1064 *
1065  igl = ( ibr-1 )*kbl + 1
1066 *
1067  DO 2010 jbc = ibr + 1, nbl
1068 *
1069  jgl = ( jbc-1 )*kbl + 1
1070 *
1071 * doing the block at ( ibr, jbc )
1072 *
1073  ijblsk = 0
1074  DO 2100 p = igl, min0( igl+kbl-1, n )
1075 *
1076  aapp = sva( p )
1077  IF( aapp.GT.zero ) THEN
1078 *
1079  pskipped = 0
1080 *
1081  DO 2200 q = jgl, min0( jgl+kbl-1, n )
1082 *
1083  aaqq = sva( q )
1084  IF( aaqq.GT.zero ) THEN
1085  aapp0 = aapp
1086 *
1087 * .. M x 2 Jacobi SVD ..
1088 *
1089 * Safe Gram matrix computation
1090 *
1091  IF( aaqq.GE.one ) THEN
1092  IF( aapp.GE.aaqq ) THEN
1093  rotok = ( small*aapp ).LE.aaqq
1094  ELSE
1095  rotok = ( small*aaqq ).LE.aapp
1096  END IF
1097  IF( aapp.LT.( big / aaqq ) ) THEN
1098  aapq = ( zdotc( m, a( 1, p ), 1,
1099  $ a( 1, q ), 1 ) / aaqq ) / aapp
1100  ELSE
1101  CALL zcopy( m, a( 1, p ), 1,
1102  $ cwork(n+1), 1 )
1103  CALL zlascl( 'G', 0, 0, aapp,
1104  $ one, m, 1,
1105  $ cwork(n+1), lda, ierr )
1106  aapq = zdotc( m, cwork(n+1), 1,
1107  $ a( 1, q ), 1 ) / aaqq
1108  END IF
1109  ELSE
1110  IF( aapp.GE.aaqq ) THEN
1111  rotok = aapp.LE.( aaqq / small )
1112  ELSE
1113  rotok = aaqq.LE.( aapp / small )
1114  END IF
1115  IF( aapp.GT.( small / aaqq ) ) THEN
1116  aapq = ( zdotc( m, a( 1, p ), 1,
1117  $ a( 1, q ), 1 ) / aaqq ) / aapp
1118  ELSE
1119  CALL zcopy( m, a( 1, q ), 1,
1120  $ cwork(n+1), 1 )
1121  CALL zlascl( 'G', 0, 0, aaqq,
1122  $ one, m, 1,
1123  $ cwork(n+1), lda, ierr )
1124  aapq = zdotc( m, a( 1, p ), 1,
1125  $ cwork(n+1), 1 ) / aapp
1126  END IF
1127  END IF
1128 *
1129 * AAPQ = AAPQ * DCONJG(CWORK(p))*CWORK(q)
1130  aapq1 = -abs(aapq)
1131  mxaapq = dmax1( mxaapq, -aapq1 )
1132 *
1133 * TO rotate or NOT to rotate, THAT is the question ...
1134 *
1135  IF( abs( aapq1 ).GT.tol ) THEN
1136  notrot = 0
1137 *[RTD] ROTATED = ROTATED + 1
1138  pskipped = 0
1139  iswrot = iswrot + 1
1140 *
1141  IF( rotok ) THEN
1142 *
1143  ompq = aapq / abs(aapq)
1144  aqoap = aaqq / aapp
1145  apoaq = aapp / aaqq
1146  theta = -half*abs( aqoap-apoaq )/ aapq1
1147  IF( aaqq.GT.aapp0 )theta = -theta
1148 *
1149  IF( abs( theta ).GT.bigtheta ) THEN
1150  t = half / theta
1151  cs = one
1152  CALL zrot( m, a(1,p), 1, a(1,q), 1,
1153  $ cs, dconjg(ompq)*t )
1154  IF( rsvec ) THEN
1155  CALL zrot( mvl, v(1,p), 1,
1156  $ v(1,q), 1, cs, dconjg(ompq)*t )
1157  END IF
1158  sva( q ) = aaqq*dsqrt( dmax1( zero,
1159  $ one+t*apoaq*aapq1 ) )
1160  aapp = aapp*dsqrt( dmax1( zero,
1161  $ one-t*aqoap*aapq1 ) )
1162  mxsinj = dmax1( mxsinj, abs( t ) )
1163  ELSE
1164 *
1165 * .. choose correct signum for THETA and rotate
1166 *
1167  thsign = -dsign( one, aapq1 )
1168  IF( aaqq.GT.aapp0 )thsign = -thsign
1169  t = one / ( theta+thsign*
1170  $ dsqrt( one+theta*theta ) )
1171  cs = dsqrt( one / ( one+t*t ) )
1172  sn = t*cs
1173  mxsinj = dmax1( mxsinj, abs( sn ) )
1174  sva( q ) = aaqq*dsqrt( dmax1( zero,
1175  $ one+t*apoaq*aapq1 ) )
1176  aapp = aapp*dsqrt( dmax1( zero,
1177  $ one-t*aqoap*aapq1 ) )
1178 *
1179  CALL zrot( m, a(1,p), 1, a(1,q), 1,
1180  $ cs, dconjg(ompq)*sn )
1181  IF( rsvec ) THEN
1182  CALL zrot( mvl, v(1,p), 1,
1183  $ v(1,q), 1, cs, dconjg(ompq)*sn )
1184  END IF
1185  END IF
1186  cwork(p) = -cwork(q) * ompq
1187 *
1188  ELSE
1189 * .. have to use modified Gram-Schmidt like transformation
1190  IF( aapp.GT.aaqq ) THEN
1191  CALL zcopy( m, a( 1, p ), 1,
1192  $ cwork(n+1), 1 )
1193  CALL zlascl( 'G', 0, 0, aapp, one,
1194  $ m, 1, cwork(n+1),lda,
1195  $ ierr )
1196  CALL zlascl( 'G', 0, 0, aaqq, one,
1197  $ m, 1, a( 1, q ), lda,
1198  $ ierr )
1199  CALL zaxpy( m, -aapq, cwork(n+1),
1200  $ 1, a( 1, q ), 1 )
1201  CALL zlascl( 'G', 0, 0, one, aaqq,
1202  $ m, 1, a( 1, q ), lda,
1203  $ ierr )
1204  sva( q ) = aaqq*dsqrt( dmax1( zero,
1205  $ one-aapq1*aapq1 ) )
1206  mxsinj = dmax1( mxsinj, sfmin )
1207  ELSE
1208  CALL zcopy( m, a( 1, q ), 1,
1209  $ cwork(n+1), 1 )
1210  CALL zlascl( 'G', 0, 0, aaqq, one,
1211  $ m, 1, cwork(n+1),lda,
1212  $ ierr )
1213  CALL zlascl( 'G', 0, 0, aapp, one,
1214  $ m, 1, a( 1, p ), lda,
1215  $ ierr )
1216  CALL zaxpy( m, -dconjg(aapq),
1217  $ cwork(n+1), 1, a( 1, p ), 1 )
1218  CALL zlascl( 'G', 0, 0, one, aapp,
1219  $ m, 1, a( 1, p ), lda,
1220  $ ierr )
1221  sva( p ) = aapp*dsqrt( dmax1( zero,
1222  $ one-aapq1*aapq1 ) )
1223  mxsinj = dmax1( mxsinj, sfmin )
1224  END IF
1225  END IF
1226 * END IF ROTOK THEN ... ELSE
1227 *
1228 * In the case of cancellation in updating SVA(q), SVA(p)
1229 * .. recompute SVA(q), SVA(p)
1230  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1231  $ THEN
1232  IF( ( aaqq.LT.rootbig ) .AND.
1233  $ ( aaqq.GT.rootsfmin ) ) THEN
1234  sva( q ) = dznrm2( m, a( 1, q ), 1)
1235  ELSE
1236  t = zero
1237  aaqq = one
1238  CALL zlassq( m, a( 1, q ), 1, t,
1239  $ aaqq )
1240  sva( q ) = t*dsqrt( aaqq )
1241  END IF
1242  END IF
1243  IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1244  IF( ( aapp.LT.rootbig ) .AND.
1245  $ ( aapp.GT.rootsfmin ) ) THEN
1246  aapp = dznrm2( m, a( 1, p ), 1 )
1247  ELSE
1248  t = zero
1249  aapp = one
1250  CALL zlassq( m, a( 1, p ), 1, t,
1251  $ aapp )
1252  aapp = t*dsqrt( aapp )
1253  END IF
1254  sva( p ) = aapp
1255  END IF
1256 * end of OK rotation
1257  ELSE
1258  notrot = notrot + 1
1259 *[RTD] SKIPPED = SKIPPED + 1
1260  pskipped = pskipped + 1
1261  ijblsk = ijblsk + 1
1262  END IF
1263  ELSE
1264  notrot = notrot + 1
1265  pskipped = pskipped + 1
1266  ijblsk = ijblsk + 1
1267  END IF
1268 *
1269  IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1270  $ THEN
1271  sva( p ) = aapp
1272  notrot = 0
1273  GO TO 2011
1274  END IF
1275  IF( ( i.LE.swband ) .AND.
1276  $ ( pskipped.GT.rowskip ) ) THEN
1277  aapp = -aapp
1278  notrot = 0
1279  GO TO 2203
1280  END IF
1281 *
1282  2200 CONTINUE
1283 * end of the q-loop
1284  2203 CONTINUE
1285 *
1286  sva( p ) = aapp
1287 *
1288  ELSE
1289 *
1290  IF( aapp.EQ.zero )notrot = notrot +
1291  $ min0( jgl+kbl-1, n ) - jgl + 1
1292  IF( aapp.LT.zero )notrot = 0
1293 *
1294  END IF
1295 *
1296  2100 CONTINUE
1297 * end of the p-loop
1298  2010 CONTINUE
1299 * end of the jbc-loop
1300  2011 CONTINUE
1301 *2011 bailed out of the jbc-loop
1302  DO 2012 p = igl, min0( igl+kbl-1, n )
1303  sva( p ) = abs( sva( p ) )
1304  2012 CONTINUE
1305 ***
1306  2000 CONTINUE
1307 *2000 :: end of the ibr-loop
1308 *
1309 * .. update SVA(N)
1310  IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1311  $ THEN
1312  sva( n ) = dznrm2( m, a( 1, n ), 1 )
1313  ELSE
1314  t = zero
1315  aapp = one
1316  CALL zlassq( m, a( 1, n ), 1, t, aapp )
1317  sva( n ) = t*dsqrt( aapp )
1318  END IF
1319 *
1320 * Additional steering devices
1321 *
1322  IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1323  $ ( iswrot.LE.n ) ) )swband = i
1324 *
1325  IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.dsqrt( dble( n ) )*
1326  $ tol ) .AND. ( dble( n )*mxaapq*mxsinj.LT.tol ) ) THEN
1327  GO TO 1994
1328  END IF
1329 *
1330  IF( notrot.GE.emptsw )GO TO 1994
1331 *
1332  1993 CONTINUE
1333 * end i=1:NSWEEP loop
1334 *
1335 * #:( Reaching this point means that the procedure has not converged.
1336  info = nsweep - 1
1337  GO TO 1995
1338 *
1339  1994 CONTINUE
1340 * #:) Reaching this point means numerical convergence after the i-th
1341 * sweep.
1342 *
1343  info = 0
1344 * #:) INFO = 0 confirms successful iterations.
1345  1995 CONTINUE
1346 *
1347 * Sort the singular values and find how many are above
1348 * the underflow threshold.
1349 *
1350  n2 = 0
1351  n4 = 0
1352  DO 5991 p = 1, n - 1
1353  q = idamax( n-p+1, sva( p ), 1 ) + p - 1
1354  IF( p.NE.q ) THEN
1355  temp1 = sva( p )
1356  sva( p ) = sva( q )
1357  sva( q ) = temp1
1358  CALL zswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1359  IF( rsvec )CALL zswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1360  END IF
1361  IF( sva( p ).NE.zero ) THEN
1362  n4 = n4 + 1
1363  IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1364  END IF
1365  5991 CONTINUE
1366  IF( sva( n ).NE.zero ) THEN
1367  n4 = n4 + 1
1368  IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1369  END IF
1370 *
1371 * Normalize the left singular vectors.
1372 *
1373  IF( lsvec .OR. uctol ) THEN
1374  DO 1998 p = 1, n2
1375  CALL zdscal( m, one / sva( p ), a( 1, p ), 1 )
1376  1998 CONTINUE
1377  END IF
1378 *
1379 * Scale the product of Jacobi rotations.
1380 *
1381  IF( rsvec ) THEN
1382  DO 2399 p = 1, n
1383  temp1 = one / dznrm2( mvl, v( 1, p ), 1 )
1384  CALL zdscal( mvl, temp1, v( 1, p ), 1 )
1385  2399 CONTINUE
1386  END IF
1387 *
1388 * Undo scaling, if necessary (and possible).
1389  IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1390  $ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1391  $ ( sfmin / skl ) ) ) ) THEN
1392  DO 2400 p = 1, n
1393  sva( p ) = skl*sva( p )
1394  2400 CONTINUE
1395  skl = one
1396  END IF
1397 *
1398  rwork( 1 ) = skl
1399 * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1400 * then some of the singular values may overflow or underflow and
1401 * the spectrum is given in this factored representation.
1402 *
1403  rwork( 2 ) = dble( n4 )
1404 * N4 is the number of computed nonzero singular values of A.
1405 *
1406  rwork( 3 ) = dble( n2 )
1407 * N2 is the number of singular values of A greater than SFMIN.
1408 * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1409 * that may carry some information.
1410 *
1411  rwork( 4 ) = dble( i )
1412 * i is the index of the last sweep before declaring convergence.
1413 *
1414  rwork( 5 ) = mxaapq
1415 * MXAAPQ is the largest absolute value of scaled pivots in the
1416 * last sweep
1417 *
1418  rwork( 6 ) = mxsinj
1419 * MXSINJ is the largest absolute value of the sines of Jacobi angles
1420 * in the last sweep
1421 *
1422  RETURN
1423 * ..
1424 * .. END OF ZGESVJ
1425 * ..
integer function idamax(N, DX, INCX)
IDAMAX
Definition: idamax.f:53
subroutine zcopy(N, ZX, INCX, ZY, INCY)
ZCOPY
Definition: zcopy.f:52
double precision function dlamch(CMACH)
DLAMCH
Definition: dlamch.f:65
subroutine zswap(N, ZX, INCX, ZY, INCY)
ZSWAP
Definition: zswap.f:52
subroutine dlascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
DLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: dlascl.f:145
subroutine zgsvj1(JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
ZGSVJ1 pre-processor for the routine zgesvj, applies Jacobi rotations targeting only particular pivot...
Definition: zgsvj1.f:238
subroutine zlaset(UPLO, M, N, ALPHA, BETA, A, LDA)
ZLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values...
Definition: zlaset.f:108
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:62
complex *16 function zdotc(N, ZX, INCX, ZY, INCY)
ZDOTC
Definition: zdotc.f:54
subroutine zlassq(N, X, INCX, SCALE, SUMSQ)
ZLASSQ updates a sum of squares represented in scaled form.
Definition: zlassq.f:108
double precision function dznrm2(N, X, INCX)
DZNRM2
Definition: dznrm2.f:56
subroutine zgsvj0(JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
ZGSVJ0 pre-processor for the routine zgesvj.
Definition: zgsvj0.f:219
subroutine zdscal(N, DA, ZX, INCX)
ZDSCAL
Definition: zdscal.f:54
subroutine zrot(N, CX, INCX, CY, INCY, C, S)
ZROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors...
Definition: zrot.f:105
subroutine zlascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
ZLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: zlascl.f:145
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:55
subroutine zaxpy(N, ZA, ZX, INCX, ZY, INCY)
ZAXPY
Definition: zaxpy.f:53

Here is the call graph for this function:

Here is the caller graph for this function: