LAPACK  3.10.1 LAPACK: Linear Algebra PACKage

◆ zgesvj()

 subroutine zgesvj ( character*1 JOBA, character*1 JOBU, character*1 JOBV, integer M, integer N, complex*16, dimension( lda, * ) A, integer LDA, double precision, dimension( n ) SVA, integer MV, complex*16, dimension( ldv, * ) V, integer LDV, complex*16, dimension( lwork ) CWORK, integer LWORK, double precision, dimension( lrwork ) RWORK, integer LRWORK, integer INFO )

ZGESVJ

Purpose:
``` ZGESVJ computes the singular value decomposition (SVD) of a complex
M-by-N matrix A, where M >= N. The SVD of A is written as
[++]   [xx]   [x0]   [xx]
A = U * SIGMA * V^*,  [++] = [xx] * [ox] * [xx]
[++]   [xx]
where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
matrix, and V is an N-by-N unitary matrix. The diagonal elements
of SIGMA are the singular values of A. The columns of U and V are the
left and the right singular vectors of A, respectively.```
Parameters
 [in] JOBA ``` JOBA is CHARACTER*1 Specifies the structure of A. = 'L': The input matrix A is lower triangular; = 'U': The input matrix A is upper triangular; = 'G': The input matrix A is general M-by-N matrix, M >= N.``` [in] JOBU ``` JOBU is CHARACTER*1 Specifies whether to compute the left singular vectors (columns of U): = 'U' or 'F': The left singular vectors corresponding to the nonzero singular values are computed and returned in the leading columns of A. See more details in the description of A. The default numerical orthogonality threshold is set to approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=DLAMCH('E'). = 'C': Analogous to JOBU='U', except that user can control the level of numerical orthogonality of the computed left singular vectors. TOL can be set to TOL = CTOL*EPS, where CTOL is given on input in the array WORK. No CTOL smaller than ONE is allowed. CTOL greater than 1 / EPS is meaningless. The option 'C' can be used if M*EPS is satisfactory orthogonality of the computed left singular vectors, so CTOL=M could save few sweeps of Jacobi rotations. See the descriptions of A and WORK(1). = 'N': The matrix U is not computed. However, see the description of A.``` [in] JOBV ``` JOBV is CHARACTER*1 Specifies whether to compute the right singular vectors, that is, the matrix V: = 'V' or 'J': the matrix V is computed and returned in the array V = 'A': the Jacobi rotations are applied to the MV-by-N array V. In other words, the right singular vector matrix V is not computed explicitly; instead it is applied to an MV-by-N matrix initially stored in the first MV rows of V. = 'N': the matrix V is not computed and the array V is not referenced``` [in] M ``` M is INTEGER The number of rows of the input matrix A. 1/DLAMCH('E') > M >= 0.``` [in] N ``` N is INTEGER The number of columns of the input matrix A. M >= N >= 0.``` [in,out] A ``` A is COMPLEX*16 array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, If JOBU = 'U' .OR. JOBU = 'C': If INFO = 0 : RANKA orthonormal columns of U are returned in the leading RANKA columns of the array A. Here RANKA <= N is the number of computed singular values of A that are above the underflow threshold DLAMCH('S'). The singular vectors corresponding to underflowed or zero singular values are not computed. The value of RANKA is returned in the array RWORK as RANKA=NINT(RWORK(2)). Also see the descriptions of SVA and RWORK. The computed columns of U are mutually numerically orthogonal up to approximately TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU = 'C'), see the description of JOBU. If INFO > 0, the procedure ZGESVJ did not converge in the given number of iterations (sweeps). In that case, the computed columns of U may not be orthogonal up to TOL. The output U (stored in A), SIGMA (given by the computed singular values in SVA(1:N)) and V is still a decomposition of the input matrix A in the sense that the residual || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small. If JOBU = 'N': If INFO = 0 : Note that the left singular vectors are 'for free' in the one-sided Jacobi SVD algorithm. However, if only the singular values are needed, the level of numerical orthogonality of U is not an issue and iterations are stopped when the columns of the iterated matrix are numerically orthogonal up to approximately M*EPS. Thus, on exit, A contains the columns of U scaled with the corresponding singular values. If INFO > 0: the procedure ZGESVJ did not converge in the given number of iterations (sweeps).``` [in] LDA ``` LDA is INTEGER The leading dimension of the array A. LDA >= max(1,M).``` [out] SVA ``` SVA is DOUBLE PRECISION array, dimension (N) On exit, If INFO = 0 : depending on the value SCALE = RWORK(1), we have: If SCALE = ONE: SVA(1:N) contains the computed singular values of A. During the computation SVA contains the Euclidean column norms of the iterated matrices in the array A. If SCALE .NE. ONE: The singular values of A are SCALE*SVA(1:N), and this factored representation is due to the fact that some of the singular values of A might underflow or overflow. If INFO > 0: the procedure ZGESVJ did not converge in the given number of iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.``` [in] MV ``` MV is INTEGER If JOBV = 'A', then the product of Jacobi rotations in ZGESVJ is applied to the first MV rows of V. See the description of JOBV.``` [in,out] V ``` V is COMPLEX*16 array, dimension (LDV,N) If JOBV = 'V', then V contains on exit the N-by-N matrix of the right singular vectors; If JOBV = 'A', then V contains the product of the computed right singular vector matrix and the initial matrix in the array V. If JOBV = 'N', then V is not referenced.``` [in] LDV ``` LDV is INTEGER The leading dimension of the array V, LDV >= 1. If JOBV = 'V', then LDV >= max(1,N). If JOBV = 'A', then LDV >= max(1,MV) .``` [in,out] CWORK ``` CWORK is COMPLEX*16 array, dimension (max(1,LWORK)) Used as workspace. If on entry LWORK = -1, then a workspace query is assumed and no computation is done; CWORK(1) is set to the minial (and optimal) length of CWORK.``` [in] LWORK ``` LWORK is INTEGER. Length of CWORK, LWORK >= M+N.``` [in,out] RWORK ``` RWORK is DOUBLE PRECISION array, dimension (max(6,LRWORK)) On entry, If JOBU = 'C' : RWORK(1) = CTOL, where CTOL defines the threshold for convergence. The process stops if all columns of A are mutually orthogonal up to CTOL*EPS, EPS=DLAMCH('E'). It is required that CTOL >= ONE, i.e. it is not allowed to force the routine to obtain orthogonality below EPSILON. On exit, RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N) are the computed singular values of A. (See description of SVA().) RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero singular values. RWORK(3) = NINT(RWORK(3)) is the number of the computed singular values that are larger than the underflow threshold. RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi rotations needed for numerical convergence. RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep. This is useful information in cases when ZGESVJ did not converge, as it can be used to estimate whether the output is still useful and for post festum analysis. RWORK(6) = the largest absolute value over all sines of the Jacobi rotation angles in the last sweep. It can be useful for a post festum analysis. If on entry LRWORK = -1, then a workspace query is assumed and no computation is done; RWORK(1) is set to the minial (and optimal) length of RWORK.``` [in] LRWORK ``` LRWORK is INTEGER Length of RWORK, LRWORK >= MAX(6,N).``` [out] INFO ``` INFO is INTEGER = 0: successful exit. < 0: if INFO = -i, then the i-th argument had an illegal value > 0: ZGESVJ did not converge in the maximal allowed number (NSWEEP=30) of sweeps. The output may still be useful. See the description of RWORK.```
Further Details:
``` The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
rotations. In the case of underflow of the tangent of the Jacobi angle, a
modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses
column interchanges of de Rijk [1]. The relative accuracy of the computed
singular values and the accuracy of the computed singular vectors (in
angle metric) is as guaranteed by the theory of Demmel and Veselic [2].
The condition number that determines the accuracy in the full rank case
is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
spectral condition number. The best performance of this Jacobi SVD
procedure is achieved if used in an  accelerated version of Drmac and
Veselic [4,5], and it is the kernel routine in the SIGMA library [6].
Some tuning parameters (marked with [TP]) are available for the
implementer.
The computational range for the nonzero singular values is the  machine
number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
denormalized singular values can be computed with the corresponding
Contributor:
```  ============

Zlatko Drmac (Zagreb, Croatia)```
References:
``` [1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the
singular value decomposition on a vector computer.
SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371.
[2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR.
[3] Z. Drmac: Implementation of Jacobi rotations for accurate singular
value computation in floating point arithmetic.
SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222.
[4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
LAPACK Working note 169.
[5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
LAPACK Working note 170.
[6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
QSVD, (H,K)-SVD computations.
Department of Mathematics, University of Zagreb, 2008, 2015.```
```  ===========================
Please report all bugs and send interesting test examples and comments to
drmac@math.hr. Thank you.```

Definition at line 349 of file zgesvj.f.

351 *
352 * -- LAPACK computational routine --
353 * -- LAPACK is a software package provided by Univ. of Tennessee, --
354 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
355 *
356  IMPLICIT NONE
357 * .. Scalar Arguments ..
358  INTEGER INFO, LDA, LDV, LWORK, LRWORK, M, MV, N
359  CHARACTER*1 JOBA, JOBU, JOBV
360 * ..
361 * .. Array Arguments ..
362  COMPLEX*16 A( LDA, * ), V( LDV, * ), CWORK( LWORK )
363  DOUBLE PRECISION RWORK( LRWORK ), SVA( N )
364 * ..
365 *
366 * =====================================================================
367 *
368 * .. Local Parameters ..
369  DOUBLE PRECISION ZERO, HALF, ONE
370  parameter( zero = 0.0d0, half = 0.5d0, one = 1.0d0)
371  COMPLEX*16 CZERO, CONE
372  parameter( czero = (0.0d0, 0.0d0), cone = (1.0d0, 0.0d0) )
373  INTEGER NSWEEP
374  parameter( nsweep = 30 )
375 * ..
376 * .. Local Scalars ..
377  COMPLEX*16 AAPQ, OMPQ
378  DOUBLE PRECISION AAPP, AAPP0, AAPQ1, AAQQ, APOAQ, AQOAP, BIG,
379  \$ BIGTHETA, CS, CTOL, EPSLN, MXAAPQ,
380  \$ MXSINJ, ROOTBIG, ROOTEPS, ROOTSFMIN, ROOTTOL,
381  \$ SKL, SFMIN, SMALL, SN, T, TEMP1, THETA, THSIGN, TOL
382  INTEGER BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1,
383  \$ ISWROT, jbc, jgl, KBL, LKAHEAD, MVL, N2, N34,
384  \$ N4, NBL, NOTROT, p, PSKIPPED, q, ROWSKIP, SWBAND
385  LOGICAL APPLV, GOSCALE, LOWER, LQUERY, LSVEC, NOSCALE, ROTOK,
386  \$ RSVEC, UCTOL, UPPER
387 * ..
388 * ..
389 * .. Intrinsic Functions ..
390  INTRINSIC abs, max, min, conjg, dble, sign, sqrt
391 * ..
392 * .. External Functions ..
393 * ..
394 * from BLAS
395  DOUBLE PRECISION DZNRM2
396  COMPLEX*16 ZDOTC
397  EXTERNAL zdotc, dznrm2
398  INTEGER IDAMAX
399  EXTERNAL idamax
400 * from LAPACK
401  DOUBLE PRECISION DLAMCH
402  EXTERNAL dlamch
403  LOGICAL LSAME
404  EXTERNAL lsame
405 * ..
406 * .. External Subroutines ..
407 * ..
408 * from BLAS
409  EXTERNAL zcopy, zrot, zdscal, zswap, zaxpy
410 * from LAPACK
411  EXTERNAL dlascl, zlascl, zlaset, zlassq, xerbla
412  EXTERNAL zgsvj0, zgsvj1
413 * ..
414 * .. Executable Statements ..
415 *
416 * Test the input arguments
417 *
418  lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
419  uctol = lsame( jobu, 'C' )
420  rsvec = lsame( jobv, 'V' ) .OR. lsame( jobv, 'J' )
421  applv = lsame( jobv, 'A' )
422  upper = lsame( joba, 'U' )
423  lower = lsame( joba, 'L' )
424 *
425  lquery = ( lwork .EQ. -1 ) .OR. ( lrwork .EQ. -1 )
426  IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
427  info = -1
428  ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
429  info = -2
430  ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
431  info = -3
432  ELSE IF( m.LT.0 ) THEN
433  info = -4
434  ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
435  info = -5
436  ELSE IF( lda.LT.m ) THEN
437  info = -7
438  ELSE IF( mv.LT.0 ) THEN
439  info = -9
440  ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
441  \$ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
442  info = -11
443  ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
444  info = -12
445  ELSE IF( ( lwork.LT.( m+n ) ) .AND. ( .NOT.lquery ) ) THEN
446  info = -13
447  ELSE IF( ( lrwork.LT.max( n, 6 ) ) .AND. ( .NOT.lquery ) ) THEN
448  info = -15
449  ELSE
450  info = 0
451  END IF
452 *
453 * #:(
454  IF( info.NE.0 ) THEN
455  CALL xerbla( 'ZGESVJ', -info )
456  RETURN
457  ELSE IF ( lquery ) THEN
458  cwork(1) = m + n
459  rwork(1) = max( n, 6 )
460  RETURN
461  END IF
462 *
463 * #:) Quick return for void matrix
464 *
465  IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
466 *
467 * Set numerical parameters
468 * The stopping criterion for Jacobi rotations is
469 *
470 * max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
471 *
472 * where EPS is the round-off and CTOL is defined as follows:
473 *
474  IF( uctol ) THEN
475 * ... user controlled
476  ctol = rwork( 1 )
477  ELSE
478 * ... default
479  IF( lsvec .OR. rsvec .OR. applv ) THEN
480  ctol = sqrt( dble( m ) )
481  ELSE
482  ctol = dble( m )
483  END IF
484  END IF
485 * ... and the machine dependent parameters are
486 *[!] (Make sure that SLAMCH() works properly on the target machine.)
487 *
488  epsln = dlamch( 'Epsilon' )
489  rooteps = sqrt( epsln )
490  sfmin = dlamch( 'SafeMinimum' )
491  rootsfmin = sqrt( sfmin )
492  small = sfmin / epsln
493  big = dlamch( 'Overflow' )
494 * BIG = ONE / SFMIN
495  rootbig = one / rootsfmin
496 * LARGE = BIG / SQRT( DBLE( M*N ) )
497  bigtheta = one / rooteps
498 *
499  tol = ctol*epsln
500  roottol = sqrt( tol )
501 *
502  IF( dble( m )*epsln.GE.one ) THEN
503  info = -4
504  CALL xerbla( 'ZGESVJ', -info )
505  RETURN
506  END IF
507 *
508 * Initialize the right singular vector matrix.
509 *
510  IF( rsvec ) THEN
511  mvl = n
512  CALL zlaset( 'A', mvl, n, czero, cone, v, ldv )
513  ELSE IF( applv ) THEN
514  mvl = mv
515  END IF
516  rsvec = rsvec .OR. applv
517 *
518 * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
519 *(!) If necessary, scale A to protect the largest singular value
520 * from overflow. It is possible that saving the largest singular
521 * value destroys the information about the small ones.
522 * This initial scaling is almost minimal in the sense that the
523 * goal is to make sure that no column norm overflows, and that
524 * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
525 * in A are detected, the procedure returns with INFO=-6.
526 *
527  skl = one / sqrt( dble( m )*dble( n ) )
528  noscale = .true.
529  goscale = .true.
530 *
531  IF( lower ) THEN
532 * the input matrix is M-by-N lower triangular (trapezoidal)
533  DO 1874 p = 1, n
534  aapp = zero
535  aaqq = one
536  CALL zlassq( m-p+1, a( p, p ), 1, aapp, aaqq )
537  IF( aapp.GT.big ) THEN
538  info = -6
539  CALL xerbla( 'ZGESVJ', -info )
540  RETURN
541  END IF
542  aaqq = sqrt( aaqq )
543  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
544  sva( p ) = aapp*aaqq
545  ELSE
546  noscale = .false.
547  sva( p ) = aapp*( aaqq*skl )
548  IF( goscale ) THEN
549  goscale = .false.
550  DO 1873 q = 1, p - 1
551  sva( q ) = sva( q )*skl
552  1873 CONTINUE
553  END IF
554  END IF
555  1874 CONTINUE
556  ELSE IF( upper ) THEN
557 * the input matrix is M-by-N upper triangular (trapezoidal)
558  DO 2874 p = 1, n
559  aapp = zero
560  aaqq = one
561  CALL zlassq( p, a( 1, p ), 1, aapp, aaqq )
562  IF( aapp.GT.big ) THEN
563  info = -6
564  CALL xerbla( 'ZGESVJ', -info )
565  RETURN
566  END IF
567  aaqq = sqrt( aaqq )
568  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
569  sva( p ) = aapp*aaqq
570  ELSE
571  noscale = .false.
572  sva( p ) = aapp*( aaqq*skl )
573  IF( goscale ) THEN
574  goscale = .false.
575  DO 2873 q = 1, p - 1
576  sva( q ) = sva( q )*skl
577  2873 CONTINUE
578  END IF
579  END IF
580  2874 CONTINUE
581  ELSE
582 * the input matrix is M-by-N general dense
583  DO 3874 p = 1, n
584  aapp = zero
585  aaqq = one
586  CALL zlassq( m, a( 1, p ), 1, aapp, aaqq )
587  IF( aapp.GT.big ) THEN
588  info = -6
589  CALL xerbla( 'ZGESVJ', -info )
590  RETURN
591  END IF
592  aaqq = sqrt( aaqq )
593  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
594  sva( p ) = aapp*aaqq
595  ELSE
596  noscale = .false.
597  sva( p ) = aapp*( aaqq*skl )
598  IF( goscale ) THEN
599  goscale = .false.
600  DO 3873 q = 1, p - 1
601  sva( q ) = sva( q )*skl
602  3873 CONTINUE
603  END IF
604  END IF
605  3874 CONTINUE
606  END IF
607 *
608  IF( noscale )skl = one
609 *
610 * Move the smaller part of the spectrum from the underflow threshold
611 *(!) Start by determining the position of the nonzero entries of the
612 * array SVA() relative to ( SFMIN, BIG ).
613 *
614  aapp = zero
615  aaqq = big
616  DO 4781 p = 1, n
617  IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
618  aapp = max( aapp, sva( p ) )
619  4781 CONTINUE
620 *
621 * #:) Quick return for zero matrix
622 *
623  IF( aapp.EQ.zero ) THEN
624  IF( lsvec )CALL zlaset( 'G', m, n, czero, cone, a, lda )
625  rwork( 1 ) = one
626  rwork( 2 ) = zero
627  rwork( 3 ) = zero
628  rwork( 4 ) = zero
629  rwork( 5 ) = zero
630  rwork( 6 ) = zero
631  RETURN
632  END IF
633 *
634 * #:) Quick return for one-column matrix
635 *
636  IF( n.EQ.1 ) THEN
637  IF( lsvec )CALL zlascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
638  \$ a( 1, 1 ), lda, ierr )
639  rwork( 1 ) = one / skl
640  IF( sva( 1 ).GE.sfmin ) THEN
641  rwork( 2 ) = one
642  ELSE
643  rwork( 2 ) = zero
644  END IF
645  rwork( 3 ) = zero
646  rwork( 4 ) = zero
647  rwork( 5 ) = zero
648  rwork( 6 ) = zero
649  RETURN
650  END IF
651 *
652 * Protect small singular values from underflow, and try to
653 * avoid underflows/overflows in computing Jacobi rotations.
654 *
655  sn = sqrt( sfmin / epsln )
656  temp1 = sqrt( big / dble( n ) )
657  IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
658  \$ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
659  temp1 = min( big, temp1 / aapp )
660 * AAQQ = AAQQ*TEMP1
661 * AAPP = AAPP*TEMP1
662  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
663  temp1 = min( sn / aaqq, big / (aapp*sqrt( dble(n)) ) )
664 * AAQQ = AAQQ*TEMP1
665 * AAPP = AAPP*TEMP1
666  ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
667  temp1 = max( sn / aaqq, temp1 / aapp )
668 * AAQQ = AAQQ*TEMP1
669 * AAPP = AAPP*TEMP1
670  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
671  temp1 = min( sn / aaqq, big / ( sqrt( dble( n ) )*aapp ) )
672 * AAQQ = AAQQ*TEMP1
673 * AAPP = AAPP*TEMP1
674  ELSE
675  temp1 = one
676  END IF
677 *
678 * Scale, if necessary
679 *
680  IF( temp1.NE.one ) THEN
681  CALL dlascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
682  END IF
683  skl = temp1*skl
684  IF( skl.NE.one ) THEN
685  CALL zlascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
686  skl = one / skl
687  END IF
688 *
689 * Row-cyclic Jacobi SVD algorithm with column pivoting
690 *
691  emptsw = ( n*( n-1 ) ) / 2
692  notrot = 0
693
694  DO 1868 q = 1, n
695  cwork( q ) = cone
696  1868 CONTINUE
697 *
698 *
699 *
700  swband = 3
701 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
702 * if ZGESVJ is used as a computational routine in the preconditioned
703 * Jacobi SVD algorithm ZGEJSV. For sweeps i=1:SWBAND the procedure
704 * works on pivots inside a band-like region around the diagonal.
705 * The boundaries are determined dynamically, based on the number of
706 * pivots above a threshold.
707 *
708  kbl = min( 8, n )
709 *[TP] KBL is a tuning parameter that defines the tile size in the
710 * tiling of the p-q loops of pivot pairs. In general, an optimal
711 * value of KBL depends on the matrix dimensions and on the
712 * parameters of the computer's memory.
713 *
714  nbl = n / kbl
715  IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
716 *
717  blskip = kbl**2
718 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
719 *
720  rowskip = min( 5, kbl )
721 *[TP] ROWSKIP is a tuning parameter.
722 *
724 *[TP] LKAHEAD is a tuning parameter.
725 *
726 * Quasi block transformations, using the lower (upper) triangular
727 * structure of the input matrix. The quasi-block-cycling usually
728 * invokes cubic convergence. Big part of this cycle is done inside
729 * canonical subspaces of dimensions less than M.
730 *
731  IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
732 *[TP] The number of partition levels and the actual partition are
733 * tuning parameters.
734  n4 = n / 4
735  n2 = n / 2
736  n34 = 3*n4
737  IF( applv ) THEN
738  q = 0
739  ELSE
740  q = 1
741  END IF
742 *
743  IF( lower ) THEN
744 *
745 * This works very well on lower triangular matrices, in particular
746 * in the framework of the preconditioned Jacobi SVD (xGEJSV).
747 * The idea is simple:
748 * [+ 0 0 0] Note that Jacobi transformations of [0 0]
749 * [+ + 0 0] [0 0]
750 * [+ + x 0] actually work on [x 0] [x 0]
751 * [+ + x x] [x x]. [x x]
752 *
753  CALL zgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
754  \$ cwork( n34+1 ), sva( n34+1 ), mvl,
755  \$ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
756  \$ 2, cwork( n+1 ), lwork-n, ierr )
757
758  CALL zgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
759  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
760  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
761  \$ cwork( n+1 ), lwork-n, ierr )
762
763  CALL zgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
764  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
765  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
766  \$ cwork( n+1 ), lwork-n, ierr )
767
768  CALL zgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
769  \$ cwork( n4+1 ), sva( n4+1 ), mvl,
770  \$ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
771  \$ cwork( n+1 ), lwork-n, ierr )
772 *
773  CALL zgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
774  \$ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
775  \$ ierr )
776 *
777  CALL zgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
778  \$ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
779  \$ lwork-n, ierr )
780 *
781 *
782  ELSE IF( upper ) THEN
783 *
784 *
785  CALL zgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
786  \$ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
787  \$ ierr )
788 *
789  CALL zgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
790  \$ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
791  \$ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
792  \$ ierr )
793 *
794  CALL zgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
795  \$ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
796  \$ lwork-n, ierr )
797 *
798  CALL zgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
799  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
800  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
801  \$ cwork( n+1 ), lwork-n, ierr )
802
803  END IF
804 *
805  END IF
806 *
807 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
808 *
809  DO 1993 i = 1, nsweep
810 *
811 * .. go go go ...
812 *
813  mxaapq = zero
814  mxsinj = zero
815  iswrot = 0
816 *
817  notrot = 0
818  pskipped = 0
819 *
820 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
821 * 1 <= p < q <= N. This is the first step toward a blocked implementation
822 * of the rotations. New implementation, based on block transformations,
823 * is under development.
824 *
825  DO 2000 ibr = 1, nbl
826 *
827  igl = ( ibr-1 )*kbl + 1
828 *
829  DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
830 *
831  igl = igl + ir1*kbl
832 *
833  DO 2001 p = igl, min( igl+kbl-1, n-1 )
834 *
835 * .. de Rijk's pivoting
836 *
837  q = idamax( n-p+1, sva( p ), 1 ) + p - 1
838  IF( p.NE.q ) THEN
839  CALL zswap( m, a( 1, p ), 1, a( 1, q ), 1 )
840  IF( rsvec )CALL zswap( mvl, v( 1, p ), 1,
841  \$ v( 1, q ), 1 )
842  temp1 = sva( p )
843  sva( p ) = sva( q )
844  sva( q ) = temp1
845  aapq = cwork(p)
846  cwork(p) = cwork(q)
847  cwork(q) = aapq
848  END IF
849 *
850  IF( ir1.EQ.0 ) THEN
851 *
852 * Column norms are periodically updated by explicit
853 * norm computation.
854 *[!] Caveat:
855 * Unfortunately, some BLAS implementations compute DZNRM2(M,A(1,p),1)
856 * as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
857 * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
858 * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
859 * Hence, DZNRM2 cannot be trusted, not even in the case when
860 * the true norm is far from the under(over)flow boundaries.
861 * If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
862 * below should be replaced with "AAPP = DZNRM2( M, A(1,p), 1 )".
863 *
864  IF( ( sva( p ).LT.rootbig ) .AND.
865  \$ ( sva( p ).GT.rootsfmin ) ) THEN
866  sva( p ) = dznrm2( m, a( 1, p ), 1 )
867  ELSE
868  temp1 = zero
869  aapp = one
870  CALL zlassq( m, a( 1, p ), 1, temp1, aapp )
871  sva( p ) = temp1*sqrt( aapp )
872  END IF
873  aapp = sva( p )
874  ELSE
875  aapp = sva( p )
876  END IF
877 *
878  IF( aapp.GT.zero ) THEN
879 *
880  pskipped = 0
881 *
882  DO 2002 q = p + 1, min( igl+kbl-1, n )
883 *
884  aaqq = sva( q )
885 *
886  IF( aaqq.GT.zero ) THEN
887 *
888  aapp0 = aapp
889  IF( aaqq.GE.one ) THEN
890  rotok = ( small*aapp ).LE.aaqq
891  IF( aapp.LT.( big / aaqq ) ) THEN
892  aapq = ( zdotc( m, a( 1, p ), 1,
893  \$ a( 1, q ), 1 ) / aaqq ) / aapp
894  ELSE
895  CALL zcopy( m, a( 1, p ), 1,
896  \$ cwork(n+1), 1 )
897  CALL zlascl( 'G', 0, 0, aapp, one,
898  \$ m, 1, cwork(n+1), lda, ierr )
899  aapq = zdotc( m, cwork(n+1), 1,
900  \$ a( 1, q ), 1 ) / aaqq
901  END IF
902  ELSE
903  rotok = aapp.LE.( aaqq / small )
904  IF( aapp.GT.( small / aaqq ) ) THEN
905  aapq = ( zdotc( m, a( 1, p ), 1,
906  \$ a( 1, q ), 1 ) / aapp ) / aaqq
907  ELSE
908  CALL zcopy( m, a( 1, q ), 1,
909  \$ cwork(n+1), 1 )
910  CALL zlascl( 'G', 0, 0, aaqq,
911  \$ one, m, 1,
912  \$ cwork(n+1), lda, ierr )
913  aapq = zdotc( m, a(1, p ), 1,
914  \$ cwork(n+1), 1 ) / aapp
915  END IF
916  END IF
917 *
918
919 * AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
920  aapq1 = -abs(aapq)
921  mxaapq = max( mxaapq, -aapq1 )
922 *
923 * TO rotate or NOT to rotate, THAT is the question ...
924 *
925  IF( abs( aapq1 ).GT.tol ) THEN
926  ompq = aapq / abs(aapq)
927 *
928 * .. rotate
929 *[RTD] ROTATED = ROTATED + ONE
930 *
931  IF( ir1.EQ.0 ) THEN
932  notrot = 0
933  pskipped = 0
934  iswrot = iswrot + 1
935  END IF
936 *
937  IF( rotok ) THEN
938 *
939  aqoap = aaqq / aapp
940  apoaq = aapp / aaqq
941  theta = -half*abs( aqoap-apoaq )/aapq1
942 *
943  IF( abs( theta ).GT.bigtheta ) THEN
944 *
945  t = half / theta
946  cs = one
947
948  CALL zrot( m, a(1,p), 1, a(1,q), 1,
949  \$ cs, conjg(ompq)*t )
950  IF ( rsvec ) THEN
951  CALL zrot( mvl, v(1,p), 1,
952  \$ v(1,q), 1, cs, conjg(ompq)*t )
953  END IF
954
955  sva( q ) = aaqq*sqrt( max( zero,
956  \$ one+t*apoaq*aapq1 ) )
957  aapp = aapp*sqrt( max( zero,
958  \$ one-t*aqoap*aapq1 ) )
959  mxsinj = max( mxsinj, abs( t ) )
960 *
961  ELSE
962 *
963 * .. choose correct signum for THETA and rotate
964 *
965  thsign = -sign( one, aapq1 )
966  t = one / ( theta+thsign*
967  \$ sqrt( one+theta*theta ) )
968  cs = sqrt( one / ( one+t*t ) )
969  sn = t*cs
970 *
971  mxsinj = max( mxsinj, abs( sn ) )
972  sva( q ) = aaqq*sqrt( max( zero,
973  \$ one+t*apoaq*aapq1 ) )
974  aapp = aapp*sqrt( max( zero,
975  \$ one-t*aqoap*aapq1 ) )
976 *
977  CALL zrot( m, a(1,p), 1, a(1,q), 1,
978  \$ cs, conjg(ompq)*sn )
979  IF ( rsvec ) THEN
980  CALL zrot( mvl, v(1,p), 1,
981  \$ v(1,q), 1, cs, conjg(ompq)*sn )
982  END IF
983  END IF
984  cwork(p) = -cwork(q) * ompq
985 *
986  ELSE
987 * .. have to use modified Gram-Schmidt like transformation
988  CALL zcopy( m, a( 1, p ), 1,
989  \$ cwork(n+1), 1 )
990  CALL zlascl( 'G', 0, 0, aapp, one, m,
991  \$ 1, cwork(n+1), lda,
992  \$ ierr )
993  CALL zlascl( 'G', 0, 0, aaqq, one, m,
994  \$ 1, a( 1, q ), lda, ierr )
995  CALL zaxpy( m, -aapq, cwork(n+1), 1,
996  \$ a( 1, q ), 1 )
997  CALL zlascl( 'G', 0, 0, one, aaqq, m,
998  \$ 1, a( 1, q ), lda, ierr )
999  sva( q ) = aaqq*sqrt( max( zero,
1000  \$ one-aapq1*aapq1 ) )
1001  mxsinj = max( mxsinj, sfmin )
1002  END IF
1003 * END IF ROTOK THEN ... ELSE
1004 *
1005 * In the case of cancellation in updating SVA(q), SVA(p)
1006 * recompute SVA(q), SVA(p).
1007 *
1008  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1009  \$ THEN
1010  IF( ( aaqq.LT.rootbig ) .AND.
1011  \$ ( aaqq.GT.rootsfmin ) ) THEN
1012  sva( q ) = dznrm2( m, a( 1, q ), 1 )
1013  ELSE
1014  t = zero
1015  aaqq = one
1016  CALL zlassq( m, a( 1, q ), 1, t,
1017  \$ aaqq )
1018  sva( q ) = t*sqrt( aaqq )
1019  END IF
1020  END IF
1021  IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1022  IF( ( aapp.LT.rootbig ) .AND.
1023  \$ ( aapp.GT.rootsfmin ) ) THEN
1024  aapp = dznrm2( m, a( 1, p ), 1 )
1025  ELSE
1026  t = zero
1027  aapp = one
1028  CALL zlassq( m, a( 1, p ), 1, t,
1029  \$ aapp )
1030  aapp = t*sqrt( aapp )
1031  END IF
1032  sva( p ) = aapp
1033  END IF
1034 *
1035  ELSE
1036 * A(:,p) and A(:,q) already numerically orthogonal
1037  IF( ir1.EQ.0 )notrot = notrot + 1
1038 *[RTD] SKIPPED = SKIPPED + 1
1039  pskipped = pskipped + 1
1040  END IF
1041  ELSE
1042 * A(:,q) is zero column
1043  IF( ir1.EQ.0 )notrot = notrot + 1
1044  pskipped = pskipped + 1
1045  END IF
1046 *
1047  IF( ( i.LE.swband ) .AND.
1048  \$ ( pskipped.GT.rowskip ) ) THEN
1049  IF( ir1.EQ.0 )aapp = -aapp
1050  notrot = 0
1051  GO TO 2103
1052  END IF
1053 *
1054  2002 CONTINUE
1055 * END q-LOOP
1056 *
1057  2103 CONTINUE
1058 * bailed out of q-loop
1059 *
1060  sva( p ) = aapp
1061 *
1062  ELSE
1063  sva( p ) = aapp
1064  IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1065  \$ notrot = notrot + min( igl+kbl-1, n ) - p
1066  END IF
1067 *
1068  2001 CONTINUE
1069 * end of the p-loop
1070 * end of doing the block ( ibr, ibr )
1071  1002 CONTINUE
1072 * end of ir1-loop
1073 *
1074 * ... go to the off diagonal blocks
1075 *
1076  igl = ( ibr-1 )*kbl + 1
1077 *
1078  DO 2010 jbc = ibr + 1, nbl
1079 *
1080  jgl = ( jbc-1 )*kbl + 1
1081 *
1082 * doing the block at ( ibr, jbc )
1083 *
1084  ijblsk = 0
1085  DO 2100 p = igl, min( igl+kbl-1, n )
1086 *
1087  aapp = sva( p )
1088  IF( aapp.GT.zero ) THEN
1089 *
1090  pskipped = 0
1091 *
1092  DO 2200 q = jgl, min( jgl+kbl-1, n )
1093 *
1094  aaqq = sva( q )
1095  IF( aaqq.GT.zero ) THEN
1096  aapp0 = aapp
1097 *
1098 * .. M x 2 Jacobi SVD ..
1099 *
1100 * Safe Gram matrix computation
1101 *
1102  IF( aaqq.GE.one ) THEN
1103  IF( aapp.GE.aaqq ) THEN
1104  rotok = ( small*aapp ).LE.aaqq
1105  ELSE
1106  rotok = ( small*aaqq ).LE.aapp
1107  END IF
1108  IF( aapp.LT.( big / aaqq ) ) THEN
1109  aapq = ( zdotc( m, a( 1, p ), 1,
1110  \$ a( 1, q ), 1 ) / aaqq ) / aapp
1111  ELSE
1112  CALL zcopy( m, a( 1, p ), 1,
1113  \$ cwork(n+1), 1 )
1114  CALL zlascl( 'G', 0, 0, aapp,
1115  \$ one, m, 1,
1116  \$ cwork(n+1), lda, ierr )
1117  aapq = zdotc( m, cwork(n+1), 1,
1118  \$ a( 1, q ), 1 ) / aaqq
1119  END IF
1120  ELSE
1121  IF( aapp.GE.aaqq ) THEN
1122  rotok = aapp.LE.( aaqq / small )
1123  ELSE
1124  rotok = aaqq.LE.( aapp / small )
1125  END IF
1126  IF( aapp.GT.( small / aaqq ) ) THEN
1127  aapq = ( zdotc( m, a( 1, p ), 1,
1128  \$ a( 1, q ), 1 ) / max(aaqq,aapp) )
1129  \$ / min(aaqq,aapp)
1130  ELSE
1131  CALL zcopy( m, a( 1, q ), 1,
1132  \$ cwork(n+1), 1 )
1133  CALL zlascl( 'G', 0, 0, aaqq,
1134  \$ one, m, 1,
1135  \$ cwork(n+1), lda, ierr )
1136  aapq = zdotc( m, a( 1, p ), 1,
1137  \$ cwork(n+1), 1 ) / aapp
1138  END IF
1139  END IF
1140 *
1141
1142 * AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1143  aapq1 = -abs(aapq)
1144  mxaapq = max( mxaapq, -aapq1 )
1145 *
1146 * TO rotate or NOT to rotate, THAT is the question ...
1147 *
1148  IF( abs( aapq1 ).GT.tol ) THEN
1149  ompq = aapq / abs(aapq)
1150  notrot = 0
1151 *[RTD] ROTATED = ROTATED + 1
1152  pskipped = 0
1153  iswrot = iswrot + 1
1154 *
1155  IF( rotok ) THEN
1156 *
1157  aqoap = aaqq / aapp
1158  apoaq = aapp / aaqq
1159  theta = -half*abs( aqoap-apoaq )/ aapq1
1160  IF( aaqq.GT.aapp0 )theta = -theta
1161 *
1162  IF( abs( theta ).GT.bigtheta ) THEN
1163  t = half / theta
1164  cs = one
1165  CALL zrot( m, a(1,p), 1, a(1,q), 1,
1166  \$ cs, conjg(ompq)*t )
1167  IF( rsvec ) THEN
1168  CALL zrot( mvl, v(1,p), 1,
1169  \$ v(1,q), 1, cs, conjg(ompq)*t )
1170  END IF
1171  sva( q ) = aaqq*sqrt( max( zero,
1172  \$ one+t*apoaq*aapq1 ) )
1173  aapp = aapp*sqrt( max( zero,
1174  \$ one-t*aqoap*aapq1 ) )
1175  mxsinj = max( mxsinj, abs( t ) )
1176  ELSE
1177 *
1178 * .. choose correct signum for THETA and rotate
1179 *
1180  thsign = -sign( one, aapq1 )
1181  IF( aaqq.GT.aapp0 )thsign = -thsign
1182  t = one / ( theta+thsign*
1183  \$ sqrt( one+theta*theta ) )
1184  cs = sqrt( one / ( one+t*t ) )
1185  sn = t*cs
1186  mxsinj = max( mxsinj, abs( sn ) )
1187  sva( q ) = aaqq*sqrt( max( zero,
1188  \$ one+t*apoaq*aapq1 ) )
1189  aapp = aapp*sqrt( max( zero,
1190  \$ one-t*aqoap*aapq1 ) )
1191 *
1192  CALL zrot( m, a(1,p), 1, a(1,q), 1,
1193  \$ cs, conjg(ompq)*sn )
1194  IF( rsvec ) THEN
1195  CALL zrot( mvl, v(1,p), 1,
1196  \$ v(1,q), 1, cs, conjg(ompq)*sn )
1197  END IF
1198  END IF
1199  cwork(p) = -cwork(q) * ompq
1200 *
1201  ELSE
1202 * .. have to use modified Gram-Schmidt like transformation
1203  IF( aapp.GT.aaqq ) THEN
1204  CALL zcopy( m, a( 1, p ), 1,
1205  \$ cwork(n+1), 1 )
1206  CALL zlascl( 'G', 0, 0, aapp, one,
1207  \$ m, 1, cwork(n+1),lda,
1208  \$ ierr )
1209  CALL zlascl( 'G', 0, 0, aaqq, one,
1210  \$ m, 1, a( 1, q ), lda,
1211  \$ ierr )
1212  CALL zaxpy( m, -aapq, cwork(n+1),
1213  \$ 1, a( 1, q ), 1 )
1214  CALL zlascl( 'G', 0, 0, one, aaqq,
1215  \$ m, 1, a( 1, q ), lda,
1216  \$ ierr )
1217  sva( q ) = aaqq*sqrt( max( zero,
1218  \$ one-aapq1*aapq1 ) )
1219  mxsinj = max( mxsinj, sfmin )
1220  ELSE
1221  CALL zcopy( m, a( 1, q ), 1,
1222  \$ cwork(n+1), 1 )
1223  CALL zlascl( 'G', 0, 0, aaqq, one,
1224  \$ m, 1, cwork(n+1),lda,
1225  \$ ierr )
1226  CALL zlascl( 'G', 0, 0, aapp, one,
1227  \$ m, 1, a( 1, p ), lda,
1228  \$ ierr )
1229  CALL zaxpy( m, -conjg(aapq),
1230  \$ cwork(n+1), 1, a( 1, p ), 1 )
1231  CALL zlascl( 'G', 0, 0, one, aapp,
1232  \$ m, 1, a( 1, p ), lda,
1233  \$ ierr )
1234  sva( p ) = aapp*sqrt( max( zero,
1235  \$ one-aapq1*aapq1 ) )
1236  mxsinj = max( mxsinj, sfmin )
1237  END IF
1238  END IF
1239 * END IF ROTOK THEN ... ELSE
1240 *
1241 * In the case of cancellation in updating SVA(q), SVA(p)
1242 * .. recompute SVA(q), SVA(p)
1243  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1244  \$ THEN
1245  IF( ( aaqq.LT.rootbig ) .AND.
1246  \$ ( aaqq.GT.rootsfmin ) ) THEN
1247  sva( q ) = dznrm2( m, a( 1, q ), 1)
1248  ELSE
1249  t = zero
1250  aaqq = one
1251  CALL zlassq( m, a( 1, q ), 1, t,
1252  \$ aaqq )
1253  sva( q ) = t*sqrt( aaqq )
1254  END IF
1255  END IF
1256  IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1257  IF( ( aapp.LT.rootbig ) .AND.
1258  \$ ( aapp.GT.rootsfmin ) ) THEN
1259  aapp = dznrm2( m, a( 1, p ), 1 )
1260  ELSE
1261  t = zero
1262  aapp = one
1263  CALL zlassq( m, a( 1, p ), 1, t,
1264  \$ aapp )
1265  aapp = t*sqrt( aapp )
1266  END IF
1267  sva( p ) = aapp
1268  END IF
1269 * end of OK rotation
1270  ELSE
1271  notrot = notrot + 1
1272 *[RTD] SKIPPED = SKIPPED + 1
1273  pskipped = pskipped + 1
1274  ijblsk = ijblsk + 1
1275  END IF
1276  ELSE
1277  notrot = notrot + 1
1278  pskipped = pskipped + 1
1279  ijblsk = ijblsk + 1
1280  END IF
1281 *
1282  IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1283  \$ THEN
1284  sva( p ) = aapp
1285  notrot = 0
1286  GO TO 2011
1287  END IF
1288  IF( ( i.LE.swband ) .AND.
1289  \$ ( pskipped.GT.rowskip ) ) THEN
1290  aapp = -aapp
1291  notrot = 0
1292  GO TO 2203
1293  END IF
1294 *
1295  2200 CONTINUE
1296 * end of the q-loop
1297  2203 CONTINUE
1298 *
1299  sva( p ) = aapp
1300 *
1301  ELSE
1302 *
1303  IF( aapp.EQ.zero )notrot = notrot +
1304  \$ min( jgl+kbl-1, n ) - jgl + 1
1305  IF( aapp.LT.zero )notrot = 0
1306 *
1307  END IF
1308 *
1309  2100 CONTINUE
1310 * end of the p-loop
1311  2010 CONTINUE
1312 * end of the jbc-loop
1313  2011 CONTINUE
1314 *2011 bailed out of the jbc-loop
1315  DO 2012 p = igl, min( igl+kbl-1, n )
1316  sva( p ) = abs( sva( p ) )
1317  2012 CONTINUE
1318 ***
1319  2000 CONTINUE
1320 *2000 :: end of the ibr-loop
1321 *
1322 * .. update SVA(N)
1323  IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1324  \$ THEN
1325  sva( n ) = dznrm2( m, a( 1, n ), 1 )
1326  ELSE
1327  t = zero
1328  aapp = one
1329  CALL zlassq( m, a( 1, n ), 1, t, aapp )
1330  sva( n ) = t*sqrt( aapp )
1331  END IF
1332 *
1334 *
1335  IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1336  \$ ( iswrot.LE.n ) ) )swband = i
1337 *
1338  IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( dble( n ) )*
1339  \$ tol ) .AND. ( dble( n )*mxaapq*mxsinj.LT.tol ) ) THEN
1340  GO TO 1994
1341  END IF
1342 *
1343  IF( notrot.GE.emptsw )GO TO 1994
1344 *
1345  1993 CONTINUE
1346 * end i=1:NSWEEP loop
1347 *
1348 * #:( Reaching this point means that the procedure has not converged.
1349  info = nsweep - 1
1350  GO TO 1995
1351 *
1352  1994 CONTINUE
1353 * #:) Reaching this point means numerical convergence after the i-th
1354 * sweep.
1355 *
1356  info = 0
1357 * #:) INFO = 0 confirms successful iterations.
1358  1995 CONTINUE
1359 *
1360 * Sort the singular values and find how many are above
1361 * the underflow threshold.
1362 *
1363  n2 = 0
1364  n4 = 0
1365  DO 5991 p = 1, n - 1
1366  q = idamax( n-p+1, sva( p ), 1 ) + p - 1
1367  IF( p.NE.q ) THEN
1368  temp1 = sva( p )
1369  sva( p ) = sva( q )
1370  sva( q ) = temp1
1371  CALL zswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1372  IF( rsvec )CALL zswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1373  END IF
1374  IF( sva( p ).NE.zero ) THEN
1375  n4 = n4 + 1
1376  IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1377  END IF
1378  5991 CONTINUE
1379  IF( sva( n ).NE.zero ) THEN
1380  n4 = n4 + 1
1381  IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1382  END IF
1383 *
1384 * Normalize the left singular vectors.
1385 *
1386  IF( lsvec .OR. uctol ) THEN
1387  DO 1998 p = 1, n4
1388 * CALL ZDSCAL( M, ONE / SVA( p ), A( 1, p ), 1 )
1389  CALL zlascl( 'G',0,0, sva(p), one, m, 1, a(1,p), m, ierr )
1390  1998 CONTINUE
1391  END IF
1392 *
1393 * Scale the product of Jacobi rotations.
1394 *
1395  IF( rsvec ) THEN
1396  DO 2399 p = 1, n
1397  temp1 = one / dznrm2( mvl, v( 1, p ), 1 )
1398  CALL zdscal( mvl, temp1, v( 1, p ), 1 )
1399  2399 CONTINUE
1400  END IF
1401 *
1402 * Undo scaling, if necessary (and possible).
1403  IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1404  \$ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1405  \$ ( sfmin / skl ) ) ) ) THEN
1406  DO 2400 p = 1, n
1407  sva( p ) = skl*sva( p )
1408  2400 CONTINUE
1409  skl = one
1410  END IF
1411 *
1412  rwork( 1 ) = skl
1413 * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1414 * then some of the singular values may overflow or underflow and
1415 * the spectrum is given in this factored representation.
1416 *
1417  rwork( 2 ) = dble( n4 )
1418 * N4 is the number of computed nonzero singular values of A.
1419 *
1420  rwork( 3 ) = dble( n2 )
1421 * N2 is the number of singular values of A greater than SFMIN.
1422 * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1423 * that may carry some information.
1424 *
1425  rwork( 4 ) = dble( i )
1426 * i is the index of the last sweep before declaring convergence.
1427 *
1428  rwork( 5 ) = mxaapq
1429 * MXAAPQ is the largest absolute value of scaled pivots in the
1430 * last sweep
1431 *
1432  rwork( 6 ) = mxsinj
1433 * MXSINJ is the largest absolute value of the sines of Jacobi angles
1434 * in the last sweep
1435 *
1436  RETURN
1437 * ..
1438 * .. END OF ZGESVJ
1439 * ..
double precision function dlamch(CMACH)
DLAMCH
Definition: dlamch.f:69
subroutine zlassq(n, x, incx, scl, sumsq)
ZLASSQ updates a sum of squares represented in scaled form.
Definition: zlassq.f90:137
subroutine dlascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
DLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: dlascl.f:143
integer function idamax(N, DX, INCX)
IDAMAX
Definition: idamax.f:71
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:60
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:53
subroutine zswap(N, ZX, INCX, ZY, INCY)
ZSWAP
Definition: zswap.f:81
subroutine zdscal(N, DA, ZX, INCX)
ZDSCAL
Definition: zdscal.f:78
complex *16 function zdotc(N, ZX, INCX, ZY, INCY)
ZDOTC
Definition: zdotc.f:83
subroutine zaxpy(N, ZA, ZX, INCX, ZY, INCY)
ZAXPY
Definition: zaxpy.f:88
subroutine zcopy(N, ZX, INCX, ZY, INCY)
ZCOPY
Definition: zcopy.f:81
subroutine zlascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
ZLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: zlascl.f:143
subroutine zrot(N, CX, INCX, CY, INCY, C, S)
ZROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors.
Definition: zrot.f:103
subroutine zlaset(UPLO, M, N, ALPHA, BETA, A, LDA)
ZLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values.
Definition: zlaset.f:106
subroutine zgsvj0(JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
ZGSVJ0 pre-processor for the routine zgesvj.
Definition: zgsvj0.f:218
subroutine zgsvj1(JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
ZGSVJ1 pre-processor for the routine zgesvj, applies Jacobi rotations targeting only particular pivot...
Definition: zgsvj1.f:236
real(wp) function dznrm2(n, x, incx)
DZNRM2
Definition: dznrm2.f90:90
Here is the call graph for this function:
Here is the caller graph for this function: