 LAPACK  3.10.0 LAPACK: Linear Algebra PACKage

◆ cgesvj()

 subroutine cgesvj ( character*1 JOBA, character*1 JOBU, character*1 JOBV, integer M, integer N, complex, dimension( lda, * ) A, integer LDA, real, dimension( n ) SVA, integer MV, complex, dimension( ldv, * ) V, integer LDV, complex, dimension( lwork ) CWORK, integer LWORK, real, dimension( lrwork ) RWORK, integer LRWORK, integer INFO )

CGESVJ

Purpose:
CGESVJ computes the singular value decomposition (SVD) of a complex
M-by-N matrix A, where M >= N. The SVD of A is written as
[++]   [xx]   [x0]   [xx]
A = U * SIGMA * V^*,  [++] = [xx] * [ox] * [xx]
[++]   [xx]
where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
matrix, and V is an N-by-N unitary matrix. The diagonal elements
of SIGMA are the singular values of A. The columns of U and V are the
left and the right singular vectors of A, respectively.
Parameters
 [in] JOBA JOBA is CHARACTER*1 Specifies the structure of A. = 'L': The input matrix A is lower triangular; = 'U': The input matrix A is upper triangular; = 'G': The input matrix A is general M-by-N matrix, M >= N. [in] JOBU JOBU is CHARACTER*1 Specifies whether to compute the left singular vectors (columns of U): = 'U' or 'F': The left singular vectors corresponding to the nonzero singular values are computed and returned in the leading columns of A. See more details in the description of A. The default numerical orthogonality threshold is set to approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=SLAMCH('E'). = 'C': Analogous to JOBU='U', except that user can control the level of numerical orthogonality of the computed left singular vectors. TOL can be set to TOL = CTOL*EPS, where CTOL is given on input in the array WORK. No CTOL smaller than ONE is allowed. CTOL greater than 1 / EPS is meaningless. The option 'C' can be used if M*EPS is satisfactory orthogonality of the computed left singular vectors, so CTOL=M could save few sweeps of Jacobi rotations. See the descriptions of A and WORK(1). = 'N': The matrix U is not computed. However, see the description of A. [in] JOBV JOBV is CHARACTER*1 Specifies whether to compute the right singular vectors, that is, the matrix V: = 'V' or 'J': the matrix V is computed and returned in the array V = 'A': the Jacobi rotations are applied to the MV-by-N array V. In other words, the right singular vector matrix V is not computed explicitly; instead it is applied to an MV-by-N matrix initially stored in the first MV rows of V. = 'N': the matrix V is not computed and the array V is not referenced [in] M M is INTEGER The number of rows of the input matrix A. 1/SLAMCH('E') > M >= 0. [in] N N is INTEGER The number of columns of the input matrix A. M >= N >= 0. [in,out] A A is COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, If JOBU = 'U' .OR. JOBU = 'C': If INFO = 0 : RANKA orthonormal columns of U are returned in the leading RANKA columns of the array A. Here RANKA <= N is the number of computed singular values of A that are above the underflow threshold SLAMCH('S'). The singular vectors corresponding to underflowed or zero singular values are not computed. The value of RANKA is returned in the array RWORK as RANKA=NINT(RWORK(2)). Also see the descriptions of SVA and RWORK. The computed columns of U are mutually numerically orthogonal up to approximately TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU = 'C'), see the description of JOBU. If INFO > 0, the procedure CGESVJ did not converge in the given number of iterations (sweeps). In that case, the computed columns of U may not be orthogonal up to TOL. The output U (stored in A), SIGMA (given by the computed singular values in SVA(1:N)) and V is still a decomposition of the input matrix A in the sense that the residual || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small. If JOBU = 'N': If INFO = 0 : Note that the left singular vectors are 'for free' in the one-sided Jacobi SVD algorithm. However, if only the singular values are needed, the level of numerical orthogonality of U is not an issue and iterations are stopped when the columns of the iterated matrix are numerically orthogonal up to approximately M*EPS. Thus, on exit, A contains the columns of U scaled with the corresponding singular values. If INFO > 0 : the procedure CGESVJ did not converge in the given number of iterations (sweeps). [in] LDA LDA is INTEGER The leading dimension of the array A. LDA >= max(1,M). [out] SVA SVA is REAL array, dimension (N) On exit, If INFO = 0 : depending on the value SCALE = RWORK(1), we have: If SCALE = ONE: SVA(1:N) contains the computed singular values of A. During the computation SVA contains the Euclidean column norms of the iterated matrices in the array A. If SCALE .NE. ONE: The singular values of A are SCALE*SVA(1:N), and this factored representation is due to the fact that some of the singular values of A might underflow or overflow. If INFO > 0 : the procedure CGESVJ did not converge in the given number of iterations (sweeps) and SCALE*SVA(1:N) may not be accurate. [in] MV MV is INTEGER If JOBV = 'A', then the product of Jacobi rotations in CGESVJ is applied to the first MV rows of V. See the description of JOBV. [in,out] V V is COMPLEX array, dimension (LDV,N) If JOBV = 'V', then V contains on exit the N-by-N matrix of the right singular vectors; If JOBV = 'A', then V contains the product of the computed right singular vector matrix and the initial matrix in the array V. If JOBV = 'N', then V is not referenced. [in] LDV LDV is INTEGER The leading dimension of the array V, LDV >= 1. If JOBV = 'V', then LDV >= max(1,N). If JOBV = 'A', then LDV >= max(1,MV) . [in,out] CWORK CWORK is COMPLEX array, dimension (max(1,LWORK)) Used as workspace. If on entry LWORK = -1, then a workspace query is assumed and no computation is done; CWORK(1) is set to the minial (and optimal) length of CWORK. [in] LWORK LWORK is INTEGER. Length of CWORK, LWORK >= M+N. [in,out] RWORK RWORK is REAL array, dimension (max(6,LRWORK)) On entry, If JOBU = 'C' : RWORK(1) = CTOL, where CTOL defines the threshold for convergence. The process stops if all columns of A are mutually orthogonal up to CTOL*EPS, EPS=SLAMCH('E'). It is required that CTOL >= ONE, i.e. it is not allowed to force the routine to obtain orthogonality below EPSILON. On exit, RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N) are the computed singular values of A. (See description of SVA().) RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero singular values. RWORK(3) = NINT(RWORK(3)) is the number of the computed singular values that are larger than the underflow threshold. RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi rotations needed for numerical convergence. RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep. This is useful information in cases when CGESVJ did not converge, as it can be used to estimate whether the output is still useful and for post festum analysis. RWORK(6) = the largest absolute value over all sines of the Jacobi rotation angles in the last sweep. It can be useful for a post festum analysis. If on entry LRWORK = -1, then a workspace query is assumed and no computation is done; RWORK(1) is set to the minial (and optimal) length of RWORK. [in] LRWORK LRWORK is INTEGER Length of RWORK, LRWORK >= MAX(6,N). [out] INFO INFO is INTEGER = 0: successful exit. < 0: if INFO = -i, then the i-th argument had an illegal value > 0: CGESVJ did not converge in the maximal allowed number (NSWEEP=30) of sweeps. The output may still be useful. See the description of RWORK.
Further Details:
The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
rotations. In the case of underflow of the tangent of the Jacobi angle, a
modified Jacobi transformation of Drmac  is used. Pivot strategy uses
column interchanges of de Rijk . The relative accuracy of the computed
singular values and the accuracy of the computed singular vectors (in
angle metric) is as guaranteed by the theory of Demmel and Veselic .
The condition number that determines the accuracy in the full rank case
is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
spectral condition number. The best performance of this Jacobi SVD
procedure is achieved if used in an  accelerated version of Drmac and
Veselic [4,5], and it is the kernel routine in the SIGMA library .
Some tuning parameters (marked with [TP]) are available for the
implementer.
The computational range for the nonzero singular values is the  machine
number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
denormalized singular values can be computed with the corresponding
Contributor:
============

Zlatko Drmac (Zagreb, Croatia)
References:
 P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the
singular value decomposition on a vector computer.
SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371.
 J. Demmel and K. Veselic: Jacobi method is more accurate than QR.
 Z. Drmac: Implementation of Jacobi rotations for accurate singular
value computation in floating point arithmetic.
SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222.
 Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
LAPACK Working note 169.
 Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
LAPACK Working note 170.
 Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
QSVD, (H,K)-SVD computations.
Department of Mathematics, University of Zagreb, 2008, 2015.
===========================
Please report all bugs and send interesting test examples and comments to
drmac@math.hr. Thank you.

Definition at line 349 of file cgesvj.f.

351 *
352 * -- LAPACK computational routine --
353 * -- LAPACK is a software package provided by Univ. of Tennessee, --
354 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
355 *
356  IMPLICIT NONE
357 * .. Scalar Arguments ..
358  INTEGER INFO, LDA, LDV, LWORK, LRWORK, M, MV, N
359  CHARACTER*1 JOBA, JOBU, JOBV
360 * ..
361 * .. Array Arguments ..
362  COMPLEX A( LDA, * ), V( LDV, * ), CWORK( LWORK )
363  REAL RWORK( LRWORK ), SVA( N )
364 * ..
365 *
366 * =====================================================================
367 *
368 * .. Local Parameters ..
369  REAL ZERO, HALF, ONE
370  parameter( zero = 0.0e0, half = 0.5e0, one = 1.0e0)
371  COMPLEX CZERO, CONE
372  parameter( czero = (0.0e0, 0.0e0), cone = (1.0e0, 0.0e0) )
373  INTEGER NSWEEP
374  parameter( nsweep = 30 )
375 * ..
376 * .. Local Scalars ..
377  COMPLEX AAPQ, OMPQ
378  REAL AAPP, AAPP0, AAPQ1, AAQQ, APOAQ, AQOAP, BIG,
379  \$ BIGTHETA, CS, CTOL, EPSLN, MXAAPQ,
380  \$ MXSINJ, ROOTBIG, ROOTEPS, ROOTSFMIN, ROOTTOL,
381  \$ SKL, SFMIN, SMALL, SN, T, TEMP1, THETA, THSIGN, TOL
382  INTEGER BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1,
383  \$ ISWROT, jbc, jgl, KBL, LKAHEAD, MVL, N2, N34,
384  \$ N4, NBL, NOTROT, p, PSKIPPED, q, ROWSKIP, SWBAND
385  LOGICAL APPLV, GOSCALE, LOWER, LQUERY, LSVEC, NOSCALE, ROTOK,
386  \$ RSVEC, UCTOL, UPPER
387 * ..
388 * ..
389 * .. Intrinsic Functions ..
390  INTRINSIC abs, max, min, conjg, real, sign, sqrt
391 * ..
392 * .. External Functions ..
393 * ..
394 * from BLAS
395  REAL SCNRM2
396  COMPLEX CDOTC
397  EXTERNAL cdotc, scnrm2
398  INTEGER ISAMAX
399  EXTERNAL isamax
400 * from LAPACK
401  REAL SLAMCH
402  EXTERNAL slamch
403  LOGICAL LSAME
404  EXTERNAL lsame
405 * ..
406 * .. External Subroutines ..
407 * ..
408 * from BLAS
409  EXTERNAL ccopy, crot, csscal, cswap, caxpy
410 * from LAPACK
411  EXTERNAL clascl, claset, classq, slascl, xerbla
412  EXTERNAL cgsvj0, cgsvj1
413 * ..
414 * .. Executable Statements ..
415 *
416 * Test the input arguments
417 *
418  lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
419  uctol = lsame( jobu, 'C' )
420  rsvec = lsame( jobv, 'V' ) .OR. lsame( jobv, 'J' )
421  applv = lsame( jobv, 'A' )
422  upper = lsame( joba, 'U' )
423  lower = lsame( joba, 'L' )
424 *
425  lquery = ( lwork .EQ. -1 ) .OR. ( lrwork .EQ. -1 )
426  IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
427  info = -1
428  ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
429  info = -2
430  ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
431  info = -3
432  ELSE IF( m.LT.0 ) THEN
433  info = -4
434  ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
435  info = -5
436  ELSE IF( lda.LT.m ) THEN
437  info = -7
438  ELSE IF( mv.LT.0 ) THEN
439  info = -9
440  ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
441  \$ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
442  info = -11
443  ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
444  info = -12
445  ELSE IF( lwork.LT.( m+n ) .AND. ( .NOT.lquery ) ) THEN
446  info = -13
447  ELSE IF( lrwork.LT.max( n, 6 ) .AND. ( .NOT.lquery ) ) THEN
448  info = -15
449  ELSE
450  info = 0
451  END IF
452 *
453 * #:(
454  IF( info.NE.0 ) THEN
455  CALL xerbla( 'CGESVJ', -info )
456  RETURN
457  ELSE IF ( lquery ) THEN
458  cwork(1) = m + n
459  rwork(1) = max( n, 6 )
460  RETURN
461  END IF
462 *
463 * #:) Quick return for void matrix
464 *
465  IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
466 *
467 * Set numerical parameters
468 * The stopping criterion for Jacobi rotations is
469 *
470 * max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
471 *
472 * where EPS is the round-off and CTOL is defined as follows:
473 *
474  IF( uctol ) THEN
475 * ... user controlled
476  ctol = rwork( 1 )
477  ELSE
478 * ... default
479  IF( lsvec .OR. rsvec .OR. applv ) THEN
480  ctol = sqrt( real( m ) )
481  ELSE
482  ctol = real( m )
483  END IF
484  END IF
485 * ... and the machine dependent parameters are
486 *[!] (Make sure that SLAMCH() works properly on the target machine.)
487 *
488  epsln = slamch( 'Epsilon' )
489  rooteps = sqrt( epsln )
490  sfmin = slamch( 'SafeMinimum' )
491  rootsfmin = sqrt( sfmin )
492  small = sfmin / epsln
493 * BIG = SLAMCH( 'Overflow' )
494  big = one / sfmin
495  rootbig = one / rootsfmin
496 * LARGE = BIG / SQRT( REAL( M*N ) )
497  bigtheta = one / rooteps
498 *
499  tol = ctol*epsln
500  roottol = sqrt( tol )
501 *
502  IF( real( m )*epsln.GE.one ) THEN
503  info = -4
504  CALL xerbla( 'CGESVJ', -info )
505  RETURN
506  END IF
507 *
508 * Initialize the right singular vector matrix.
509 *
510  IF( rsvec ) THEN
511  mvl = n
512  CALL claset( 'A', mvl, n, czero, cone, v, ldv )
513  ELSE IF( applv ) THEN
514  mvl = mv
515  END IF
516  rsvec = rsvec .OR. applv
517 *
518 * Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
519 *(!) If necessary, scale A to protect the largest singular value
520 * from overflow. It is possible that saving the largest singular
521 * value destroys the information about the small ones.
522 * This initial scaling is almost minimal in the sense that the
523 * goal is to make sure that no column norm overflows, and that
524 * SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
525 * in A are detected, the procedure returns with INFO=-6.
526 *
527  skl = one / sqrt( real( m )*real( n ) )
528  noscale = .true.
529  goscale = .true.
530 *
531  IF( lower ) THEN
532 * the input matrix is M-by-N lower triangular (trapezoidal)
533  DO 1874 p = 1, n
534  aapp = zero
535  aaqq = one
536  CALL classq( m-p+1, a( p, p ), 1, aapp, aaqq )
537  IF( aapp.GT.big ) THEN
538  info = -6
539  CALL xerbla( 'CGESVJ', -info )
540  RETURN
541  END IF
542  aaqq = sqrt( aaqq )
543  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
544  sva( p ) = aapp*aaqq
545  ELSE
546  noscale = .false.
547  sva( p ) = aapp*( aaqq*skl )
548  IF( goscale ) THEN
549  goscale = .false.
550  DO 1873 q = 1, p - 1
551  sva( q ) = sva( q )*skl
552  1873 CONTINUE
553  END IF
554  END IF
555  1874 CONTINUE
556  ELSE IF( upper ) THEN
557 * the input matrix is M-by-N upper triangular (trapezoidal)
558  DO 2874 p = 1, n
559  aapp = zero
560  aaqq = one
561  CALL classq( p, a( 1, p ), 1, aapp, aaqq )
562  IF( aapp.GT.big ) THEN
563  info = -6
564  CALL xerbla( 'CGESVJ', -info )
565  RETURN
566  END IF
567  aaqq = sqrt( aaqq )
568  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
569  sva( p ) = aapp*aaqq
570  ELSE
571  noscale = .false.
572  sva( p ) = aapp*( aaqq*skl )
573  IF( goscale ) THEN
574  goscale = .false.
575  DO 2873 q = 1, p - 1
576  sva( q ) = sva( q )*skl
577  2873 CONTINUE
578  END IF
579  END IF
580  2874 CONTINUE
581  ELSE
582 * the input matrix is M-by-N general dense
583  DO 3874 p = 1, n
584  aapp = zero
585  aaqq = one
586  CALL classq( m, a( 1, p ), 1, aapp, aaqq )
587  IF( aapp.GT.big ) THEN
588  info = -6
589  CALL xerbla( 'CGESVJ', -info )
590  RETURN
591  END IF
592  aaqq = sqrt( aaqq )
593  IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
594  sva( p ) = aapp*aaqq
595  ELSE
596  noscale = .false.
597  sva( p ) = aapp*( aaqq*skl )
598  IF( goscale ) THEN
599  goscale = .false.
600  DO 3873 q = 1, p - 1
601  sva( q ) = sva( q )*skl
602  3873 CONTINUE
603  END IF
604  END IF
605  3874 CONTINUE
606  END IF
607 *
608  IF( noscale )skl = one
609 *
610 * Move the smaller part of the spectrum from the underflow threshold
611 *(!) Start by determining the position of the nonzero entries of the
612 * array SVA() relative to ( SFMIN, BIG ).
613 *
614  aapp = zero
615  aaqq = big
616  DO 4781 p = 1, n
617  IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
618  aapp = max( aapp, sva( p ) )
619  4781 CONTINUE
620 *
621 * #:) Quick return for zero matrix
622 *
623  IF( aapp.EQ.zero ) THEN
624  IF( lsvec )CALL claset( 'G', m, n, czero, cone, a, lda )
625  rwork( 1 ) = one
626  rwork( 2 ) = zero
627  rwork( 3 ) = zero
628  rwork( 4 ) = zero
629  rwork( 5 ) = zero
630  rwork( 6 ) = zero
631  RETURN
632  END IF
633 *
634 * #:) Quick return for one-column matrix
635 *
636  IF( n.EQ.1 ) THEN
637  IF( lsvec )CALL clascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
638  \$ a( 1, 1 ), lda, ierr )
639  rwork( 1 ) = one / skl
640  IF( sva( 1 ).GE.sfmin ) THEN
641  rwork( 2 ) = one
642  ELSE
643  rwork( 2 ) = zero
644  END IF
645  rwork( 3 ) = zero
646  rwork( 4 ) = zero
647  rwork( 5 ) = zero
648  rwork( 6 ) = zero
649  RETURN
650  END IF
651 *
652 * Protect small singular values from underflow, and try to
653 * avoid underflows/overflows in computing Jacobi rotations.
654 *
655  sn = sqrt( sfmin / epsln )
656  temp1 = sqrt( big / real( n ) )
657  IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
658  \$ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
659  temp1 = min( big, temp1 / aapp )
660 * AAQQ = AAQQ*TEMP1
661 * AAPP = AAPP*TEMP1
662  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
663  temp1 = min( sn / aaqq, big / ( aapp*sqrt( real( n ) ) ) )
664 * AAQQ = AAQQ*TEMP1
665 * AAPP = AAPP*TEMP1
666  ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
667  temp1 = max( sn / aaqq, temp1 / aapp )
668 * AAQQ = AAQQ*TEMP1
669 * AAPP = AAPP*TEMP1
670  ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
671  temp1 = min( sn / aaqq, big / ( sqrt( real( n ) )*aapp ) )
672 * AAQQ = AAQQ*TEMP1
673 * AAPP = AAPP*TEMP1
674  ELSE
675  temp1 = one
676  END IF
677 *
678 * Scale, if necessary
679 *
680  IF( temp1.NE.one ) THEN
681  CALL slascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
682  END IF
683  skl = temp1*skl
684  IF( skl.NE.one ) THEN
685  CALL clascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
686  skl = one / skl
687  END IF
688 *
689 * Row-cyclic Jacobi SVD algorithm with column pivoting
690 *
691  emptsw = ( n*( n-1 ) ) / 2
692  notrot = 0
693
694  DO 1868 q = 1, n
695  cwork( q ) = cone
696  1868 CONTINUE
697 *
698 *
699 *
700  swband = 3
701 *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
702 * if CGESVJ is used as a computational routine in the preconditioned
703 * Jacobi SVD algorithm CGEJSV. For sweeps i=1:SWBAND the procedure
704 * works on pivots inside a band-like region around the diagonal.
705 * The boundaries are determined dynamically, based on the number of
706 * pivots above a threshold.
707 *
708  kbl = min( 8, n )
709 *[TP] KBL is a tuning parameter that defines the tile size in the
710 * tiling of the p-q loops of pivot pairs. In general, an optimal
711 * value of KBL depends on the matrix dimensions and on the
712 * parameters of the computer's memory.
713 *
714  nbl = n / kbl
715  IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
716 *
717  blskip = kbl**2
718 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
719 *
720  rowskip = min( 5, kbl )
721 *[TP] ROWSKIP is a tuning parameter.
722 *
724 *[TP] LKAHEAD is a tuning parameter.
725 *
726 * Quasi block transformations, using the lower (upper) triangular
727 * structure of the input matrix. The quasi-block-cycling usually
728 * invokes cubic convergence. Big part of this cycle is done inside
729 * canonical subspaces of dimensions less than M.
730 *
731  IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
732 *[TP] The number of partition levels and the actual partition are
733 * tuning parameters.
734  n4 = n / 4
735  n2 = n / 2
736  n34 = 3*n4
737  IF( applv ) THEN
738  q = 0
739  ELSE
740  q = 1
741  END IF
742 *
743  IF( lower ) THEN
744 *
745 * This works very well on lower triangular matrices, in particular
746 * in the framework of the preconditioned Jacobi SVD (xGEJSV).
747 * The idea is simple:
748 * [+ 0 0 0] Note that Jacobi transformations of [0 0]
749 * [+ + 0 0] [0 0]
750 * [+ + x 0] actually work on [x 0] [x 0]
751 * [+ + x x] [x x]. [x x]
752 *
753  CALL cgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
754  \$ cwork( n34+1 ), sva( n34+1 ), mvl,
755  \$ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
756  \$ 2, cwork( n+1 ), lwork-n, ierr )
757
758  CALL cgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
759  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
760  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
761  \$ cwork( n+1 ), lwork-n, ierr )
762
763  CALL cgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
764  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
765  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
766  \$ cwork( n+1 ), lwork-n, ierr )
767 *
768  CALL cgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
769  \$ cwork( n4+1 ), sva( n4+1 ), mvl,
770  \$ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
771  \$ cwork( n+1 ), lwork-n, ierr )
772 *
773  CALL cgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
774  \$ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
775  \$ ierr )
776 *
777  CALL cgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
778  \$ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
779  \$ lwork-n, ierr )
780 *
781 *
782  ELSE IF( upper ) THEN
783 *
784 *
785  CALL cgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
786  \$ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
787  \$ ierr )
788 *
789  CALL cgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
790  \$ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
791  \$ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
792  \$ ierr )
793 *
794  CALL cgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
795  \$ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
796  \$ lwork-n, ierr )
797 *
798  CALL cgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
799  \$ cwork( n2+1 ), sva( n2+1 ), mvl,
800  \$ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
801  \$ cwork( n+1 ), lwork-n, ierr )
802
803  END IF
804 *
805  END IF
806 *
807 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
808 *
809  DO 1993 i = 1, nsweep
810 *
811 * .. go go go ...
812 *
813  mxaapq = zero
814  mxsinj = zero
815  iswrot = 0
816 *
817  notrot = 0
818  pskipped = 0
819 *
820 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
821 * 1 <= p < q <= N. This is the first step toward a blocked implementation
822 * of the rotations. New implementation, based on block transformations,
823 * is under development.
824 *
825  DO 2000 ibr = 1, nbl
826 *
827  igl = ( ibr-1 )*kbl + 1
828 *
829  DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
830 *
831  igl = igl + ir1*kbl
832 *
833  DO 2001 p = igl, min( igl+kbl-1, n-1 )
834 *
835 * .. de Rijk's pivoting
836 *
837  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
838  IF( p.NE.q ) THEN
839  CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
840  IF( rsvec )CALL cswap( mvl, v( 1, p ), 1,
841  \$ v( 1, q ), 1 )
842  temp1 = sva( p )
843  sva( p ) = sva( q )
844  sva( q ) = temp1
845  aapq = cwork(p)
846  cwork(p) = cwork(q)
847  cwork(q) = aapq
848  END IF
849 *
850  IF( ir1.EQ.0 ) THEN
851 *
852 * Column norms are periodically updated by explicit
853 * norm computation.
854 *[!] Caveat:
855 * Unfortunately, some BLAS implementations compute SCNRM2(M,A(1,p),1)
856 * as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
857 * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
858 * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
859 * Hence, SCNRM2 cannot be trusted, not even in the case when
860 * the true norm is far from the under(over)flow boundaries.
861 * If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
862 * below should be replaced with "AAPP = SCNRM2( M, A(1,p), 1 )".
863 *
864  IF( ( sva( p ).LT.rootbig ) .AND.
865  \$ ( sva( p ).GT.rootsfmin ) ) THEN
866  sva( p ) = scnrm2( m, a( 1, p ), 1 )
867  ELSE
868  temp1 = zero
869  aapp = one
870  CALL classq( m, a( 1, p ), 1, temp1, aapp )
871  sva( p ) = temp1*sqrt( aapp )
872  END IF
873  aapp = sva( p )
874  ELSE
875  aapp = sva( p )
876  END IF
877 *
878  IF( aapp.GT.zero ) THEN
879 *
880  pskipped = 0
881 *
882  DO 2002 q = p + 1, min( igl+kbl-1, n )
883 *
884  aaqq = sva( q )
885 *
886  IF( aaqq.GT.zero ) THEN
887 *
888  aapp0 = aapp
889  IF( aaqq.GE.one ) THEN
890  rotok = ( small*aapp ).LE.aaqq
891  IF( aapp.LT.( big / aaqq ) ) THEN
892  aapq = ( cdotc( m, a( 1, p ), 1,
893  \$ a( 1, q ), 1 ) / aaqq ) / aapp
894  ELSE
895  CALL ccopy( m, a( 1, p ), 1,
896  \$ cwork(n+1), 1 )
897  CALL clascl( 'G', 0, 0, aapp, one,
898  \$ m, 1, cwork(n+1), lda, ierr )
899  aapq = cdotc( m, cwork(n+1), 1,
900  \$ a( 1, q ), 1 ) / aaqq
901  END IF
902  ELSE
903  rotok = aapp.LE.( aaqq / small )
904  IF( aapp.GT.( small / aaqq ) ) THEN
905  aapq = ( cdotc( m, a( 1, p ), 1,
906  \$ a( 1, q ), 1 ) / aapp ) / aaqq
907  ELSE
908  CALL ccopy( m, a( 1, q ), 1,
909  \$ cwork(n+1), 1 )
910  CALL clascl( 'G', 0, 0, aaqq,
911  \$ one, m, 1,
912  \$ cwork(n+1), lda, ierr )
913  aapq = cdotc( m, a(1, p ), 1,
914  \$ cwork(n+1), 1 ) / aapp
915  END IF
916  END IF
917 *
918 * AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
919  aapq1 = -abs(aapq)
920  mxaapq = max( mxaapq, -aapq1 )
921 *
922 * TO rotate or NOT to rotate, THAT is the question ...
923 *
924  IF( abs( aapq1 ).GT.tol ) THEN
925  ompq = aapq / abs(aapq)
926 *
927 * .. rotate
928 *[RTD] ROTATED = ROTATED + ONE
929 *
930  IF( ir1.EQ.0 ) THEN
931  notrot = 0
932  pskipped = 0
933  iswrot = iswrot + 1
934  END IF
935 *
936  IF( rotok ) THEN
937 *
938  aqoap = aaqq / aapp
939  apoaq = aapp / aaqq
940  theta = -half*abs( aqoap-apoaq )/aapq1
941 *
942  IF( abs( theta ).GT.bigtheta ) THEN
943 *
944  t = half / theta
945  cs = one
946
947  CALL crot( m, a(1,p), 1, a(1,q), 1,
948  \$ cs, conjg(ompq)*t )
949  IF ( rsvec ) THEN
950  CALL crot( mvl, v(1,p), 1,
951  \$ v(1,q), 1, cs, conjg(ompq)*t )
952  END IF
953
954  sva( q ) = aaqq*sqrt( max( zero,
955  \$ one+t*apoaq*aapq1 ) )
956  aapp = aapp*sqrt( max( zero,
957  \$ one-t*aqoap*aapq1 ) )
958  mxsinj = max( mxsinj, abs( t ) )
959 *
960  ELSE
961 *
962 * .. choose correct signum for THETA and rotate
963 *
964  thsign = -sign( one, aapq1 )
965  t = one / ( theta+thsign*
966  \$ sqrt( one+theta*theta ) )
967  cs = sqrt( one / ( one+t*t ) )
968  sn = t*cs
969 *
970  mxsinj = max( mxsinj, abs( sn ) )
971  sva( q ) = aaqq*sqrt( max( zero,
972  \$ one+t*apoaq*aapq1 ) )
973  aapp = aapp*sqrt( max( zero,
974  \$ one-t*aqoap*aapq1 ) )
975 *
976  CALL crot( m, a(1,p), 1, a(1,q), 1,
977  \$ cs, conjg(ompq)*sn )
978  IF ( rsvec ) THEN
979  CALL crot( mvl, v(1,p), 1,
980  \$ v(1,q), 1, cs, conjg(ompq)*sn )
981  END IF
982  END IF
983  cwork(p) = -cwork(q) * ompq
984 *
985  ELSE
986 * .. have to use modified Gram-Schmidt like transformation
987  CALL ccopy( m, a( 1, p ), 1,
988  \$ cwork(n+1), 1 )
989  CALL clascl( 'G', 0, 0, aapp, one, m,
990  \$ 1, cwork(n+1), lda,
991  \$ ierr )
992  CALL clascl( 'G', 0, 0, aaqq, one, m,
993  \$ 1, a( 1, q ), lda, ierr )
994  CALL caxpy( m, -aapq, cwork(n+1), 1,
995  \$ a( 1, q ), 1 )
996  CALL clascl( 'G', 0, 0, one, aaqq, m,
997  \$ 1, a( 1, q ), lda, ierr )
998  sva( q ) = aaqq*sqrt( max( zero,
999  \$ one-aapq1*aapq1 ) )
1000  mxsinj = max( mxsinj, sfmin )
1001  END IF
1002 * END IF ROTOK THEN ... ELSE
1003 *
1004 * In the case of cancellation in updating SVA(q), SVA(p)
1005 * recompute SVA(q), SVA(p).
1006 *
1007  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1008  \$ THEN
1009  IF( ( aaqq.LT.rootbig ) .AND.
1010  \$ ( aaqq.GT.rootsfmin ) ) THEN
1011  sva( q ) = scnrm2( m, a( 1, q ), 1 )
1012  ELSE
1013  t = zero
1014  aaqq = one
1015  CALL classq( m, a( 1, q ), 1, t,
1016  \$ aaqq )
1017  sva( q ) = t*sqrt( aaqq )
1018  END IF
1019  END IF
1020  IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1021  IF( ( aapp.LT.rootbig ) .AND.
1022  \$ ( aapp.GT.rootsfmin ) ) THEN
1023  aapp = scnrm2( m, a( 1, p ), 1 )
1024  ELSE
1025  t = zero
1026  aapp = one
1027  CALL classq( m, a( 1, p ), 1, t,
1028  \$ aapp )
1029  aapp = t*sqrt( aapp )
1030  END IF
1031  sva( p ) = aapp
1032  END IF
1033 *
1034  ELSE
1035 * A(:,p) and A(:,q) already numerically orthogonal
1036  IF( ir1.EQ.0 )notrot = notrot + 1
1037 *[RTD] SKIPPED = SKIPPED + 1
1038  pskipped = pskipped + 1
1039  END IF
1040  ELSE
1041 * A(:,q) is zero column
1042  IF( ir1.EQ.0 )notrot = notrot + 1
1043  pskipped = pskipped + 1
1044  END IF
1045 *
1046  IF( ( i.LE.swband ) .AND.
1047  \$ ( pskipped.GT.rowskip ) ) THEN
1048  IF( ir1.EQ.0 )aapp = -aapp
1049  notrot = 0
1050  GO TO 2103
1051  END IF
1052 *
1053  2002 CONTINUE
1054 * END q-LOOP
1055 *
1056  2103 CONTINUE
1057 * bailed out of q-loop
1058 *
1059  sva( p ) = aapp
1060 *
1061  ELSE
1062  sva( p ) = aapp
1063  IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1064  \$ notrot = notrot + min( igl+kbl-1, n ) - p
1065  END IF
1066 *
1067  2001 CONTINUE
1068 * end of the p-loop
1069 * end of doing the block ( ibr, ibr )
1070  1002 CONTINUE
1071 * end of ir1-loop
1072 *
1073 * ... go to the off diagonal blocks
1074 *
1075  igl = ( ibr-1 )*kbl + 1
1076 *
1077  DO 2010 jbc = ibr + 1, nbl
1078 *
1079  jgl = ( jbc-1 )*kbl + 1
1080 *
1081 * doing the block at ( ibr, jbc )
1082 *
1083  ijblsk = 0
1084  DO 2100 p = igl, min( igl+kbl-1, n )
1085 *
1086  aapp = sva( p )
1087  IF( aapp.GT.zero ) THEN
1088 *
1089  pskipped = 0
1090 *
1091  DO 2200 q = jgl, min( jgl+kbl-1, n )
1092 *
1093  aaqq = sva( q )
1094  IF( aaqq.GT.zero ) THEN
1095  aapp0 = aapp
1096 *
1097 * .. M x 2 Jacobi SVD ..
1098 *
1099 * Safe Gram matrix computation
1100 *
1101  IF( aaqq.GE.one ) THEN
1102  IF( aapp.GE.aaqq ) THEN
1103  rotok = ( small*aapp ).LE.aaqq
1104  ELSE
1105  rotok = ( small*aaqq ).LE.aapp
1106  END IF
1107  IF( aapp.LT.( big / aaqq ) ) THEN
1108  aapq = ( cdotc( m, a( 1, p ), 1,
1109  \$ a( 1, q ), 1 ) / aaqq ) / aapp
1110  ELSE
1111  CALL ccopy( m, a( 1, p ), 1,
1112  \$ cwork(n+1), 1 )
1113  CALL clascl( 'G', 0, 0, aapp,
1114  \$ one, m, 1,
1115  \$ cwork(n+1), lda, ierr )
1116  aapq = cdotc( m, cwork(n+1), 1,
1117  \$ a( 1, q ), 1 ) / aaqq
1118  END IF
1119  ELSE
1120  IF( aapp.GE.aaqq ) THEN
1121  rotok = aapp.LE.( aaqq / small )
1122  ELSE
1123  rotok = aaqq.LE.( aapp / small )
1124  END IF
1125  IF( aapp.GT.( small / aaqq ) ) THEN
1126  aapq = ( cdotc( m, a( 1, p ), 1,
1127  \$ a( 1, q ), 1 ) / max(aaqq,aapp) )
1128  \$ / min(aaqq,aapp)
1129  ELSE
1130  CALL ccopy( m, a( 1, q ), 1,
1131  \$ cwork(n+1), 1 )
1132  CALL clascl( 'G', 0, 0, aaqq,
1133  \$ one, m, 1,
1134  \$ cwork(n+1), lda, ierr )
1135  aapq = cdotc( m, a( 1, p ), 1,
1136  \$ cwork(n+1), 1 ) / aapp
1137  END IF
1138  END IF
1139 *
1140 * AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1141  aapq1 = -abs(aapq)
1142  mxaapq = max( mxaapq, -aapq1 )
1143 *
1144 * TO rotate or NOT to rotate, THAT is the question ...
1145 *
1146  IF( abs( aapq1 ).GT.tol ) THEN
1147  ompq = aapq / abs(aapq)
1148  notrot = 0
1149 *[RTD] ROTATED = ROTATED + 1
1150  pskipped = 0
1151  iswrot = iswrot + 1
1152 *
1153  IF( rotok ) THEN
1154 *
1155  aqoap = aaqq / aapp
1156  apoaq = aapp / aaqq
1157  theta = -half*abs( aqoap-apoaq )/ aapq1
1158  IF( aaqq.GT.aapp0 )theta = -theta
1159 *
1160  IF( abs( theta ).GT.bigtheta ) THEN
1161  t = half / theta
1162  cs = one
1163  CALL crot( m, a(1,p), 1, a(1,q), 1,
1164  \$ cs, conjg(ompq)*t )
1165  IF( rsvec ) THEN
1166  CALL crot( mvl, v(1,p), 1,
1167  \$ v(1,q), 1, cs, conjg(ompq)*t )
1168  END IF
1169  sva( q ) = aaqq*sqrt( max( zero,
1170  \$ one+t*apoaq*aapq1 ) )
1171  aapp = aapp*sqrt( max( zero,
1172  \$ one-t*aqoap*aapq1 ) )
1173  mxsinj = max( mxsinj, abs( t ) )
1174  ELSE
1175 *
1176 * .. choose correct signum for THETA and rotate
1177 *
1178  thsign = -sign( one, aapq1 )
1179  IF( aaqq.GT.aapp0 )thsign = -thsign
1180  t = one / ( theta+thsign*
1181  \$ sqrt( one+theta*theta ) )
1182  cs = sqrt( one / ( one+t*t ) )
1183  sn = t*cs
1184  mxsinj = max( mxsinj, abs( sn ) )
1185  sva( q ) = aaqq*sqrt( max( zero,
1186  \$ one+t*apoaq*aapq1 ) )
1187  aapp = aapp*sqrt( max( zero,
1188  \$ one-t*aqoap*aapq1 ) )
1189 *
1190  CALL crot( m, a(1,p), 1, a(1,q), 1,
1191  \$ cs, conjg(ompq)*sn )
1192  IF( rsvec ) THEN
1193  CALL crot( mvl, v(1,p), 1,
1194  \$ v(1,q), 1, cs, conjg(ompq)*sn )
1195  END IF
1196  END IF
1197  cwork(p) = -cwork(q) * ompq
1198 *
1199  ELSE
1200 * .. have to use modified Gram-Schmidt like transformation
1201  IF( aapp.GT.aaqq ) THEN
1202  CALL ccopy( m, a( 1, p ), 1,
1203  \$ cwork(n+1), 1 )
1204  CALL clascl( 'G', 0, 0, aapp, one,
1205  \$ m, 1, cwork(n+1),lda,
1206  \$ ierr )
1207  CALL clascl( 'G', 0, 0, aaqq, one,
1208  \$ m, 1, a( 1, q ), lda,
1209  \$ ierr )
1210  CALL caxpy( m, -aapq, cwork(n+1),
1211  \$ 1, a( 1, q ), 1 )
1212  CALL clascl( 'G', 0, 0, one, aaqq,
1213  \$ m, 1, a( 1, q ), lda,
1214  \$ ierr )
1215  sva( q ) = aaqq*sqrt( max( zero,
1216  \$ one-aapq1*aapq1 ) )
1217  mxsinj = max( mxsinj, sfmin )
1218  ELSE
1219  CALL ccopy( m, a( 1, q ), 1,
1220  \$ cwork(n+1), 1 )
1221  CALL clascl( 'G', 0, 0, aaqq, one,
1222  \$ m, 1, cwork(n+1),lda,
1223  \$ ierr )
1224  CALL clascl( 'G', 0, 0, aapp, one,
1225  \$ m, 1, a( 1, p ), lda,
1226  \$ ierr )
1227  CALL caxpy( m, -conjg(aapq),
1228  \$ cwork(n+1), 1, a( 1, p ), 1 )
1229  CALL clascl( 'G', 0, 0, one, aapp,
1230  \$ m, 1, a( 1, p ), lda,
1231  \$ ierr )
1232  sva( p ) = aapp*sqrt( max( zero,
1233  \$ one-aapq1*aapq1 ) )
1234  mxsinj = max( mxsinj, sfmin )
1235  END IF
1236  END IF
1237 * END IF ROTOK THEN ... ELSE
1238 *
1239 * In the case of cancellation in updating SVA(q), SVA(p)
1240 * .. recompute SVA(q), SVA(p)
1241  IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1242  \$ THEN
1243  IF( ( aaqq.LT.rootbig ) .AND.
1244  \$ ( aaqq.GT.rootsfmin ) ) THEN
1245  sva( q ) = scnrm2( m, a( 1, q ), 1)
1246  ELSE
1247  t = zero
1248  aaqq = one
1249  CALL classq( m, a( 1, q ), 1, t,
1250  \$ aaqq )
1251  sva( q ) = t*sqrt( aaqq )
1252  END IF
1253  END IF
1254  IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1255  IF( ( aapp.LT.rootbig ) .AND.
1256  \$ ( aapp.GT.rootsfmin ) ) THEN
1257  aapp = scnrm2( m, a( 1, p ), 1 )
1258  ELSE
1259  t = zero
1260  aapp = one
1261  CALL classq( m, a( 1, p ), 1, t,
1262  \$ aapp )
1263  aapp = t*sqrt( aapp )
1264  END IF
1265  sva( p ) = aapp
1266  END IF
1267 * end of OK rotation
1268  ELSE
1269  notrot = notrot + 1
1270 *[RTD] SKIPPED = SKIPPED + 1
1271  pskipped = pskipped + 1
1272  ijblsk = ijblsk + 1
1273  END IF
1274  ELSE
1275  notrot = notrot + 1
1276  pskipped = pskipped + 1
1277  ijblsk = ijblsk + 1
1278  END IF
1279 *
1280  IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1281  \$ THEN
1282  sva( p ) = aapp
1283  notrot = 0
1284  GO TO 2011
1285  END IF
1286  IF( ( i.LE.swband ) .AND.
1287  \$ ( pskipped.GT.rowskip ) ) THEN
1288  aapp = -aapp
1289  notrot = 0
1290  GO TO 2203
1291  END IF
1292 *
1293  2200 CONTINUE
1294 * end of the q-loop
1295  2203 CONTINUE
1296 *
1297  sva( p ) = aapp
1298 *
1299  ELSE
1300 *
1301  IF( aapp.EQ.zero )notrot = notrot +
1302  \$ min( jgl+kbl-1, n ) - jgl + 1
1303  IF( aapp.LT.zero )notrot = 0
1304 *
1305  END IF
1306 *
1307  2100 CONTINUE
1308 * end of the p-loop
1309  2010 CONTINUE
1310 * end of the jbc-loop
1311  2011 CONTINUE
1312 *2011 bailed out of the jbc-loop
1313  DO 2012 p = igl, min( igl+kbl-1, n )
1314  sva( p ) = abs( sva( p ) )
1315  2012 CONTINUE
1316 ***
1317  2000 CONTINUE
1318 *2000 :: end of the ibr-loop
1319 *
1320 * .. update SVA(N)
1321  IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1322  \$ THEN
1323  sva( n ) = scnrm2( m, a( 1, n ), 1 )
1324  ELSE
1325  t = zero
1326  aapp = one
1327  CALL classq( m, a( 1, n ), 1, t, aapp )
1328  sva( n ) = t*sqrt( aapp )
1329  END IF
1330 *
1332 *
1333  IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1334  \$ ( iswrot.LE.n ) ) )swband = i
1335 *
1336  IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( real( n ) )*
1337  \$ tol ) .AND. ( real( n )*mxaapq*mxsinj.LT.tol ) ) THEN
1338  GO TO 1994
1339  END IF
1340 *
1341  IF( notrot.GE.emptsw )GO TO 1994
1342 *
1343  1993 CONTINUE
1344 * end i=1:NSWEEP loop
1345 *
1346 * #:( Reaching this point means that the procedure has not converged.
1347  info = nsweep - 1
1348  GO TO 1995
1349 *
1350  1994 CONTINUE
1351 * #:) Reaching this point means numerical convergence after the i-th
1352 * sweep.
1353 *
1354  info = 0
1355 * #:) INFO = 0 confirms successful iterations.
1356  1995 CONTINUE
1357 *
1358 * Sort the singular values and find how many are above
1359 * the underflow threshold.
1360 *
1361  n2 = 0
1362  n4 = 0
1363  DO 5991 p = 1, n - 1
1364  q = isamax( n-p+1, sva( p ), 1 ) + p - 1
1365  IF( p.NE.q ) THEN
1366  temp1 = sva( p )
1367  sva( p ) = sva( q )
1368  sva( q ) = temp1
1369  CALL cswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1370  IF( rsvec )CALL cswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1371  END IF
1372  IF( sva( p ).NE.zero ) THEN
1373  n4 = n4 + 1
1374  IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1375  END IF
1376  5991 CONTINUE
1377  IF( sva( n ).NE.zero ) THEN
1378  n4 = n4 + 1
1379  IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1380  END IF
1381 *
1382 * Normalize the left singular vectors.
1383 *
1384  IF( lsvec .OR. uctol ) THEN
1385  DO 1998 p = 1, n4
1386 * CALL CSSCAL( M, ONE / SVA( p ), A( 1, p ), 1 )
1387  CALL clascl( 'G',0,0, sva(p), one, m, 1, a(1,p), m, ierr )
1388  1998 CONTINUE
1389  END IF
1390 *
1391 * Scale the product of Jacobi rotations.
1392 *
1393  IF( rsvec ) THEN
1394  DO 2399 p = 1, n
1395  temp1 = one / scnrm2( mvl, v( 1, p ), 1 )
1396  CALL csscal( mvl, temp1, v( 1, p ), 1 )
1397  2399 CONTINUE
1398  END IF
1399 *
1400 * Undo scaling, if necessary (and possible).
1401  IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1402  \$ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1403  \$ ( sfmin / skl ) ) ) ) THEN
1404  DO 2400 p = 1, n
1405  sva( p ) = skl*sva( p )
1406  2400 CONTINUE
1407  skl = one
1408  END IF
1409 *
1410  rwork( 1 ) = skl
1411 * The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1412 * then some of the singular values may overflow or underflow and
1413 * the spectrum is given in this factored representation.
1414 *
1415  rwork( 2 ) = real( n4 )
1416 * N4 is the number of computed nonzero singular values of A.
1417 *
1418  rwork( 3 ) = real( n2 )
1419 * N2 is the number of singular values of A greater than SFMIN.
1420 * If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1421 * that may carry some information.
1422 *
1423  rwork( 4 ) = real( i )
1424 * i is the index of the last sweep before declaring convergence.
1425 *
1426  rwork( 5 ) = mxaapq
1427 * MXAAPQ is the largest absolute value of scaled pivots in the
1428 * last sweep
1429 *
1430  rwork( 6 ) = mxsinj
1431 * MXSINJ is the largest absolute value of the sines of Jacobi angles
1432 * in the last sweep
1433 *
1434  RETURN
1435 * ..
1436 * .. END OF CGESVJ
1437 * ..
subroutine slascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
SLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: slascl.f:143
subroutine classq(n, x, incx, scl, sumsq)
CLASSQ updates a sum of squares represented in scaled form.
Definition: classq.f90:126
integer function isamax(N, SX, INCX)
ISAMAX
Definition: isamax.f:71
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:60
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:53
complex function cdotc(N, CX, INCX, CY, INCY)
CDOTC
Definition: cdotc.f:83
subroutine ccopy(N, CX, INCX, CY, INCY)
CCOPY
Definition: ccopy.f:81
subroutine csscal(N, SA, CX, INCX)
CSSCAL
Definition: csscal.f:78
subroutine caxpy(N, CA, CX, INCX, CY, INCY)
CAXPY
Definition: caxpy.f:88
subroutine cswap(N, CX, INCX, CY, INCY)
CSWAP
Definition: cswap.f:81
subroutine claset(UPLO, M, N, ALPHA, BETA, A, LDA)
CLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values.
Definition: claset.f:106
subroutine crot(N, CX, INCX, CY, INCY, C, S)
CROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors.
Definition: crot.f:103
subroutine clascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
CLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: clascl.f:143
subroutine cgsvj0(JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
CGSVJ0 pre-processor for the routine cgesvj.
Definition: cgsvj0.f:218
subroutine cgsvj1(JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV, EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO)
CGSVJ1 pre-processor for the routine cgesvj, applies Jacobi rotations targeting only particular pivot...
Definition: cgsvj1.f:236
real(wp) function scnrm2(n, x, incx)
SCNRM2
Definition: scnrm2.f90:90
real function slamch(CMACH)
SLAMCH
Definition: slamch.f:68
Here is the call graph for this function:
Here is the caller graph for this function: