LAPACK 3.12.0
LAPACK: Linear Algebra PACKage
Loading...
Searching...
No Matches

◆ zgesvj()

subroutine zgesvj ( character*1  joba,
character*1  jobu,
character*1  jobv,
integer  m,
integer  n,
complex*16, dimension( lda, * )  a,
integer  lda,
double precision, dimension( n )  sva,
integer  mv,
complex*16, dimension( ldv, * )  v,
integer  ldv,
complex*16, dimension( lwork )  cwork,
integer  lwork,
double precision, dimension( lrwork )  rwork,
integer  lrwork,
integer  info 
)

ZGESVJ

Download ZGESVJ + dependencies [TGZ] [ZIP] [TXT]

Purpose:
 ZGESVJ computes the singular value decomposition (SVD) of a complex
 M-by-N matrix A, where M >= N. The SVD of A is written as
                                    [++]   [xx]   [x0]   [xx]
              A = U * SIGMA * V^*,  [++] = [xx] * [ox] * [xx]
                                    [++]   [xx]
 where SIGMA is an N-by-N diagonal matrix, U is an M-by-N orthonormal
 matrix, and V is an N-by-N unitary matrix. The diagonal elements
 of SIGMA are the singular values of A. The columns of U and V are the
 left and the right singular vectors of A, respectively.
Parameters
[in]JOBA
          JOBA is CHARACTER*1
          Specifies the structure of A.
          = 'L': The input matrix A is lower triangular;
          = 'U': The input matrix A is upper triangular;
          = 'G': The input matrix A is general M-by-N matrix, M >= N.
[in]JOBU
          JOBU is CHARACTER*1
          Specifies whether to compute the left singular vectors
          (columns of U):
          = 'U' or 'F': The left singular vectors corresponding to the nonzero
                 singular values are computed and returned in the leading
                 columns of A. See more details in the description of A.
                 The default numerical orthogonality threshold is set to
                 approximately TOL=CTOL*EPS, CTOL=SQRT(M), EPS=DLAMCH('E').
          = 'C': Analogous to JOBU='U', except that user can control the
                 level of numerical orthogonality of the computed left
                 singular vectors. TOL can be set to TOL = CTOL*EPS, where
                 CTOL is given on input in the array WORK.
                 No CTOL smaller than ONE is allowed. CTOL greater
                 than 1 / EPS is meaningless. The option 'C'
                 can be used if M*EPS is satisfactory orthogonality
                 of the computed left singular vectors, so CTOL=M could
                 save few sweeps of Jacobi rotations.
                 See the descriptions of A and WORK(1).
          = 'N': The matrix U is not computed. However, see the
                 description of A.
[in]JOBV
          JOBV is CHARACTER*1
          Specifies whether to compute the right singular vectors, that
          is, the matrix V:
          = 'V' or 'J': the matrix V is computed and returned in the array V
          = 'A':  the Jacobi rotations are applied to the MV-by-N
                  array V. In other words, the right singular vector
                  matrix V is not computed explicitly; instead it is
                  applied to an MV-by-N matrix initially stored in the
                  first MV rows of V.
          = 'N':  the matrix V is not computed and the array V is not
                  referenced
[in]M
          M is INTEGER
          The number of rows of the input matrix A. 1/DLAMCH('E') > M >= 0.
[in]N
          N is INTEGER
          The number of columns of the input matrix A.
          M >= N >= 0.
[in,out]A
          A is COMPLEX*16 array, dimension (LDA,N)
          On entry, the M-by-N matrix A.
          On exit,
          If JOBU = 'U' .OR. JOBU = 'C':
                 If INFO = 0 :
                 RANKA orthonormal columns of U are returned in the
                 leading RANKA columns of the array A. Here RANKA <= N
                 is the number of computed singular values of A that are
                 above the underflow threshold DLAMCH('S'). The singular
                 vectors corresponding to underflowed or zero singular
                 values are not computed. The value of RANKA is returned
                 in the array RWORK as RANKA=NINT(RWORK(2)). Also see the
                 descriptions of SVA and RWORK. The computed columns of U
                 are mutually numerically orthogonal up to approximately
                 TOL=SQRT(M)*EPS (default); or TOL=CTOL*EPS (JOBU = 'C'),
                 see the description of JOBU.
                 If INFO > 0,
                 the procedure ZGESVJ did not converge in the given number
                 of iterations (sweeps). In that case, the computed
                 columns of U may not be orthogonal up to TOL. The output
                 U (stored in A), SIGMA (given by the computed singular
                 values in SVA(1:N)) and V is still a decomposition of the
                 input matrix A in the sense that the residual
                 || A - SCALE * U * SIGMA * V^* ||_2 / ||A||_2 is small.
          If JOBU = 'N':
                 If INFO = 0 :
                 Note that the left singular vectors are 'for free' in the
                 one-sided Jacobi SVD algorithm. However, if only the
                 singular values are needed, the level of numerical
                 orthogonality of U is not an issue and iterations are
                 stopped when the columns of the iterated matrix are
                 numerically orthogonal up to approximately M*EPS. Thus,
                 on exit, A contains the columns of U scaled with the
                 corresponding singular values.
                 If INFO > 0:
                 the procedure ZGESVJ did not converge in the given number
                 of iterations (sweeps).
[in]LDA
          LDA is INTEGER
          The leading dimension of the array A.  LDA >= max(1,M).
[out]SVA
          SVA is DOUBLE PRECISION array, dimension (N)
          On exit,
          If INFO = 0 :
          depending on the value SCALE = RWORK(1), we have:
                 If SCALE = ONE:
                 SVA(1:N) contains the computed singular values of A.
                 During the computation SVA contains the Euclidean column
                 norms of the iterated matrices in the array A.
                 If SCALE .NE. ONE:
                 The singular values of A are SCALE*SVA(1:N), and this
                 factored representation is due to the fact that some of the
                 singular values of A might underflow or overflow.

          If INFO > 0:
          the procedure ZGESVJ did not converge in the given number of
          iterations (sweeps) and SCALE*SVA(1:N) may not be accurate.
[in]MV
          MV is INTEGER
          If JOBV = 'A', then the product of Jacobi rotations in ZGESVJ
          is applied to the first MV rows of V. See the description of JOBV.
[in,out]V
          V is COMPLEX*16 array, dimension (LDV,N)
          If JOBV = 'V', then V contains on exit the N-by-N matrix of
                         the right singular vectors;
          If JOBV = 'A', then V contains the product of the computed right
                         singular vector matrix and the initial matrix in
                         the array V.
          If JOBV = 'N', then V is not referenced.
[in]LDV
          LDV is INTEGER
          The leading dimension of the array V, LDV >= 1.
          If JOBV = 'V', then LDV >= max(1,N).
          If JOBV = 'A', then LDV >= max(1,MV) .
[in,out]CWORK
          CWORK is COMPLEX*16 array, dimension (max(1,LWORK))
          Used as workspace.
          If on entry LWORK = -1, then a workspace query is assumed and
          no computation is done; CWORK(1) is set to the minial (and optimal)
          length of CWORK.
[in]LWORK
          LWORK is INTEGER.
          Length of CWORK, LWORK >= M+N.
[in,out]RWORK
          RWORK is DOUBLE PRECISION array, dimension (max(6,LRWORK))
          On entry,
          If JOBU = 'C' :
          RWORK(1) = CTOL, where CTOL defines the threshold for convergence.
                    The process stops if all columns of A are mutually
                    orthogonal up to CTOL*EPS, EPS=DLAMCH('E').
                    It is required that CTOL >= ONE, i.e. it is not
                    allowed to force the routine to obtain orthogonality
                    below EPSILON.
          On exit,
          RWORK(1) = SCALE is the scaling factor such that SCALE*SVA(1:N)
                    are the computed singular values of A.
                    (See description of SVA().)
          RWORK(2) = NINT(RWORK(2)) is the number of the computed nonzero
                    singular values.
          RWORK(3) = NINT(RWORK(3)) is the number of the computed singular
                    values that are larger than the underflow threshold.
          RWORK(4) = NINT(RWORK(4)) is the number of sweeps of Jacobi
                    rotations needed for numerical convergence.
          RWORK(5) = max_{i.NE.j} |COS(A(:,i),A(:,j))| in the last sweep.
                    This is useful information in cases when ZGESVJ did
                    not converge, as it can be used to estimate whether
                    the output is still useful and for post festum analysis.
          RWORK(6) = the largest absolute value over all sines of the
                    Jacobi rotation angles in the last sweep. It can be
                    useful for a post festum analysis.
         If on entry LRWORK = -1, then a workspace query is assumed and
         no computation is done; RWORK(1) is set to the minial (and optimal)
         length of RWORK.
[in]LRWORK
         LRWORK is INTEGER
         Length of RWORK, LRWORK >= MAX(6,N).
[out]INFO
          INFO is INTEGER
          = 0:  successful exit.
          < 0:  if INFO = -i, then the i-th argument had an illegal value
          > 0:  ZGESVJ did not converge in the maximal allowed number
                (NSWEEP=30) of sweeps. The output may still be useful.
                See the description of RWORK.
Author
Univ. of Tennessee
Univ. of California Berkeley
Univ. of Colorado Denver
NAG Ltd.
Further Details:
 The orthogonal N-by-N matrix V is obtained as a product of Jacobi plane
 rotations. In the case of underflow of the tangent of the Jacobi angle, a
 modified Jacobi transformation of Drmac [3] is used. Pivot strategy uses
 column interchanges of de Rijk [1]. The relative accuracy of the computed
 singular values and the accuracy of the computed singular vectors (in
 angle metric) is as guaranteed by the theory of Demmel and Veselic [2].
 The condition number that determines the accuracy in the full rank case
 is essentially min_{D=diag} kappa(A*D), where kappa(.) is the
 spectral condition number. The best performance of this Jacobi SVD
 procedure is achieved if used in an  accelerated version of Drmac and
 Veselic [4,5], and it is the kernel routine in the SIGMA library [6].
 Some tuning parameters (marked with [TP]) are available for the
 implementer.
 The computational range for the nonzero singular values is the  machine
 number interval ( UNDERFLOW , OVERFLOW ). In extreme cases, even
 denormalized singular values can be computed with the corresponding
 gradual loss of accurate digits.
Contributor:
  ============

  Zlatko Drmac (Zagreb, Croatia)
References:
 [1] P. P. M. De Rijk: A one-sided Jacobi algorithm for computing the
    singular value decomposition on a vector computer.
    SIAM J. Sci. Stat. Comp., Vol. 10 (1998), pp. 359-371.
 [2] J. Demmel and K. Veselic: Jacobi method is more accurate than QR.
 [3] Z. Drmac: Implementation of Jacobi rotations for accurate singular
    value computation in floating point arithmetic.
    SIAM J. Sci. Comp., Vol. 18 (1997), pp. 1200-1222.
 [4] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
    SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
    LAPACK Working note 169.
 [5] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
    SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
    LAPACK Working note 170.
 [6] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
    QSVD, (H,K)-SVD computations.
    Department of Mathematics, University of Zagreb, 2008, 2015.
Bugs, examples and comments:
  ===========================
  Please report all bugs and send interesting test examples and comments to
  drmac@math.hr. Thank you.

Definition at line 349 of file zgesvj.f.

351*
352* -- LAPACK computational routine --
353* -- LAPACK is a software package provided by Univ. of Tennessee, --
354* -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
355*
356 IMPLICIT NONE
357* .. Scalar Arguments ..
358 INTEGER INFO, LDA, LDV, LWORK, LRWORK, M, MV, N
359 CHARACTER*1 JOBA, JOBU, JOBV
360* ..
361* .. Array Arguments ..
362 COMPLEX*16 A( LDA, * ), V( LDV, * ), CWORK( LWORK )
363 DOUBLE PRECISION RWORK( LRWORK ), SVA( N )
364* ..
365*
366* =====================================================================
367*
368* .. Local Parameters ..
369 DOUBLE PRECISION ZERO, HALF, ONE
370 parameter( zero = 0.0d0, half = 0.5d0, one = 1.0d0)
371 COMPLEX*16 CZERO, CONE
372 parameter( czero = (0.0d0, 0.0d0), cone = (1.0d0, 0.0d0) )
373 INTEGER NSWEEP
374 parameter( nsweep = 30 )
375* ..
376* .. Local Scalars ..
377 COMPLEX*16 AAPQ, OMPQ
378 DOUBLE PRECISION AAPP, AAPP0, AAPQ1, AAQQ, APOAQ, AQOAP, BIG,
379 $ BIGTHETA, CS, CTOL, EPSLN, MXAAPQ,
380 $ MXSINJ, ROOTBIG, ROOTEPS, ROOTSFMIN, ROOTTOL,
381 $ SKL, SFMIN, SMALL, SN, T, TEMP1, THETA, THSIGN, TOL
382 INTEGER BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1,
383 $ ISWROT, jbc, jgl, KBL, LKAHEAD, MVL, N2, N34,
384 $ N4, NBL, NOTROT, p, PSKIPPED, q, ROWSKIP, SWBAND
385 LOGICAL APPLV, GOSCALE, LOWER, LQUERY, LSVEC, NOSCALE, ROTOK,
386 $ RSVEC, UCTOL, UPPER
387* ..
388* ..
389* .. Intrinsic Functions ..
390 INTRINSIC abs, max, min, conjg, dble, sign, sqrt
391* ..
392* .. External Functions ..
393* ..
394* from BLAS
395 DOUBLE PRECISION DZNRM2
396 COMPLEX*16 ZDOTC
397 EXTERNAL zdotc, dznrm2
398 INTEGER IDAMAX
399 EXTERNAL idamax
400* from LAPACK
401 DOUBLE PRECISION DLAMCH
402 EXTERNAL dlamch
403 LOGICAL LSAME
404 EXTERNAL lsame
405* ..
406* .. External Subroutines ..
407* ..
408* from BLAS
409 EXTERNAL zcopy, zrot, zdscal, zswap, zaxpy
410* from LAPACK
411 EXTERNAL dlascl, zlascl, zlaset, zlassq, xerbla
412 EXTERNAL zgsvj0, zgsvj1
413* ..
414* .. Executable Statements ..
415*
416* Test the input arguments
417*
418 lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
419 uctol = lsame( jobu, 'C' )
420 rsvec = lsame( jobv, 'V' ) .OR. lsame( jobv, 'J' )
421 applv = lsame( jobv, 'A' )
422 upper = lsame( joba, 'U' )
423 lower = lsame( joba, 'L' )
424*
425 lquery = ( lwork .EQ. -1 ) .OR. ( lrwork .EQ. -1 )
426 IF( .NOT.( upper .OR. lower .OR. lsame( joba, 'G' ) ) ) THEN
427 info = -1
428 ELSE IF( .NOT.( lsvec .OR. uctol .OR. lsame( jobu, 'N' ) ) ) THEN
429 info = -2
430 ELSE IF( .NOT.( rsvec .OR. applv .OR. lsame( jobv, 'N' ) ) ) THEN
431 info = -3
432 ELSE IF( m.LT.0 ) THEN
433 info = -4
434 ELSE IF( ( n.LT.0 ) .OR. ( n.GT.m ) ) THEN
435 info = -5
436 ELSE IF( lda.LT.m ) THEN
437 info = -7
438 ELSE IF( mv.LT.0 ) THEN
439 info = -9
440 ELSE IF( ( rsvec .AND. ( ldv.LT.n ) ) .OR.
441 $ ( applv .AND. ( ldv.LT.mv ) ) ) THEN
442 info = -11
443 ELSE IF( uctol .AND. ( rwork( 1 ).LE.one ) ) THEN
444 info = -12
445 ELSE IF( ( lwork.LT.( m+n ) ) .AND. ( .NOT.lquery ) ) THEN
446 info = -13
447 ELSE IF( ( lrwork.LT.max( n, 6 ) ) .AND. ( .NOT.lquery ) ) THEN
448 info = -15
449 ELSE
450 info = 0
451 END IF
452*
453* #:(
454 IF( info.NE.0 ) THEN
455 CALL xerbla( 'ZGESVJ', -info )
456 RETURN
457 ELSE IF ( lquery ) THEN
458 cwork(1) = m + n
459 rwork(1) = max( n, 6 )
460 RETURN
461 END IF
462*
463* #:) Quick return for void matrix
464*
465 IF( ( m.EQ.0 ) .OR. ( n.EQ.0 ) )RETURN
466*
467* Set numerical parameters
468* The stopping criterion for Jacobi rotations is
469*
470* max_{i<>j}|A(:,i)^* * A(:,j)| / (||A(:,i)||*||A(:,j)||) < CTOL*EPS
471*
472* where EPS is the round-off and CTOL is defined as follows:
473*
474 IF( uctol ) THEN
475* ... user controlled
476 ctol = rwork( 1 )
477 ELSE
478* ... default
479 IF( lsvec .OR. rsvec .OR. applv ) THEN
480 ctol = sqrt( dble( m ) )
481 ELSE
482 ctol = dble( m )
483 END IF
484 END IF
485* ... and the machine dependent parameters are
486*[!] (Make sure that SLAMCH() works properly on the target machine.)
487*
488 epsln = dlamch( 'Epsilon' )
489 rooteps = sqrt( epsln )
490 sfmin = dlamch( 'SafeMinimum' )
491 rootsfmin = sqrt( sfmin )
492 small = sfmin / epsln
493 big = dlamch( 'Overflow' )
494* BIG = ONE / SFMIN
495 rootbig = one / rootsfmin
496* LARGE = BIG / SQRT( DBLE( M*N ) )
497 bigtheta = one / rooteps
498*
499 tol = ctol*epsln
500 roottol = sqrt( tol )
501*
502 IF( dble( m )*epsln.GE.one ) THEN
503 info = -4
504 CALL xerbla( 'ZGESVJ', -info )
505 RETURN
506 END IF
507*
508* Initialize the right singular vector matrix.
509*
510 IF( rsvec ) THEN
511 mvl = n
512 CALL zlaset( 'A', mvl, n, czero, cone, v, ldv )
513 ELSE IF( applv ) THEN
514 mvl = mv
515 END IF
516 rsvec = rsvec .OR. applv
517*
518* Initialize SVA( 1:N ) = ( ||A e_i||_2, i = 1:N )
519*(!) If necessary, scale A to protect the largest singular value
520* from overflow. It is possible that saving the largest singular
521* value destroys the information about the small ones.
522* This initial scaling is almost minimal in the sense that the
523* goal is to make sure that no column norm overflows, and that
524* SQRT(N)*max_i SVA(i) does not overflow. If INFinite entries
525* in A are detected, the procedure returns with INFO=-6.
526*
527 skl = one / sqrt( dble( m )*dble( n ) )
528 noscale = .true.
529 goscale = .true.
530*
531 IF( lower ) THEN
532* the input matrix is M-by-N lower triangular (trapezoidal)
533 DO 1874 p = 1, n
534 aapp = zero
535 aaqq = one
536 CALL zlassq( m-p+1, a( p, p ), 1, aapp, aaqq )
537 IF( aapp.GT.big ) THEN
538 info = -6
539 CALL xerbla( 'ZGESVJ', -info )
540 RETURN
541 END IF
542 aaqq = sqrt( aaqq )
543 IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
544 sva( p ) = aapp*aaqq
545 ELSE
546 noscale = .false.
547 sva( p ) = aapp*( aaqq*skl )
548 IF( goscale ) THEN
549 goscale = .false.
550 DO 1873 q = 1, p - 1
551 sva( q ) = sva( q )*skl
552 1873 CONTINUE
553 END IF
554 END IF
555 1874 CONTINUE
556 ELSE IF( upper ) THEN
557* the input matrix is M-by-N upper triangular (trapezoidal)
558 DO 2874 p = 1, n
559 aapp = zero
560 aaqq = one
561 CALL zlassq( p, a( 1, p ), 1, aapp, aaqq )
562 IF( aapp.GT.big ) THEN
563 info = -6
564 CALL xerbla( 'ZGESVJ', -info )
565 RETURN
566 END IF
567 aaqq = sqrt( aaqq )
568 IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
569 sva( p ) = aapp*aaqq
570 ELSE
571 noscale = .false.
572 sva( p ) = aapp*( aaqq*skl )
573 IF( goscale ) THEN
574 goscale = .false.
575 DO 2873 q = 1, p - 1
576 sva( q ) = sva( q )*skl
577 2873 CONTINUE
578 END IF
579 END IF
580 2874 CONTINUE
581 ELSE
582* the input matrix is M-by-N general dense
583 DO 3874 p = 1, n
584 aapp = zero
585 aaqq = one
586 CALL zlassq( m, a( 1, p ), 1, aapp, aaqq )
587 IF( aapp.GT.big ) THEN
588 info = -6
589 CALL xerbla( 'ZGESVJ', -info )
590 RETURN
591 END IF
592 aaqq = sqrt( aaqq )
593 IF( ( aapp.LT.( big / aaqq ) ) .AND. noscale ) THEN
594 sva( p ) = aapp*aaqq
595 ELSE
596 noscale = .false.
597 sva( p ) = aapp*( aaqq*skl )
598 IF( goscale ) THEN
599 goscale = .false.
600 DO 3873 q = 1, p - 1
601 sva( q ) = sva( q )*skl
602 3873 CONTINUE
603 END IF
604 END IF
605 3874 CONTINUE
606 END IF
607*
608 IF( noscale )skl = one
609*
610* Move the smaller part of the spectrum from the underflow threshold
611*(!) Start by determining the position of the nonzero entries of the
612* array SVA() relative to ( SFMIN, BIG ).
613*
614 aapp = zero
615 aaqq = big
616 DO 4781 p = 1, n
617 IF( sva( p ).NE.zero )aaqq = min( aaqq, sva( p ) )
618 aapp = max( aapp, sva( p ) )
619 4781 CONTINUE
620*
621* #:) Quick return for zero matrix
622*
623 IF( aapp.EQ.zero ) THEN
624 IF( lsvec )CALL zlaset( 'G', m, n, czero, cone, a, lda )
625 rwork( 1 ) = one
626 rwork( 2 ) = zero
627 rwork( 3 ) = zero
628 rwork( 4 ) = zero
629 rwork( 5 ) = zero
630 rwork( 6 ) = zero
631 RETURN
632 END IF
633*
634* #:) Quick return for one-column matrix
635*
636 IF( n.EQ.1 ) THEN
637 IF( lsvec )CALL zlascl( 'G', 0, 0, sva( 1 ), skl, m, 1,
638 $ a( 1, 1 ), lda, ierr )
639 rwork( 1 ) = one / skl
640 IF( sva( 1 ).GE.sfmin ) THEN
641 rwork( 2 ) = one
642 ELSE
643 rwork( 2 ) = zero
644 END IF
645 rwork( 3 ) = zero
646 rwork( 4 ) = zero
647 rwork( 5 ) = zero
648 rwork( 6 ) = zero
649 RETURN
650 END IF
651*
652* Protect small singular values from underflow, and try to
653* avoid underflows/overflows in computing Jacobi rotations.
654*
655 sn = sqrt( sfmin / epsln )
656 temp1 = sqrt( big / dble( n ) )
657 IF( ( aapp.LE.sn ) .OR. ( aaqq.GE.temp1 ) .OR.
658 $ ( ( sn.LE.aaqq ) .AND. ( aapp.LE.temp1 ) ) ) THEN
659 temp1 = min( big, temp1 / aapp )
660* AAQQ = AAQQ*TEMP1
661* AAPP = AAPP*TEMP1
662 ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.LE.temp1 ) ) THEN
663 temp1 = min( sn / aaqq, big / (aapp*sqrt( dble(n)) ) )
664* AAQQ = AAQQ*TEMP1
665* AAPP = AAPP*TEMP1
666 ELSE IF( ( aaqq.GE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
667 temp1 = max( sn / aaqq, temp1 / aapp )
668* AAQQ = AAQQ*TEMP1
669* AAPP = AAPP*TEMP1
670 ELSE IF( ( aaqq.LE.sn ) .AND. ( aapp.GE.temp1 ) ) THEN
671 temp1 = min( sn / aaqq, big / ( sqrt( dble( n ) )*aapp ) )
672* AAQQ = AAQQ*TEMP1
673* AAPP = AAPP*TEMP1
674 ELSE
675 temp1 = one
676 END IF
677*
678* Scale, if necessary
679*
680 IF( temp1.NE.one ) THEN
681 CALL dlascl( 'G', 0, 0, one, temp1, n, 1, sva, n, ierr )
682 END IF
683 skl = temp1*skl
684 IF( skl.NE.one ) THEN
685 CALL zlascl( joba, 0, 0, one, skl, m, n, a, lda, ierr )
686 skl = one / skl
687 END IF
688*
689* Row-cyclic Jacobi SVD algorithm with column pivoting
690*
691 emptsw = ( n*( n-1 ) ) / 2
692 notrot = 0
693
694 DO 1868 q = 1, n
695 cwork( q ) = cone
696 1868 CONTINUE
697*
698*
699*
700 swband = 3
701*[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
702* if ZGESVJ is used as a computational routine in the preconditioned
703* Jacobi SVD algorithm ZGEJSV. For sweeps i=1:SWBAND the procedure
704* works on pivots inside a band-like region around the diagonal.
705* The boundaries are determined dynamically, based on the number of
706* pivots above a threshold.
707*
708 kbl = min( 8, n )
709*[TP] KBL is a tuning parameter that defines the tile size in the
710* tiling of the p-q loops of pivot pairs. In general, an optimal
711* value of KBL depends on the matrix dimensions and on the
712* parameters of the computer's memory.
713*
714 nbl = n / kbl
715 IF( ( nbl*kbl ).NE.n )nbl = nbl + 1
716*
717 blskip = kbl**2
718*[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
719*
720 rowskip = min( 5, kbl )
721*[TP] ROWSKIP is a tuning parameter.
722*
723 lkahead = 1
724*[TP] LKAHEAD is a tuning parameter.
725*
726* Quasi block transformations, using the lower (upper) triangular
727* structure of the input matrix. The quasi-block-cycling usually
728* invokes cubic convergence. Big part of this cycle is done inside
729* canonical subspaces of dimensions less than M.
730*
731 IF( ( lower .OR. upper ) .AND. ( n.GT.max( 64, 4*kbl ) ) ) THEN
732*[TP] The number of partition levels and the actual partition are
733* tuning parameters.
734 n4 = n / 4
735 n2 = n / 2
736 n34 = 3*n4
737 IF( applv ) THEN
738 q = 0
739 ELSE
740 q = 1
741 END IF
742*
743 IF( lower ) THEN
744*
745* This works very well on lower triangular matrices, in particular
746* in the framework of the preconditioned Jacobi SVD (xGEJSV).
747* The idea is simple:
748* [+ 0 0 0] Note that Jacobi transformations of [0 0]
749* [+ + 0 0] [0 0]
750* [+ + x 0] actually work on [x 0] [x 0]
751* [+ + x x] [x x]. [x x]
752*
753 CALL zgsvj0( jobv, m-n34, n-n34, a( n34+1, n34+1 ), lda,
754 $ cwork( n34+1 ), sva( n34+1 ), mvl,
755 $ v( n34*q+1, n34+1 ), ldv, epsln, sfmin, tol,
756 $ 2, cwork( n+1 ), lwork-n, ierr )
757
758 CALL zgsvj0( jobv, m-n2, n34-n2, a( n2+1, n2+1 ), lda,
759 $ cwork( n2+1 ), sva( n2+1 ), mvl,
760 $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 2,
761 $ cwork( n+1 ), lwork-n, ierr )
762
763 CALL zgsvj1( jobv, m-n2, n-n2, n4, a( n2+1, n2+1 ), lda,
764 $ cwork( n2+1 ), sva( n2+1 ), mvl,
765 $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
766 $ cwork( n+1 ), lwork-n, ierr )
767
768 CALL zgsvj0( jobv, m-n4, n2-n4, a( n4+1, n4+1 ), lda,
769 $ cwork( n4+1 ), sva( n4+1 ), mvl,
770 $ v( n4*q+1, n4+1 ), ldv, epsln, sfmin, tol, 1,
771 $ cwork( n+1 ), lwork-n, ierr )
772*
773 CALL zgsvj0( jobv, m, n4, a, lda, cwork, sva, mvl, v, ldv,
774 $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
775 $ ierr )
776*
777 CALL zgsvj1( jobv, m, n2, n4, a, lda, cwork, sva, mvl, v,
778 $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
779 $ lwork-n, ierr )
780*
781*
782 ELSE IF( upper ) THEN
783*
784*
785 CALL zgsvj0( jobv, n4, n4, a, lda, cwork, sva, mvl, v, ldv,
786 $ epsln, sfmin, tol, 2, cwork( n+1 ), lwork-n,
787 $ ierr )
788*
789 CALL zgsvj0( jobv, n2, n4, a( 1, n4+1 ), lda, cwork( n4+1 ),
790 $ sva( n4+1 ), mvl, v( n4*q+1, n4+1 ), ldv,
791 $ epsln, sfmin, tol, 1, cwork( n+1 ), lwork-n,
792 $ ierr )
793*
794 CALL zgsvj1( jobv, n2, n2, n4, a, lda, cwork, sva, mvl, v,
795 $ ldv, epsln, sfmin, tol, 1, cwork( n+1 ),
796 $ lwork-n, ierr )
797*
798 CALL zgsvj0( jobv, n2+n4, n4, a( 1, n2+1 ), lda,
799 $ cwork( n2+1 ), sva( n2+1 ), mvl,
800 $ v( n2*q+1, n2+1 ), ldv, epsln, sfmin, tol, 1,
801 $ cwork( n+1 ), lwork-n, ierr )
802
803 END IF
804*
805 END IF
806*
807* .. Row-cyclic pivot strategy with de Rijk's pivoting ..
808*
809 DO 1993 i = 1, nsweep
810*
811* .. go go go ...
812*
813 mxaapq = zero
814 mxsinj = zero
815 iswrot = 0
816*
817 notrot = 0
818 pskipped = 0
819*
820* Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
821* 1 <= p < q <= N. This is the first step toward a blocked implementation
822* of the rotations. New implementation, based on block transformations,
823* is under development.
824*
825 DO 2000 ibr = 1, nbl
826*
827 igl = ( ibr-1 )*kbl + 1
828*
829 DO 1002 ir1 = 0, min( lkahead, nbl-ibr )
830*
831 igl = igl + ir1*kbl
832*
833 DO 2001 p = igl, min( igl+kbl-1, n-1 )
834*
835* .. de Rijk's pivoting
836*
837 q = idamax( n-p+1, sva( p ), 1 ) + p - 1
838 IF( p.NE.q ) THEN
839 CALL zswap( m, a( 1, p ), 1, a( 1, q ), 1 )
840 IF( rsvec )CALL zswap( mvl, v( 1, p ), 1,
841 $ v( 1, q ), 1 )
842 temp1 = sva( p )
843 sva( p ) = sva( q )
844 sva( q ) = temp1
845 aapq = cwork(p)
846 cwork(p) = cwork(q)
847 cwork(q) = aapq
848 END IF
849*
850 IF( ir1.EQ.0 ) THEN
851*
852* Column norms are periodically updated by explicit
853* norm computation.
854*[!] Caveat:
855* Unfortunately, some BLAS implementations compute DZNRM2(M,A(1,p),1)
856* as SQRT(S=CDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
857* overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
858* underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
859* Hence, DZNRM2 cannot be trusted, not even in the case when
860* the true norm is far from the under(over)flow boundaries.
861* If properly implemented SCNRM2 is available, the IF-THEN-ELSE-END IF
862* below should be replaced with "AAPP = DZNRM2( M, A(1,p), 1 )".
863*
864 IF( ( sva( p ).LT.rootbig ) .AND.
865 $ ( sva( p ).GT.rootsfmin ) ) THEN
866 sva( p ) = dznrm2( m, a( 1, p ), 1 )
867 ELSE
868 temp1 = zero
869 aapp = one
870 CALL zlassq( m, a( 1, p ), 1, temp1, aapp )
871 sva( p ) = temp1*sqrt( aapp )
872 END IF
873 aapp = sva( p )
874 ELSE
875 aapp = sva( p )
876 END IF
877*
878 IF( aapp.GT.zero ) THEN
879*
880 pskipped = 0
881*
882 DO 2002 q = p + 1, min( igl+kbl-1, n )
883*
884 aaqq = sva( q )
885*
886 IF( aaqq.GT.zero ) THEN
887*
888 aapp0 = aapp
889 IF( aaqq.GE.one ) THEN
890 rotok = ( small*aapp ).LE.aaqq
891 IF( aapp.LT.( big / aaqq ) ) THEN
892 aapq = ( zdotc( m, a( 1, p ), 1,
893 $ a( 1, q ), 1 ) / aaqq ) / aapp
894 ELSE
895 CALL zcopy( m, a( 1, p ), 1,
896 $ cwork(n+1), 1 )
897 CALL zlascl( 'G', 0, 0, aapp, one,
898 $ m, 1, cwork(n+1), lda, ierr )
899 aapq = zdotc( m, cwork(n+1), 1,
900 $ a( 1, q ), 1 ) / aaqq
901 END IF
902 ELSE
903 rotok = aapp.LE.( aaqq / small )
904 IF( aapp.GT.( small / aaqq ) ) THEN
905 aapq = ( zdotc( m, a( 1, p ), 1,
906 $ a( 1, q ), 1 ) / aapp ) / aaqq
907 ELSE
908 CALL zcopy( m, a( 1, q ), 1,
909 $ cwork(n+1), 1 )
910 CALL zlascl( 'G', 0, 0, aaqq,
911 $ one, m, 1,
912 $ cwork(n+1), lda, ierr )
913 aapq = zdotc( m, a(1, p ), 1,
914 $ cwork(n+1), 1 ) / aapp
915 END IF
916 END IF
917*
918
919* AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
920 aapq1 = -abs(aapq)
921 mxaapq = max( mxaapq, -aapq1 )
922*
923* TO rotate or NOT to rotate, THAT is the question ...
924*
925 IF( abs( aapq1 ).GT.tol ) THEN
926 ompq = aapq / abs(aapq)
927*
928* .. rotate
929*[RTD] ROTATED = ROTATED + ONE
930*
931 IF( ir1.EQ.0 ) THEN
932 notrot = 0
933 pskipped = 0
934 iswrot = iswrot + 1
935 END IF
936*
937 IF( rotok ) THEN
938*
939 aqoap = aaqq / aapp
940 apoaq = aapp / aaqq
941 theta = -half*abs( aqoap-apoaq )/aapq1
942*
943 IF( abs( theta ).GT.bigtheta ) THEN
944*
945 t = half / theta
946 cs = one
947
948 CALL zrot( m, a(1,p), 1, a(1,q), 1,
949 $ cs, conjg(ompq)*t )
950 IF ( rsvec ) THEN
951 CALL zrot( mvl, v(1,p), 1,
952 $ v(1,q), 1, cs, conjg(ompq)*t )
953 END IF
954
955 sva( q ) = aaqq*sqrt( max( zero,
956 $ one+t*apoaq*aapq1 ) )
957 aapp = aapp*sqrt( max( zero,
958 $ one-t*aqoap*aapq1 ) )
959 mxsinj = max( mxsinj, abs( t ) )
960*
961 ELSE
962*
963* .. choose correct signum for THETA and rotate
964*
965 thsign = -sign( one, aapq1 )
966 t = one / ( theta+thsign*
967 $ sqrt( one+theta*theta ) )
968 cs = sqrt( one / ( one+t*t ) )
969 sn = t*cs
970*
971 mxsinj = max( mxsinj, abs( sn ) )
972 sva( q ) = aaqq*sqrt( max( zero,
973 $ one+t*apoaq*aapq1 ) )
974 aapp = aapp*sqrt( max( zero,
975 $ one-t*aqoap*aapq1 ) )
976*
977 CALL zrot( m, a(1,p), 1, a(1,q), 1,
978 $ cs, conjg(ompq)*sn )
979 IF ( rsvec ) THEN
980 CALL zrot( mvl, v(1,p), 1,
981 $ v(1,q), 1, cs, conjg(ompq)*sn )
982 END IF
983 END IF
984 cwork(p) = -cwork(q) * ompq
985*
986 ELSE
987* .. have to use modified Gram-Schmidt like transformation
988 CALL zcopy( m, a( 1, p ), 1,
989 $ cwork(n+1), 1 )
990 CALL zlascl( 'G', 0, 0, aapp, one, m,
991 $ 1, cwork(n+1), lda,
992 $ ierr )
993 CALL zlascl( 'G', 0, 0, aaqq, one, m,
994 $ 1, a( 1, q ), lda, ierr )
995 CALL zaxpy( m, -aapq, cwork(n+1), 1,
996 $ a( 1, q ), 1 )
997 CALL zlascl( 'G', 0, 0, one, aaqq, m,
998 $ 1, a( 1, q ), lda, ierr )
999 sva( q ) = aaqq*sqrt( max( zero,
1000 $ one-aapq1*aapq1 ) )
1001 mxsinj = max( mxsinj, sfmin )
1002 END IF
1003* END IF ROTOK THEN ... ELSE
1004*
1005* In the case of cancellation in updating SVA(q), SVA(p)
1006* recompute SVA(q), SVA(p).
1007*
1008 IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1009 $ THEN
1010 IF( ( aaqq.LT.rootbig ) .AND.
1011 $ ( aaqq.GT.rootsfmin ) ) THEN
1012 sva( q ) = dznrm2( m, a( 1, q ), 1 )
1013 ELSE
1014 t = zero
1015 aaqq = one
1016 CALL zlassq( m, a( 1, q ), 1, t,
1017 $ aaqq )
1018 sva( q ) = t*sqrt( aaqq )
1019 END IF
1020 END IF
1021 IF( ( aapp / aapp0 ).LE.rooteps ) THEN
1022 IF( ( aapp.LT.rootbig ) .AND.
1023 $ ( aapp.GT.rootsfmin ) ) THEN
1024 aapp = dznrm2( m, a( 1, p ), 1 )
1025 ELSE
1026 t = zero
1027 aapp = one
1028 CALL zlassq( m, a( 1, p ), 1, t,
1029 $ aapp )
1030 aapp = t*sqrt( aapp )
1031 END IF
1032 sva( p ) = aapp
1033 END IF
1034*
1035 ELSE
1036* A(:,p) and A(:,q) already numerically orthogonal
1037 IF( ir1.EQ.0 )notrot = notrot + 1
1038*[RTD] SKIPPED = SKIPPED + 1
1039 pskipped = pskipped + 1
1040 END IF
1041 ELSE
1042* A(:,q) is zero column
1043 IF( ir1.EQ.0 )notrot = notrot + 1
1044 pskipped = pskipped + 1
1045 END IF
1046*
1047 IF( ( i.LE.swband ) .AND.
1048 $ ( pskipped.GT.rowskip ) ) THEN
1049 IF( ir1.EQ.0 )aapp = -aapp
1050 notrot = 0
1051 GO TO 2103
1052 END IF
1053*
1054 2002 CONTINUE
1055* END q-LOOP
1056*
1057 2103 CONTINUE
1058* bailed out of q-loop
1059*
1060 sva( p ) = aapp
1061*
1062 ELSE
1063 sva( p ) = aapp
1064 IF( ( ir1.EQ.0 ) .AND. ( aapp.EQ.zero ) )
1065 $ notrot = notrot + min( igl+kbl-1, n ) - p
1066 END IF
1067*
1068 2001 CONTINUE
1069* end of the p-loop
1070* end of doing the block ( ibr, ibr )
1071 1002 CONTINUE
1072* end of ir1-loop
1073*
1074* ... go to the off diagonal blocks
1075*
1076 igl = ( ibr-1 )*kbl + 1
1077*
1078 DO 2010 jbc = ibr + 1, nbl
1079*
1080 jgl = ( jbc-1 )*kbl + 1
1081*
1082* doing the block at ( ibr, jbc )
1083*
1084 ijblsk = 0
1085 DO 2100 p = igl, min( igl+kbl-1, n )
1086*
1087 aapp = sva( p )
1088 IF( aapp.GT.zero ) THEN
1089*
1090 pskipped = 0
1091*
1092 DO 2200 q = jgl, min( jgl+kbl-1, n )
1093*
1094 aaqq = sva( q )
1095 IF( aaqq.GT.zero ) THEN
1096 aapp0 = aapp
1097*
1098* .. M x 2 Jacobi SVD ..
1099*
1100* Safe Gram matrix computation
1101*
1102 IF( aaqq.GE.one ) THEN
1103 IF( aapp.GE.aaqq ) THEN
1104 rotok = ( small*aapp ).LE.aaqq
1105 ELSE
1106 rotok = ( small*aaqq ).LE.aapp
1107 END IF
1108 IF( aapp.LT.( big / aaqq ) ) THEN
1109 aapq = ( zdotc( m, a( 1, p ), 1,
1110 $ a( 1, q ), 1 ) / aaqq ) / aapp
1111 ELSE
1112 CALL zcopy( m, a( 1, p ), 1,
1113 $ cwork(n+1), 1 )
1114 CALL zlascl( 'G', 0, 0, aapp,
1115 $ one, m, 1,
1116 $ cwork(n+1), lda, ierr )
1117 aapq = zdotc( m, cwork(n+1), 1,
1118 $ a( 1, q ), 1 ) / aaqq
1119 END IF
1120 ELSE
1121 IF( aapp.GE.aaqq ) THEN
1122 rotok = aapp.LE.( aaqq / small )
1123 ELSE
1124 rotok = aaqq.LE.( aapp / small )
1125 END IF
1126 IF( aapp.GT.( small / aaqq ) ) THEN
1127 aapq = ( zdotc( m, a( 1, p ), 1,
1128 $ a( 1, q ), 1 ) / max(aaqq,aapp) )
1129 $ / min(aaqq,aapp)
1130 ELSE
1131 CALL zcopy( m, a( 1, q ), 1,
1132 $ cwork(n+1), 1 )
1133 CALL zlascl( 'G', 0, 0, aaqq,
1134 $ one, m, 1,
1135 $ cwork(n+1), lda, ierr )
1136 aapq = zdotc( m, a( 1, p ), 1,
1137 $ cwork(n+1), 1 ) / aapp
1138 END IF
1139 END IF
1140*
1141
1142* AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1143 aapq1 = -abs(aapq)
1144 mxaapq = max( mxaapq, -aapq1 )
1145*
1146* TO rotate or NOT to rotate, THAT is the question ...
1147*
1148 IF( abs( aapq1 ).GT.tol ) THEN
1149 ompq = aapq / abs(aapq)
1150 notrot = 0
1151*[RTD] ROTATED = ROTATED + 1
1152 pskipped = 0
1153 iswrot = iswrot + 1
1154*
1155 IF( rotok ) THEN
1156*
1157 aqoap = aaqq / aapp
1158 apoaq = aapp / aaqq
1159 theta = -half*abs( aqoap-apoaq )/ aapq1
1160 IF( aaqq.GT.aapp0 )theta = -theta
1161*
1162 IF( abs( theta ).GT.bigtheta ) THEN
1163 t = half / theta
1164 cs = one
1165 CALL zrot( m, a(1,p), 1, a(1,q), 1,
1166 $ cs, conjg(ompq)*t )
1167 IF( rsvec ) THEN
1168 CALL zrot( mvl, v(1,p), 1,
1169 $ v(1,q), 1, cs, conjg(ompq)*t )
1170 END IF
1171 sva( q ) = aaqq*sqrt( max( zero,
1172 $ one+t*apoaq*aapq1 ) )
1173 aapp = aapp*sqrt( max( zero,
1174 $ one-t*aqoap*aapq1 ) )
1175 mxsinj = max( mxsinj, abs( t ) )
1176 ELSE
1177*
1178* .. choose correct signum for THETA and rotate
1179*
1180 thsign = -sign( one, aapq1 )
1181 IF( aaqq.GT.aapp0 )thsign = -thsign
1182 t = one / ( theta+thsign*
1183 $ sqrt( one+theta*theta ) )
1184 cs = sqrt( one / ( one+t*t ) )
1185 sn = t*cs
1186 mxsinj = max( mxsinj, abs( sn ) )
1187 sva( q ) = aaqq*sqrt( max( zero,
1188 $ one+t*apoaq*aapq1 ) )
1189 aapp = aapp*sqrt( max( zero,
1190 $ one-t*aqoap*aapq1 ) )
1191*
1192 CALL zrot( m, a(1,p), 1, a(1,q), 1,
1193 $ cs, conjg(ompq)*sn )
1194 IF( rsvec ) THEN
1195 CALL zrot( mvl, v(1,p), 1,
1196 $ v(1,q), 1, cs, conjg(ompq)*sn )
1197 END IF
1198 END IF
1199 cwork(p) = -cwork(q) * ompq
1200*
1201 ELSE
1202* .. have to use modified Gram-Schmidt like transformation
1203 IF( aapp.GT.aaqq ) THEN
1204 CALL zcopy( m, a( 1, p ), 1,
1205 $ cwork(n+1), 1 )
1206 CALL zlascl( 'G', 0, 0, aapp, one,
1207 $ m, 1, cwork(n+1),lda,
1208 $ ierr )
1209 CALL zlascl( 'G', 0, 0, aaqq, one,
1210 $ m, 1, a( 1, q ), lda,
1211 $ ierr )
1212 CALL zaxpy( m, -aapq, cwork(n+1),
1213 $ 1, a( 1, q ), 1 )
1214 CALL zlascl( 'G', 0, 0, one, aaqq,
1215 $ m, 1, a( 1, q ), lda,
1216 $ ierr )
1217 sva( q ) = aaqq*sqrt( max( zero,
1218 $ one-aapq1*aapq1 ) )
1219 mxsinj = max( mxsinj, sfmin )
1220 ELSE
1221 CALL zcopy( m, a( 1, q ), 1,
1222 $ cwork(n+1), 1 )
1223 CALL zlascl( 'G', 0, 0, aaqq, one,
1224 $ m, 1, cwork(n+1),lda,
1225 $ ierr )
1226 CALL zlascl( 'G', 0, 0, aapp, one,
1227 $ m, 1, a( 1, p ), lda,
1228 $ ierr )
1229 CALL zaxpy( m, -conjg(aapq),
1230 $ cwork(n+1), 1, a( 1, p ), 1 )
1231 CALL zlascl( 'G', 0, 0, one, aapp,
1232 $ m, 1, a( 1, p ), lda,
1233 $ ierr )
1234 sva( p ) = aapp*sqrt( max( zero,
1235 $ one-aapq1*aapq1 ) )
1236 mxsinj = max( mxsinj, sfmin )
1237 END IF
1238 END IF
1239* END IF ROTOK THEN ... ELSE
1240*
1241* In the case of cancellation in updating SVA(q), SVA(p)
1242* .. recompute SVA(q), SVA(p)
1243 IF( ( sva( q ) / aaqq )**2.LE.rooteps )
1244 $ THEN
1245 IF( ( aaqq.LT.rootbig ) .AND.
1246 $ ( aaqq.GT.rootsfmin ) ) THEN
1247 sva( q ) = dznrm2( m, a( 1, q ), 1)
1248 ELSE
1249 t = zero
1250 aaqq = one
1251 CALL zlassq( m, a( 1, q ), 1, t,
1252 $ aaqq )
1253 sva( q ) = t*sqrt( aaqq )
1254 END IF
1255 END IF
1256 IF( ( aapp / aapp0 )**2.LE.rooteps ) THEN
1257 IF( ( aapp.LT.rootbig ) .AND.
1258 $ ( aapp.GT.rootsfmin ) ) THEN
1259 aapp = dznrm2( m, a( 1, p ), 1 )
1260 ELSE
1261 t = zero
1262 aapp = one
1263 CALL zlassq( m, a( 1, p ), 1, t,
1264 $ aapp )
1265 aapp = t*sqrt( aapp )
1266 END IF
1267 sva( p ) = aapp
1268 END IF
1269* end of OK rotation
1270 ELSE
1271 notrot = notrot + 1
1272*[RTD] SKIPPED = SKIPPED + 1
1273 pskipped = pskipped + 1
1274 ijblsk = ijblsk + 1
1275 END IF
1276 ELSE
1277 notrot = notrot + 1
1278 pskipped = pskipped + 1
1279 ijblsk = ijblsk + 1
1280 END IF
1281*
1282 IF( ( i.LE.swband ) .AND. ( ijblsk.GE.blskip ) )
1283 $ THEN
1284 sva( p ) = aapp
1285 notrot = 0
1286 GO TO 2011
1287 END IF
1288 IF( ( i.LE.swband ) .AND.
1289 $ ( pskipped.GT.rowskip ) ) THEN
1290 aapp = -aapp
1291 notrot = 0
1292 GO TO 2203
1293 END IF
1294*
1295 2200 CONTINUE
1296* end of the q-loop
1297 2203 CONTINUE
1298*
1299 sva( p ) = aapp
1300*
1301 ELSE
1302*
1303 IF( aapp.EQ.zero )notrot = notrot +
1304 $ min( jgl+kbl-1, n ) - jgl + 1
1305 IF( aapp.LT.zero )notrot = 0
1306*
1307 END IF
1308*
1309 2100 CONTINUE
1310* end of the p-loop
1311 2010 CONTINUE
1312* end of the jbc-loop
1313 2011 CONTINUE
1314*2011 bailed out of the jbc-loop
1315 DO 2012 p = igl, min( igl+kbl-1, n )
1316 sva( p ) = abs( sva( p ) )
1317 2012 CONTINUE
1318***
1319 2000 CONTINUE
1320*2000 :: end of the ibr-loop
1321*
1322* .. update SVA(N)
1323 IF( ( sva( n ).LT.rootbig ) .AND. ( sva( n ).GT.rootsfmin ) )
1324 $ THEN
1325 sva( n ) = dznrm2( m, a( 1, n ), 1 )
1326 ELSE
1327 t = zero
1328 aapp = one
1329 CALL zlassq( m, a( 1, n ), 1, t, aapp )
1330 sva( n ) = t*sqrt( aapp )
1331 END IF
1332*
1333* Additional steering devices
1334*
1335 IF( ( i.LT.swband ) .AND. ( ( mxaapq.LE.roottol ) .OR.
1336 $ ( iswrot.LE.n ) ) )swband = i
1337*
1338 IF( ( i.GT.swband+1 ) .AND. ( mxaapq.LT.sqrt( dble( n ) )*
1339 $ tol ) .AND. ( dble( n )*mxaapq*mxsinj.LT.tol ) ) THEN
1340 GO TO 1994
1341 END IF
1342*
1343 IF( notrot.GE.emptsw )GO TO 1994
1344*
1345 1993 CONTINUE
1346* end i=1:NSWEEP loop
1347*
1348* #:( Reaching this point means that the procedure has not converged.
1349 info = nsweep - 1
1350 GO TO 1995
1351*
1352 1994 CONTINUE
1353* #:) Reaching this point means numerical convergence after the i-th
1354* sweep.
1355*
1356 info = 0
1357* #:) INFO = 0 confirms successful iterations.
1358 1995 CONTINUE
1359*
1360* Sort the singular values and find how many are above
1361* the underflow threshold.
1362*
1363 n2 = 0
1364 n4 = 0
1365 DO 5991 p = 1, n - 1
1366 q = idamax( n-p+1, sva( p ), 1 ) + p - 1
1367 IF( p.NE.q ) THEN
1368 temp1 = sva( p )
1369 sva( p ) = sva( q )
1370 sva( q ) = temp1
1371 CALL zswap( m, a( 1, p ), 1, a( 1, q ), 1 )
1372 IF( rsvec )CALL zswap( mvl, v( 1, p ), 1, v( 1, q ), 1 )
1373 END IF
1374 IF( sva( p ).NE.zero ) THEN
1375 n4 = n4 + 1
1376 IF( sva( p )*skl.GT.sfmin )n2 = n2 + 1
1377 END IF
1378 5991 CONTINUE
1379 IF( sva( n ).NE.zero ) THEN
1380 n4 = n4 + 1
1381 IF( sva( n )*skl.GT.sfmin )n2 = n2 + 1
1382 END IF
1383*
1384* Normalize the left singular vectors.
1385*
1386 IF( lsvec .OR. uctol ) THEN
1387 DO 1998 p = 1, n4
1388* CALL ZDSCAL( M, ONE / SVA( p ), A( 1, p ), 1 )
1389 CALL zlascl( 'G',0,0, sva(p), one, m, 1, a(1,p), m, ierr )
1390 1998 CONTINUE
1391 END IF
1392*
1393* Scale the product of Jacobi rotations.
1394*
1395 IF( rsvec ) THEN
1396 DO 2399 p = 1, n
1397 temp1 = one / dznrm2( mvl, v( 1, p ), 1 )
1398 CALL zdscal( mvl, temp1, v( 1, p ), 1 )
1399 2399 CONTINUE
1400 END IF
1401*
1402* Undo scaling, if necessary (and possible).
1403 IF( ( ( skl.GT.one ) .AND. ( sva( 1 ).LT.( big / skl ) ) )
1404 $ .OR. ( ( skl.LT.one ) .AND. ( sva( max( n2, 1 ) ) .GT.
1405 $ ( sfmin / skl ) ) ) ) THEN
1406 DO 2400 p = 1, n
1407 sva( p ) = skl*sva( p )
1408 2400 CONTINUE
1409 skl = one
1410 END IF
1411*
1412 rwork( 1 ) = skl
1413* The singular values of A are SKL*SVA(1:N). If SKL.NE.ONE
1414* then some of the singular values may overflow or underflow and
1415* the spectrum is given in this factored representation.
1416*
1417 rwork( 2 ) = dble( n4 )
1418* N4 is the number of computed nonzero singular values of A.
1419*
1420 rwork( 3 ) = dble( n2 )
1421* N2 is the number of singular values of A greater than SFMIN.
1422* If N2<N, SVA(N2:N) contains ZEROS and/or denormalized numbers
1423* that may carry some information.
1424*
1425 rwork( 4 ) = dble( i )
1426* i is the index of the last sweep before declaring convergence.
1427*
1428 rwork( 5 ) = mxaapq
1429* MXAAPQ is the largest absolute value of scaled pivots in the
1430* last sweep
1431*
1432 rwork( 6 ) = mxsinj
1433* MXSINJ is the largest absolute value of the sines of Jacobi angles
1434* in the last sweep
1435*
1436 RETURN
1437* ..
1438* .. END OF ZGESVJ
1439* ..
subroutine xerbla(srname, info)
Definition cblat2.f:3285
subroutine zaxpy(n, za, zx, incx, zy, incy)
ZAXPY
Definition zaxpy.f:88
subroutine zcopy(n, zx, incx, zy, incy)
ZCOPY
Definition zcopy.f:81
complex *16 function zdotc(n, zx, incx, zy, incy)
ZDOTC
Definition zdotc.f:83
subroutine zgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work, lwork, info)
ZGSVJ0 pre-processor for the routine zgesvj.
Definition zgsvj0.f:218
subroutine zgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work, lwork, info)
ZGSVJ1 pre-processor for the routine zgesvj, applies Jacobi rotations targeting only particular pivot...
Definition zgsvj1.f:236
integer function idamax(n, dx, incx)
IDAMAX
Definition idamax.f:71
double precision function dlamch(cmach)
DLAMCH
Definition dlamch.f:69
subroutine zlascl(type, kl, ku, cfrom, cto, m, n, a, lda, info)
ZLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition zlascl.f:143
subroutine dlascl(type, kl, ku, cfrom, cto, m, n, a, lda, info)
DLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition dlascl.f:143
subroutine zlaset(uplo, m, n, alpha, beta, a, lda)
ZLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values.
Definition zlaset.f:106
subroutine zlassq(n, x, incx, scale, sumsq)
ZLASSQ updates a sum of squares represented in scaled form.
Definition zlassq.f90:124
logical function lsame(ca, cb)
LSAME
Definition lsame.f:48
real(wp) function dznrm2(n, x, incx)
DZNRM2
Definition dznrm2.f90:90
subroutine zrot(n, cx, incx, cy, incy, c, s)
ZROT applies a plane rotation with real cosine and complex sine to a pair of complex vectors.
Definition zrot.f:103
subroutine zdscal(n, da, zx, incx)
ZDSCAL
Definition zdscal.f:78
subroutine zswap(n, zx, incx, zy, incy)
ZSWAP
Definition zswap.f:81
Here is the call graph for this function:
Here is the caller graph for this function: