LAPACK  3.6.1 LAPACK: Linear Algebra PACKage
 subroutine sgejsv ( character*1 JOBA, character*1 JOBU, character*1 JOBV, character*1 JOBR, character*1 JOBT, character*1 JOBP, integer M, integer N, real, dimension( lda, * ) A, integer LDA, real, dimension( n ) SVA, real, dimension( ldu, * ) U, integer LDU, real, dimension( ldv, * ) V, integer LDV, real, dimension( lwork ) WORK, integer LWORK, integer, dimension( * ) IWORK, integer INFO )

SGEJSV

Download SGEJSV + dependencies [TGZ] [ZIP] [TXT]

Purpose:
``` SGEJSV computes the singular value decomposition (SVD) of a real M-by-N
matrix [A], where M >= N. The SVD of [A] is written as

[A] = [U] * [SIGMA] * [V]^t,

where [SIGMA] is an N-by-N (M-by-N) matrix which is zero except for its N
diagonal elements, [U] is an M-by-N (or M-by-M) orthonormal matrix, and
[V] is an N-by-N orthogonal matrix. The diagonal elements of [SIGMA] are
the singular values of [A]. The columns of [U] and [V] are the left and
the right singular vectors of [A], respectively. The matrices [U] and [V]
are computed and stored in the arrays U and V, respectively. The diagonal
of [SIGMA] is computed and stored in the array SVA.
SGEJSV can sometimes compute tiny singular values and their singular vectors much
more accurately than other SVD routines, see below under Further Details.```
Parameters
 [in] JOBA ``` JOBA is CHARACTER*1 Specifies the level of accuracy: = 'C': This option works well (high relative accuracy) if A = B * D, with well-conditioned B and arbitrary diagonal matrix D. The accuracy cannot be spoiled by COLUMN scaling. The accuracy of the computed output depends on the condition of B, and the procedure aims at the best theoretical accuracy. The relative error max_{i=1:N}|d sigma_i| / sigma_i is bounded by f(M,N)*epsilon* cond(B), independent of D. The input matrix is preprocessed with the QRF with column pivoting. This initial preprocessing and preconditioning by a rank revealing QR factorization is common for all values of JOBA. Additional actions are specified as follows: = 'E': Computation as with 'C' with an additional estimate of the condition number of B. It provides a realistic error bound. = 'F': If A = D1 * C * D2 with ill-conditioned diagonal scalings D1, D2, and well-conditioned matrix C, this option gives higher accuracy than the 'C' option. If the structure of the input matrix is not known, and relative accuracy is desirable, then this option is advisable. The input matrix A is preprocessed with QR factorization with FULL (row and column) pivoting. = 'G' Computation as with 'F' with an additional estimate of the condition number of B, where A=D*B. If A has heavily weighted rows, then using this condition number gives too pessimistic error bound. = 'A': Small singular values are the noise and the matrix is treated as numerically rank defficient. The error in the computed singular values is bounded by f(m,n)*epsilon*||A||. The computed SVD A = U * S * V^t restores A up to f(m,n)*epsilon*||A||. This gives the procedure the licence to discard (set to zero) all singular values below N*epsilon*||A||. = 'R': Similar as in 'A'. Rank revealing property of the initial QR factorization is used do reveal (using triangular factor) a gap sigma_{r+1} < epsilon * sigma_r in which case the numerical RANK is declared to be r. The SVD is computed with absolute error bounds, but more accurately than with 'A'.``` [in] JOBU ``` JOBU is CHARACTER*1 Specifies whether to compute the columns of U: = 'U': N columns of U are returned in the array U. = 'F': full set of M left sing. vectors is returned in the array U. = 'W': U may be used as workspace of length M*N. See the description of U. = 'N': U is not computed.``` [in] JOBV ``` JOBV is CHARACTER*1 Specifies whether to compute the matrix V: = 'V': N columns of V are returned in the array V; Jacobi rotations are not explicitly accumulated. = 'J': N columns of V are returned in the array V, but they are computed as the product of Jacobi rotations. This option is allowed only if JOBU .NE. 'N', i.e. in computing the full SVD. = 'W': V may be used as workspace of length N*N. See the description of V. = 'N': V is not computed.``` [in] JOBR ``` JOBR is CHARACTER*1 Specifies the RANGE for the singular values. Issues the licence to set to zero small positive singular values if they are outside specified range. If A .NE. 0 is scaled so that the largest singular value of c*A is around SQRT(BIG), BIG=SLAMCH('O'), then JOBR issues the licence to kill columns of A whose norm in c*A is less than SQRT(SFMIN) (for JOBR.EQ.'R'), or less than SMALL=SFMIN/EPSLN, where SFMIN=SLAMCH('S'), EPSLN=SLAMCH('E'). = 'N': Do not kill small columns of c*A. This option assumes that BLAS and QR factorizations and triangular solvers are implemented to work in that range. If the condition of A is greater than BIG, use SGESVJ. = 'R': RESTRICTED range for sigma(c*A) is [SQRT(SFMIN), SQRT(BIG)] (roughly, as described above). This option is recommended. =========================== For computing the singular values in the FULL range [SFMIN,BIG] use SGESVJ.``` [in] JOBT ``` JOBT is CHARACTER*1 If the matrix is square then the procedure may determine to use transposed A if A^t seems to be better with respect to convergence. If the matrix is not square, JOBT is ignored. This is subject to changes in the future. The decision is based on two values of entropy over the adjoint orbit of A^t * A. See the descriptions of WORK(6) and WORK(7). = 'T': transpose if entropy test indicates possibly faster convergence of Jacobi process if A^t is taken as input. If A is replaced with A^t, then the row pivoting is included automatically. = 'N': do not speculate. This option can be used to compute only the singular values, or the full SVD (U, SIGMA and V). For only one set of singular vectors (U or V), the caller should provide both U and V, as one of the matrices is used as workspace if the matrix A is transposed. The implementer can easily remove this constraint and make the code more complicated. See the descriptions of U and V.``` [in] JOBP ``` JOBP is CHARACTER*1 Issues the licence to introduce structured perturbations to drown denormalized numbers. This licence should be active if the denormals are poorly implemented, causing slow computation, especially in cases of fast convergence (!). For details see [1,2]. For the sake of simplicity, this perturbations are included only when the full SVD or only the singular values are requested. The implementer/user can easily add the perturbation for the cases of computing one set of singular vectors. = 'P': introduce perturbation = 'N': do not perturb``` [in] M ``` M is INTEGER The number of rows of the input matrix A. M >= 0.``` [in] N ``` N is INTEGER The number of columns of the input matrix A. M >= N >= 0.``` [in,out] A ``` A is REAL array, dimension (LDA,N) On entry, the M-by-N matrix A.``` [in] LDA ``` LDA is INTEGER The leading dimension of the array A. LDA >= max(1,M).``` [out] SVA ``` SVA is REAL array, dimension (N) On exit, - For WORK(1)/WORK(2) = ONE: The singular values of A. During the computation SVA contains Euclidean column norms of the iterated matrices in the array A. - For WORK(1) .NE. WORK(2): The singular values of A are (WORK(1)/WORK(2)) * SVA(1:N). This factored form is used if sigma_max(A) overflows or if small singular values have been saved from underflow by scaling the input matrix A. - If JOBR='R' then some of the singular values may be returned as exact zeros obtained by "set to zero" because they are below the numerical rank threshold or are denormalized numbers.``` [out] U ``` U is REAL array, dimension ( LDU, N ) If JOBU = 'U', then U contains on exit the M-by-N matrix of the left singular vectors. If JOBU = 'F', then U contains on exit the M-by-M matrix of the left singular vectors, including an ONB of the orthogonal complement of the Range(A). If JOBU = 'W' .AND. (JOBV.EQ.'V' .AND. JOBT.EQ.'T' .AND. M.EQ.N), then U is used as workspace if the procedure replaces A with A^t. In that case, [V] is computed in U as left singular vectors of A^t and then copied back to the V array. This 'W' option is just a reminder to the caller that in this case U is reserved as workspace of length N*N. If JOBU = 'N' U is not referenced, unless JOBT='T'.``` [in] LDU ``` LDU is INTEGER The leading dimension of the array U, LDU >= 1. IF JOBU = 'U' or 'F' or 'W', then LDU >= M.``` [out] V ``` V is REAL array, dimension ( LDV, N ) If JOBV = 'V', 'J' then V contains on exit the N-by-N matrix of the right singular vectors; If JOBV = 'W', AND (JOBU.EQ.'U' AND JOBT.EQ.'T' AND M.EQ.N), then V is used as workspace if the pprocedure replaces A with A^t. In that case, [U] is computed in V as right singular vectors of A^t and then copied back to the U array. This 'W' option is just a reminder to the caller that in this case V is reserved as workspace of length N*N. If JOBV = 'N' V is not referenced, unless JOBT='T'.``` [in] LDV ``` LDV is INTEGER The leading dimension of the array V, LDV >= 1. If JOBV = 'V' or 'J' or 'W', then LDV >= N.``` [out] WORK ``` WORK is REAL array, dimension at least LWORK. On exit, WORK(1) = SCALE = WORK(2) / WORK(1) is the scaling factor such that SCALE*SVA(1:N) are the computed singular values of A. (See the description of SVA().) WORK(2) = See the description of WORK(1). WORK(3) = SCONDA is an estimate for the condition number of column equilibrated A. (If JOBA .EQ. 'E' or 'G') SCONDA is an estimate of SQRT(||(R^t * R)^(-1)||_1). It is computed using SPOCON. It holds N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA where R is the triangular factor from the QRF of A. However, if R is truncated and the numerical rank is determined to be strictly smaller than N, SCONDA is returned as -1, thus indicating that the smallest singular values might be lost. If full SVD is needed, the following two condition numbers are useful for the analysis of the algorithm. They are provied for a developer/implementer who is familiar with the details of the method. WORK(4) = an estimate of the scaled condition number of the triangular factor in the first QR factorization. WORK(5) = an estimate of the scaled condition number of the triangular factor in the second QR factorization. The following two parameters are computed if JOBT .EQ. 'T'. They are provided for a developer/implementer who is familiar with the details of the method. WORK(6) = the entropy of A^t*A :: this is the Shannon entropy of diag(A^t*A) / Trace(A^t*A) taken as point in the probability simplex. WORK(7) = the entropy of A*A^t.``` [in] LWORK ``` LWORK is INTEGER Length of WORK to confirm proper allocation of work space. LWORK depends on the job: If only SIGMA is needed ( JOBU.EQ.'N', JOBV.EQ.'N' ) and -> .. no scaled condition estimate required (JOBE.EQ.'N'): LWORK >= max(2*M+N,4*N+1,7). This is the minimal requirement. ->> For optimal performance (blocked code) the optimal value is LWORK >= max(2*M+N,3*N+(N+1)*NB,7). Here NB is the optimal block size for DGEQP3 and DGEQRF. In general, optimal LWORK is computed as LWORK >= max(2*M+N,N+LWORK(DGEQP3),N+LWORK(DGEQRF), 7). -> .. an estimate of the scaled condition number of A is required (JOBA='E', 'G'). In this case, LWORK is the maximum of the above and N*N+4*N, i.e. LWORK >= max(2*M+N,N*N+4*N,7). ->> For optimal performance (blocked code) the optimal value is LWORK >= max(2*M+N,3*N+(N+1)*NB, N*N+4*N, 7). In general, the optimal length LWORK is computed as LWORK >= max(2*M+N,N+LWORK(DGEQP3),N+LWORK(DGEQRF), N+N*N+LWORK(DPOCON),7). If SIGMA and the right singular vectors are needed (JOBV.EQ.'V'), -> the minimal requirement is LWORK >= max(2*M+N,4*N+1,7). -> For optimal performance, LWORK >= max(2*M+N,3*N+(N+1)*NB,7), where NB is the optimal block size for DGEQP3, DGEQRF, DGELQ, DORMLQ. In general, the optimal length LWORK is computed as LWORK >= max(2*M+N,N+LWORK(DGEQP3), N+LWORK(DPOCON), N+LWORK(DGELQ), 2*N+LWORK(DGEQRF), N+LWORK(DORMLQ)). If SIGMA and the left singular vectors are needed -> the minimal requirement is LWORK >= max(2*M+N,4*N+1,7). -> For optimal performance: if JOBU.EQ.'U' :: LWORK >= max(2*M+N,3*N+(N+1)*NB,7), if JOBU.EQ.'F' :: LWORK >= max(2*M+N,3*N+(N+1)*NB,N+M*NB,7), where NB is the optimal block size for DGEQP3, DGEQRF, DORMQR. In general, the optimal length LWORK is computed as LWORK >= max(2*M+N,N+LWORK(DGEQP3),N+LWORK(DPOCON), 2*N+LWORK(DGEQRF), N+LWORK(DORMQR)). Here LWORK(DORMQR) equals N*NB (for JOBU.EQ.'U') or M*NB (for JOBU.EQ.'F'). If the full SVD is needed: (JOBU.EQ.'U' or JOBU.EQ.'F') and -> if JOBV.EQ.'V' the minimal requirement is LWORK >= max(2*M+N,6*N+2*N*N). -> if JOBV.EQ.'J' the minimal requirement is LWORK >= max(2*M+N, 4*N+N*N,2*N+N*N+6). -> For optimal performance, LWORK should be additionally larger than N+M*NB, where NB is the optimal block size for DORMQR.``` [out] IWORK ``` IWORK is INTEGER array, dimension M+3*N. On exit, IWORK(1) = the numerical rank determined after the initial QR factorization with pivoting. See the descriptions of JOBA and JOBR. IWORK(2) = the number of the computed nonzero singular values IWORK(3) = if nonzero, a warning message: If IWORK(3).EQ.1 then some of the column norms of A were denormalized floats. The requested high accuracy is not warranted by the data.``` [out] INFO ``` INFO is INTEGER < 0 : if INFO = -i, then the i-th argument had an illegal value. = 0 : successfull exit; > 0 : SGEJSV did not converge in the maximal allowed number of sweeps. The computed values may be inaccurate.```
Date
June 2016
Further Details:
```  SGEJSV implements a preconditioned Jacobi SVD algorithm. It uses SGEQP3,
SGEQRF, and SGELQF as preprocessors and preconditioners. Optionally, an
additional row pivoting can be used as a preprocessor, which in some
cases results in much higher accuracy. An example is matrix A with the
structure A = D1 * C * D2, where D1, D2 are arbitrarily ill-conditioned
diagonal matrices and C is well-conditioned matrix. In that case, complete
pivoting in the first QR factorizations provides accuracy dependent on the
condition number of C, and independent of D1, D2. Such higher accuracy is
not completely understood theoretically, but it works well in practice.
Further, if A can be written as A = B*D, with well-conditioned B and some
diagonal D, then the high accuracy is guaranteed, both theoretically and
in software, independent of D. For more details see [1], [2].
The computational range for the singular values can be the full range
( UNDERFLOW,OVERFLOW ), provided that the machine arithmetic and the BLAS
& LAPACK routines called by SGEJSV are implemented to work in that range.
If that is not the case, then the restriction for safe computation with
the singular values in the range of normalized IEEE numbers is that the
spectral condition number kappa(A)=sigma_max(A)/sigma_min(A) does not
overflow. This code (SGEJSV) is best used in this restricted range,
meaning that singular values of magnitude below ||A||_2 / SLAMCH('O') are
returned as zeros. See JOBR for details on this.
Further, this implementation is somewhat slower than the one described
in [1,2] due to replacement of some non-LAPACK components, and because
the choice of some tuning parameters in the iterative part (SGESVJ) is
left to the implementer on a particular machine.
The rank revealing QR factorization (in this code: SGEQP3) should be
implemented as in [3]. We have a new version of SGEQP3 under development
that is more robust than the current one in LAPACK, with a cleaner cut in
rank defficient cases. It will be available in the SIGMA library [4].
If M is much larger than N, it is obvious that the inital QRF with
column pivoting can be preprocessed by the QRF without pivoting. That
well known trick is not used in SGEJSV because in some cases heavy row
weighting can be treated with complete pivoting. The overhead in cases
M much larger than N is then only due to pivoting, but the benefits in
terms of accuracy have prevailed. The implementer/user can incorporate
this extra QRF step easily. The implementer can also improve data movement
(matrix transpose, matrix copy, matrix transposed copy) - this
implementation of SGEJSV uses only the simplest, naive data movement.```
Contributors:
Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)
References:
``` [1] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
LAPACK Working note 169.
[2] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
LAPACK Working note 170.
[3] Z. Drmac and Z. Bujanovic: On the failure of rank-revealing QR
factorization software - a case study.
ACM Trans. Math. Softw. Vol. 35, No 2 (2008), pp. 1-28.
LAPACK Working note 176.
[4] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
QSVD, (H,K)-SVD computations.
Department of Mathematics, University of Zagreb, 2008.```
Bugs, examples and comments:
Please report all bugs and send interesting examples and/or comments to drmac.nosp@m.@mat.nosp@m.h.hr. Thank you.

Definition at line 478 of file sgejsv.f.

478 *
479 * -- LAPACK computational routine (version 3.6.1) --
480 * -- LAPACK is a software package provided by Univ. of Tennessee, --
481 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
482 * June 2016
483 *
484 * .. Scalar Arguments ..
485  IMPLICIT NONE
486  INTEGER info, lda, ldu, ldv, lwork, m, n
487 * ..
488 * .. Array Arguments ..
489  REAL a( lda, * ), sva( n ), u( ldu, * ), v( ldv, * ),
490  \$ work( lwork )
491  INTEGER iwork( * )
492  CHARACTER*1 joba, jobp, jobr, jobt, jobu, jobv
493 * ..
494 *
495 * ===========================================================================
496 *
497 * .. Local Parameters ..
498  REAL zero, one
499  parameter( zero = 0.0e0, one = 1.0e0 )
500 * ..
501 * .. Local Scalars ..
502  REAL aapp, aaqq, aatmax, aatmin, big, big1, cond_ok,
503  \$ condr1, condr2, entra, entrat, epsln, maxprj, scalem,
504  \$ sconda, sfmin, small, temp1, uscal1, uscal2, xsc
505  INTEGER ierr, n1, nr, numrank, p, q, warning
506  LOGICAL almort, defr, errest, goscal, jracc, kill, lsvec,
507  \$ l2aber, l2kill, l2pert, l2rank, l2tran,
508  \$ noscal, rowpiv, rsvec, transp
509 * ..
510 * .. Intrinsic Functions ..
511  INTRINSIC abs, alog, max, min, float, nint, sign, sqrt
512 * ..
513 * .. External Functions ..
514  REAL slamch, snrm2
515  INTEGER isamax
516  LOGICAL lsame
517  EXTERNAL isamax, lsame, slamch, snrm2
518 * ..
519 * .. External Subroutines ..
520  EXTERNAL scopy, sgelqf, sgeqp3, sgeqrf, slacpy, slascl,
523 *
524  EXTERNAL sgesvj
525 * ..
526 *
527 * Test the input arguments
528 *
529  lsvec = lsame( jobu, 'U' ) .OR. lsame( jobu, 'F' )
530  jracc = lsame( jobv, 'J' )
531  rsvec = lsame( jobv, 'V' ) .OR. jracc
532  rowpiv = lsame( joba, 'F' ) .OR. lsame( joba, 'G' )
533  l2rank = lsame( joba, 'R' )
534  l2aber = lsame( joba, 'A' )
535  errest = lsame( joba, 'E' ) .OR. lsame( joba, 'G' )
536  l2tran = lsame( jobt, 'T' )
537  l2kill = lsame( jobr, 'R' )
538  defr = lsame( jobr, 'N' )
539  l2pert = lsame( jobp, 'P' )
540 *
541  IF ( .NOT.(rowpiv .OR. l2rank .OR. l2aber .OR.
542  \$ errest .OR. lsame( joba, 'C' ) )) THEN
543  info = - 1
544  ELSE IF ( .NOT.( lsvec .OR. lsame( jobu, 'N' ) .OR.
545  \$ lsame( jobu, 'W' )) ) THEN
546  info = - 2
547  ELSE IF ( .NOT.( rsvec .OR. lsame( jobv, 'N' ) .OR.
548  \$ lsame( jobv, 'W' )) .OR. ( jracc .AND. (.NOT.lsvec) ) ) THEN
549  info = - 3
550  ELSE IF ( .NOT. ( l2kill .OR. defr ) ) THEN
551  info = - 4
552  ELSE IF ( .NOT. ( l2tran .OR. lsame( jobt, 'N' ) ) ) THEN
553  info = - 5
554  ELSE IF ( .NOT. ( l2pert .OR. lsame( jobp, 'N' ) ) ) THEN
555  info = - 6
556  ELSE IF ( m .LT. 0 ) THEN
557  info = - 7
558  ELSE IF ( ( n .LT. 0 ) .OR. ( n .GT. m ) ) THEN
559  info = - 8
560  ELSE IF ( lda .LT. m ) THEN
561  info = - 10
562  ELSE IF ( lsvec .AND. ( ldu .LT. m ) ) THEN
563  info = - 13
564  ELSE IF ( rsvec .AND. ( ldv .LT. n ) ) THEN
565  info = - 14
566  ELSE IF ( (.NOT.(lsvec .OR. rsvec .OR. errest).AND.
567  \$ (lwork .LT. max(7,4*n+1,2*m+n))) .OR.
568  \$ (.NOT.(lsvec .OR. rsvec) .AND. errest .AND.
569  \$ (lwork .LT. max(7,4*n+n*n,2*m+n))) .OR.
570  \$ (lsvec .AND. (.NOT.rsvec) .AND. (lwork .LT. max(7,2*m+n,4*n+1)))
571  \$ .OR.
572  \$ (rsvec .AND. (.NOT.lsvec) .AND. (lwork .LT. max(7,2*m+n,4*n+1)))
573  \$ .OR.
574  \$ (lsvec .AND. rsvec .AND. (.NOT.jracc) .AND.
575  \$ (lwork.LT.max(2*m+n,6*n+2*n*n)))
576  \$ .OR. (lsvec .AND. rsvec .AND. jracc .AND.
577  \$ lwork.LT.max(2*m+n,4*n+n*n,2*n+n*n+6)))
578  \$ THEN
579  info = - 17
580  ELSE
581 * #:)
582  info = 0
583  END IF
584 *
585  IF ( info .NE. 0 ) THEN
586 * #:(
587  CALL xerbla( 'SGEJSV', - info )
588  RETURN
589  END IF
590 *
591 * Quick return for void matrix (Y3K safe)
592 * #:)
593  IF ( ( m .EQ. 0 ) .OR. ( n .EQ. 0 ) ) THEN
594  iwork(1:3) = 0
595  work(1:7) = 0
596  RETURN
597  ENDIF
598 *
599 * Determine whether the matrix U should be M x N or M x M
600 *
601  IF ( lsvec ) THEN
602  n1 = n
603  IF ( lsame( jobu, 'F' ) ) n1 = m
604  END IF
605 *
606 * Set numerical parameters
607 *
608 *! NOTE: Make sure SLAMCH() does not fail on the target architecture.
609 *
610  epsln = slamch('Epsilon')
611  sfmin = slamch('SafeMinimum')
612  small = sfmin / epsln
613  big = slamch('O')
614 * BIG = ONE / SFMIN
615 *
616 * Initialize SVA(1:N) = diag( ||A e_i||_2 )_1^N
617 *
618 *(!) If necessary, scale SVA() to protect the largest norm from
619 * overflow. It is possible that this scaling pushes the smallest
620 * column norm left from the underflow threshold (extreme case).
621 *
622  scalem = one / sqrt(float(m)*float(n))
623  noscal = .true.
624  goscal = .true.
625  DO 1874 p = 1, n
626  aapp = zero
627  aaqq = one
628  CALL slassq( m, a(1,p), 1, aapp, aaqq )
629  IF ( aapp .GT. big ) THEN
630  info = - 9
631  CALL xerbla( 'SGEJSV', -info )
632  RETURN
633  END IF
634  aaqq = sqrt(aaqq)
635  IF ( ( aapp .LT. (big / aaqq) ) .AND. noscal ) THEN
636  sva(p) = aapp * aaqq
637  ELSE
638  noscal = .false.
639  sva(p) = aapp * ( aaqq * scalem )
640  IF ( goscal ) THEN
641  goscal = .false.
642  CALL sscal( p-1, scalem, sva, 1 )
643  END IF
644  END IF
645  1874 CONTINUE
646 *
647  IF ( noscal ) scalem = one
648 *
649  aapp = zero
650  aaqq = big
651  DO 4781 p = 1, n
652  aapp = max( aapp, sva(p) )
653  IF ( sva(p) .NE. zero ) aaqq = min( aaqq, sva(p) )
654  4781 CONTINUE
655 *
656 * Quick return for zero M x N matrix
657 * #:)
658  IF ( aapp .EQ. zero ) THEN
659  IF ( lsvec ) CALL slaset( 'G', m, n1, zero, one, u, ldu )
660  IF ( rsvec ) CALL slaset( 'G', n, n, zero, one, v, ldv )
661  work(1) = one
662  work(2) = one
663  IF ( errest ) work(3) = one
664  IF ( lsvec .AND. rsvec ) THEN
665  work(4) = one
666  work(5) = one
667  END IF
668  IF ( l2tran ) THEN
669  work(6) = zero
670  work(7) = zero
671  END IF
672  iwork(1) = 0
673  iwork(2) = 0
674  iwork(3) = 0
675  RETURN
676  END IF
677 *
678 * Issue warning if denormalized column norms detected. Override the
679 * high relative accuracy request. Issue licence to kill columns
680 * (set them to zero) whose norm is less than sigma_max / BIG (roughly).
681 * #:(
682  warning = 0
683  IF ( aaqq .LE. sfmin ) THEN
684  l2rank = .true.
685  l2kill = .true.
686  warning = 1
687  END IF
688 *
689 * Quick return for one-column matrix
690 * #:)
691  IF ( n .EQ. 1 ) THEN
692 *
693  IF ( lsvec ) THEN
694  CALL slascl( 'G',0,0,sva(1),scalem, m,1,a(1,1),lda,ierr )
695  CALL slacpy( 'A', m, 1, a, lda, u, ldu )
696 * computing all M left singular vectors of the M x 1 matrix
697  IF ( n1 .NE. n ) THEN
698  CALL sgeqrf( m, n, u,ldu, work, work(n+1),lwork-n,ierr )
699  CALL sorgqr( m,n1,1, u,ldu,work,work(n+1),lwork-n,ierr )
700  CALL scopy( m, a(1,1), 1, u(1,1), 1 )
701  END IF
702  END IF
703  IF ( rsvec ) THEN
704  v(1,1) = one
705  END IF
706  IF ( sva(1) .LT. (big*scalem) ) THEN
707  sva(1) = sva(1) / scalem
708  scalem = one
709  END IF
710  work(1) = one / scalem
711  work(2) = one
712  IF ( sva(1) .NE. zero ) THEN
713  iwork(1) = 1
714  IF ( ( sva(1) / scalem) .GE. sfmin ) THEN
715  iwork(2) = 1
716  ELSE
717  iwork(2) = 0
718  END IF
719  ELSE
720  iwork(1) = 0
721  iwork(2) = 0
722  END IF
723  iwork(3) = 0
724  IF ( errest ) work(3) = one
725  IF ( lsvec .AND. rsvec ) THEN
726  work(4) = one
727  work(5) = one
728  END IF
729  IF ( l2tran ) THEN
730  work(6) = zero
731  work(7) = zero
732  END IF
733  RETURN
734 *
735  END IF
736 *
737  transp = .false.
738  l2tran = l2tran .AND. ( m .EQ. n )
739 *
740  aatmax = -one
741  aatmin = big
742  IF ( rowpiv .OR. l2tran ) THEN
743 *
744 * Compute the row norms, needed to determine row pivoting sequence
745 * (in the case of heavily row weighted A, row pivoting is strongly
746 * advised) and to collect information needed to compare the
747 * structures of A * A^t and A^t * A (in the case L2TRAN.EQ..TRUE.).
748 *
749  IF ( l2tran ) THEN
750  DO 1950 p = 1, m
751  xsc = zero
752  temp1 = one
753  CALL slassq( n, a(p,1), lda, xsc, temp1 )
754 * SLASSQ gets both the ell_2 and the ell_infinity norm
755 * in one pass through the vector
756  work(m+n+p) = xsc * scalem
757  work(n+p) = xsc * (scalem*sqrt(temp1))
758  aatmax = max( aatmax, work(n+p) )
759  IF (work(n+p) .NE. zero) aatmin = min(aatmin,work(n+p))
760  1950 CONTINUE
761  ELSE
762  DO 1904 p = 1, m
763  work(m+n+p) = scalem*abs( a(p,isamax(n,a(p,1),lda)) )
764  aatmax = max( aatmax, work(m+n+p) )
765  aatmin = min( aatmin, work(m+n+p) )
766  1904 CONTINUE
767  END IF
768 *
769  END IF
770 *
771 * For square matrix A try to determine whether A^t would be better
772 * input for the preconditioned Jacobi SVD, with faster convergence.
773 * The decision is based on an O(N) function of the vector of column
774 * and row norms of A, based on the Shannon entropy. This should give
775 * the right choice in most cases when the difference actually matters.
776 * It may fail and pick the slower converging side.
777 *
778  entra = zero
779  entrat = zero
780  IF ( l2tran ) THEN
781 *
782  xsc = zero
783  temp1 = one
784  CALL slassq( n, sva, 1, xsc, temp1 )
785  temp1 = one / temp1
786 *
787  entra = zero
788  DO 1113 p = 1, n
789  big1 = ( ( sva(p) / xsc )**2 ) * temp1
790  IF ( big1 .NE. zero ) entra = entra + big1 * alog(big1)
791  1113 CONTINUE
792  entra = - entra / alog(float(n))
793 *
794 * Now, SVA().^2/Trace(A^t * A) is a point in the probability simplex.
795 * It is derived from the diagonal of A^t * A. Do the same with the
796 * diagonal of A * A^t, compute the entropy of the corresponding
797 * probability distribution. Note that A * A^t and A^t * A have the
798 * same trace.
799 *
800  entrat = zero
801  DO 1114 p = n+1, n+m
802  big1 = ( ( work(p) / xsc )**2 ) * temp1
803  IF ( big1 .NE. zero ) entrat = entrat + big1 * alog(big1)
804  1114 CONTINUE
805  entrat = - entrat / alog(float(m))
806 *
807 * Analyze the entropies and decide A or A^t. Smaller entropy
808 * usually means better input for the algorithm.
809 *
810  transp = ( entrat .LT. entra )
811 *
812 * If A^t is better than A, transpose A.
813 *
814  IF ( transp ) THEN
815 * In an optimal implementation, this trivial transpose
816 * should be replaced with faster transpose.
817  DO 1115 p = 1, n - 1
818  DO 1116 q = p + 1, n
819  temp1 = a(q,p)
820  a(q,p) = a(p,q)
821  a(p,q) = temp1
822  1116 CONTINUE
823  1115 CONTINUE
824  DO 1117 p = 1, n
825  work(m+n+p) = sva(p)
826  sva(p) = work(n+p)
827  1117 CONTINUE
828  temp1 = aapp
829  aapp = aatmax
830  aatmax = temp1
831  temp1 = aaqq
832  aaqq = aatmin
833  aatmin = temp1
834  kill = lsvec
835  lsvec = rsvec
836  rsvec = kill
837  IF ( lsvec ) n1 = n
838 *
839  rowpiv = .true.
840  END IF
841 *
842  END IF
843 * END IF L2TRAN
844 *
845 * Scale the matrix so that its maximal singular value remains less
846 * than SQRT(BIG) -- the matrix is scaled so that its maximal column
847 * has Euclidean norm equal to SQRT(BIG/N). The only reason to keep
848 * SQRT(BIG) instead of BIG is the fact that SGEJSV uses LAPACK and
849 * BLAS routines that, in some implementations, are not capable of
850 * working in the full interval [SFMIN,BIG] and that they may provoke
851 * overflows in the intermediate results. If the singular values spread
852 * from SFMIN to BIG, then SGESVJ will compute them. So, in that case,
853 * one should use SGESVJ instead of SGEJSV.
854 *
855  big1 = sqrt( big )
856  temp1 = sqrt( big / float(n) )
857 *
858  CALL slascl( 'G', 0, 0, aapp, temp1, n, 1, sva, n, ierr )
859  IF ( aaqq .GT. (aapp * sfmin) ) THEN
860  aaqq = ( aaqq / aapp ) * temp1
861  ELSE
862  aaqq = ( aaqq * temp1 ) / aapp
863  END IF
864  temp1 = temp1 * scalem
865  CALL slascl( 'G', 0, 0, aapp, temp1, m, n, a, lda, ierr )
866 *
867 * To undo scaling at the end of this procedure, multiply the
868 * computed singular values with USCAL2 / USCAL1.
869 *
870  uscal1 = temp1
871  uscal2 = aapp
872 *
873  IF ( l2kill ) THEN
874 * L2KILL enforces computation of nonzero singular values in
875 * the restricted range of condition number of the initial A,
876 * sigma_max(A) / sigma_min(A) approx. SQRT(BIG)/SQRT(SFMIN).
877  xsc = sqrt( sfmin )
878  ELSE
879  xsc = small
880 *
881 * Now, if the condition number of A is too big,
882 * sigma_max(A) / sigma_min(A) .GT. SQRT(BIG/N) * EPSLN / SFMIN,
883 * as a precaution measure, the full SVD is computed using SGESVJ
884 * with accumulated Jacobi rotations. This provides numerically
885 * more robust computation, at the cost of slightly increased run
886 * time. Depending on the concrete implementation of BLAS and LAPACK
887 * (i.e. how they behave in presence of extreme ill-conditioning) the
888 * implementor may decide to remove this switch.
889  IF ( ( aaqq.LT.sqrt(sfmin) ) .AND. lsvec .AND. rsvec ) THEN
890  jracc = .true.
891  END IF
892 *
893  END IF
894  IF ( aaqq .LT. xsc ) THEN
895  DO 700 p = 1, n
896  IF ( sva(p) .LT. xsc ) THEN
897  CALL slaset( 'A', m, 1, zero, zero, a(1,p), lda )
898  sva(p) = zero
899  END IF
900  700 CONTINUE
901  END IF
902 *
903 * Preconditioning using QR factorization with pivoting
904 *
905  IF ( rowpiv ) THEN
906 * Optional row permutation (Bjoerck row pivoting):
907 * A result by Cox and Higham shows that the Bjoerck's
908 * row pivoting combined with standard column pivoting
909 * has similar effect as Powell-Reid complete pivoting.
910 * The ell-infinity norms of A are made nonincreasing.
911  DO 1952 p = 1, m - 1
912  q = isamax( m-p+1, work(m+n+p), 1 ) + p - 1
913  iwork(2*n+p) = q
914  IF ( p .NE. q ) THEN
915  temp1 = work(m+n+p)
916  work(m+n+p) = work(m+n+q)
917  work(m+n+q) = temp1
918  END IF
919  1952 CONTINUE
920  CALL slaswp( n, a, lda, 1, m-1, iwork(2*n+1), 1 )
921  END IF
922 *
923 * End of the preparation phase (scaling, optional sorting and
924 * transposing, optional flushing of small columns).
925 *
926 * Preconditioning
927 *
928 * If the full SVD is needed, the right singular vectors are computed
929 * from a matrix equation, and for that we need theoretical analysis
930 * of the Businger-Golub pivoting. So we use SGEQP3 as the first RR QRF.
931 * In all other cases the first RR QRF can be chosen by other criteria
932 * (eg speed by replacing global with restricted window pivoting, such
933 * as in SGEQPX from TOMS # 782). Good results will be obtained using
934 * SGEQPX with properly (!) chosen numerical parameters.
935 * Any improvement of SGEQP3 improves overal performance of SGEJSV.
936 *
937 * A * P1 = Q1 * [ R1^t 0]^t:
938  DO 1963 p = 1, n
939 * .. all columns are free columns
940  iwork(p) = 0
941  1963 CONTINUE
942  CALL sgeqp3( m,n,a,lda, iwork,work, work(n+1),lwork-n, ierr )
943 *
944 * The upper triangular matrix R1 from the first QRF is inspected for
945 * rank deficiency and possibilities for deflation, or possible
946 * ill-conditioning. Depending on the user specified flag L2RANK,
947 * the procedure explores possibilities to reduce the numerical
948 * rank by inspecting the computed upper triangular factor. If
949 * L2RANK or L2ABER are up, then SGEJSV will compute the SVD of
950 * A + dA, where ||dA|| <= f(M,N)*EPSLN.
951 *
952  nr = 1
953  IF ( l2aber ) THEN
954 * Standard absolute error bound suffices. All sigma_i with
955 * sigma_i < N*EPSLN*||A|| are flushed to zero. This is an
956 * agressive enforcement of lower numerical rank by introducing a
957 * backward error of the order of N*EPSLN*||A||.
958  temp1 = sqrt(float(n))*epsln
959  DO 3001 p = 2, n
960  IF ( abs(a(p,p)) .GE. (temp1*abs(a(1,1))) ) THEN
961  nr = nr + 1
962  ELSE
963  GO TO 3002
964  END IF
965  3001 CONTINUE
966  3002 CONTINUE
967  ELSE IF ( l2rank ) THEN
968 * .. similarly as above, only slightly more gentle (less agressive).
969 * Sudden drop on the diagonal of R1 is used as the criterion for
970 * close-to-rank-defficient.
971  temp1 = sqrt(sfmin)
972  DO 3401 p = 2, n
973  IF ( ( abs(a(p,p)) .LT. (epsln*abs(a(p-1,p-1))) ) .OR.
974  \$ ( abs(a(p,p)) .LT. small ) .OR.
975  \$ ( l2kill .AND. (abs(a(p,p)) .LT. temp1) ) ) GO TO 3402
976  nr = nr + 1
977  3401 CONTINUE
978  3402 CONTINUE
979 *
980  ELSE
981 * The goal is high relative accuracy. However, if the matrix
982 * has high scaled condition number the relative accuracy is in
983 * general not feasible. Later on, a condition number estimator
984 * will be deployed to estimate the scaled condition number.
985 * Here we just remove the underflowed part of the triangular
986 * factor. This prevents the situation in which the code is
987 * working hard to get the accuracy not warranted by the data.
988  temp1 = sqrt(sfmin)
989  DO 3301 p = 2, n
990  IF ( ( abs(a(p,p)) .LT. small ) .OR.
991  \$ ( l2kill .AND. (abs(a(p,p)) .LT. temp1) ) ) GO TO 3302
992  nr = nr + 1
993  3301 CONTINUE
994  3302 CONTINUE
995 *
996  END IF
997 *
998  almort = .false.
999  IF ( nr .EQ. n ) THEN
1000  maxprj = one
1001  DO 3051 p = 2, n
1002  temp1 = abs(a(p,p)) / sva(iwork(p))
1003  maxprj = min( maxprj, temp1 )
1004  3051 CONTINUE
1005  IF ( maxprj**2 .GE. one - float(n)*epsln ) almort = .true.
1006  END IF
1007 *
1008 *
1009  sconda = - one
1010  condr1 = - one
1011  condr2 = - one
1012 *
1013  IF ( errest ) THEN
1014  IF ( n .EQ. nr ) THEN
1015  IF ( rsvec ) THEN
1016 * .. V is available as workspace
1017  CALL slacpy( 'U', n, n, a, lda, v, ldv )
1018  DO 3053 p = 1, n
1019  temp1 = sva(iwork(p))
1020  CALL sscal( p, one/temp1, v(1,p), 1 )
1021  3053 CONTINUE
1022  CALL spocon( 'U', n, v, ldv, one, temp1,
1023  \$ work(n+1), iwork(2*n+m+1), ierr )
1024  ELSE IF ( lsvec ) THEN
1025 * .. U is available as workspace
1026  CALL slacpy( 'U', n, n, a, lda, u, ldu )
1027  DO 3054 p = 1, n
1028  temp1 = sva(iwork(p))
1029  CALL sscal( p, one/temp1, u(1,p), 1 )
1030  3054 CONTINUE
1031  CALL spocon( 'U', n, u, ldu, one, temp1,
1032  \$ work(n+1), iwork(2*n+m+1), ierr )
1033  ELSE
1034  CALL slacpy( 'U', n, n, a, lda, work(n+1), n )
1035  DO 3052 p = 1, n
1036  temp1 = sva(iwork(p))
1037  CALL sscal( p, one/temp1, work(n+(p-1)*n+1), 1 )
1038  3052 CONTINUE
1039 * .. the columns of R are scaled to have unit Euclidean lengths.
1040  CALL spocon( 'U', n, work(n+1), n, one, temp1,
1041  \$ work(n+n*n+1), iwork(2*n+m+1), ierr )
1042  END IF
1043  sconda = one / sqrt(temp1)
1044 * SCONDA is an estimate of SQRT(||(R^t * R)^(-1)||_1).
1045 * N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA
1046  ELSE
1047  sconda = - one
1048  END IF
1049  END IF
1050 *
1051  l2pert = l2pert .AND. ( abs( a(1,1)/a(nr,nr) ) .GT. sqrt(big1) )
1052 * If there is no violent scaling, artificial perturbation is not needed.
1053 *
1054 * Phase 3:
1055 *
1056  IF ( .NOT. ( rsvec .OR. lsvec ) ) THEN
1057 *
1058 * Singular Values only
1059 *
1060 * .. transpose A(1:NR,1:N)
1061  DO 1946 p = 1, min( n-1, nr )
1062  CALL scopy( n-p, a(p,p+1), lda, a(p+1,p), 1 )
1063  1946 CONTINUE
1064 *
1065 * The following two DO-loops introduce small relative perturbation
1066 * into the strict upper triangle of the lower triangular matrix.
1067 * Small entries below the main diagonal are also changed.
1068 * This modification is useful if the computing environment does not
1069 * provide/allow FLUSH TO ZERO underflow, for it prevents many
1070 * annoying denormalized numbers in case of strongly scaled matrices.
1071 * The perturbation is structured so that it does not introduce any
1072 * new perturbation of the singular values, and it does not destroy
1073 * the job done by the preconditioner.
1074 * The licence for this perturbation is in the variable L2PERT, which
1075 * should be .FALSE. if FLUSH TO ZERO underflow is active.
1076 *
1077  IF ( .NOT. almort ) THEN
1078 *
1079  IF ( l2pert ) THEN
1080 * XSC = SQRT(SMALL)
1081  xsc = epsln / float(n)
1082  DO 4947 q = 1, nr
1083  temp1 = xsc*abs(a(q,q))
1084  DO 4949 p = 1, n
1085  IF ( ( (p.GT.q) .AND. (abs(a(p,q)).LE.temp1) )
1086  \$ .OR. ( p .LT. q ) )
1087  \$ a(p,q) = sign( temp1, a(p,q) )
1088  4949 CONTINUE
1089  4947 CONTINUE
1090  ELSE
1091  CALL slaset( 'U', nr-1,nr-1, zero,zero, a(1,2),lda )
1092  END IF
1093 *
1094 * .. second preconditioning using the QR factorization
1095 *
1096  CALL sgeqrf( n,nr, a,lda, work, work(n+1),lwork-n, ierr )
1097 *
1098 * .. and transpose upper to lower triangular
1099  DO 1948 p = 1, nr - 1
1100  CALL scopy( nr-p, a(p,p+1), lda, a(p+1,p), 1 )
1101  1948 CONTINUE
1102 *
1103  END IF
1104 *
1105 * Row-cyclic Jacobi SVD algorithm with column pivoting
1106 *
1107 * .. again some perturbation (a "background noise") is added
1108 * to drown denormals
1109  IF ( l2pert ) THEN
1110 * XSC = SQRT(SMALL)
1111  xsc = epsln / float(n)
1112  DO 1947 q = 1, nr
1113  temp1 = xsc*abs(a(q,q))
1114  DO 1949 p = 1, nr
1115  IF ( ( (p.GT.q) .AND. (abs(a(p,q)).LE.temp1) )
1116  \$ .OR. ( p .LT. q ) )
1117  \$ a(p,q) = sign( temp1, a(p,q) )
1118  1949 CONTINUE
1119  1947 CONTINUE
1120  ELSE
1121  CALL slaset( 'U', nr-1, nr-1, zero, zero, a(1,2), lda )
1122  END IF
1123 *
1124 * .. and one-sided Jacobi rotations are started on a lower
1125 * triangular matrix (plus perturbation which is ignored in
1126 * the part which destroys triangular form (confusing?!))
1127 *
1128  CALL sgesvj( 'L', 'NoU', 'NoV', nr, nr, a, lda, sva,
1129  \$ n, v, ldv, work, lwork, info )
1130 *
1131  scalem = work(1)
1132  numrank = nint(work(2))
1133 *
1134 *
1135  ELSE IF ( rsvec .AND. ( .NOT. lsvec ) ) THEN
1136 *
1137 * -> Singular Values and Right Singular Vectors <-
1138 *
1139  IF ( almort ) THEN
1140 *
1141 * .. in this case NR equals N
1142  DO 1998 p = 1, nr
1143  CALL scopy( n-p+1, a(p,p), lda, v(p,p), 1 )
1144  1998 CONTINUE
1145  CALL slaset( 'Upper', nr-1, nr-1, zero, zero, v(1,2), ldv )
1146 *
1147  CALL sgesvj( 'L','U','N', n, nr, v,ldv, sva, nr, a,lda,
1148  \$ work, lwork, info )
1149  scalem = work(1)
1150  numrank = nint(work(2))
1151
1152  ELSE
1153 *
1154 * .. two more QR factorizations ( one QRF is not enough, two require
1155 * accumulated product of Jacobi rotations, three are perfect )
1156 *
1157  CALL slaset( 'Lower', nr-1, nr-1, zero, zero, a(2,1), lda )
1158  CALL sgelqf( nr, n, a, lda, work, work(n+1), lwork-n, ierr)
1159  CALL slacpy( 'Lower', nr, nr, a, lda, v, ldv )
1160  CALL slaset( 'Upper', nr-1, nr-1, zero, zero, v(1,2), ldv )
1161  CALL sgeqrf( nr, nr, v, ldv, work(n+1), work(2*n+1),
1162  \$ lwork-2*n, ierr )
1163  DO 8998 p = 1, nr
1164  CALL scopy( nr-p+1, v(p,p), ldv, v(p,p), 1 )
1165  8998 CONTINUE
1166  CALL slaset( 'Upper', nr-1, nr-1, zero, zero, v(1,2), ldv )
1167 *
1168  CALL sgesvj( 'Lower', 'U','N', nr, nr, v,ldv, sva, nr, u,
1169  \$ ldu, work(n+1), lwork-n, info )
1170  scalem = work(n+1)
1171  numrank = nint(work(n+2))
1172  IF ( nr .LT. n ) THEN
1173  CALL slaset( 'A',n-nr, nr, zero,zero, v(nr+1,1), ldv )
1174  CALL slaset( 'A',nr, n-nr, zero,zero, v(1,nr+1), ldv )
1175  CALL slaset( 'A',n-nr,n-nr,zero,one, v(nr+1,nr+1), ldv )
1176  END IF
1177 *
1178  CALL sormlq( 'Left', 'Transpose', n, n, nr, a, lda, work,
1179  \$ v, ldv, work(n+1), lwork-n, ierr )
1180 *
1181  END IF
1182 *
1183  DO 8991 p = 1, n
1184  CALL scopy( n, v(p,1), ldv, a(iwork(p),1), lda )
1185  8991 CONTINUE
1186  CALL slacpy( 'All', n, n, a, lda, v, ldv )
1187 *
1188  IF ( transp ) THEN
1189  CALL slacpy( 'All', n, n, v, ldv, u, ldu )
1190  END IF
1191 *
1192  ELSE IF ( lsvec .AND. ( .NOT. rsvec ) ) THEN
1193 *
1194 * .. Singular Values and Left Singular Vectors ..
1195 *
1196 * .. second preconditioning step to avoid need to accumulate
1197 * Jacobi rotations in the Jacobi iterations.
1198  DO 1965 p = 1, nr
1199  CALL scopy( n-p+1, a(p,p), lda, u(p,p), 1 )
1200  1965 CONTINUE
1201  CALL slaset( 'Upper', nr-1, nr-1, zero, zero, u(1,2), ldu )
1202 *
1203  CALL sgeqrf( n, nr, u, ldu, work(n+1), work(2*n+1),
1204  \$ lwork-2*n, ierr )
1205 *
1206  DO 1967 p = 1, nr - 1
1207  CALL scopy( nr-p, u(p,p+1), ldu, u(p+1,p), 1 )
1208  1967 CONTINUE
1209  CALL slaset( 'Upper', nr-1, nr-1, zero, zero, u(1,2), ldu )
1210 *
1211  CALL sgesvj( 'Lower', 'U', 'N', nr,nr, u, ldu, sva, nr, a,
1212  \$ lda, work(n+1), lwork-n, info )
1213  scalem = work(n+1)
1214  numrank = nint(work(n+2))
1215 *
1216  IF ( nr .LT. m ) THEN
1217  CALL slaset( 'A', m-nr, nr,zero, zero, u(nr+1,1), ldu )
1218  IF ( nr .LT. n1 ) THEN
1219  CALL slaset( 'A',nr, n1-nr, zero, zero, u(1,nr+1), ldu )
1220  CALL slaset( 'A',m-nr,n1-nr,zero,one,u(nr+1,nr+1), ldu )
1221  END IF
1222  END IF
1223 *
1224  CALL sormqr( 'Left', 'No Tr', m, n1, n, a, lda, work, u,
1225  \$ ldu, work(n+1), lwork-n, ierr )
1226 *
1227  IF ( rowpiv )
1228  \$ CALL slaswp( n1, u, ldu, 1, m-1, iwork(2*n+1), -1 )
1229 *
1230  DO 1974 p = 1, n1
1231  xsc = one / snrm2( m, u(1,p), 1 )
1232  CALL sscal( m, xsc, u(1,p), 1 )
1233  1974 CONTINUE
1234 *
1235  IF ( transp ) THEN
1236  CALL slacpy( 'All', n, n, u, ldu, v, ldv )
1237  END IF
1238 *
1239  ELSE
1240 *
1241 * .. Full SVD ..
1242 *
1243  IF ( .NOT. jracc ) THEN
1244 *
1245  IF ( .NOT. almort ) THEN
1246 *
1247 * Second Preconditioning Step (QRF [with pivoting])
1248 * Note that the composition of TRANSPOSE, QRF and TRANSPOSE is
1249 * equivalent to an LQF CALL. Since in many libraries the QRF
1250 * seems to be better optimized than the LQF, we do explicit
1251 * transpose and use the QRF. This is subject to changes in an
1252 * optimized implementation of SGEJSV.
1253 *
1254  DO 1968 p = 1, nr
1255  CALL scopy( n-p+1, a(p,p), lda, v(p,p), 1 )
1256  1968 CONTINUE
1257 *
1258 * .. the following two loops perturb small entries to avoid
1259 * denormals in the second QR factorization, where they are
1260 * as good as zeros. This is done to avoid painfully slow
1261 * computation with denormals. The relative size of the perturbation
1262 * is a parameter that can be changed by the implementer.
1263 * This perturbation device will be obsolete on machines with
1264 * properly implemented arithmetic.
1265 * To switch it off, set L2PERT=.FALSE. To remove it from the
1266 * code, remove the action under L2PERT=.TRUE., leave the ELSE part.
1267 * The following two loops should be blocked and fused with the
1268 * transposed copy above.
1269 *
1270  IF ( l2pert ) THEN
1271  xsc = sqrt(small)
1272  DO 2969 q = 1, nr
1273  temp1 = xsc*abs( v(q,q) )
1274  DO 2968 p = 1, n
1275  IF ( ( p .GT. q ) .AND. ( abs(v(p,q)) .LE. temp1 )
1276  \$ .OR. ( p .LT. q ) )
1277  \$ v(p,q) = sign( temp1, v(p,q) )
1278  IF ( p .LT. q ) v(p,q) = - v(p,q)
1279  2968 CONTINUE
1280  2969 CONTINUE
1281  ELSE
1282  CALL slaset( 'U', nr-1, nr-1, zero, zero, v(1,2), ldv )
1283  END IF
1284 *
1285 * Estimate the row scaled condition number of R1
1286 * (If R1 is rectangular, N > NR, then the condition number
1287 * of the leading NR x NR submatrix is estimated.)
1288 *
1289  CALL slacpy( 'L', nr, nr, v, ldv, work(2*n+1), nr )
1290  DO 3950 p = 1, nr
1291  temp1 = snrm2(nr-p+1,work(2*n+(p-1)*nr+p),1)
1292  CALL sscal(nr-p+1,one/temp1,work(2*n+(p-1)*nr+p),1)
1293  3950 CONTINUE
1294  CALL spocon('Lower',nr,work(2*n+1),nr,one,temp1,
1295  \$ work(2*n+nr*nr+1),iwork(m+2*n+1),ierr)
1296  condr1 = one / sqrt(temp1)
1297 * .. here need a second oppinion on the condition number
1298 * .. then assume worst case scenario
1299 * R1 is OK for inverse <=> CONDR1 .LT. FLOAT(N)
1300 * more conservative <=> CONDR1 .LT. SQRT(FLOAT(N))
1301 *
1302  cond_ok = sqrt(float(nr))
1303 *[TP] COND_OK is a tuning parameter.
1304
1305  IF ( condr1 .LT. cond_ok ) THEN
1306 * .. the second QRF without pivoting. Note: in an optimized
1307 * implementation, this QRF should be implemented as the QRF
1308 * of a lower triangular matrix.
1309 * R1^t = Q2 * R2
1310  CALL sgeqrf( n, nr, v, ldv, work(n+1), work(2*n+1),
1311  \$ lwork-2*n, ierr )
1312 *
1313  IF ( l2pert ) THEN
1314  xsc = sqrt(small)/epsln
1315  DO 3959 p = 2, nr
1316  DO 3958 q = 1, p - 1
1317  temp1 = xsc * min(abs(v(p,p)),abs(v(q,q)))
1318  IF ( abs(v(q,p)) .LE. temp1 )
1319  \$ v(q,p) = sign( temp1, v(q,p) )
1320  3958 CONTINUE
1321  3959 CONTINUE
1322  END IF
1323 *
1324  IF ( nr .NE. n )
1325  \$ CALL slacpy( 'A', n, nr, v, ldv, work(2*n+1), n )
1326 * .. save ...
1327 *
1328 * .. this transposed copy should be better than naive
1329  DO 1969 p = 1, nr - 1
1330  CALL scopy( nr-p, v(p,p+1), ldv, v(p+1,p), 1 )
1331  1969 CONTINUE
1332 *
1333  condr2 = condr1
1334 *
1335  ELSE
1336 *
1337 * .. ill-conditioned case: second QRF with pivoting
1338 * Note that windowed pivoting would be equaly good
1339 * numerically, and more run-time efficient. So, in
1340 * an optimal implementation, the next call to SGEQP3
1341 * should be replaced with eg. CALL SGEQPX (ACM TOMS #782)
1342 * with properly (carefully) chosen parameters.
1343 *
1344 * R1^t * P2 = Q2 * R2
1345  DO 3003 p = 1, nr
1346  iwork(n+p) = 0
1347  3003 CONTINUE
1348  CALL sgeqp3( n, nr, v, ldv, iwork(n+1), work(n+1),
1349  \$ work(2*n+1), lwork-2*n, ierr )
1350 ** CALL SGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1),
1351 ** \$ LWORK-2*N, IERR )
1352  IF ( l2pert ) THEN
1353  xsc = sqrt(small)
1354  DO 3969 p = 2, nr
1355  DO 3968 q = 1, p - 1
1356  temp1 = xsc * min(abs(v(p,p)),abs(v(q,q)))
1357  IF ( abs(v(q,p)) .LE. temp1 )
1358  \$ v(q,p) = sign( temp1, v(q,p) )
1359  3968 CONTINUE
1360  3969 CONTINUE
1361  END IF
1362 *
1363  CALL slacpy( 'A', n, nr, v, ldv, work(2*n+1), n )
1364 *
1365  IF ( l2pert ) THEN
1366  xsc = sqrt(small)
1367  DO 8970 p = 2, nr
1368  DO 8971 q = 1, p - 1
1369  temp1 = xsc * min(abs(v(p,p)),abs(v(q,q)))
1370  v(p,q) = - sign( temp1, v(q,p) )
1371  8971 CONTINUE
1372  8970 CONTINUE
1373  ELSE
1374  CALL slaset( 'L',nr-1,nr-1,zero,zero,v(2,1),ldv )
1375  END IF
1376 * Now, compute R2 = L3 * Q3, the LQ factorization.
1377  CALL sgelqf( nr, nr, v, ldv, work(2*n+n*nr+1),
1378  \$ work(2*n+n*nr+nr+1), lwork-2*n-n*nr-nr, ierr )
1379 * .. and estimate the condition number
1380  CALL slacpy( 'L',nr,nr,v,ldv,work(2*n+n*nr+nr+1),nr )
1381  DO 4950 p = 1, nr
1382  temp1 = snrm2( p, work(2*n+n*nr+nr+p), nr )
1383  CALL sscal( p, one/temp1, work(2*n+n*nr+nr+p), nr )
1384  4950 CONTINUE
1385  CALL spocon( 'L',nr,work(2*n+n*nr+nr+1),nr,one,temp1,
1386  \$ work(2*n+n*nr+nr+nr*nr+1),iwork(m+2*n+1),ierr )
1387  condr2 = one / sqrt(temp1)
1388 *
1389  IF ( condr2 .GE. cond_ok ) THEN
1390 * .. save the Householder vectors used for Q3
1391 * (this overwrittes the copy of R2, as it will not be
1392 * needed in this branch, but it does not overwritte the
1393 * Huseholder vectors of Q2.).
1394  CALL slacpy( 'U', nr, nr, v, ldv, work(2*n+1), n )
1395 * .. and the rest of the information on Q3 is in
1396 * WORK(2*N+N*NR+1:2*N+N*NR+N)
1397  END IF
1398 *
1399  END IF
1400 *
1401  IF ( l2pert ) THEN
1402  xsc = sqrt(small)
1403  DO 4968 q = 2, nr
1404  temp1 = xsc * v(q,q)
1405  DO 4969 p = 1, q - 1
1406 * V(p,q) = - SIGN( TEMP1, V(q,p) )
1407  v(p,q) = - sign( temp1, v(p,q) )
1408  4969 CONTINUE
1409  4968 CONTINUE
1410  ELSE
1411  CALL slaset( 'U', nr-1,nr-1, zero,zero, v(1,2), ldv )
1412  END IF
1413 *
1414 * Second preconditioning finished; continue with Jacobi SVD
1415 * The input matrix is lower trinagular.
1416 *
1417 * Recover the right singular vectors as solution of a well
1418 * conditioned triangular matrix equation.
1419 *
1420  IF ( condr1 .LT. cond_ok ) THEN
1421 *
1422  CALL sgesvj( 'L','U','N',nr,nr,v,ldv,sva,nr,u,
1423  \$ ldu,work(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,info )
1424  scalem = work(2*n+n*nr+nr+1)
1425  numrank = nint(work(2*n+n*nr+nr+2))
1426  DO 3970 p = 1, nr
1427  CALL scopy( nr, v(1,p), 1, u(1,p), 1 )
1428  CALL sscal( nr, sva(p), v(1,p), 1 )
1429  3970 CONTINUE
1430
1431 * .. pick the right matrix equation and solve it
1432 *
1433  IF ( nr .EQ. n ) THEN
1434 * :)) .. best case, R1 is inverted. The solution of this matrix
1435 * equation is Q2*V2 = the product of the Jacobi rotations
1436 * used in SGESVJ, premultiplied with the orthogonal matrix
1437 * from the second QR factorization.
1438  CALL strsm( 'L','U','N','N', nr,nr,one, a,lda, v,ldv )
1439  ELSE
1440 * .. R1 is well conditioned, but non-square. Transpose(R2)
1441 * is inverted to get the product of the Jacobi rotations
1442 * used in SGESVJ. The Q-factor from the second QR
1443 * factorization is then built in explicitly.
1444  CALL strsm('L','U','T','N',nr,nr,one,work(2*n+1),
1445  \$ n,v,ldv)
1446  IF ( nr .LT. n ) THEN
1447  CALL slaset('A',n-nr,nr,zero,zero,v(nr+1,1),ldv)
1448  CALL slaset('A',nr,n-nr,zero,zero,v(1,nr+1),ldv)
1449  CALL slaset('A',n-nr,n-nr,zero,one,v(nr+1,nr+1),ldv)
1450  END IF
1451  CALL sormqr('L','N',n,n,nr,work(2*n+1),n,work(n+1),
1452  \$ v,ldv,work(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,ierr)
1453  END IF
1454 *
1455  ELSE IF ( condr2 .LT. cond_ok ) THEN
1456 *
1457 * :) .. the input matrix A is very likely a relative of
1458 * the Kahan matrix :)
1459 * The matrix R2 is inverted. The solution of the matrix equation
1460 * is Q3^T*V3 = the product of the Jacobi rotations (appplied to
1461 * the lower triangular L3 from the LQ factorization of
1462 * R2=L3*Q3), pre-multiplied with the transposed Q3.
1463  CALL sgesvj( 'L', 'U', 'N', nr, nr, v, ldv, sva, nr, u,
1464  \$ ldu, work(2*n+n*nr+nr+1), lwork-2*n-n*nr-nr, info )
1465  scalem = work(2*n+n*nr+nr+1)
1466  numrank = nint(work(2*n+n*nr+nr+2))
1467  DO 3870 p = 1, nr
1468  CALL scopy( nr, v(1,p), 1, u(1,p), 1 )
1469  CALL sscal( nr, sva(p), u(1,p), 1 )
1470  3870 CONTINUE
1471  CALL strsm('L','U','N','N',nr,nr,one,work(2*n+1),n,u,ldu)
1472 * .. apply the permutation from the second QR factorization
1473  DO 873 q = 1, nr
1474  DO 872 p = 1, nr
1475  work(2*n+n*nr+nr+iwork(n+p)) = u(p,q)
1476  872 CONTINUE
1477  DO 874 p = 1, nr
1478  u(p,q) = work(2*n+n*nr+nr+p)
1479  874 CONTINUE
1480  873 CONTINUE
1481  IF ( nr .LT. n ) THEN
1482  CALL slaset( 'A',n-nr,nr,zero,zero,v(nr+1,1),ldv )
1483  CALL slaset( 'A',nr,n-nr,zero,zero,v(1,nr+1),ldv )
1484  CALL slaset( 'A',n-nr,n-nr,zero,one,v(nr+1,nr+1),ldv )
1485  END IF
1486  CALL sormqr( 'L','N',n,n,nr,work(2*n+1),n,work(n+1),
1487  \$ v,ldv,work(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,ierr )
1488  ELSE
1489 * Last line of defense.
1490 * #:( This is a rather pathological case: no scaled condition
1491 * improvement after two pivoted QR factorizations. Other
1492 * possibility is that the rank revealing QR factorization
1493 * or the condition estimator has failed, or the COND_OK
1494 * is set very close to ONE (which is unnecessary). Normally,
1495 * this branch should never be executed, but in rare cases of
1496 * failure of the RRQR or condition estimator, the last line of
1497 * defense ensures that SGEJSV completes the task.
1498 * Compute the full SVD of L3 using SGESVJ with explicit
1499 * accumulation of Jacobi rotations.
1500  CALL sgesvj( 'L', 'U', 'V', nr, nr, v, ldv, sva, nr, u,
1501  \$ ldu, work(2*n+n*nr+nr+1), lwork-2*n-n*nr-nr, info )
1502  scalem = work(2*n+n*nr+nr+1)
1503  numrank = nint(work(2*n+n*nr+nr+2))
1504  IF ( nr .LT. n ) THEN
1505  CALL slaset( 'A',n-nr,nr,zero,zero,v(nr+1,1),ldv )
1506  CALL slaset( 'A',nr,n-nr,zero,zero,v(1,nr+1),ldv )
1507  CALL slaset( 'A',n-nr,n-nr,zero,one,v(nr+1,nr+1),ldv )
1508  END IF
1509  CALL sormqr( 'L','N',n,n,nr,work(2*n+1),n,work(n+1),
1510  \$ v,ldv,work(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,ierr )
1511 *
1512  CALL sormlq( 'L', 'T', nr, nr, nr, work(2*n+1), n,
1513  \$ work(2*n+n*nr+1), u, ldu, work(2*n+n*nr+nr+1),
1514  \$ lwork-2*n-n*nr-nr, ierr )
1515  DO 773 q = 1, nr
1516  DO 772 p = 1, nr
1517  work(2*n+n*nr+nr+iwork(n+p)) = u(p,q)
1518  772 CONTINUE
1519  DO 774 p = 1, nr
1520  u(p,q) = work(2*n+n*nr+nr+p)
1521  774 CONTINUE
1522  773 CONTINUE
1523 *
1524  END IF
1525 *
1526 * Permute the rows of V using the (column) permutation from the
1527 * first QRF. Also, scale the columns to make them unit in
1528 * Euclidean norm. This applies to all cases.
1529 *
1530  temp1 = sqrt(float(n)) * epsln
1531  DO 1972 q = 1, n
1532  DO 972 p = 1, n
1533  work(2*n+n*nr+nr+iwork(p)) = v(p,q)
1534  972 CONTINUE
1535  DO 973 p = 1, n
1536  v(p,q) = work(2*n+n*nr+nr+p)
1537  973 CONTINUE
1538  xsc = one / snrm2( n, v(1,q), 1 )
1539  IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
1540  \$ CALL sscal( n, xsc, v(1,q), 1 )
1541  1972 CONTINUE
1542 * At this moment, V contains the right singular vectors of A.
1543 * Next, assemble the left singular vector matrix U (M x N).
1544  IF ( nr .LT. m ) THEN
1545  CALL slaset( 'A', m-nr, nr, zero, zero, u(nr+1,1), ldu )
1546  IF ( nr .LT. n1 ) THEN
1547  CALL slaset('A',nr,n1-nr,zero,zero,u(1,nr+1),ldu)
1548  CALL slaset('A',m-nr,n1-nr,zero,one,u(nr+1,nr+1),ldu)
1549  END IF
1550  END IF
1551 *
1552 * The Q matrix from the first QRF is built into the left singular
1553 * matrix U. This applies to all cases.
1554 *
1555  CALL sormqr( 'Left', 'No_Tr', m, n1, n, a, lda, work, u,
1556  \$ ldu, work(n+1), lwork-n, ierr )
1557
1558 * The columns of U are normalized. The cost is O(M*N) flops.
1559  temp1 = sqrt(float(m)) * epsln
1560  DO 1973 p = 1, nr
1561  xsc = one / snrm2( m, u(1,p), 1 )
1562  IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
1563  \$ CALL sscal( m, xsc, u(1,p), 1 )
1564  1973 CONTINUE
1565 *
1566 * If the initial QRF is computed with row pivoting, the left
1567 * singular vectors must be adjusted.
1568 *
1569  IF ( rowpiv )
1570  \$ CALL slaswp( n1, u, ldu, 1, m-1, iwork(2*n+1), -1 )
1571 *
1572  ELSE
1573 *
1574 * .. the initial matrix A has almost orthogonal columns and
1575 * the second QRF is not needed
1576 *
1577  CALL slacpy( 'Upper', n, n, a, lda, work(n+1), n )
1578  IF ( l2pert ) THEN
1579  xsc = sqrt(small)
1580  DO 5970 p = 2, n
1581  temp1 = xsc * work( n + (p-1)*n + p )
1582  DO 5971 q = 1, p - 1
1583  work(n+(q-1)*n+p)=-sign(temp1,work(n+(p-1)*n+q))
1584  5971 CONTINUE
1585  5970 CONTINUE
1586  ELSE
1587  CALL slaset( 'Lower',n-1,n-1,zero,zero,work(n+2),n )
1588  END IF
1589 *
1590  CALL sgesvj( 'Upper', 'U', 'N', n, n, work(n+1), n, sva,
1591  \$ n, u, ldu, work(n+n*n+1), lwork-n-n*n, info )
1592 *
1593  scalem = work(n+n*n+1)
1594  numrank = nint(work(n+n*n+2))
1595  DO 6970 p = 1, n
1596  CALL scopy( n, work(n+(p-1)*n+1), 1, u(1,p), 1 )
1597  CALL sscal( n, sva(p), work(n+(p-1)*n+1), 1 )
1598  6970 CONTINUE
1599 *
1600  CALL strsm( 'Left', 'Upper', 'NoTrans', 'No UD', n, n,
1601  \$ one, a, lda, work(n+1), n )
1602  DO 6972 p = 1, n
1603  CALL scopy( n, work(n+p), n, v(iwork(p),1), ldv )
1604  6972 CONTINUE
1605  temp1 = sqrt(float(n))*epsln
1606  DO 6971 p = 1, n
1607  xsc = one / snrm2( n, v(1,p), 1 )
1608  IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
1609  \$ CALL sscal( n, xsc, v(1,p), 1 )
1610  6971 CONTINUE
1611 *
1612 * Assemble the left singular vector matrix U (M x N).
1613 *
1614  IF ( n .LT. m ) THEN
1615  CALL slaset( 'A', m-n, n, zero, zero, u(n+1,1), ldu )
1616  IF ( n .LT. n1 ) THEN
1617  CALL slaset( 'A',n, n1-n, zero, zero, u(1,n+1),ldu )
1618  CALL slaset( 'A',m-n,n1-n, zero, one,u(n+1,n+1),ldu )
1619  END IF
1620  END IF
1621  CALL sormqr( 'Left', 'No Tr', m, n1, n, a, lda, work, u,
1622  \$ ldu, work(n+1), lwork-n, ierr )
1623  temp1 = sqrt(float(m))*epsln
1624  DO 6973 p = 1, n1
1625  xsc = one / snrm2( m, u(1,p), 1 )
1626  IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
1627  \$ CALL sscal( m, xsc, u(1,p), 1 )
1628  6973 CONTINUE
1629 *
1630  IF ( rowpiv )
1631  \$ CALL slaswp( n1, u, ldu, 1, m-1, iwork(2*n+1), -1 )
1632 *
1633  END IF
1634 *
1635 * end of the >> almost orthogonal case << in the full SVD
1636 *
1637  ELSE
1638 *
1639 * This branch deploys a preconditioned Jacobi SVD with explicitly
1640 * accumulated rotations. It is included as optional, mainly for
1641 * experimental purposes. It does perfom well, and can also be used.
1642 * In this implementation, this branch will be automatically activated
1643 * if the condition number sigma_max(A) / sigma_min(A) is predicted
1644 * to be greater than the overflow threshold. This is because the
1645 * a posteriori computation of the singular vectors assumes robust
1646 * implementation of BLAS and some LAPACK procedures, capable of working
1647 * in presence of extreme values. Since that is not always the case, ...
1648 *
1649  DO 7968 p = 1, nr
1650  CALL scopy( n-p+1, a(p,p), lda, v(p,p), 1 )
1651  7968 CONTINUE
1652 *
1653  IF ( l2pert ) THEN
1654  xsc = sqrt(small/epsln)
1655  DO 5969 q = 1, nr
1656  temp1 = xsc*abs( v(q,q) )
1657  DO 5968 p = 1, n
1658  IF ( ( p .GT. q ) .AND. ( abs(v(p,q)) .LE. temp1 )
1659  \$ .OR. ( p .LT. q ) )
1660  \$ v(p,q) = sign( temp1, v(p,q) )
1661  IF ( p .LT. q ) v(p,q) = - v(p,q)
1662  5968 CONTINUE
1663  5969 CONTINUE
1664  ELSE
1665  CALL slaset( 'U', nr-1, nr-1, zero, zero, v(1,2), ldv )
1666  END IF
1667
1668  CALL sgeqrf( n, nr, v, ldv, work(n+1), work(2*n+1),
1669  \$ lwork-2*n, ierr )
1670  CALL slacpy( 'L', n, nr, v, ldv, work(2*n+1), n )
1671 *
1672  DO 7969 p = 1, nr
1673  CALL scopy( nr-p+1, v(p,p), ldv, u(p,p), 1 )
1674  7969 CONTINUE
1675
1676  IF ( l2pert ) THEN
1677  xsc = sqrt(small/epsln)
1678  DO 9970 q = 2, nr
1679  DO 9971 p = 1, q - 1
1680  temp1 = xsc * min(abs(u(p,p)),abs(u(q,q)))
1681  u(p,q) = - sign( temp1, u(q,p) )
1682  9971 CONTINUE
1683  9970 CONTINUE
1684  ELSE
1685  CALL slaset('U', nr-1, nr-1, zero, zero, u(1,2), ldu )
1686  END IF
1687
1688  CALL sgesvj( 'L', 'U', 'V', nr, nr, u, ldu, sva,
1689  \$ n, v, ldv, work(2*n+n*nr+1), lwork-2*n-n*nr, info )
1690  scalem = work(2*n+n*nr+1)
1691  numrank = nint(work(2*n+n*nr+2))
1692
1693  IF ( nr .LT. n ) THEN
1694  CALL slaset( 'A',n-nr,nr,zero,zero,v(nr+1,1),ldv )
1695  CALL slaset( 'A',nr,n-nr,zero,zero,v(1,nr+1),ldv )
1696  CALL slaset( 'A',n-nr,n-nr,zero,one,v(nr+1,nr+1),ldv )
1697  END IF
1698
1699  CALL sormqr( 'L','N',n,n,nr,work(2*n+1),n,work(n+1),
1700  \$ v,ldv,work(2*n+n*nr+nr+1),lwork-2*n-n*nr-nr,ierr )
1701 *
1702 * Permute the rows of V using the (column) permutation from the
1703 * first QRF. Also, scale the columns to make them unit in
1704 * Euclidean norm. This applies to all cases.
1705 *
1706  temp1 = sqrt(float(n)) * epsln
1707  DO 7972 q = 1, n
1708  DO 8972 p = 1, n
1709  work(2*n+n*nr+nr+iwork(p)) = v(p,q)
1710  8972 CONTINUE
1711  DO 8973 p = 1, n
1712  v(p,q) = work(2*n+n*nr+nr+p)
1713  8973 CONTINUE
1714  xsc = one / snrm2( n, v(1,q), 1 )
1715  IF ( (xsc .LT. (one-temp1)) .OR. (xsc .GT. (one+temp1)) )
1716  \$ CALL sscal( n, xsc, v(1,q), 1 )
1717  7972 CONTINUE
1718 *
1719 * At this moment, V contains the right singular vectors of A.
1720 * Next, assemble the left singular vector matrix U (M x N).
1721 *
1722  IF ( nr .LT. m ) THEN
1723  CALL slaset( 'A', m-nr, nr, zero, zero, u(nr+1,1), ldu )
1724  IF ( nr .LT. n1 ) THEN
1725  CALL slaset( 'A',nr, n1-nr, zero, zero, u(1,nr+1),ldu )
1726  CALL slaset( 'A',m-nr,n1-nr, zero, one,u(nr+1,nr+1),ldu )
1727  END IF
1728  END IF
1729 *
1730  CALL sormqr( 'Left', 'No Tr', m, n1, n, a, lda, work, u,
1731  \$ ldu, work(n+1), lwork-n, ierr )
1732 *
1733  IF ( rowpiv )
1734  \$ CALL slaswp( n1, u, ldu, 1, m-1, iwork(2*n+1), -1 )
1735 *
1736 *
1737  END IF
1738  IF ( transp ) THEN
1739 * .. swap U and V because the procedure worked on A^t
1740  DO 6974 p = 1, n
1741  CALL sswap( n, u(1,p), 1, v(1,p), 1 )
1742  6974 CONTINUE
1743  END IF
1744 *
1745  END IF
1746 * end of the full SVD
1747 *
1748 * Undo scaling, if necessary (and possible)
1749 *
1750  IF ( uscal2 .LE. (big/sva(1))*uscal1 ) THEN
1751  CALL slascl( 'G', 0, 0, uscal1, uscal2, nr, 1, sva, n, ierr )
1752  uscal1 = one
1753  uscal2 = one
1754  END IF
1755 *
1756  IF ( nr .LT. n ) THEN
1757  DO 3004 p = nr+1, n
1758  sva(p) = zero
1759  3004 CONTINUE
1760  END IF
1761 *
1762  work(1) = uscal2 * scalem
1763  work(2) = uscal1
1764  IF ( errest ) work(3) = sconda
1765  IF ( lsvec .AND. rsvec ) THEN
1766  work(4) = condr1
1767  work(5) = condr2
1768  END IF
1769  IF ( l2tran ) THEN
1770  work(6) = entra
1771  work(7) = entrat
1772  END IF
1773 *
1774  iwork(1) = nr
1775  iwork(2) = numrank
1776  iwork(3) = warning
1777 *
1778  RETURN
1779 * ..
1780 * .. END OF SGEJSV
1781 * ..
subroutine slassq(N, X, INCX, SCALE, SUMSQ)
SLASSQ updates a sum of squares represented in scaled form.
Definition: slassq.f:105
subroutine sgesvj(JOBA, JOBU, JOBV, M, N, A, LDA, SVA, MV, V, LDV, WORK, LWORK, INFO)
SGESVJ
Definition: sgesvj.f:325
subroutine sormqr(SIDE, TRANS, M, N, K, A, LDA, TAU, C, LDC, WORK, LWORK, INFO)
SORMQR
Definition: sormqr.f:170
subroutine strsm(SIDE, UPLO, TRANSA, DIAG, M, N, ALPHA, A, LDA, B, LDB)
STRSM
Definition: strsm.f:183
integer function isamax(N, SX, INCX)
ISAMAX
Definition: isamax.f:53
subroutine slascl(TYPE, KL, KU, CFROM, CTO, M, N, A, LDA, INFO)
SLASCL multiplies a general rectangular matrix by a real scalar defined as cto/cfrom.
Definition: slascl.f:145
subroutine sgeqrf(M, N, A, LDA, TAU, WORK, LWORK, INFO)
SGEQRF
Definition: sgeqrf.f:138
subroutine xerbla(SRNAME, INFO)
XERBLA
Definition: xerbla.f:62
subroutine slacpy(UPLO, M, N, A, LDA, B, LDB)
SLACPY copies all or part of one two-dimensional array to another.
Definition: slacpy.f:105
subroutine slaswp(N, A, LDA, K1, K2, IPIV, INCX)
SLASWP performs a series of row interchanges on a general rectangular matrix.
Definition: slaswp.f:116
subroutine slaset(UPLO, M, N, ALPHA, BETA, A, LDA)
SLASET initializes the off-diagonal elements and the diagonal elements of a matrix to given values...
Definition: slaset.f:112
subroutine sormlq(SIDE, TRANS, M, N, K, A, LDA, TAU, C, LDC, WORK, LWORK, INFO)
SORMLQ
Definition: sormlq.f:170
real function snrm2(N, X, INCX)
SNRM2
Definition: snrm2.f:56
subroutine spocon(UPLO, N, A, LDA, ANORM, RCOND, WORK, IWORK, INFO)
SPOCON
Definition: spocon.f:123
subroutine sgeqp3(M, N, A, LDA, JPVT, TAU, WORK, LWORK, INFO)
SGEQP3
Definition: sgeqp3.f:153
subroutine sscal(N, SA, SX, INCX)
SSCAL
Definition: sscal.f:55
subroutine sswap(N, SX, INCX, SY, INCY)
SSWAP
Definition: sswap.f:53
real function slamch(CMACH)
SLAMCH
Definition: slamch.f:69
subroutine sorgqr(M, N, K, A, LDA, TAU, WORK, LWORK, INFO)
SORGQR
Definition: sorgqr.f:130
logical function lsame(CA, CB)
LSAME
Definition: lsame.f:55
subroutine scopy(N, SX, INCX, SY, INCY)
SCOPY
Definition: scopy.f:53
subroutine sgelqf(M, N, A, LDA, TAU, WORK, LWORK, INFO)
SGELQF
Definition: sgelqf.f:137

Here is the call graph for this function:

Here is the caller graph for this function: