◆ sla_gerfsx_extended()

subroutine sla_gerfsx_extended	(	integer	prec_type,
		integer	trans_type,
		integer	n,
		integer	nrhs,
		real, dimension( lda, * )	a,
		integer	lda,
		real, dimension( ldaf, * )	af,
		integer	ldaf,
		integer, dimension( * )	ipiv,
		logical	colequ,
		real, dimension( * )	c,
		real, dimension( ldb, * )	b,
		integer	ldb,
		real, dimension( ldy, * )	y,
		integer	ldy,
		real, dimension( * )	berr_out,
		integer	n_norms,
		real, dimension( nrhs, * )	errs_n,
		real, dimension( nrhs, * )	errs_c,
		real, dimension( * )	res,
		real, dimension( * )	ayb,
		real, dimension( * )	dy,
		real, dimension( * )	y_tail,
		real	rcond,
		integer	ithresh,
		real	rthresh,
		real	dz_ub,
		logical	ignore_cwise,
		integer	info
	)

SLA_GERFSX_EXTENDED improves the computed solution to a system of linear equations for general matrices by performing extra-precise iterative refinement and provides error bounds and backward error estimates for the solution.

Download SLA_GERFSX_EXTENDED + dependencies [TGZ] [ZIP] [TXT]

Purpose:

 SLA_GERFSX_EXTENDED improves the computed solution to a system of
 linear equations by performing extra-precise iterative refinement
 and provides error bounds and backward error estimates for the solution.
 This subroutine is called by SGERFSX to perform iterative refinement.
 In addition to normwise error bound, the code provides maximum
 componentwise error bound if possible. See comments for ERRS_N
 and ERRS_C for details of the error bounds. Note that this
 subroutine is only responsible for setting the second fields of
 ERRS_N and ERRS_C.

Parameters

[in]	PREC_TYPE	PREC_TYPE is INTEGER Specifies the intermediate precision to be used in refinement. The value is defined by ILAPREC(P) where P is a CHARACTER and P = 'S': Single = 'D': Double = 'I': Indigenous = 'X' or 'E': Extra
[in]	TRANS_TYPE	TRANS_TYPE is INTEGER Specifies the transposition operation on A. The value is defined by ILATRANS(T) where T is a CHARACTER and T = 'N': No transpose = 'T': Transpose = 'C': Conjugate transpose
[in]	N	N is INTEGER The number of linear equations, i.e., the order of the matrix A. N >= 0.
[in]	NRHS	NRHS is INTEGER The number of right-hand-sides, i.e., the number of columns of the matrix B.
[in]	A	A is REAL array, dimension (LDA,N) On entry, the N-by-N matrix A.
[in]	LDA	LDA is INTEGER The leading dimension of the array A. LDA >= max(1,N).
[in]	AF	AF is REAL array, dimension (LDAF,N) The factors L and U from the factorization A = PLU as computed by SGETRF.
[in]	LDAF	LDAF is INTEGER The leading dimension of the array AF. LDAF >= max(1,N).
[in]	IPIV	IPIV is INTEGER array, dimension (N) The pivot indices from the factorization A = PLU as computed by SGETRF; row i of the matrix was interchanged with row IPIV(i).
[in]	COLEQU	COLEQU is LOGICAL If .TRUE. then column equilibration was done to A before calling this routine. This is needed to compute the solution and error bounds correctly.
[in]	C	C is REAL array, dimension (N) The column scale factors for A. If COLEQU = .FALSE., C is not accessed. If C is input, each element of C should be a power of the radix to ensure a reliable solution and error estimates. Scaling by powers of the radix does not cause rounding errors unless the result underflows or overflows. Rounding errors during scaling lead to refining with a matrix that is not equivalent to the input matrix, producing error estimates that may not be reliable.
[in]	B	B is REAL array, dimension (LDB,NRHS) The right-hand-side matrix B.
[in]	LDB	LDB is INTEGER The leading dimension of the array B. LDB >= max(1,N).
[in,out]	Y	Y is REAL array, dimension (LDY,NRHS) On entry, the solution matrix X, as computed by SGETRS. On exit, the improved solution matrix Y.
[in]	LDY	LDY is INTEGER The leading dimension of the array Y. LDY >= max(1,N).
[out]	BERR_OUT	BERR_OUT is REAL array, dimension (NRHS) On exit, BERR_OUT(j) contains the componentwise relative backward error for right-hand-side j from the formula max(i) ( abs(RES(i)) / ( abs(op(A_s))*abs(Y) + abs(B_s) )(i) ) where abs(Z) is the componentwise absolute value of the matrix or vector Z. This is computed by SLA_LIN_BERR.
[in]	N_NORMS	N_NORMS is INTEGER Determines which error bounds to return (see ERRS_N and ERRS_C). If N_NORMS >= 1 return normwise error bounds. If N_NORMS >= 2 return componentwise error bounds.
[in,out]	ERRS_N	ERRS_N is REAL array, dimension (NRHS, N_ERR_BNDS) For each right-hand side, this array contains information about various error bounds and condition numbers corresponding to the normwise relative error, which is defined as follows: Normwise relative error in the ith solution vector: max_j (abs(XTRUE(j,i) - X(j,i))) ------------------------------ max_j abs(X(j,i)) The array is indexed by the type of error information as described below. There currently are up to three pieces of information returned. The first index in ERRS_N(i,:) corresponds to the ith right-hand side. The second index in ERRS_N(:,err) contains the following three fields: err = 1 "Trust/don't trust" boolean. Trust the answer if the reciprocal condition number is less than the threshold sqrt(n) * slamch('Epsilon'). err = 2 "Guaranteed" error bound: The estimated forward error, almost certainly within a factor of 10 of the true error so long as the next entry is greater than the threshold sqrt(n) * slamch('Epsilon'). This error bound should only be trusted if the previous boolean is true. err = 3 Reciprocal condition number: Estimated normwise reciprocal condition number. Compared with the threshold sqrt(n) * slamch('Epsilon') to determine if the error estimate is "guaranteed". These reciprocal condition numbers are 1 / (norm(Z^{-1},inf) * norm(Z,inf)) for some appropriately scaled matrix Z. Let Z = S*A, where S scales each row by a power of the radix so all absolute row sums of Z are approximately 1. This subroutine is only responsible for setting the second field above. See Lapack Working Note 165 for further details and extra cautions.
[in,out]	ERRS_C	ERRS_C is REAL array, dimension (NRHS, N_ERR_BNDS) For each right-hand side, this array contains information about various error bounds and condition numbers corresponding to the componentwise relative error, which is defined as follows: Componentwise relative error in the ith solution vector: abs(XTRUE(j,i) - X(j,i)) max_j ---------------------- abs(X(j,i)) The array is indexed by the right-hand side i (on which the componentwise relative error depends), and the type of error information as described below. There currently are up to three pieces of information returned for each right-hand side. If componentwise accuracy is not requested (PARAMS(3) = 0.0), then ERRS_C is not accessed. If N_ERR_BNDS < 3, then at most the first (:,N_ERR_BNDS) entries are returned. The first index in ERRS_C(i,:) corresponds to the ith right-hand side. The second index in ERRS_C(:,err) contains the following three fields: err = 1 "Trust/don't trust" boolean. Trust the answer if the reciprocal condition number is less than the threshold sqrt(n) * slamch('Epsilon'). err = 2 "Guaranteed" error bound: The estimated forward error, almost certainly within a factor of 10 of the true error so long as the next entry is greater than the threshold sqrt(n) * slamch('Epsilon'). This error bound should only be trusted if the previous boolean is true. err = 3 Reciprocal condition number: Estimated componentwise reciprocal condition number. Compared with the threshold sqrt(n) * slamch('Epsilon') to determine if the error estimate is "guaranteed". These reciprocal condition numbers are 1 / (norm(Z^{-1},inf) * norm(Z,inf)) for some appropriately scaled matrix Z. Let Z = S(Adiag(x)), where x is the solution for the current right-hand side and S scales each row of A*diag(x) by a power of the radix so all absolute row sums of Z are approximately 1. This subroutine is only responsible for setting the second field above. See Lapack Working Note 165 for further details and extra cautions.
[in]	RES	RES is REAL array, dimension (N) Workspace to hold the intermediate residual.
[in]	AYB	AYB is REAL array, dimension (N) Workspace. This can be the same workspace passed for Y_TAIL.
[in]	DY	DY is REAL array, dimension (N) Workspace to hold the intermediate solution.
[in]	Y_TAIL	Y_TAIL is REAL array, dimension (N) Workspace to hold the trailing bits of the intermediate solution.
[in]	RCOND	RCOND is REAL Reciprocal scaled condition number. This is an estimate of the reciprocal Skeel condition number of the matrix A after equilibration (if done). If this is less than the machine precision (in particular, if it is zero), the matrix is singular to working precision. Note that the error may still be small even if this number is very small and the matrix appears ill- conditioned.
[in]	ITHRESH	ITHRESH is INTEGER The maximum number of residual computations allowed for refinement. The default is 10. For 'aggressive' set to 100 to permit convergence using approximate factorizations or factorizations other than LU. If the factorization uses a technique other than Gaussian elimination, the guarantees in ERRS_N and ERRS_C may no longer be trustworthy.
[in]	RTHRESH	RTHRESH is REAL Determines when to stop refinement if the error estimate stops decreasing. Refinement will stop when the next solution no longer satisfies norm(dx_{i+1}) < RTHRESH * norm(dx_i) where norm(Z) is the infinity norm of Z. RTHRESH satisfies 0 < RTHRESH <= 1. The default value is 0.5. For 'aggressive' set to 0.9 to permit convergence on extremely ill-conditioned matrices. See LAWN 165 for more details.
[in]	DZ_UB	DZ_UB is REAL Determines when to start considering componentwise convergence. Componentwise convergence is only considered after each component of the solution Y is stable, which we define as the relative change in each component being less than DZ_UB. The default value is 0.25, requiring the first bit to be stable. See LAWN 165 for more details.
[in]	IGNORE_CWISE	IGNORE_CWISE is LOGICAL If .TRUE. then ignore componentwise convergence. Default value is .FALSE..
[out]	INFO	INFO is INTEGER = 0: Successful exit. < 0: if INFO = -i, the ith argument to SGETRS had an illegal value

Author: Univ. of Tennessee; Univ. of California Berkeley; Univ. of Colorado Denver; NAG Ltd.

Definition at line 391 of file sla_gerfsx_extended.f.

*
*  -- LAPACK computational routine --
*  -- LAPACK is a software package provided by Univ. of Tennessee,    --
*  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
*
*     .. Scalar Arguments ..
      INTEGER            INFO, LDA, LDAF, LDB, LDY, N, NRHS, PREC_TYPE,
     $                   TRANS_TYPE, N_NORMS, ITHRESH
      LOGICAL            COLEQU, IGNORE_CWISE
      REAL               RTHRESH, DZ_UB
*     ..
*     .. Array Arguments ..
      INTEGER            IPIV( * )
      REAL               A( LDA, * ), AF( LDAF, * ), B( LDB, * ),
     $                   Y( LDY, * ), RES( * ), DY( * ), Y_TAIL( * )
      REAL               C( * ), AYB( * ), RCOND, BERR_OUT( * ),
     $                   ERRS_N( NRHS, * ),
     $                   ERRS_C( NRHS, * )
*     ..
*
*  =====================================================================
*
*     .. Local Scalars ..
      CHARACTER          TRANS
      INTEGER            CNT, I, J, X_STATE, Z_STATE, Y_PREC_STATE
      REAL               YK, DYK, YMIN, NORMY, NORMX, NORMDX, DXRAT,
     $                   DZRAT, PREVNORMDX, PREV_DZ_Z, DXRATMAX,
     $                   DZRATMAX, DX_X, DZ_Z, FINAL_DX_X, FINAL_DZ_Z,
     $                   EPS, HUGEVAL, INCR_THRESH
      LOGICAL            INCR_PREC
*     ..
*     .. Parameters ..
      INTEGER            UNSTABLE_STATE, WORKING_STATE, CONV_STATE,
     $                   NOPROG_STATE, BASE_RESIDUAL, EXTRA_RESIDUAL,
     $                   EXTRA_Y
      parameter( unstable_state = 0, working_state = 1,
     $                   conv_state = 2, noprog_state = 3 )
      parameter( base_residual = 0, extra_residual = 1,
     $                   extra_y = 2 )
      INTEGER            FINAL_NRM_ERR_I, FINAL_CMP_ERR_I, BERR_I
      INTEGER            RCOND_I, NRM_RCOND_I, NRM_ERR_I, CMP_RCOND_I
      INTEGER            CMP_ERR_I, PIV_GROWTH_I
      parameter( final_nrm_err_i = 1, final_cmp_err_i = 2,
     $                   berr_i = 3 )
      parameter( rcond_i = 4, nrm_rcond_i = 5, nrm_err_i = 6 )
      parameter( cmp_rcond_i = 7, cmp_err_i = 8,
     $                   piv_growth_i = 9 )
      INTEGER            LA_LINRX_ITREF_I, LA_LINRX_ITHRESH_I,
     $                   LA_LINRX_CWISE_I
      parameter( la_linrx_itref_i = 1,
     $                   la_linrx_ithresh_i = 2 )
      parameter( la_linrx_cwise_i = 3 )
      INTEGER            LA_LINRX_TRUST_I, LA_LINRX_ERR_I,
     $                   LA_LINRX_RCOND_I
      parameter( la_linrx_trust_i = 1, la_linrx_err_i = 2 )
      parameter( la_linrx_rcond_i = 3 )
*     ..
*     .. External Subroutines ..
      EXTERNAL           saxpy, scopy, sgetrs, sgemv, blas_sgemv_x,
     $                   blas_sgemv2_x, sla_geamv, sla_wwaddw, slamch,
     $                   chla_transtype, sla_lin_berr
      REAL               SLAMCH
      CHARACTER          CHLA_TRANSTYPE
*     ..
*     .. Intrinsic Functions ..
      INTRINSIC          abs, max, min
*     ..
*     .. Executable Statements ..
*
      IF ( info.NE.0 ) RETURN
      trans = chla_transtype(trans_type)
      eps = slamch( 'Epsilon' )
      hugeval = slamch( 'Overflow' )
*     Force HUGEVAL to Inf
      hugeval = hugeval * hugeval
*     Using HUGEVAL may lead to spurious underflows.
      incr_thresh = real( n ) * eps
*
      DO j = 1, nrhs
         y_prec_state = extra_residual
         IF ( y_prec_state .EQ. extra_y ) THEN
            DO i = 1, n
               y_tail( i ) = 0.0
            END DO
         END IF
 
         dxrat = 0.0
         dxratmax = 0.0
         dzrat = 0.0
         dzratmax = 0.0
         final_dx_x = hugeval
         final_dz_z = hugeval
         prevnormdx = hugeval
         prev_dz_z = hugeval
         dz_z = hugeval
         dx_x = hugeval
 
         x_state = working_state
         z_state = unstable_state
         incr_prec = .false.
 
         DO cnt = 1, ithresh
*
*         Compute residual RES = B_s - op(A_s) * Y,
*             op(A) = A, A**T, or A**H depending on TRANS (and type).
*
            CALL scopy( n, b( 1, j ), 1, res, 1 )
            IF ( y_prec_state .EQ. base_residual ) THEN
               CALL sgemv( trans, n, n, -1.0, a, lda, y( 1, j ), 1,
     $              1.0, res, 1 )
            ELSE IF ( y_prec_state .EQ. extra_residual ) THEN
               CALL blas_sgemv_x( trans_type, n, n, -1.0, a, lda,
     $              y( 1, j ), 1, 1.0, res, 1, prec_type )
            ELSE
               CALL blas_sgemv2_x( trans_type, n, n, -1.0, a, lda,
     $              y( 1, j ), y_tail, 1, 1.0, res, 1, prec_type )
            END IF
 
!        XXX: RES is no longer needed.
            CALL scopy( n, res, 1, dy, 1 )
            CALL sgetrs( trans, n, 1, af, ldaf, ipiv, dy, n, info )
*
*         Calculate relative changes DX_X, DZ_Z and ratios DXRAT, DZRAT.
*
            normx = 0.0
            normy = 0.0
            normdx = 0.0
            dz_z = 0.0
            ymin = hugeval
*
            DO i = 1, n
               yk = abs( y( i, j ) )
               dyk = abs( dy( i ) )
 
               IF ( yk .NE. 0.0 ) THEN
                  dz_z = max( dz_z, dyk / yk )
               ELSE IF ( dyk .NE. 0.0 ) THEN
                  dz_z = hugeval
               END IF
 
               ymin = min( ymin, yk )
 
               normy = max( normy, yk )
 
               IF ( colequ ) THEN
                  normx = max( normx, yk * c( i ) )
                  normdx = max( normdx, dyk * c( i ) )
               ELSE
                  normx = normy
                  normdx = max( normdx, dyk )
               END IF
            END DO
 
            IF ( normx .NE. 0.0 ) THEN
               dx_x = normdx / normx
            ELSE IF ( normdx .EQ. 0.0 ) THEN
               dx_x = 0.0
            ELSE
               dx_x = hugeval
            END IF
 
            dxrat = normdx / prevnormdx
            dzrat = dz_z / prev_dz_z
*
*         Check termination criteria
*
            IF (.NOT.ignore_cwise
     $           .AND. ymin*rcond .LT. incr_thresh*normy
     $           .AND. y_prec_state .LT. extra_y)
     $           incr_prec = .true.
 
            IF ( x_state .EQ. noprog_state .AND. dxrat .LE. rthresh )
     $           x_state = working_state
            IF ( x_state .EQ. working_state ) THEN
               IF ( dx_x .LE. eps ) THEN
                  x_state = conv_state
               ELSE IF ( dxrat .GT. rthresh ) THEN
                  IF ( y_prec_state .NE. extra_y ) THEN
                     incr_prec = .true.
                  ELSE
                     x_state = noprog_state
                  END IF
               ELSE
                  IF ( dxrat .GT. dxratmax ) dxratmax = dxrat
               END IF
               IF ( x_state .GT. working_state ) final_dx_x = dx_x
            END IF
 
            IF ( z_state .EQ. unstable_state .AND. dz_z .LE. dz_ub )
     $           z_state = working_state
            IF ( z_state .EQ. noprog_state .AND. dzrat .LE. rthresh )
     $           z_state = working_state
            IF ( z_state .EQ. working_state ) THEN
               IF ( dz_z .LE. eps ) THEN
                  z_state = conv_state
               ELSE IF ( dz_z .GT. dz_ub ) THEN
                  z_state = unstable_state
                  dzratmax = 0.0
                  final_dz_z = hugeval
               ELSE IF ( dzrat .GT. rthresh ) THEN
                  IF ( y_prec_state .NE. extra_y ) THEN
                     incr_prec = .true.
                  ELSE
                     z_state = noprog_state
                  END IF
               ELSE
                  IF ( dzrat .GT. dzratmax ) dzratmax = dzrat
               END IF
               IF ( z_state .GT. working_state ) final_dz_z = dz_z
            END IF
*
*           Exit if both normwise and componentwise stopped working,
*           but if componentwise is unstable, let it go at least two
*           iterations.
*
            IF ( x_state.NE.working_state ) THEN
               IF ( ignore_cwise) GOTO 666
               IF ( z_state.EQ.noprog_state .OR. z_state.EQ.conv_state )
     $              GOTO 666
               IF ( z_state.EQ.unstable_state .AND. cnt.GT.1 ) GOTO 666
            END IF
 
            IF ( incr_prec ) THEN
               incr_prec = .false.
               y_prec_state = y_prec_state + 1
               DO i = 1, n
                  y_tail( i ) = 0.0
               END DO
            END IF
 
            prevnormdx = normdx
            prev_dz_z = dz_z
*
*           Update solution.
*
            IF ( y_prec_state .LT. extra_y ) THEN
               CALL saxpy( n, 1.0, dy, 1, y( 1, j ), 1 )
            ELSE
               CALL sla_wwaddw( n, y( 1, j ), y_tail, dy )
            END IF
 
         END DO
*        Target of "IF (Z_STOP .AND. X_STOP)".  Sun's f77 won't EXIT.
 666     CONTINUE
*
*     Set final_* when cnt hits ithresh.
*
         IF ( x_state .EQ. working_state ) final_dx_x = dx_x
         IF ( z_state .EQ. working_state ) final_dz_z = dz_z
*
*     Compute error bounds
*
         IF (n_norms .GE. 1) THEN
            errs_n( j, la_linrx_err_i ) =
     $           final_dx_x / (1 - dxratmax)
         END IF
         IF ( n_norms .GE. 2 ) THEN
            errs_c( j, la_linrx_err_i ) =
     $           final_dz_z / (1 - dzratmax)
         END IF
*
*     Compute componentwise relative backward error from formula
*         max(i) ( abs(R(i)) / ( abs(op(A_s))*abs(Y) + abs(B_s) )(i) )
*     where abs(Z) is the componentwise absolute value of the matrix
*     or vector Z.
*
*         Compute residual RES = B_s - op(A_s) * Y,
*             op(A) = A, A**T, or A**H depending on TRANS (and type).
*
         CALL scopy( n, b( 1, j ), 1, res, 1 )
         CALL sgemv( trans, n, n, -1.0, a, lda, y(1,j), 1, 1.0, res, 1 )
 
         DO i = 1, n
            ayb( i ) = abs( b( i, j ) )
         END DO
*
*     Compute abs(op(A_s))*abs(Y) + abs(B_s).
*
         CALL sla_geamv ( trans_type, n, n, 1.0,
     $        a, lda, y(1, j), 1, 1.0, ayb, 1 )
 
         CALL sla_lin_berr ( n, n, 1, res, ayb, berr_out( j ) )
*
*     End of loop for each RHS.
*
      END DO
*
      RETURN
*
*     End of SLA_GERFSX_EXTENDED
*

Here is the call graph for this function:

Here is the caller graph for this function: