C ALGORITHM 729, COLLECTED ALGORITHMS FROM ACM. C THIS WORK PUBLISHED IN TRANSACTIONS ON MATHEMATICAL SOFTWARE, C VOL. 20, NO. 1, MARCH, 1994, P. 160. This Readme-file dated: 10.15.90 This floppy disk contains all the necessary Fortran main programs and subroutines for the extended Levinson algorithms for symmetric and general Toeplitz systems. The programs are collected in one file, which includes two main test-programs and all the necessary subroutines. They appear in alphabetical order. The main test programs are: dsytst - tests the subroutines for symmetric Toeplitz matrices, dgetst - tests the subroutines for general Toeplitz matrices. There are six subroutines for solving symmetric and general Toeplitz systems with the classical and the extended Levinson algorithms: dsytcl - solves a symmetric Toeplitz system using the classical Levinson algorithm, including algorithm and matrix condition estimation, dsyte2 - solves a symmetric Toeplitz system using the extended Levinson algorithm; this subroutine is designed for block steps of size 1 and 2 only, dsytep - solves a symmetric Toeplitz system using the extended Levinson algorithm; this subroutine allows a general, user-specified, maximum block size, dgetcl - solves a general Toeplitz system using the classical Levinson algorithm, including algorithm and matrix condition estimation, dgete2 - solves a general Toeplitz system using the extended Levinson algorithm; this subroutine is designed for block steps of size 1 and 2 only, dgetep - solves a general Toeplitz system using the extended Levinson algorithm; this subroutine is designed for user-specified, maximum block size. All six subroutines are documented in the companion report: P. C. Hansen & T. F. Chan, "Fortran Subroutines for General Toeplitz Systems", Report CAM-90-21, Dept. of Mathematics, UCLA, September 1990; submitted to ACM Trans. Math. Software. There is one subroutine from the Argonne Toeplitz package: tsld1 - solves a general Toeplitz system using the classical Levinson algorithm. This subroutine is described in: O. B. Arushanian et al., "The Toeplitz Package Users' Guide", Report ANL-83-16, Mathematics and Computer Science Division, Argonne National Laboratory; October 1983. There is one subroutine for generating random numbers: urand - from G. E. Forsythe, M. A. Malcolm & C. B. Moler, "Computer Methods for Mathematical Computations", Prentice-Hall, 1977. There are 2 subroutines from LINPACK: dgefa - computes the LU-factorization of a general matrix, dgesl - solves a system of linear algebraic equations, using the LU-factorization from dgefa. These subroutines are described in: J. J. Dongarra et al., "LINPACK Users' Guide", SIAM, 1979. There are several subroutines from the Level 1 BLAS as listed in the LINPACK Users' Guide. These BLAS are collected in one file: dblas - contains all the necessary Level 1 BLAS. There are 9 miscellaneous utility subroutines written by us: dgesm - estimates the smallest singular value of a general matrix, dgesv2 - estimates the smallest singular value of a general 2-by-2 matrix, dsysv2 - estimates the smallest singular value of a symmetric 2-by-2 matrix, dgesl2 - solves a general linear system of two equations with two unknowns, dtplz - given the first row and column of a Toeplitz matrix, this subroutine generates the complete Toeplitz matrix, dsykms - generates the symmetric KMS test matrices from W. F. Trench, "Numerical Solution of the Eigenvalue Problem for Hermitian Toeplitz Matrices", SIAM J. Matrix Anal. Appl. 10 (1989), 135-146, dsyswe - generates the symmetric test matrix from D. R. Sweet, "The Use of pivoting to improve the numerical performance of Toeplitz solvers", SPIE Vol. 696 Advanced Algorithms and Architectures for Signal Processing (1986), dgeswe - generates the general test matrices from the same paper, timing - a subroutine wherein the user must put a call to the specific timing subroutine on the particular computer. These subroutines are described in the above-mentioned manuscript by P. C. Hansen and T. F. Chan. To run the test programs, simply modify the subroutine timing, compile the programs, and execute them. They do not require any input. The results are written on the standard output. The test programs compare our subroutines with the one from the Toeplitz package, concerning accuracy of the computed solution, and computational overhead involved in condition estimation, test for ill-conditioned leading principal submatrices, and performing block steps. Below are sample outputs from the above-mentioned main programs, produced on an Alliant FX/8 computer. ************************************************** * Numerical test of extended Levinson algorithms * * for symmetric Toeplitz matrices. * * Computer: Alliant FX/8. * ************************************************** Legend: 1 - tsld1 (from Argonne Toeplitz package) 2 - dsytcl, classical Levinson algorithm with condition estimation 3 - dsyte2, extended Levinson algorithm with fixed PMAX = 2 4 - dsytep, extended Levinson algorithm, general version with PMAX = 4 5 - dsytep, extended Levinson algorithm, general version with PMAX = 6 6 - dsytep, extended Levinson algorithm, general version with PMAX = 8 Test matrices with random elements: N case error CONDA CONDM PSTEPS time overhead --------------------------------------------------------------------- 512 1 2.82E-11 5.0 00.0% 512 2 1.97E-10 3.06E+05 2.43E+02 5.7 14.0% 00.0% 512 3 4.89E-11 2.04E+04 2.43E+02 6 6.3 26.6% 11.1% 512 4 4.89E-11 2.04E+04 2.43E+02 6 6.9 39.1% 22.0% 512 5 4.89E-11 2.04E+04 2.43E+02 6 6.9 39.0% 21.9% 512 6 4.89E-11 2.04E+04 2.43E+02 6 7.0 39.4% 22.3% --------------------------------------------------------------------- 1024 1 5.39E-10 19.9 00.0% 1024 2 1.30E-09 2.99E+07 2.56E+03 22.7 13.7% 00.0% 1024 3 2.31E-10 7.04E+05 2.56E+03 7 25.9 29.9% 14.2% 1024 4 4.90E-10 7.85E+04 2.56E+03 10 27.6 38.5% 21.8% 1024 5 4.90E-10 7.85E+04 2.56E+03 10 27.6 38.6% 21.9% 1024 6 4.90E-10 7.85E+04 2.56E+03 10 27.7 38.9% 22.1% --------------------------------------------------------------------- 2048 1 5.84E-08 80.5 00.0% 2048 2 5.35E-08 1.46E+07 3.09E+03 91.0 13.0% 00.0% 2048 3 1.02E-09 2.59E+05 3.09E+03 6 102.1 26.7% 12.2% 2048 4 1.02E-09 2.59E+05 3.09E+03 6 110.6 37.4% 21.6% 2048 5 1.02E-09 2.59E+05 3.09E+03 6 110.6 37.3% 21.5% 2048 6 1.02E-09 2.59E+05 3.09E+03 6 111.0 37.8% 22.0% --------------------------------------------------------------------- Test matrix from Sweet (1986): N case error CONDA CONDM PSTEPS time overhead --------------------------------------------------------------------- 6 1 2.00E-02 0.0 6 2 2.00E-02 1.05E+15 3.35E+00 0.0 6 3 2.87E-16 9.94E+00 3.40E+00 1 0.0 6 4 2.87E-16 9.94E+00 3.40E+00 1 0.0 6 5 2.87E-16 9.94E+00 3.40E+00 1 0.0 6 6 2.87E-16 9.94E+00 3.40E+00 1 0.0 --------------------------------------------------------------------- Test matrices from Trench (1989): N case error CONDA CONDM PSTEPS time overhead --------------------------------------------------------------------- 512 1 2.34E-02 5.0 00.0% 512 2 2.43E-02 2.05E+14 3.33E+00 5.7 14.0% 00.0% 512 3 2.71E-14 2.00E+00 2.00E+00 171 6.8 36.3% 19.6% 512 4 2.71E-14 2.00E+00 2.00E+00 171 7.6 53.2% 34.4% 512 5 2.71E-14 2.00E+00 2.00E+00 171 7.6 52.5% 33.8% 512 6 2.71E-14 2.00E+00 2.00E+00 171 7.6 52.7% 34.0% --------------------------------------------------------------------- 1024 1 1.53E-01 20.0 00.0% 1024 2 2.28E-02 2.05E+14 1.20E+01 22.7 13.3% 00.0% 1024 3 3.33E-03 1.40E+11 1.40E+11 341 27.6 37.9% 21.7% 1024 4 3.33E-03 1.40E+11 1.40E+11 341 30.3 51.4% 33.6% 1024 5 3.33E-03 1.40E+11 1.40E+11 341 30.3 51.3% 33.5% 1024 6 3.33E-03 1.40E+11 1.40E+11 341 30.4 51.6% 33.7% --------------------------------------------------------------------- 2048 1 2.63E-02 80.8 00.0% 2048 2 2.46E-02 2.05E+14 6.62E+00 90.9 12.4% 00.0% 2048 3 1.53E-13 2.00E+00 2.00E+00 683 109.9 36.0% 20.9% 2048 4 1.53E-13 2.00E+00 2.00E+00 683 122.0 50.9% 34.2% 2048 5 1.53E-13 2.00E+00 2.00E+00 683 122.1 51.0% 34.3% 2048 6 1.53E-13 2.00E+00 2.00E+00 683 122.1 51.1% 34.4% --------------------------------------------------------------------- FORTRAN STOP ************************************************** * Numerical test of extended Levinson algorithms * * for nonsymmetric Toeplitz matrices. * * Computer: Alliant FX/8. * ************************************************** Legend: 1 - tsld1 (from Argonne Toeplitz package) 2 - dgetcl, classical Levinson algorithm with condition estimation 3 - dgete2, extended Levinson algorithm with fixed PMAX = 2 4 - dgetep, extended Levinson algorithm, general version with PMAX = 4 5 - dgetep, extended Levinson algorithm, general version with PMAX = 6 6 - dgetep, extended Levinson algorithm, general version with PMAX = 8 Test matrices with random elements: N case error CONDA CONDM PSTEPS time overhead --------------------------------------------------------------------- 512 1 2.27E-09 5.1 00.0% 512 2 5.34E-08 3.17E+07 5.28E+02 9.2 81.9% 00.0% 512 3 5.34E-08 3.17E+07 5.28E+02 0 10.4 106.3% 13.4% 512 4 9.77E-08 2.04E+07 5.28E+02 1 11.2 121.5% 21.8% 512 5 9.77E-08 2.04E+07 5.28E+02 1 11.0 117.4% 19.5% 512 6 1.26E-10 2.81E+05 5.28E+02 2 11.1 119.1% 20.4% --------------------------------------------------------------------- 1024 1 3.82E-06 20.3 00.0% 1024 2 4.06E-05 6.15E+10 3.63E+02 36.7 80.8% 00.0% 1024 3 8.96E-10 8.09E+07 3.63E+02 2 41.6 104.9% 13.3% 1024 4 8.59E-11 1.41E+05 3.63E+02 2 44.9 120.8% 22.1% 1024 5 1.09E-10 1.41E+05 3.63E+02 2 44.0 116.4% 19.7% 1024 6 1.09E-10 1.41E+05 3.63E+02 2 45.1 121.8% 22.7% --------------------------------------------------------------------- 2048 1 2.90E-10 82.9 00.0% 2048 2 3.42E-10 1.53E+05 4.71E+01 145.2 75.2% 00.0% 2048 3 1.57E-10 1.53E+05 4.71E+01 3 167.1 101.6% 15.1% 2048 4 1.57E-10 1.53E+05 4.71E+01 3 177.8 114.6% 22.5% 2048 5 1.57E-10 1.53E+05 4.71E+01 3 178.1 114.9% 22.7% 2048 6 1.57E-10 1.53E+05 4.71E+01 3 177.7 114.4% 22.4% --------------------------------------------------------------------- Test matrices from Sweet (1986): N case error CONDA CONDM PSTEPS time overhead --------------------------------------------------------------------- 6 1 3.51E-02 0.0 6 2 4.42E-03 4.83E+14 2.17E+01 0.0 6 3 8.88E-15 1.03E+02 2.18E+01 1 0.0 6 4 8.79E-16 2.35E+01 2.18E+01 1 0.0 6 5 8.79E-16 2.35E+01 2.18E+01 1 0.0 6 6 8.79E-16 2.35E+01 2.18E+01 1 0.0 --------------------------------------------------------------------- 6 1 4.87E-01 0.0 6 2 3.01E-01 4.13E+16 7.38E+00 0.0 6 3 2.76E-16 1.41E+01 1.17E+01 1 0.0 6 4 2.76E-16 1.41E+01 1.17E+01 1 0.0 6 5 2.76E-16 1.41E+01 1.17E+01 1 0.0 6 6 2.76E-16 1.41E+01 1.17E+01 1 0.0 --------------------------------------------------------------------- 13 1 4.66E-10 0.0 13 2 2.95E-10 4.44E+06 1.74E+01 0.0 13 3 3.20E-10 4.44E+06 1.74E+01 1 0.0 13 4 1.53E-10 4.44E+06 1.74E+01 2 0.0 13 5 5.85E-14 2.34E+03 1.74E+01 2 0.0 13 6 2.22E-15 1.29E+02 1.74E+01 2 0.1 --------------------------------------------------------------------- Test matrices from Trench (1989): N case error CONDA CONDM PSTEPS time overhead --------------------------------------------------------------------- 512 1 2.34E-02 5.1 00.0% 512 2 2.43E-02 2.05E+14 3.33E+00 9.2 81.7% 00.0% 512 3 3.54E-13 2.00E+00 2.00E+00 171 11.3 122.9% 22.7% 512 4 3.54E-13 2.00E+00 2.00E+00 171 12.3 143.1% 33.8% 512 5 3.54E-13 2.00E+00 2.00E+00 171 12.2 139.9% 32.1% 512 6 3.54E-13 2.00E+00 2.00E+00 171 12.2 141.1% 32.7% --------------------------------------------------------------------- 1024 1 1.53E-01 20.4 00.0% 1024 2 2.28E-02 2.05E+14 1.20E+01 36.7 80.4% 00.0% 1024 3 7.83E-03 4.35E+10 4.35E+10 341 44.5 118.5% 21.1% 1024 4 7.83E-03 4.35E+10 4.35E+10 341 49.1 141.0% 33.6% 1024 5 7.83E-03 4.35E+10 4.35E+10 341 48.6 138.8% 32.3% 1024 6 7.83E-03 4.35E+10 4.35E+10 341 49.3 142.0% 34.1% --------------------------------------------------------------------- 2048 1 2.63E-02 82.9 00.0% 2048 2 2.46E-02 2.05E+14 6.62E+00 144.5 74.3% 00.0% 2048 3 6.90E-12 2.00E+00 2.00E+00 683 177.4 113.9% 22.7% 2048 4 6.90E-12 2.00E+00 2.00E+00 683 194.4 134.4% 34.5% 2048 5 6.90E-12 2.00E+00 2.00E+00 683 194.3 134.3% 34.4% 2048 6 6.90E-12 2.00E+00 2.00E+00 683 193.6 133.4% 33.9% --------------------------------------------------------------------- FORTRAN STOP Below is a makefile which can be used on Unix systems: dsytst: dsytst.o dsytep.o dblas.o dgefa.o dgesl.o dgesm.o dtplz.o \ timing.o urand.o dgesv2.o dsyte2.o dsytcl.o dgesl2.f dsykms.o \ dsyswe.o dsysv2.o tsld1.o fortran -o dsytst \ dsytst.o dsytep.o dblas.o dgefa.o dgesl.o dgesm.o dtplz.o \ timing.o urand.o dgesv2.o dsyte2.o dsytcl.o dgesl2.f dsykms.o \ dsyswe.o dsysv2.o tsld1.o dgetst: dgetst.o dgetep.o dblas.o dgefa.o dgesl.o dgesm.o dtplz.o \ timing.o urand.o dgesv2.o dgete2.o dgetcl.o dgesl2.f dsykms.o \ dgeswe.o dsysv2.o tsld1.o fortran -o dgetst \ dgetst.o dgetep.o dblas.o dgefa.o dgesl.o dgesm.o dtplz.o \ timing.o urand.o dgesv2.o dgete2.o dgetcl.o dgesl2.f dsykms.o \ dgeswe.o dsysv2.o tsld1.o Per Christian Hansen Tony F. Chan UNI-C (University Computing Center) Dept. of Mathematics Building 305 UCLA Technical University of Denmark 405 Hilgard Ave. DK-2800 Lyngby Los Angeles Denmark CA 90024, USA Tel.: +45 45.93.83.55 Tel.: +1 (213) 825-2601 Email: neupch@wuli.uni-c.dk Email: chan@math.ucla.edu c c DASUM--------------------------------------------------------------- c double precision function dasum(n,dx,incx) c c takes the sum of the absolute values. c jack dongarra, linpack, 3/11/78. c double precision dx(1),dtemp integer i,incx,m,mp1,n,nincx c dasum = 0.0d0 dtemp = 0.0d0 if(n.le.0)return if(incx.eq.1)go to 20 c c code for increment not equal to 1 c nincx = n*incx do 10 i = 1,nincx,incx dtemp = dtemp + dabs(dx(i)) 10 continue dasum = dtemp return c c code for increment equal to 1 c c c clean-up loop c 20 m = mod(n,6) if( m .eq. 0 ) go to 40 do 30 i = 1,m dtemp = dtemp + dabs(dx(i)) 30 continue if( n .lt. 6 ) go to 60 40 mp1 = m + 1 do 50 i = mp1,n,6 dtemp = dtemp + dabs(dx(i)) + dabs(dx(i + 1)) + dabs(dx(i + 2)) * + dabs(dx(i + 3)) + dabs(dx(i + 4)) + dabs(dx(i + 5)) 50 continue 60 dasum = dtemp return end c c DAXPY--------------------------------------------------------------- c subroutine daxpy(n,da,dx,incx,dy,incy) c c constant times a vector plus a vector. c uses unrolled loops for increments equal to one. c jack dongarra, linpack, 3/11/78. c double precision dx(1),dy(1),da integer i,incx,incy,ix,iy,m,mp1,n c if(n.le.0)return if (da .eq. 0.0d0) return if(incx.eq.1.and.incy.eq.1)go to 20 c c code for unequal increments or equal increments c not equal to 1 c ix = 1 iy = 1 if(incx.lt.0)ix = (-n+1)*incx + 1 if(incy.lt.0)iy = (-n+1)*incy + 1 do 10 i = 1,n dy(iy) = dy(iy) + da*dx(ix) ix = ix + incx iy = iy + incy 10 continue return c c code for both increments equal to 1 c c c clean-up loop c 20 m = mod(n,4) if( m .eq. 0 ) go to 40 do 30 i = 1,m dy(i) = dy(i) + da*dx(i) 30 continue if( n .lt. 4 ) return 40 mp1 = m + 1 do 50 i = mp1,n,4 dy(i) = dy(i) + da*dx(i) dy(i + 1) = dy(i + 1) + da*dx(i + 1) dy(i + 2) = dy(i + 2) + da*dx(i + 2) dy(i + 3) = dy(i + 3) + da*dx(i + 3) 50 continue return end c c DDOT---------------------------------------------------------------- c double precision function ddot(n,dx,incx,dy,incy) c c forms the dot product of two vectors. c uses unrolled loops for increments equal to one. c jack dongarra, linpack, 3/11/78. c double precision dx(1),dy(1),dtemp integer i,incx,incy,ix,iy,m,mp1,n c ddot = 0.0d0 dtemp = 0.0d0 if(n.le.0)return if(incx.eq.1.and.incy.eq.1)go to 20 c c code for unequal increments or equal increments c not equal to 1 c ix = 1 iy = 1 if(incx.lt.0)ix = (-n+1)*incx + 1 if(incy.lt.0)iy = (-n+1)*incy + 1 do 10 i = 1,n dtemp = dtemp + dx(ix)*dy(iy) ix = ix + incx iy = iy + incy 10 continue ddot = dtemp return c c code for both increments equal to 1 c c c clean-up loop c 20 m = mod(n,5) if( m .eq. 0 ) go to 40 do 30 i = 1,m dtemp = dtemp + dx(i)*dy(i) 30 continue if( n .lt. 5 ) go to 60 40 mp1 = m + 1 do 50 i = mp1,n,5 dtemp = dtemp + dx(i)*dy(i) + dx(i + 1)*dy(i + 1) + * dx(i + 2)*dy(i + 2) + dx(i + 3)*dy(i + 3) + dx(i + 4)*dy(i + 4) 50 continue 60 ddot = dtemp return end c c DGEFA--------------------------------------------------------------- c subroutine dgefa(a,lda,n,ipvt,info) integer lda,n,ipvt(1),info double precision a(lda,1) c c dgefa factors a double precision matrix by gaussian elimination. c c dgefa is usually called by dgeco, but it can be called c directly with a saving in time if rcond is not needed. c (time for dgeco) = (1 + 9/n)*(time for dgefa) . c c on entry c c a double precision(lda, n) c the matrix to be factored. c c lda integer c the leading dimension of the array a . c c n integer c the order of the matrix a . c c on return c c a an upper triangular matrix and the multipliers c which were used to obtain it. c the factorization can be written a = l*u where c l is a product of permutation and unit lower c triangular matrices and u is upper triangular. c c ipvt integer(n) c an integer vector of pivot indices. c c info integer c = 0 normal value. c = k if u(k,k) .eq. 0.0 . this is not an error c condition for this subroutine, but it does c indicate that dgesl or dgedi will divide by zero c if called. use rcond in dgeco for a reliable c indication of singularity. c c linpack. this version dated 08/14/78 . c cleve moler, university of new mexico, argonne national lab. c c subroutines and functions c c blas daxpy,dscal,idamax c c internal variables c double precision t integer idamax,j,k,kp1,l,nm1 c c c gaussian elimination with partial pivoting c info = 0 nm1 = n - 1 if (nm1 .lt. 1) go to 70 do 60 k = 1, nm1 kp1 = k + 1 c c find l = pivot index c l = idamax(n-k+1,a(k,k),1) + k - 1 ipvt(k) = l c c zero pivot implies this column already triangularized c if (a(l,k) .eq. 0.0d0) go to 40 c c interchange if necessary c if (l .eq. k) go to 10 t = a(l,k) a(l,k) = a(k,k) a(k,k) = t 10 continue c c compute multipliers c t = -1.0d0/a(k,k) call dscal(n-k,t,a(k+1,k),1) c c row elimination with column indexing c do 30 j = kp1, n t = a(l,j) if (l .eq. k) go to 20 a(l,j) = a(k,j) a(k,j) = t 20 continue call daxpy(n-k,t,a(k+1,k),1,a(k+1,j),1) 30 continue go to 50 40 continue info = k 50 continue 60 continue 70 continue ipvt(n) = n if (a(n,n) .eq. 0.0d0) info = n return end c c DGESL--------------------------------------------------------------- c subroutine dgesl(a,lda,n,ipvt,b,job) integer lda,n,ipvt(1),job double precision a(lda,1),b(1) c c dgesl solves the double precision system c a * x = b or trans(a) * x = b c using the factors computed by dgeco or dgefa. c c on entry c c a double precision(lda, n) c the output from dgeco or dgefa. c c lda integer c the leading dimension of the array a . c c n integer c the order of the matrix a . c c ipvt integer(n) c the pivot vector from dgeco or dgefa. c c b double precision(n) c the right hand side vector. c c job integer c = 0 to solve a*x = b , c = nonzero to solve trans(a)*x = b where c trans(a) is the transpose. c c on return c c b the solution vector x . c c error condition c c a division by zero will occur if the input factor contains a c zero on the diagonal. technically this indicates singularity c but it is often caused by improper arguments or improper c setting of lda . it will not occur if the subroutines are c called correctly and if dgeco has set rcond .gt. 0.0 c or dgefa has set info .eq. 0 . c c to compute inverse(a) * c where c is a matrix c with p columns c call dgeco(a,lda,n,ipvt,rcond,z) c if (rcond is too small) go to ... c do 10 j = 1, p c call dgesl(a,lda,n,ipvt,c(1,j),0) c 10 continue c c linpack. this version dated 08/14/78 . c cleve moler, university of new mexico, argonne national lab. c c subroutines and functions c c blas daxpy,ddot c c internal variables c double precision ddot,t integer k,kb,l,nm1 c nm1 = n - 1 if (job .ne. 0) go to 50 c c job = 0 , solve a * x = b c first solve l*y = b c if (nm1 .lt. 1) go to 30 do 20 k = 1, nm1 l = ipvt(k) t = b(l) if (l .eq. k) go to 10 b(l) = b(k) b(k) = t 10 continue call daxpy(n-k,t,a(k+1,k),1,b(k+1),1) 20 continue 30 continue c c now solve u*x = y c do 40 kb = 1, n k = n + 1 - kb b(k) = b(k)/a(k,k) t = -b(k) call daxpy(k-1,t,a(1,k),1,b(1),1) 40 continue go to 100 50 continue c c job = nonzero, solve trans(a) * x = b c first solve trans(u)*y = b c do 60 k = 1, n t = ddot(k-1,a(1,k),1,b(1),1) b(k) = (b(k) - t)/a(k,k) 60 continue c c now solve trans(l)*x = y c if (nm1 .lt. 1) go to 90 do 80 kb = 1, nm1 k = n - kb b(k) = b(k) + ddot(n-k,a(k+1,k),1,b(k+1),1) l = ipvt(k) if (l .eq. k) go to 70 t = b(l) b(l) = b(k) b(k) = t 70 continue 80 continue 90 continue 100 continue return end c c DGESL2-------------------------------------------------------------- c subroutine dgesl2(a11,a21,a12,a22,b1,b2,x1,x2,info) integer info double precision a11,a21,a12,a22,b1,b2,x1,x2 c c Subroutine dgesl1 solves a 2-by-2 system of linear equations c by means of Cramer's rule. c c Underflow or overflow may occur if the matrix is scaled very close c to either limit of double precision numbers on the particular machine. c c On entry c c a11,a21,a12,a22 double precision c the elements of the matrix. c c b1,b2 double precision c the elements of the right-hand side. c c On return c c x1,x2 double precision c the elements of the solution. c c info integer c info = 0 normal value, c info = 1 the matrix is singular c to working precision. c c This version dated 11/27/89. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Internal variables c double precision d c d = a11*a22 - a12*a21 if (d .eq. 0.0d0) then info = 1 return else info = 0 x1 = (a22*b1 - a12*b2)/d x2 = (a11*b2 - a21*b1)/d return endif c return end c c DGESM--------------------------------------------------------------- c subroutine dgesm(a,lda,n,ipvt,sigmin,z) integer lda,n,ipvt(1) double precision a(lda,1),z(1) double precision sigmin c c dgesm factors a double precision matrix by gaussian elimination c and estimates the smallest singular value of the matrix. c c on entry c c a double precision(lda, n) c the matrix to be factored. c c lda integer c the leading dimension of the array a . c c n integer c the order of the matrix a . c c on return c c a an upper triangular matrix and the multipliers c which were used to obtain it. c the factorization can be written a = l*u where c l is a product of permutation and unit lower c triangular matrices and u is upper triangular. c c ipvt integer(n) c an integer vector of pivot indices. c c sigmin double precision c an estimate of the smallest singular value of a . c c z double precision(n) c a work vector whose contents are usually unimportant. c if a is close to a singular matrix, then z is c an approximate null vector. c c Based on the linpack subroutine dgeco. c This version dated 12/05/89 . c Per Christian Hansen, UNI-C, Technical University of Denmark c c subroutines and functions c c linpack dgefa c blas daxpy,ddot,dscal,dasum c fortran dabs,dmin1,dsign c c internal variables c double precision ddot,ek,t,wk,wkm double precision s,dasum,sm,ynorm integer info,j,k,kb,kp1,l c c c factor c call dgefa(a,lda,n,ipvt,info) c c sigmin = norm(y)/norm(z) where a*z = y and trans(a)*y = e . c trans(a) is the transpose of a . the components of e are c chosen to cause maximum local growth in the elements of w where c trans(u)*w = e . the vectors are rescaled to avoid overflow c using the strategy described in r.g. grimes & j.g. lewis, c "condition number estimation for sparse matrices", SISSC 2 c (1981), 384-388. c c solve trans(u)*w = e c ek = 1.0d0 do 20 j = 1, n z(j) = 0.0d0 20 continue do 100 k = 1, n if (z(k) .ne. 0.0d0) ek = dsign(ek,-z(k)) if (dabs(ek-z(k)) .le. dabs(a(k,k))) go to 30 s = dmin1(1.0d-3,dabs(a(k,k))/dabs(ek-z(k))) call dscal(n,s,z,1) ek = s*ek 30 continue wk = ek - z(k) wkm = -ek - z(k) s = dabs(wk) sm = dabs(wkm) if (a(k,k) .eq. 0.0d0) go to 40 wk = wk/a(k,k) wkm = wkm/a(k,k) go to 50 40 continue wk = 1.0d0 wkm = 1.0d0 50 continue kp1 = k + 1 if (kp1 .gt. n) go to 90 do 60 j = kp1, n sm = sm + dabs(z(j)+wkm*a(k,j)) z(j) = z(j) + wk*a(k,j) s = s + dabs(z(j)) 60 continue if (s .ge. sm) go to 80 t = wkm - wk wk = wkm do 70 j = kp1, n z(j) = z(j) + t*a(k,j) 70 continue 80 continue 90 continue z(k) = wk 100 continue s = 1.0d0/dasum(n,z,1) call dscal(n,s,z,1) c c solve trans(l)*y = w c do 120 kb = 1, n k = n + 1 - kb if (k .lt. n) z(k) = z(k) + ddot(n-k,a(k+1,k),1,z(k+1),1) if (dabs(z(k)) .le. 1.0d0) go to 110 s = dmin1(1.0d-3,1.0d0/dabs(z(k))) call dscal(n,s,z,1) 110 continue l = ipvt(k) t = z(l) z(l) = z(k) z(k) = t 120 continue s = 1.0d0/dasum(n,z,1) call dscal(n,s,z,1) c ynorm = 1.0d0 c c solve l*v = y c do 140 k = 1, n l = ipvt(k) t = z(l) z(l) = z(k) z(k) = t if (k .lt. n) call daxpy(n-k,t,a(k+1,k),1,z(k+1),1) if (dabs(z(k)) .le. 1.0d0) go to 130 s = dmin1(1.0d-3,1.0d0/dabs(z(k))) call dscal(n,s,z,1) ynorm = s*ynorm 130 continue 140 continue s = 1.0d0/dasum(n,z,1) call dscal(n,s,z,1) ynorm = s*ynorm c c solve u*z = v c do 160 kb = 1, n k = n + 1 - kb if (dabs(z(k)) .le. dabs(a(k,k))) go to 150 s = dmin1(1.0d-3,dabs(a(k,k))/dabs(z(k))) call dscal(n,s,z,1) ynorm = s*ynorm 150 continue if (a(k,k) .ne. 0.0d0) z(k) = z(k)/a(k,k) if (a(k,k) .eq. 0.0d0) z(k) = 1.0d0 t = -z(k) call daxpy(k-1,t,a(1,k),1,z(1),1) 160 continue c s = 1.0d0/dasum(n,z,1) sigmin = s*ynorm c return end c c DGESV2-------------------------------------------------------------- c subroutine dgesv2(w,x,y,z,sig1,sig2) double precision sig1,sig2,w,x,y,z c c Compute the singular values sig1 >= sig2 >= 0 of the 2-by-2 c general matrix [ w x ] . c [ y z ] c c This version dated 03/24/1990. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c fortran dabs,dsign,dsqrt c double precision a,b,c,c1,c2,s,s1,s2,t,tau c if (x .eq. y) then call dsysv2(w,x,z,sig1,sig2) return endif c c Generate a rotation that makes the matrix symmetric. c a = w + z b = x - y if (dabs(b) .gt. dabs(a)) then t = a/b s = 1.0d0/sqrt(1.0d0 + t**2) c = s*t else t = b/a c = 1.0d0/dsqrt(1.0d0 + t**2) s = c*t endif a = s*(x + y) + c*(z - w) b = 2.0d0*(c*x - s*z) c c Then diagonalize the symmetric matrix. c if (b .eq. 0.0d0) then c2 = 1.0d0 s2 = 0.0d0 else if (dabs(b) .gt. dabs(a)) then tau = a/b t = dsign(1.0d0,tau)/(dabs(tau) + dsqrt(1.0d0 + tau**2)) else tau = b/a t = tau/(1.0d0 + dsqrt(1.0d0 + tau**2)) endif c2 = 1.0d0/dsqrt(1.0d0 + t**2) s2 = t*c2 endif c c1 = c2*c - s2*s s1 = s2*c + c2*s sig1 = dabs(c1*(w*c2 - x*s2) - s1*(y*c2 - z*s2)) sig2 = dabs(s1*(w*s2 + x*c2) + c1*(y*s2 + z*c2)) c c Sort the singular values. c if (sig1 .lt. sig2) then t = sig1 sig1 = sig2 sig2 = t endif c return end c c DGESWE-------------------------------------------------------------- c subroutine dgeswe(itype,r,s,n) double precision r,s dimension r(*),s(*) c c Generate the three non-symmetric Toeplitz test matrices from c D.R. Sweet, "The use of pivoting to improve the numerical per- c formance of Toeplitz solvers"; in J.M. Speiser (Ed.), "Advanced c Algorithms and Architectures for Signal Processing", Proc. SPIE c 696 (1986), 8-18. c c This version dated 07/10/1990. c Per Christian Hansen, UNI-C, Technical University of Denmark. c double precision eps c c Set small multiple of machine precision. c eps = 5.0d-15 c c Set up the Toeplitz matrices. c if (itype .eq. 1) then n = 6 r(1) = 4.0d0 r(2) = 8.0d0 r(3) = 1.0d0 r(4) = 6.0d0 r(5) = 2.0d0 r(6) = 3.0d0 s(1) = 4.0d0 s(2) = 6.0d0 s(3) = 71.0d0/15.0d0 + eps s(4) = 5.0d0 s(5) = 3.0d0 s(6) = 1.0d0 elseif (itype .eq. 2) then n = 6 r(1) = 8.0d0 r(2) = 4.0d0 r(3) = 1.0d0 r(4) = 6.0d0 r(5) = 2.0d0 r(6) = 3.0d0 s(1) = 8.0d0 s(2) = 4.0d0 s(3) = -3.40d1 + eps s(4) = 5.0d0 s(5) = 3.0d0 s(6) = 1.0d0 else n = 13 r( 1) = 5.0d0 r( 2) = -1.0d0 r( 3) = 6.0d0 r( 4) = 2.0d0 r( 5) = 5.697d0 r( 6) = 5.850d0 r( 7) = 3.0d0 r( 8) = -5.0d0 r( 9) = -2.0d0 r(10) = -7.0d0 r(11) = 1.0d0 r(12) = 1.0d1 r(13) = -1.5d1 s( 1) = 5.0d0 s( 2) = 1.0d0 s( 3) = -3.0d0 s( 4) = 1.2755d1 s( 5) = -1.9656d1 s( 6) = 2.8361d1 s( 7) = -7.0d0 s( 8) = -1.0d0 s( 9) = 2.0d0 s(10) = 1.0d0 s(11) = -6.0d0 s(12) = 1.0d0 s(13) = -5.0d-1 endif c return end c c DGETCL-------------------------------------------------------------- c subroutine dgetcl(n,r,s,b,wantsm,w,y,z,x,conda,condm,sigmin,info) logical wantsm integer info,n double precision b(*),conda,condm,r(*),s(*),sigmin(*),w(*),x(*), - y(*),z(*) c c Subroutine dgetcl solves a system of linear equations with a c general Toeplitz matrix, using the classical Levinson algorithm. c c The numbers conda and condm are estimates of the algorithm c and matrix condition numbers, respectively. If conda is much c greater than condm , then the extended Levinson algorithm is c required in order to 'skip over' a cluster of consecutive ill- c conditioned leading submatrices. We recommend using one of the c subroutines dgete2 and dgetep . c c The array sigmin (if wanted) is a further diagnostic tools: it c holds the estimates for the smallest singular values of all the c leading submatrices. c c On entry c c n integer c the order of the Toeplitz matrix (n .gt. 0). c c r double precision(n) c the first row of the Toeplitz matrix. c c s double precision(n) c the first column of the Toeplitz matrix. c c b double precision(n) c the right-hand side of the linear system. c c wantsm logical c wantsm = true - return sigmin , c wantsm = false - do not return sigmin . c c w double precision(2*n) c workspace of length 2*n . c c y,z double precision(n) c two workspaces of lenght n each, used internally to c hold the solutions to the Yule-Walker problems. c c On return c c x double precision(n) c the solution to the linear system; if b is not required c then b and x may share the same storage locations. c c conda double precision c an estimate of the algorithm condition number, defined c as the 2-norm of the Toeplitz matrix divided by the c smallest singular value of all the leading submatrices. c c condm double precision c an estimate of the classical 2-norm condition number of c the Toeplitz matrix. c c sigmin double precision(n) c if wantsm = true , then sigmin holds the estimates of c the smallest singular values of all the leading submatri- c ces; if wantsm = false , then sigmin is not accessed. c c info integer c info = 0 normal value, c info = 1 illegal n , c info = 2 a step failed because an exactly singular c leading submatrix was encountered. c c This version dated 03/24/90. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Subroutines and functions c c blas dasum,ddot,idamax c fortran dabs,dmax1 c c Internal variables c integer idamax,ip,k double precision alpha,beta,c,d,dasum,ddot,eta,gamma,phi,sn, - smin,t,yn,zn c c Initialization. c k = 0 info = 0 if (n .le. 0) then info = 1 return endif c c Compute the first elements of x and y . c if (r(1) .eq. 0.0d0) goto 999 x(1) = b(1)/r(1) y(1) = -r(2)/r(1) z(1) = -s(2)/r(1) gamma = r(1) + s(2)*y(1) smin = dabs(r(1)) if (wantsm) sigmin(1) = smin sn = smin k = 1 if (n .eq. 1) goto 200 c c Main loop starts here. c 100 continue c c Estimate the smallest singular value of the leading submatrix. c ip = idamax(k,y(1),1) yn = dabs(y(ip)) ip = idamax(k,z(1),1) zn = dabs(z(ip)) sn = dabs(gamma)/dmax1(1.0d0,yn,zn,yn*zn) if (sn .lt. smin) smin = sn c c Perform a classical Levinson step and update the solutions. c beta = b(k+1) - ddot(k,s(2),1,x(1),-1) c = -r(k+2) - ddot(k,r(2),1,y(1),-1) d = -s(k+2) - ddot(k,s(2),1,z(1),-1) if (gamma .eq. 0.0d0) goto 999 alpha = beta/gamma eta = c/gamma phi = d/gamma do 110 i = 1,k x(i) = x(i) + alpha*y(k-i+1) w(i) = y(i) + eta*z(k-i+1) w(n+i) = z(i) + phi*y(k-i+1) 110 continue do 120 i = 1,k y(i) = w(i) z(i) = w(n+i) 120 continue k = k + 1 x(k) = alpha y(k) = eta z(k) = phi gamma = gamma - c*phi if (wantsm) sigmin(k) = sn c c While k .lt. n repeat the main loop. c if (k .lt. n) goto 100 c c Normal termination of main loop. Compute condition numbers. c 200 continue t = dmax1(dasum(n,r(1),1),dasum(n,s(1),1)) conda = t/smin condm = t/sn return c c Get here only if an exactly singular leading submatrix was encountered. c 999 continue info = 2 if (wantsm) then do 1000 i = 1,n sigmin(i) = 0.0d0 1000 continue endif return c end c c DGETE2-------------------------------------------------------------- c subroutine dgete2(n,r,s,b,wantsm,w,y,z, - x,conda,condm,psteps,sigmin,info) logical wantsm integer info,n,psteps double precision b(*),conda,condm,r(*),s(*),sigmin(*),w(*),x(*), - y(*),z(*) c c Subroutine dgete2 solves a system of linear equations with a c general Toeplitz matrix, using the extended Levinson algorithm c with maximum block-size equal to 2. c c The numbers conda and condm are estimates of the algorithm c and matrix condition numbers, respectively. If conda is much c greater than condm , then a bigger block size is required in order c to 'skip over' a cluster of consecutive ill-conditioned leading c submatrices. We recommend using the subroutine dgetep . c c The quantities psteps and sigmin are further diagnostic tools: c psteps is the number of block-steps taken, and the array sigmin c (if wanted) holds the estimates for the smallest singular values of c all the leading submatrices. Entries of sigmin corresponding to c submatrices that were 'skipped over' are negative. c c On entry c c n integer c the order of the Toeplitz matrix (n .gt. 0). c c r double precision(n) c the first row of the Toeplitz matrix. c c s double precision(n) c the first column of the Toeplitz matrix. c c b double precision(n) c the right-hand side of the linear system. c c wantsm logical c wantsm = true - return sigmin , c wantsm = false - do not return sigmin . c c w double precision(4*n) c workspace of length 4*n . c c y,z double precision(2*n) c two workspaces of lenght 2*n each, used internally to c hold the solutions to the Yule-Walker problems. c c On return c c x double precision(n) c the solution to the linear system; if b is not required c then b and x may share the same storage locations. c c conda double precision c an estimate of the algorithm condition number, defined c as the 2-norm of the Toeplitz matrix divided by the c smallest singular value of all the leading submatrices, c except those that were 'skipped over' by the algorithm. c c condm double precision c an estimate of the classical 2-norm condition number of c the Toeplitz matrix. c c psteps integer c the number of block-steps used to solve the linear system c of equations with the extended Levinson algorithm. c c sigmin double precision(n) c if wantsm = true , then sigmin holds the estimates of c the smallest singular values of all the leading submatri- c ces; if wantsm = false , then sigmin is not accessed. c Entries of sigmin corresponding to submatrices that c were 'skipped over' are negative. c c info integer c info = 0 normal value, c info = 1 illegal n , c info = 2 a block-step failed because 2 consecutive c exactly singular leading submatrices were c encountered. c c This version dated 09/04/90. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Subroutines and functions c c blas dasum,ddot,idamax c others dgesl2,dgesv2 c fortran dabs,dmax1 c c Internal variables c integer idamax,inf,ip,k,pold double precision a1,a2,b1,b2,c1,c2,dasum,ddot,d1,d2,e1,e2,f1,f2, - gamma,gammao,g12,g21,g22,h,psi,smin,smin1,smin2, - t,tau,u,yn,zn c c Check parameters, treat the n=1 case, and set the factor tau . c psteps = 0 k = 0 info = 0 if (n .eq. 1) then if (r(1) .eq. 0.0d0) goto 999 x(1) = b(1)/r(1) endif if (n .le. 0) then info = 1 return endif tau = 1.0d-1 c c Special processing of the first p-by-p leading submatrix T.p . c Choose the block-size p such that it minimizes sigma.min(T.p) c for p = 1,2 , and solve the three p-by-p systems involving T.p . c smin1 = dabs(r(1)) call dgesv2(r(1),r(2),s(2),r(1),t,smin2) if (smin1 .gt. smin2) then gammao = r(1) x(1) = b(1)/gammao y(1) = -r(2)/gammao z(1) = -s(2)/gammao gamma = gammao + s(2)*y(1) psi = smin1 smin = smin1 if (wantsm) sigmin(1) = smin1 k = 1 pold = 1 c print *,'First step was a 1-step' else if (smin2 .eq. 0.0d0) goto 999 call dgesl2(r(1),s(2),r(2),r(1),b(1),b(2),x(1),x(2),inf) if (inf .eq. 1) goto 999 c1 = -r(2) c2 = -r(3) call dgesl2(r(1),r(2),s(2),r(1),c1,c2,y(1),y(2),inf) c1 = -s(2) c2 = -s(3) call dgesl2(r(1),s(2),r(2),r(1),c1,c2,z(1),z(2),inf) gamma = r(1) + s(2)*y(1) + s(3)*y(2) psi = smin2 smin = smin2 if (wantsm) then sigmin(1) = - smin1 sigmin(2) = smin2 endif k = 2 pold = 2 psteps = 1 c print *,'First step was a 2-step' endif c c Main loop starts here. c if (k .eq. n) goto 400 100 continue c print *,'Enter main loop with k = ',k c c Prepare for a classical Levinson step. c b1 = b(k+1) - ddot(k,s(2),1,x(1),-1) c1 = -r(k+2) - ddot(k,r(2),1,y(1),-1) d1 = -s(k+2) - ddot(k,s(2),1,z(1),-1) ip = idamax(k,y(1),1) yn = dabs(y(ip)) ip = idamax(k,z(1),1) zn = dabs(z(ip)) smin1 = dabs(gamma)/dmax1(1.0d0,yn,zn,yn*zn) c c If the leading submatrix T.k+1 is well-conditioned, or if c k+1 = n , then perform a classical Levinson step ... c if (smin1 .ge. tau*smin .or. k+1 .eq. n) goto 200 c c ... else compute y.k,1 and z.k,1 . If the previous step was a c classical step, then use the simple updating formula ... c c print *,' entering block phase: p = 2, k =',k if (pold .eq. 1) then c print *,' use simple updating formula to get y.k,p-1' t = c1/gammao u = d1/gammao do 110 i = 1,k-1 y(n+i) = y(i+1) - y(1)*y(i) + t*w(3*n+k-i) z(n+i) = z(i+1) - z(1)*z(i) + u*w( n+k-i) 110 continue y(n+k) = -y(1)*y(k) + t z(n+k) = -z(1)*z(k) + u c c ... else if the previous step was the first step of the algorithm, c then compute y.k,1 and z.k,1 directly ... c elseif (pold. eq. k) then c print *,' previous step was the first' a1 = -r(3) a2 = -r(4) call dgesl2(r(1),r(2),s(2),r(1),a1,a2,y(n+1),y(n+2),inf) a1 = -s(3) a2 = -s(4) call dgesl2(r(1),s(2),r(2),r(1),a1,a2,z(n+1),z(n+2),inf) c c ... else update the previous y.k,1 and z.k,1 . c else c print *,' update the previous y.k,1 ' a1 = -r(k+1) - ddot(k-2,r(2),1,y(n+1),-1) a2 = -r(k+2) - ddot(k-2,r(3),1,y(n+1),-1) call dgesl2(gammao,g12,g21,g22,a1,a2,e1,e2,inf) a1 = -s(k+1) - ddot(k-2,s(2),1,z(n+1),-1) a2 = -s(k+2) - ddot(k-2,s(3),1,z(n+1),-1) call dgesl2(gammao,g21,g12,g22,a1,a2,f1,f2,inf) do 120 i = 1,k-2 w(i) = y(n+i) + e1*w(3*n+k-i-1) + e2*z(n+k-i-1) w(2*n+i) = z(n+i) + f1*w( n+k-i-1) + f2*y(n+k-i-1) 120 continue do 130 i = 1,k-2 y(n+i) = w(i) z(n+i) = w(2*n+i) 130 continue y(n+k-1) = e1 y(n+k) = e2 z(n+k-1) = f1 z(n+k) = f2 endif c c Construct the Schur complement Gamma.2 and compute its smallest c singular value. Also, construct the corresponding right-hand sides. c g21 = s(2) + ddot(k,s(3),1,y(1) ,1) g12 = r(2) + ddot(k,s(2),1,y(n+1),1) g22 = r(1) + ddot(k,s(3),1,y(n+1),1) call dgesv2(gamma,g21,g12,g22,h,t) b2 = b(k+2) - ddot(k,s(3),1,x(1),-1) c2 = -r(k+3) - ddot(k,r(3),1,y(1),-1) d2 = -s(k+3) - ddot(k,s(3),1,z(1),-1) c c Estimate the smallest singular value sigma.min(T.k+2) of the c leading submatrix of order k+2 . c ip = idamax(k,y(n+1),1) yn = dmax1(yn,dabs(y(n+ip))) ip = idamax(k,z(n+1),1) zn = dmax1(zn,dabs(z(n+ip))) smin2 = t/dmax1(1.0d0,yn,zn,yn*zn) c c If this T.k+2 is not well-conditioned, then find the best possible c block-size; i.e. the p that maximizes sigma.min(T.k+p) , p = 1,2 . c if (smin2 .lt. tau*smin) then c print *,' do best possible step' h = dmax1(smin1,smin2) if (h .lt. smin) smin = h if (smin1 .gt. smin2) goto 200 if (smin2 .eq. 0.0d0) goto 999 endif c c Perform an extended Levinson step: save the present y.k and z.k , c solve the small systems involving the Schur complement Gamma.2 , c and update the solutions. c c print *,' extended Levinson step with k,p = ',k,2 call dgesl2(gamma,g21,g12,g22,b1,b2,a1,a2,inf) if (inf .eq. 1) goto 999 call dgesl2(gamma,g12,g21,g22,c1,c2,e1,e2,inf) call dgesl2(gamma,g21,g12,g22,d1,d2,f1,f2,inf) do 140 i = 1,k w(n+i) = y(i) w(3*n+i) = z(i) x(i) = x(i) + a1*y(k-i+1) + a2*y(n+k-i+1) w(i) = y(i) + e1*z(k-i+1) + e2*z(n+k-i+1) w(2*n+i) = z(i) + f1*y(k-i+1) + f2*y(n+k-i+1) 140 continue do 150 i = 1,k y(i) = w(i) z(i) = w(2*n+i) 150 continue k = k + 2 x(k-1) = a1 x(k) = a2 y(k-1) = e1 y(k) = e2 z(k-1) = f1 z(k) = f2 gammao = gamma gamma = r(1) + ddot(k,s(2),1,y(1),1) pold = 2 psteps = psteps + 1 if (smin2 .lt. psi) psi = smin2 if (wantsm) then sigmin(k-1) = - smin1 sigmin(k) = smin2 endif goto 300 c c Perform a classical Levinson step: save the present y.k and c z.k , solve for a1 , e1 and f1 , and update the solutions. c 200 continue c print *,' perform a classical Levinson step' a1 = b1/gamma e1 = c1/gamma f1 = d1/gamma do 210 i = 1,k w(n+i) = y(i) w(3*n+i) = z(i) x(i) = x(i) + a1*y(k-i+1) y(n+i) = y(i) + e1*z(k-i+1) z(n+i) = z(i) + f1*y(k-i+1) 210 continue do 220 i = 1,k y(i) = y(n+i) z(i) = z(n+i) 220 continue k = k + 1 x(k) = a1 y(k) = e1 z(k) = f1 gammao = gamma gamma = gamma - c1*f1 pold = 1 if (smin1 .lt. psi) psi = smin1 if (wantsm) sigmin(k) = smin1 c c While k .lt. n repeat the main loop. c 300 continue if (k .lt. n) goto 100 c c Normal termination of main loop. Compute condition numbers. c 400 continue t = dmax1(dasum(n,r(1),1),dasum(n,s(1),1)) conda = t/psi if (pold .eq. 1) then condm = t/smin1 else condm = t/smin2 endif if (conda .lt. condm) conda = condm return c c Get here only if 2 consecutive exactly singular leading submatrices c were encountered. c 999 continue info = 2 if (wantsm) then do 1000 i = k+1,n sigmin(i) = 0.0d0 1000 continue endif return c end c c DGETEP-------------------------------------------------------------- c subroutine dgetep(n,r,s,b,pmax,wantsm,w,y,z, - x,conda,condm,psteps,sigmin,info) logical wantsm integer info,n,pmax,psteps double precision b(*),conda,condm,r(*),s(*),sigmin(*),w(*),x(*), - y(n,*),z(n,*) c c Subroutine dgetep solves a system of linear equations with a c general Toeplitz matrix, using the extended Levinson algorithm c with maximum block-size pmax . c c This routine is designed for pmax .gt. 2 ; subroutines dgetcl c and dgete2 are provided for the special cases pmax = 1 (clas- c sical Levinson) and pmax = 2 . c c The numbers conda and condm are estimates of the algorithm c and matrix condition numbers, respectively. If conda is much c greater than condm , then a bigger pmax is required in order c to 'skip over' a cluster of consecutive ill-conditioned leading c submatrices. c c The quantities psteps and sigmin are further diagnostic tools: c psteps is the number of block-steps taken, and the array sigmin c (if wanted) holds the estimates for the smallest singular values of c all the leading submatrices. Entries of sigmin corresponding to c submatrices that were 'skipped over' are negative. c c On entry c c n integer c the order of the Toeplitz matrix (n .gt. 0). c c r double precision(n) c the first row of the Toeplitz matrix. c c s double precision(n) c the first column of the Toeplitz matrix. c c b double precision(n) c the right-hand side of the linear system. c c pmax integer c the maximum block-size (pmax .gt. 0). c c wantsm logical c wantsm = true - return sigmin , c wantsm = false - do not return sigmin . c c w double precision(lw) c workspace whose length is given by c lw = pmax*(pmax*pmax + 20)/3 + 2*(pmax*pmax - 1) c Examples of lw as a function of pmax is given below: c pmax: 3 4 5 6 7 8 9 10 c lw : 45 78 123 182 257 350 463 598 c c y,z double precision(n,pmax+2) c two two-dimensional workspaces, used internally to hold c the solutions to the Yule-Walker problems. c c On return c c x double precision(n) c the solution to the linear system; if b is not required c then b and x may share the same storage locations. c c conda double precision c an estimate of the algorithm condition number, defined c as the 2-norm of the Toeplitz matrix divided by the c smallest singular value of all the leading submatrices, c except those that were 'skipped over' by the algorithm. c c condm double precision c an estimate of the classical 2-norm condition number of c the Toeplitz matrix. c c psteps integer c the number of block-steps used to solve the linear system c of equations with the extended Levinson algorithm. c c sigmin double precision(n) c if wantsm = true , then sigmin holds the estimates of c the smallest singular values of all the leading submatri- c ces; if wantsm = false , then sigmin is not accessed. c Entries of sigmin corresponding to submatrices that c were 'skipped over' are negative. c c info integer c info = 0 normal value, c info = 1 illegal n or pmax , c info = 2 a block-step failed because pmax consecutive c exactly singular leading submatrices were c encountered. c c This version dated 09/04/90. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Subroutines and functions c c blas dasum,ddot,idamax c linpack dgefa,dgesl,dgesl2 c others dgesm,dgesv2,dgetcl,dgete2,dtplz c fortran dabs,dmax1,min0 c c Internal variables c integer a,c,d,e,f,g,inf,ip,ip1,ip2,ipvt,ipvt2,idamax, - k,kold,matp,p,pmax1,pmax2,pold double precision dasum,ddot,gamma,gammao,g12,g21,g22,psi,smin, - t,tau,u,yn,zn c c Check parameters and treat special cases. c psteps = 0 if (pmax .lt. 3) then if (pmax .eq. 1) - call dgetcl(n,r,s,b,wantsm,z,y(1,1),y(1,2), - x,conda,condm,sigmin,info) if (pmax .eq. 2) - call dgete2(n,r,s,b,wantsm,z,y(1,3),y(1,1), - x,conda,condm,psteps,sigmin,info) return endif k = 0 info = 0 if (n .eq. 1) then if (r(1) .eq. 0.0d0) goto 999 x(1) = b(1)/r(1) endif if (n .le. 0 .or. pmax .le. 0) then info = 1 return endif c c Initialize the pointers and set the factor tau . c a = 1 + pmax c = a + pmax d = c + pmax e = d + pmax f = e + pmax g = f + pmax ipvt2 = g + pmax*pmax pmax1 = pmax + 1 pmax2 = pmax + 2 tau = 1.0d-1 c c Special processing of the first p-by-p leading submatrix T.p . c Choose the block-size p such that it minimizes sigma.min(T.p) c for p = 1,...,min(n,pmax) . c ip = min0(n,pmax) w(1) = dabs(r(1)) call dgesv2(r(1),r(2),s(2),r(1),t,w(2)) ipvt = ipvt2 do 10 k = 3,ip ipvt = ipvt + (k-1)*k matp = ipvt + k call dtplz(k,r,s,w(matp)) call dgesm(w(matp),k,k,w(ipvt),w(k),w(e)) 10 continue p = idamax(ip,w(1),1) c print *,' first step had p =',p smin = w(p) if (smin .eq. 0.0d0) goto 999 psi = w(p) if (wantsm) then do 20 i = 1,p-1 sigmin(i) = -w(i) 20 continue sigmin(p) = w(p) endif c c Now solve the three p-by-p systems involving T.p . c if (p .eq. 1) then gammao = r(1) x(1) = b(1)/gammao y(1,1) = -r(2)/gammao z(1,1) = -s(2)/gammao else if (p .eq. 2) then call dgesl2(r(1),s(2),r(2),r(1),b(1),b(2),x(1),x(2),inf) t = -r(2) u = -r(3) call dgesl2(r(1),r(2),s(2),r(1),t,u,y(1,1),y(2,1),inf) t = -s(2) u = -s(3) call dgesl2(r(1),s(2),r(2),r(1),t,u,z(1,1),z(2,1),inf) else do 30 i = 1,p x(i) = b(i) y(i,1) = -r(i+1) z(i,1) = -s(i+1) 30 continue ipvt = ipvt2 + (p*(p*p - 1))/3 - 2 matp = ipvt + p call dgesl(w(matp),p,p,w(ipvt),x ,0) call dgesl(w(matp),p,p,w(ipvt),y(1,1),1) call dgesl(w(matp),p,p,w(ipvt),z(1,1),0) endif psteps = 1 endif k = p pold = p c c Compute the initial Schur complement gamma . c gamma = r(1) do 40 i = 1,p gamma = gamma + s(i+1)*y(i,1) 40 continue c c Main loop starts here. c if (k .eq. n) goto 700 100 continue c print *,'Enter main loop with k = ',k c c Prepare for a classical Levinson step. c p = 1 w(a) = b(k+1) - ddot(k,s(2),1,x(1) ,-1) w(c) = -r(k+2) - ddot(k,r(2),1,y(1,1),-1) w(d) = -s(k+2) - ddot(k,s(2),1,z(1,1),-1) ip = idamax(k,y(1,1),1) yn = dabs(y(ip,1)) ip = idamax(k,z(1,1),1) zn = dabs(z(ip,1)) w(1) = dabs(gamma)/dmax1(1.0d0,yn,zn,yn*zn) c c If the leading submatrix T.k+1 is well-conditioned, or if c k+1 = n , then perform a classical Levinson step ... c if (w(1) .ge. tau*smin .or. k+1 .eq. n) goto 500 c c ... else perform an inner loop to determine the optimal c block-size p adaptively. c 200 continue p = p + 1 c print *,' entering block phase: k,p =',k,p c c Compute y.k,p-1 and z.k,p-1 . If the previous step was a c classical step, then use the simple updating formula ... c if (pold .eq. 1) then c print *,' use simple updating formula to get y.k,p-1' t = w(c+p-2)/gammao u = w(d+p-2)/gammao do 210 i = 1,k-1 y(i,p) = y(i+1,p-1) - y(1,p-1)*y(i,1) + t*z(k-i,pmax1) z(i,p) = z(i+1,p-1) - z(1,p-1)*z(i,1) + u*y(k-i,pmax1) 210 continue y(k,p) = -y(1,p-1)*y(k,1) + t z(k,p) = -z(1,p-1)*z(k,1) + u c c ... else if the previous step was the first step of the algorithm, c then compute all y.k,i and z.k,i , i = 1,..,pmax-1 directly ... c elseif (pold. eq. k) then c print *,' previous step was the first' if (p .eq. 2) then do 230 j = 2,pmax if (k .eq. 2) then t = -r(1+j) u = -r(2+j) call dgesl2(r(1),r(2),s(2),r(1),t,u,y(1,j),y(2,j),inf) t = -s(1+j) u = -s(2+j) call dgesl2(r(1),s(2),r(2),r(1),t,u,z(1,j),z(2,j),inf) else do 220 i = 1,k y(i,j) = -r(i+j) z(i,j) = -s(i+j) 220 continue call dgesl(w(matp),k,k,w(ipvt),y(1,j),1) call dgesl(w(matp),k,k,w(ipvt),z(1,j),0) endif 230 continue endif c c ... else update the previous y.k,1 and z.k,1 (for p = 2 ) c and continue along this line. c elseif (p .eq. 2) then c print *,' update the previous y.k,p-1 ' kold = k - pold do 240 i = 1,pold w(e+i-1) = -r(kold+i+2) - ddot(kold,r(i+1),1,y(1,2),-1) w(f+i-1) = -s(kold+i+2) - ddot(kold,s(i+1),1,z(1,2),-1) 240 continue if (pold .eq. 2) then t = w(e) u = w(e+1) call dgesl2(gammao,g12,g21,g22,t,u,w(e),w(e+1),inf) t = w(f) u = w(f+1) call dgesl2(gammao,g21,g12,g22,t,u,w(f),w(f+1),inf) else call dgesl(w(matp),pold,pold,w(ipvt),w(e),1) call dgesl(w(matp),pold,pold,w(ipvt),w(f),0) endif do 250 i = 1,kold y(i,pmax2) = y(i,2) + w(e)*z(kold-i+1,pmax1) z(i,pmax2) = z(i,2) + w(f)*y(kold-i+1,pmax1) 250 continue y(kold+1,2) = w(e) z(kold+1,2) = w(f) do 270 j = 2,pold do 260 i = 1,kold y(i,pmax2) = y(i,pmax2) + w(e+j-1)*z(kold-i+1,j) z(i,pmax2) = z(i,pmax2) + w(f+j-1)*y(kold-i+1,j) 260 continue y(kold+j,2) = w(e+j-1) z(kold+j,2) = w(f+j-1) 270 continue do 280 i = 1,kold y(i,2) = y(i,pmax2) z(i,2) = z(i,pmax2) 280 continue elseif (p .eq. 3) then do 290 i = 1,k-1 y(i,pmax2) = (y(i,2) - y(i+1,1) + y(1,1)*y(i,1))/w(c) z(i,pmax2) = (z(i,2) - z(i+1,1) + z(1,1)*z(i,1))/w(d) y(i,3) = y(i+1,2) - y(1,2)*y(i,1) + w(c+1)*y(i,pmax2) z(i,3) = z(i+1,2) - z(1,2)*z(i,1) + w(d+1)*z(i,pmax2) 290 continue y(k,pmax2) = (y(k,2) + y(1,1)*y(k,1))/w(c) z(k,pmax2) = (z(k,2) + z(1,1)*z(k,1))/w(d) y(k,3) = -y(1,2)*y(k,1) + w(c+1)*y(k,pmax2) z(k,3) = -z(1,2)*z(k,1) + w(d+1)*z(k,pmax2) else do 300 i = 1,k-1 y(i,p) = y(i+1,p-1) - y(1,p-1)*y(i,1) + w(c+p-2)*y(i,pmax2) z(i,p) = z(i+1,p-1) - z(1,p-1)*z(i,1) + w(d+p-2)*z(i,pmax2) 300 continue y(k,p) = -y(1,p-1)*y(k,1) + w(c+p-2)*y(k,pmax2) z(k,p) = -z(1,p-1)*z(k,1) + w(d+p-2)*z(k,pmax2) endif c c Construct the Schur complement Gamma.p , factorize it, and estimate c its smallest singular value. c if (p .eq. 2) then g21 = s(2) + ddot(k,s(3),1,y(1,1),1) g12 = r(2) + ddot(k,s(2),1,y(1,2),1) g22 = r(1) + ddot(k,s(3),1,y(1,2),1) w(g) = gamma w(g+1) = g21 w(g+pmax) = g12 w(g+pmax1) = g22 ipvt = ipvt2 matp = ipvt + 2 w(matp) = gamma w(matp+1) = g21 w(matp+2) = g12 w(matp+3) = g22 call dgesv2(gamma,g12,g21,g22,u,t) else ipvt = ipvt + (p-1)*p matp = ipvt + p ip1 = g + p-1 ip2 = g + (p-1)*pmax do 310 i = 1,p-1 w(ip1) = s(p-i+1) + ddot(k,s(p+1),1,y(1,i),1) w(ip2) = r(p-i+1) + ddot(k,s(i+1),1,y(1,p),1) ip1 = ip1 + pmax ip2 = ip2 + 1 310 continue w(ip2) = r(1) + ddot(k,s(p+1),1,y(1,p),1) ip1 = matp do 330 j = 1,p ip2 = g + (j-1)*pmax do 320 i = 1,p w(ip1) = w(ip2) ip1 = ip1 + 1 ip2 = ip2 + 1 320 continue 330 continue call dgesm(w(matp),p,p,w(ipvt),t,w(e)) endif c c Construct the corresponding right-hand sides. c w(a+p-1) = b(k+p ) - ddot(k,s(p+1),1,x(1) ,-1) w(c+p-1) = -r(k+p+1) - ddot(k,r(p+1),1,y(1,1),-1) w(d+p-1) = -s(k+p+1) - ddot(k,s(p+1),1,z(1,1),-1) c c Estimate the smallest singular value sigma.min(T.k+p) of the c leading submatrix of order k+p . c ip = idamax(k,y(1,p),1) yn = dmax1(yn,dabs(y(ip,p))) ip = idamax(k,z(1,p),1) zn = dmax1(zn,dabs(z(ip,p))) w(p) = t/dmax1(1.0d0,yn,zn,yn*zn) c c If this T.k+p is well-conditioned, then perform an extended c Levinson step ... c if (w(p) .ge. tau*smin) goto 400 c c ... else, while p .le. pmax and k+p .lt. n , repeat the inner c loop. Otherwise, find the best possible block-size; i.e. the p c that maximizes sigma.min(T.k+p) , p = 1,...,min(pmax,p-k) . c if (k+p .eq. n) then p = idamax(p,w(1),1) elseif (p .lt. pmax) then goto 200 else p = idamax(pmax,w(1),1) endif c print *,' do best possible step' if (w(p) .eq. 0.0d0) goto 999 if (w(p) .lt. smin) smin = w(p) ipvt = ipvt2 + (p*(p*p - 1))/3 - 2 matp = ipvt + p if (p .eq. 1) goto 500 c c Perform an extended Levinson step: save the present y.k and z.k , c solve the small systems involving the Schur complement Gamma.p , c and update the solutions. c 400 continue c print *,' extended Levinson step with k,p =',k,p if (p .eq. 2) then t = w(a) u = w(a+1) call dgesl2(gamma,g21,g12,g22,t,u,w(a),w(a+1),inf) t = w(c) u = w(c+1) call dgesl2(gamma,g12,g21,g22,t,u,w(e),w(e+1),inf) t = w(d) u = w(d+1) call dgesl2(gamma,g21,g12,g22,t,u,w(f),w(f+1),inf) else do 410 i = 0,p-1 w(e+i) = w(c+i) w(f+i) = w(d+i) 410 continue call dgesl(w(matp),p,p,w(ipvt),w(a),0) call dgesl(w(matp),p,p,w(ipvt),w(e),1) call dgesl(w(matp),p,p,w(ipvt),w(f),0) endif do 420 i = 1,k x(i) = x(i) + w(a)*y(k-i+1,1) + w(a+1)*y(k-i+1,2) y(i,pmax1) = y(i,1) y(i,pmax2) = y(i,1) + w(e)*z(k-i+1,1) + w(e+1)*z(k-i+1,2) z(i,pmax1) = z(i,1) z(i,pmax2) = z(i,1) + w(f)*y(k-i+1,1) + w(f+1)*y(k-i+1,2) 420 continue x(k+1) = w(a) x(k+2) = w(a+1) y(k+1,1) = w(e) y(k+2,1) = w(e+1) z(k+1,1) = w(f) z(k+2,1) = w(f+1) do 440 j = 3,p do 430 i = 1,k x(i) = x(i) + w(a+j-1)*y(k-i+1,j) y(i,pmax2) = y(i,pmax2) + w(e+j-1)*z(k-i+1,j) z(i,pmax2) = z(i,pmax2) + w(f+j-1)*y(k-i+1,j) 430 continue x(k+j) = w(a+j-1) y(k+j,1) = w(e+j-1) z(k+j,1) = w(f+j-1) 440 continue do 450 i = 1,k y(i,1) = y(i,pmax2) z(i,1) = z(i,pmax2) 450 continue k = k + p gammao = gamma gamma = r(1) + ddot(k,s(2),1,y(1,1),1) pold = p psteps = psteps + 1 if (w(p) .lt. psi) psi = w(p) if (wantsm) then do 460 i = 1,p-1 sigmin(k+i) = -w(i) 460 continue sigmin(k+p) = w(p) endif goto 600 c c Perform a classical Levinson step: save the present y.k and z.k , c solve for w(a) , w(e) and w(f) , and update the solutions. c 500 continue c print *,' perform a classical Levinson step' w(a) = w(a)/gamma w(e) = w(c)/gamma w(f) = w(d)/gamma do 510 i = 1,k y(i,pmax1) = y(i,1) z(i,pmax1) = z(i,1) x(i) = x(i) + w(a)*y(k-i+1,1) y(i,pmax2) = y(i,1) + w(e)*z(k-i+1,1) z(i,pmax2) = z(i,1) + w(f)*y(k-i+1,1) 510 continue do 520 i = 1,k y(i,1) = y(i,pmax2) z(i,1) = z(i,pmax2) 520 continue k = k + 1 x(k) = w(a) y(k,1) = w(e) z(k,1) = w(f) gammao = gamma gamma = gamma - w(c)*w(f) pold = 1 if (w(1) .lt. psi) psi = w(1) if (wantsm) sigmin(k) = w(1) c c While k .lt. n repeat the main loop. c 600 continue if (k .lt. n) goto 100 c c Normal termination of main loop. Compute condition numbers. c 700 continue t = dmax1(dasum(n,r(1),1),dasum(n,s(1),1)) conda = t/psi condm = t/w(p) if (conda .lt. condm) conda = condm return c c Get here only if pmax consecutive exactly singular leading c submatrices were encountered. c 999 continue info = 2 if (wantsm) then do 1000 i = k+1,n sigmin(i) = 0.0d0 1000 continue endif return c end c c DGETST-------------------------------------------------------------- c program dgetst c c This program tests the following subroutines: c - dgetcl c - dgete2 c - dgetep c for solving nonsymmetric Toeplitz systems by means of the Levinson c algorithm. Subroutine dgetcl is an implementation of the classical c Levinson algorithm, while dgete2 and dgetep are implementations of c the extended Levinson algorithm. In dgete2, the maximum block-size c is 2, and in dgetep the user can specify any maximum block-size. c The algorithm is described in Hansen & Chan (1990) c c The subroutines are tested on the following Toeplitz matrices: c - nonsymmetric matrices with random elements, c - the nonsymmetric test-matrix from Sweet (1986), and c - the symmetric KMS test-matrices from Trench (1989). c c The performance of the subroutines is compared with that of the c subroutine tsld1 from the Argonne Toeplitz package. c c References: c P.C. Hansen & T.F. Chan, "Fortran subroutines for general Toeplitz c systems", Report CAM-90-xx, Dept. of Mathematics, UCLA, 1990. c Submitted to ACM Trans. Math. Software. c D.R. Sweet, "The use of pivoting to improve the numerical perfor- c mance of Toeplitz solvers"; in J.M. Speiser (Ed.), Advanced c Algorithms and Architectures for Signal Processing", Proc. c SPIE 696 (1986), 8-18. c W.F. Trence, "Numerical solutions of the eigenvalue problem for c Hermitian Toeplitz matrices", SIAM J. Matrix Anal. Appl. 10 c (1989), 135-146. c c This version dated 09/05/90. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Subroutines and functions c c toeplitz dgetcl,dgete2,dgetep c others timing,tsld1,urand c fortran dble c c Internal variables c implicit double precision (a-z) logical wantsm integer itype,in,info,iy,n,pmax,psteps real ru1,ru2,time(6),timing double precision b(2048),conda,condm,dble,r(2048),relerr, - s(2048),sigmin(2048),w(8192),x(2048),y(2048,10),z(2048,10) c c Generate random matrices of order n = 512,1024,2048 and solve c by all the subroutines. For dgetep, use pmax equal to 4, 6 and 8. c Compute the overhead involved in the extended Levinson algorithm. c The exact solution consists of all ones. c write(6,1000) write(6,2001) write(6,1001) write(6,1002) iy = 82638346 do 100 in=9,11 n = 2**in c ru1 = urand(iy) ru2 = urand(iy) r(1) = dble(ru1) + 1.0d-7*dble(ru2) s(1) = dble(ru1) + 1.0d-7*dble(ru2) do 10 i=1,n ru1 = urand(iy) ru2 = urand(iy) r(i) = dble(ru1) + 1.0d-7*dble(ru2) ru1 = urand(iy) ru2 = urand(iy) s(i) = dble(ru1) + 1.0d-7*dble(ru2) 10 continue c do 50 i=1,n b(i) = 0.0d0 do 30 j=1,i-1 30 b(i) = b(i) + s(i-j+1) do 40 j=i,n 40 b(i) = b(i) + r(j-i+1) 50 continue c time(1) = timing() call tsld1(r,s(2),b,x,w,y,n) time(1) = timing() - time(1) if (time(1) .eq. 0.0d0) time(1) = 1.0d0 relerr = 0.0d0 do 61 i=1,n 61 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1003) n,1,relerr,time(1),0.0d0 c time(2) = timing() call dgetcl(n,r,s,b,wantsm,w,y,z,x,conda,condm,sigmin,info) time(2) = timing() - time(2) if (time(2) .eq. 0.0d0) time(2) = 1.0d0 if (info .ne. 0) then print *,'return from dgetcl with info =',info goto 999 endif relerr = 0.0d0 do 62 i=1,n 62 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1004) n,2,relerr,conda,condm,time(2), - (time(2)-time(1))/time(1),0.0d0 c time(3) = timing() call dgete2(n,r,s,b,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(3) = timing() - time(3) if (info .ne. 0) then print *,'returned from dgete2 with info =',info goto 999 endif relerr = 0.0d0 do 63 i=1,n 63 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,3,relerr,conda,condm,psteps,time(3), - (time(3)-time(1))/time(1),(time(3)-time(2))/time(2) c pmax = 4 time(4) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(4) = timing() - time(4) if (info .ne. 0) then print *,'returned from dgetep (pmax=4) with info =',info goto 999 endif relerr = 0.0d0 do 64 i=1,n 64 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,4,relerr,conda,condm,psteps,time(4), - (time(4)-time(1))/time(1),(time(4)-time(2))/time(2) c pmax = 6 time(5) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(5) = timing() - time(5) if (info .ne. 0) then print *,'returned from dgetep (pmax=6) with info =',info goto 999 endif relerr = 0.0d0 do 65 i=1,n 65 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,5,relerr,conda,condm,psteps,time(5), - (time(5)-time(1))/time(1),(time(5)-time(2))/time(2) c pmax = 8 time(6) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(6) = timing() - time(6) if (info .ne. 0) then print *,'returned from dgetep (pmax=8) with info =',info goto 999 endif relerr = 0.0d0 do 66 i=1,n 66 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,6,relerr,conda,condm,psteps,time(6), - (time(6)-time(1))/time(1),(time(6)-time(2))/time(2) c write(6,1002) c 100 continue c c Generate the test matrices from Sweet (1986) and solve by means c of all the subroutines. The exact solution is all ones. c write(6,2002) write(6,1001) write(6,1002) do 200 itype=1,3 c call dgeswe(itype,r,s,n) c do 150 i=1,n b(i) = 0.0d0 do 130 j=1,i-1 130 b(i) = b(i) + s(i-j+1) do 140 j=i,n 140 b(i) = b(i) + r(j-i+1) 150 continue c wantsm = .false. c time(1) = timing() call tsld1(r,s(2),b,x,w,y,n) time(1) = timing() - time(1) relerr = 0.0d0 do 161 i=1,n 161 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1003) n,1,relerr,time(1) c time(2) = timing() call dgetcl(n,r,s,b,wantsm,w,y,z,x,conda,condm,sigmin,info) time(2) = timing() - time(2) if (info .ne. 0) then print *,'return from dgetcl with info =',info goto 999 endif relerr = 0.0d0 do 162 i=1,n 162 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1004) n,2,relerr,conda,condm,time(2) c time(3) = timing() call dgete2(n,r,s,b,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(3) = timing() - time(3) if (info .ne. 0) then print *,'returned from dgete2 with info =',info goto 999 endif relerr = 0.0d0 do 163 i=1,n 163 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,3,relerr,conda,condm,psteps,time(3) c pmax = 4 time(4) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(4) = timing() - time(4) if (info .ne. 0) then print *,'returned from dgetep (pmax=4) with info =',info goto 999 endif relerr = 0.0d0 do 164 i=1,n 164 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,4,relerr,conda,condm,psteps,time(4) c pmax = 6 time(5) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(5) = timing() - time(5) if (info .ne. 0) then print *,'returned from dgetep (pmax=6) with info =',info goto 999 endif relerr = 0.0d0 do 165 i=1,n 165 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,5,relerr,conda,condm,psteps,time(5) c pmax = 8 time(6) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(6) = timing() - time(6) if (info .ne. 0) then print *,'returned from dgetep (pmax=8) with info =',info goto 999 endif relerr = 0.0d0 do 166 i=1,n 166 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,6,relerr,conda,condm,psteps,time(6) c write(6,1002) c 200 continue c c Generate three KMS test matrices from Trench (1989) and solve c the systems by means of all the subroutines. The exact solution c is all ones. c write(6,2003) write(6,1001) write(6,1002) do 300 in=9,11 n = 2**in call dsykms(n,r) c do 220 i=1,n 220 s(i) = r(i) c do 250 i=1,n b(i) = 0.0d0 do 230 j=1,i-1 230 b(i) = b(i) + s(i-j+1) do 240 j=i,n 240 b(i) = b(i) + r(j-i+1) 250 continue c wantsm = .false. c time(1) = timing() call tsld1(r,s(2),b,x,w,y,n) time(1) = timing() - time(1) if (time(1) .eq. 0.0d0) time(1) = 1.0d0 relerr = 0.0d0 do 261 i=1,n 261 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1003) n,1,relerr,time(1),0.0d0 c time(2) = timing() call dgetcl(n,r,s,b,wantsm,w,y,z,x,conda,condm,sigmin,info) time(2) = timing() - time(2) if (time(2) .eq. 0.0d0) time(2) = 1.0d0 if (info .ne. 0) then print *,'return from dgetcl with info =',info goto 999 endif relerr = 0.0d0 do 262 i=1,n 262 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1004) n,2,relerr,conda,condm,time(2), - (time(2)-time(1))/time(1),0.0d0 c time(3) = timing() call dgete2(n,r,s,b,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(3) = timing() - time(3) if (info .ne. 0) then print *,'returned from dgete2 with info =',info goto 999 endif relerr = 0.0d0 do 263 i=1,n 263 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,3,relerr,conda,condm,psteps,time(3), - (time(3)-time(1))/time(1),(time(3)-time(2))/time(2) c pmax = 4 time(4) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(4) = timing() - time(4) if (info .ne. 0) then print *,'returned from dgetep (pmax=4) with info =',info goto 999 endif relerr = 0.0d0 do 264 i=1,n 264 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,4,relerr,conda,condm,psteps,time(4), - (time(4)-time(1))/time(1),(time(4)-time(2))/time(2) c pmax = 6 time(5) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(5) = timing() - time(5) if (info .ne. 0) then print *,'returned from dgetep (pmax=6) with info =',info goto 999 endif relerr = 0.0d0 do 265 i=1,n 265 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,5,relerr,conda,condm,psteps,time(5), - (time(5)-time(1))/time(1),(time(5)-time(2))/time(2) c pmax = 8 time(6) = timing() call dgetep(n,r,s,b,pmax,wantsm,w,y,z,x, - conda,condm,psteps,sigmin,info) time(6) = timing() - time(6) if (info .ne. 0) then print *,'returned from dgetep (pmax=8) with info =',info goto 999 endif relerr = 0.0d0 do 266 i=1,n 266 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,6,relerr,conda,condm,psteps,time(6), - (time(6)-time(1))/time(1),(time(6)-time(2))/time(2) c write(6,1002) 300 continue c 1000 format(' ','**************************************************',/, - ' ','* Numerical test of extended Levinson algorithms *',/, - ' ','* for nonsymmetric Toeplitz matrices. *',/, - ' ','* Computer: Amdahl 5890. *',/, - ' ','**************************************************',/, - ' ',/, - ' ','Legend:',/, - ' ',' 1 - tsld1 (from Argonne Toeplitz package)',/, - ' ',' 2 - dgetcl, classical Levinson algorithm ', - 'with condition estimation',/, - ' ',' 3 - dgete2, extended Levinson algorithm ', - 'with fixed PMAX = 2',/, - ' ',' 4 - dgetep, extended Levinson algorithm, ', - 'general version with PMAX = 4',/, - ' ',' 5 - dgetep, extended Levinson algorithm, ', - 'general version with PMAX = 6',/, - ' ',' 6 - dgetep, extended Levinson algorithm, ', - 'general version with PMAX = 8') c 1001 format(' ',' N case error CONDA CONDM PSTEPS', - ' time overhead') 1002 format(' ','-------------------------------------------------', - '--------------------') 1003 format(' ',i4,i4,1pe11.2,' ', - ' ',0pf7.1,2pf7.1,'%') 1004 format(' ',i4,i4,1p3e11.2,' ',0pf7.1,2pf7.1,'%',2pf7.1,'%') 1005 format(' ',i4,i4,1p3e11.2,i4 ,0pf7.1,2pf7.1,'%',2pf7.1,'%') c 2001 format(' ',/,' ','Test matrices with random elements:',/,' ') 2002 format(' ',/,' ','Test matrices from Sweet (1986):',/,' ') 2003 format(' ',/,' ','Test matrices from Trench (1989):',/,' ') c 999 stop end c c DSCAL--------------------------------------------------------------- c subroutine dscal(n,da,dx,incx) c c scales a vector by a constant. c uses unrolled loops for increment equal to one. c jack dongarra, linpack, 3/11/78. c double precision da,dx(1) integer i,incx,m,mp1,n,nincx c if(n.le.0)return if(incx.eq.1)go to 20 c c code for increment not equal to 1 c nincx = n*incx do 10 i = 1,nincx,incx dx(i) = da*dx(i) 10 continue return c c code for increment equal to 1 c c c clean-up loop c 20 m = mod(n,5) if( m .eq. 0 ) go to 40 do 30 i = 1,m dx(i) = da*dx(i) 30 continue if( n .lt. 5 ) return 40 mp1 = m + 1 do 50 i = mp1,n,5 dx(i) = da*dx(i) dx(i + 1) = da*dx(i + 1) dx(i + 2) = da*dx(i + 2) dx(i + 3) = da*dx(i + 3) dx(i + 4) = da*dx(i + 4) 50 continue return end c c DSYKMS-------------------------------------------------------------- c subroutine dsykms(n,r) double precision r dimension r(*) c c Generate the symmetric KMS Toeplitz test matrix from W.F. Trench, c "Numerical solution of the eigenvalue problem for Hermitian c Toeplitz matrices", SIAM J. Matrix Anal. Appl. 10 (1989), 135-146. c c This version dated 07/10/1990. c Per Christian Hansen, UNI-C, Technical University of Denmark. c double precision eps c eps = 5.0d-15 r(1) = eps do 10 i=1,n-1 r(i+1) = 5.0d-1**i 10 continue c return end c c DSYSV2-------------------------------------------------------------- c subroutine dsysv2(p,q,r,sig1,sig2) double precision p,q,r,sig1,sig2 c c Compute the singular values sig1 .ge. sig2 .ge. 0 of the 2-by-2 c symmetric matrix ( p q ) . c ( q r ) c c This version dated 12/05/1989. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c fortran dabs,dsign,dsqrt c double precision a,b,c,s,t,tau c c Diagonalize the matrix. c if (q .eq. 0.0d0) then sig1 = dabs(p) sig2 = dabs(r) elseif (p .eq. r) then sig1 = dabs(p + q) sig2 = dabs(p - q) else a = r - p b = 2.0d0*q if (dabs(b) .gt. dabs(a)) then tau = a/b t = dsign(1.0d0,tau)/(dabs(tau) + dsqrt(1.0d0 + tau**2)) else tau = b/a t = tau/(1.0d0 + dsqrt(1.0d0 + tau**2)) endif c = 1.0d0/dsqrt(1.0d0 + t**2) s = t*c sig1 = dabs(c*(p*c - q*s) - s*(q*c - r*s)) sig2 = dabs(s*(p*s + q*c) + c*(q*s + r*c)) endif c c Sort the singular values. c if (sig1 .lt. sig2) then t = sig1 sig1 = sig2 sig2 = t endif c return end c c DSYSWE-------------------------------------------------------------- c subroutine dsyswe(r,n) double precision r dimension r(*) c c Generate the symmetric Toeplitz test matrix from D.R. Sweet, c "The use of pivoting to improve the numerical performance of c Toeplitz solvers"; in J.M. Speiser (Ed.), "Advanced Algorithms c and Architectures for Signal Processing", Proc. SPIE 696 (1986), c 8-18. c c The element r(3) was modified for double precision. c c This version dated 07/11/1990. c Per Christian Hansen, UNI-C, Technical University of Denmark. c double precision eps c eps = 5.0d-15 n = 6 r(1) = 2.0d1 r(2) = 1.5d1 r(3) = 2.5d0 + eps r(4) = 6.0d0 r(5) = 1.0d0 r(6) = -2.0d0 c return end c c DSYTCL-------------------------------------------------------------- c subroutine dsytcl(n,r,b,wantsm,w,y,x,conda,condm,sigmin,info) logical wantsm integer info,n double precision b(*),conda,condm,r(*),sigmin(*),w(*),x(*),y(*) c c Subroutine dsytcl solves a system of linear equations with a c symmetric Toeplitz matrix, using the classical Levinson algorithm. c c The numbers conda and condm are estimates of the algorithm c and matrix condition numbers, respectively. If conda is much c greater than condm , then the extended Levinson algorithm is c required in order to 'skip over' a cluster of consecutive ill- c conditioned leading submatrices. We recommend using one of the c subroutines dsyte2 and dsytep . c c The array sigmin (if wanted) is a further diagnostic tools: it c holds the estimates for the smallest singular values of all the c leading submatrices. c c On entry c c n integer c the order of the Toeplitz matrix (n .gt. 0). c c r double precision(n) c the first row of the symmetric Toeplitz matrix. c c b double precision(n) c the right-hand side of the linear system. c c wantsm logical c wantsm = true - return sigmin , c wantsm = false - do not return sigmin . c c w double precision(n) c workspace of length n . c c y double precision(n) c workspace of lenght n , used internally to hold the c solutions to the Yule-Walker problems. c c On return c c x double precision(n) c the solution to the linear system; if b is not required c then b and x may share the same storage locations. c c conda double precision c an estimate of the algorithm condition number, defined c as the 2-norm of the Toeplitz matrix divided by the c smallest singular value of all the leading submatrices. c c condm double precision c an estimate of the classical 2-norm condition number of c the Toeplitz matrix. c c sigmin double precision(n) c if wantsm = true , then sigmin holds the estimates of c the smallest singular values of all the leading submatri- c ces; if wantsm = false , then sigmin is not accessed. c c info integer c info = 0 normal value, c info = 1 illegal n , c info = 2 a step failed because an exactly singular c leading submatrix was encountered. c c This version dated 12/14/89. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Subroutines and functions c c blas dasum,ddot,idamax c fortran dabs,dmax1 c c Internal variables c integer idamax,ip,k double precision alpha,beta,c,dasum,ddot,eta,gamma,s,smin,t,yn2 c c Initialization. c k = 0 info = 0 if (n .le. 0) then info = 1 return endif c c Compute the first elements of x and y . c if (r(1) .eq. 0.0d0) goto 999 x(1) = b(1)/r(1) y(1) = -r(2)/r(1) gamma = r(1) + r(2)*y(1) smin = dabs(r(1)) if (wantsm) sigmin(1) = smin s = smin k = 1 if (n .eq. 1) goto 200 c c Main loop starts here. c 100 continue c c Estimate the smallest singular value of the leading submatrix. c ip = idamax(k,y(1),1) yn2 = y(ip)**2 s = dabs(gamma)/dmax1(1.0d0,yn2) if (s .lt. smin) smin = s c c Perform a classical Levinson step and update the solutions. c beta = b(k+1) - ddot(k,r(2),1,x(1),-1) c = -r(k+2) - ddot(k,r(2),1,y(1),-1) if (gamma .eq. 0.0d0) goto 999 alpha = beta/gamma eta = c/gamma do 110 i = 1,k x(i) = x(i) + alpha*y(k-i+1) w(i) = y(i) + eta*y(k-i+1) 110 continue do 120 i = 1,k y(i) = w(i) 120 continue k = k + 1 x(k) = alpha y(k) = eta gamma = gamma - c*eta if (wantsm) sigmin(k) = s c c While k .lt. n repeat the main loop. c if (k .lt. n) goto 100 c c Normal termination of main loop. Compute condition numbers. c 200 continue t = dasum(n,r(1),1) conda = t/smin condm = t/s return c c Get here only if an exactly singular leading submatrix was encountered. c 999 continue info = 2 if (wantsm) then do 1000 i = 1,n sigmin(i) = 0.0d0 1000 continue endif return c end c c DSYTE2-------------------------------------------------------------- c subroutine dsyte2(n,r,b,wantsm,w,y, - x,conda,condm,psteps,sigmin,info) logical wantsm integer info,n,psteps double precision b(*),conda,condm,r(*),sigmin(*),w(*),x(*),y(*) c c Subroutine dsyte2 solves a system of linear equations with a c symmetric Toeplitz matrix, using the extended Levinson algorithm c with maximum block-size equal to 2. c c The numbers conda and condm are estimates of the algorithm c and matrix condition numbers, respectively. If conda is much c greater than condm , then a bigger block size is required in order c to 'skip over' a cluster of consecutive ill-conditioned leading c submatrices. We recommend using the subroutine dsytep . c c The quantities psteps and sigmin are further diagnostic tools: c psteps is the number of block-steps taken, and the array sigmin c (if wanted) holds the estimates for the smallest singular values of c all the leading submatrices. Entries of sigmin corresponding to c submatrices that were 'skipped over' are negative. c c On entry c c n integer c the order of the Toeplitz matrix (n .gt. 0). c c r double precision(n) c the first row of the symmetric Toeplitz matrix. c c b double precision(n) c the right-hand side of the linear system. c c wantsm logical c wantsm = true - return sigmin , c wantsm = false - do not return sigmin . c c w double precision(2*n) c workspace of length 2*n . c c y double precision(2*n) c workspace of lenght 2*n , used internally to c hold the solutions to the Yule-Walker problems. c c On return c c x double precision(n) c the solution to the linear system; if b is not required c then b and x may share the same storage locations. c c conda double precision c an estimate of the algorithm condition number, defined c as the 2-norm of the Toeplitz matrix divided by the c smallest singular value of all the leading submatrices, c except those that were 'skipped over' by the algorithm. c c condm double precision c an estimate of the classical 2-norm condition number of c the Toeplitz matrix. c c psteps integer c the number of block-steps used to solve the linear system c of equations with the extended Levinson algorithm. c c sigmin double precision(n) c if wantsm = true , then sigmin holds the estimates of c the smallest singular values of all the leading submatri- c ces; if wantsm = false , then sigmin is not accessed. c Entries of sigmin corresponding to submatrices that c were 'skipped over' are negative. c c info integer c info = 0 normal value, c info = 1 illegal n , c info = 2 a block-step failed because 2 consecutive c exactly singular leading submatrices were c encountered. c c This version dated 09/04/90. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Subroutines and functions c c blas dasum,ddot,idamax c others dgesl2,dsysv2 c fortran dabs,dmax1,dmin1 c c Internal variables c integer idamax,inf,ip,k,pold double precision a1,a2,b1,b2,c1,c2,dasum,ddot,e1,e2,gamma,gammao, - g21,g22,psi,s,smin,smin1,smin2,t,tau,yn,yn2 c c Check parameters, treat the n=1 case, and set the factor tau . c psteps = 0 k = 0 info = 0 if (n .eq. 1) then if (r(1) .eq. 0.0d0) goto 999 x(1) = b(1)/r(1) endif if (n .le. 0) then info = 1 return endif tau = 1.0d-1 c c Special processing of the first p-by-p leading submatrix T.p . c Choose the block-size p such that it minimizes sigma.min(T.p) c for p = 1,2 , and solve the two p-by-p systems involving T.p . c smin1 = dabs(r(1)) smin2 = dmin1(dabs(r(1)+r(2)),dabs(r(1)-r(2))) if (smin1 .gt. smin2) then gammao = r(1) x(1) = b(1)/gammao y(1) = -r(2)/gammao gamma = gammao + r(2)*y(1) psi = smin1 smin = smin1 if (wantsm) sigmin(1) = smin1 k = 1 pold = 1 c print *,'First step was a 1-step' else if (smin2 .eq. 0.0d0) goto 999 call dgesl2(r(1),r(2),r(2),r(1),b(1),b(2),x(1),x(2),inf) if (inf .eq. 1) goto 999 c1 = -r(2) c2 = -r(3) call dgesl2(r(1),r(2),r(2),r(1),c1,c2,y(1),y(2),inf) gamma = r(1) + r(2)*y(1) + r(3)*y(2) psi = smin2 smin = smin2 if (wantsm) then sigmin(1) = - smin1 sigmin(2) = smin2 endif k = 2 pold = 2 psteps = 1 c print *,'First step was a 2-step' endif c c Main loop starts here. c if (k .eq. n) goto 400 100 continue c print *,'Enter main loop with k = ',k c c Prepare for a classical Levinson step. c b1 = b(k+1) - ddot(k,r(2),1,x(1),-1) c1 = -r(k+2) - ddot(k,r(2),1,y(1),-1) ip = idamax(k,y(1),1) yn = dabs(y(ip)) yn2 = yn**2 smin1 = dabs(gamma)/dmax1(1.0d0,yn2) c c If the leading submatrix T.k+1 is well-conditioned, or if c k+1 = n , then perform a classical Levinson step ... c if (smin1 .ge. tau*smin .or. k+1 .eq. n) goto 200 c c ... else compute y.k,1 . If the previous step was a classical c step, then use the simple updating formula ... c c print *,' entering block phase: p = 2, k =',k if (pold .eq. 1) then c print *,' use simple updating formula to get y.k,1' t = c1/gammao do 110 i = 1,k-1 y(n+i) = y(i+1) - y(1)*y(i) + t*w(n+k-i) 110 continue y(n+k) = -y(1)*y(k) + t c c ... else if the previous step was the first step of the algorithm, c then compute y.k,1 directly ... c elseif (pold. eq. k) then c print *,' previous step was the first' a1 = -r(3) a2 = -r(4) call dgesl2(r(1),r(2),r(2),r(1),a1,a2,y(n+1),y(n+2),inf) c c ... else update the previous y.k,1 . c else c print *,' update the previous y.k,1 ' a1 = -r(k+1) - ddot(k-2,r(2),1,y(n+1),-1) a2 = -r(k+2) - ddot(k-2,r(3),1,y(n+1),-1) call dgesl2(gammao,g21,g21,g22,a1,a2,e1,e2,inf) do 120 i = 1,k-2 w(i) = y(n+i) + e1*w(n+k-i-1) + e2*y(n+k-i-1) 120 continue do 130 i = 1,k-2 y(n+i) = w(i) 130 continue y(n+k-1) = e1 y(n+k) = e2 endif c c Construct the Schur complement Gamma.2 and compute its smallest c singular value. Also, construct the corresponding right-hand sides. c g21 = r(2) + ddot(k,r(3),1,y(1) ,1) g22 = r(1) + ddot(k,r(3),1,y(n+1),1) call dsysv2(gamma,g21,g22,s,t) b2 = b(k+2) - ddot(k,r(3),1,x(1),-1) c2 = -r(k+3) - ddot(k,r(3),1,y(1),-1) c c Estimate the smallest singular value sigma.min(T.k+2) of the c leading submatrix of order k+2 . c ip = idamax(k,y(n+1),1) yn = dmax1(yn,dabs(y(n+ip))) yn2 = yn**2 smin2 = t/dmax1(1.0d0,yn2) c c If this T.k+2 is not well-conditioned, then find the best possible c block-size; i.e. the p that maximizes sigma.min(T.k+p) , p = 1,2 . c if (smin2 .lt. tau*smin) then c print *,' do best possible step' h = dmax1(smin1,smin2) if (h .lt. smin) smin = h if (smin1 .gt. smin2) goto 200 if (smin2 .eq. 0.0d0) goto 999 endif c c Perform an extended Levinson step: save the present y.k , solve c the small systems involving the Schur complement Gamma.2 , and c update the solutions. c c print *,' perform an extended Levinson step with p = 2' call dgesl2(gamma,g21,g21,g22,b1,b2,a1,a2,inf) if (inf .eq. 1) goto 999 call dgesl2(gamma,g21,g21,g22,c1,c2,e1,e2,inf) do 140 i = 1,k w(n+i) = y(i) x(i) = x(i) + a1*y(k-i+1) + a2*y(n+k-i+1) w(i) = y(i) + e1*y(k-i+1) + e2*y(n+k-i+1) 140 continue do 150 i = 1,k y(i) = w(i) 150 continue k = k + 2 x(k-1) = a1 x(k) = a2 y(k-1) = e1 y(k) = e2 gammao = gamma gamma = r(1) + ddot(k,r(2),1,y(1),1) pold = 2 psteps = psteps + 1 if (smin2 .lt. psi) psi = smin2 if (wantsm) then sigmin(k-1) = - smin1 sigmin(k) = smin2 endif goto 300 c c Perform a classical Levinson step: save the present y.k , solve c for a1 and e1 , and update the solutions. c 200 continue c print *,' perform a classical Levinson step' a1 = b1/gamma e1 = c1/gamma do 210 i = 1,k w(n+i) = y(i) x(i) = x(i) + a1*y(k-i+1) y(n+i) = y(i) + e1*y(k-i+1) 210 continue do 220 i = 1,k y(i) = y(n+i) 220 continue k = k + 1 x(k) = a1 y(k) = e1 gammao = gamma gamma = gamma - c1*e1 pold = 1 if (smin1 .lt. psi) psi = smin1 if (wantsm) sigmin(k) = smin1 c c While k .lt. n repeat the main loop. c 300 continue if (k .lt. n) goto 100 c c Normal termination of main loop. Compute condition numbers. c 400 continue t = dasum(n,r(1),1) conda = t/psi if (pold .eq. 1) then condm = t/smin1 else condm = t/smin2 endif return c c Get here only if 2 consecutive exactly singular leading submatrices c were encountered. c 999 continue info = 2 if (wantsm) then do 1000 i = k+1,n sigmin(i) = 0.0d0 1000 continue endif return c end c c DSYTEP-------------------------------------------------------------- c subroutine dsytep(n,r,b,pmax,wantsm,w,y, - x,conda,condm,psteps,sigmin,info) logical wantsm integer info,n,pmax,psteps double precision b(*),conda,condm,r(*),sigmin(*),w(*),x(*),y(n,*) c c Subroutine dsytep solves a system of linear equations with a c symmetric Toeplitz matrix, using the extended Levinson algorithm c with maximum block-size pmax . c c This routine is designed for pmax .gt. 2 ; subroutines dsytcl c and dsyte2 are provided for the special cases pmax = 1 (clas- c sical Levinson) and pmax = 2 . c c The numbers conda and condm are estimates of the algorithm c and matrix condition numbers, respectively. If conda is much c greater than condm , then a bigger pmax is required in order c to 'skip over' a cluster of consecutive ill-conditioned leading c submatrices. c c The quantities psteps and sigmin are further diagnostic tools: c psteps is the number of block-steps taken, and the array sigmin c (if wanted) holds the estimates for the smallest singular values of c all the leading submatrices. Entries of sigmin corresponding to c submatrices that were 'skipped over' are negative. c c On entry c c n integer c the order of the Toeplitz matrix (n .gt. 0). c c r double precision(n) c the first row of the symmetric Toeplitz matrix. c c b double precision(n) c the right-hand side of the linear system. c c pmax integer c the maximum block-size (pmax .gt. 0). c c wantsm logical c wantsm = true - return sigmin , c wantsm = false - do not return sigmin . c c w double precision(lw) c workspace whose length is given by c lw = pmax*(pmax*pmax + 14)/3 + 2*(pmax*pmax - 1) c Examples of lw as a function of pmax is given below: c pmax: 3 4 5 6 7 8 9 10 c lw : 39 70 113 170 243 334 445 578 c c y double precision(n,pmax+2) c two-dimensional workspace, used internally to hold c the solutions to the Yule-Walker problems. c c On return c c x double precision(n) c the solution to the linear system; if b is not required c then b and x may share the same storage locations. c c conda double precision c an estimate of the algorithm condition number, defined c as the 2-norm of the Toeplitz matrix divided by the c smallest singular value of all the leading submatrices, c except those that were 'skipped over' by the algorithm. c c condm double precision c an estimate of the classical 2-norm condition number of c the Toeplitz matrix. c c psteps integer c the number of block-steps used to solve the linear system c of equations with the extended Levinson algorithm. c c sigmin double precision(n) c if wantsm = true , then sigmin holds the estimates of c the smallest singular values of all the leading submatri- c ces; if wantsm = false , then sigmin is not accessed. c Entries of sigmin corresponding to submatrices that c were 'skipped over' are negative. c c info integer c info = 0 normal value, c info = 1 illegal n or pmax , c info = 2 a block-step failed because pmax consecutive c exactly singular leading submatrices were c encountered. c c This version dated 09/04/90. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Subroutines and functions c c blas dasum,ddot,idamax c linpack dgefa,dgesl,dgesl2 c others dgesm,dsysv2,dsytcl,dsyte2,dtplz c fortran dabs,dmax1,dmin1,min0 c c Internal variables c integer a,c,e,g,idamax,inf,ip,ip1,ip2,ipvt,ipvt2, - k,kold,matp,p,pmax1,pmax2,pold double precision dasum,ddot,gamma,gammao,g12,g22,psi,s,smin, - t,tau,yn,yn2 c c Check parameters and treat special cases. c psteps = 0 if (pmax .lt. 3) then if (pmax .eq. 1) - call dsytcl(n,r,b,wantsm,y(1,2),y(1,1), - x,conda,condm,sigmin,info) if (pmax .eq. 2) - call dsyte2(n,r,b,wantsm,y(1,3),y(1,1), - x,conda,condm,psteps,sigmin,info) return endif k = 0 info = 0 if (n .eq. 1) then if (r(1) .eq. 0.0d0) goto 999 x(1) = b(1)/r(1) endif if (n .le. 0 .or. pmax .le. 0) then info = 1 return endif c c Initialize the pointers and set the factor tau . c a = 1 + pmax c = a + pmax e = c + pmax g = e + pmax ipvt2 = g + pmax*pmax pmax1 = pmax + 1 pmax2 = pmax + 2 tau = 1.0d-1 c c Special processing of the first p-by-p leading submatrix T.p . c Choose the block-size p such that it minimizes sigma.min(T.p) c for p = 1,...,min(n,pmax) . c ip = min0(n,pmax) w(1) = dabs(r(1)) w(2) = dmin1(dabs(r(1)+r(2)),dabs(r(1)-r(2))) ipvt = ipvt2 do 10 k = 3,ip ipvt = ipvt + (k-1)*k matp = ipvt + k call dtplz(k,r,r,w(matp)) call dgesm(w(matp),k,k,w(ipvt),w(k),w(e)) 10 continue p = idamax(ip,w(1),1) smin = w(p) if (smin .eq. 0.0d0) goto 999 psi = w(p) if (wantsm) then do 20 i = 1,p-1 sigmin(i) = - w(i) 20 continue sigmin(p) = w(p) endif c c Now solve the two p-by-p systems involving T.p . c if (p .eq. 1) then gammao = r(1) x(1) = b(1)/gammao y(1,1) = -r(2)/gammao else if (p .eq. 2) then call dgesl2(r(1),r(2),r(2),r(1),b(1),b(2),x(1),x(2),inf) s = -r(2) t = -r(3) call dgesl2(r(1),r(2),r(2),r(1),s,t,y(1,1),y(2,1),inf) else do 30 i = 1,p x(i) = b(i) y(i,1) = -r(i+1) 30 continue ipvt = ipvt2 + (p*(p*p - 1))/3 - 2 matp = ipvt + p call dgesl(w(matp),p,p,w(ipvt),x ,0) call dgesl(w(matp),p,p,w(ipvt),y(1,1),0) endif psteps = 1 endif pold = p k = p c c Compute the initial Schur complement gamma . c gamma = r(1) do 40 i = 1,p gamma = gamma + r(i+1)*y(i,1) 40 continue c c Main loop starts here. c if (k .eq. n) goto 700 100 continue c print *,'Enter main loop with k = ',k c c Prepare for a classical Levinson step. c p = 1 w(a) = b(k+1) - ddot(k,r(2),1,x(1) ,-1) w(c) = -r(k+2) - ddot(k,r(2),1,y(1,1),-1) ip = idamax(k,y(1,1),1) yn = dabs(y(ip,1)) yn2 = yn**2 w(1) = dabs(gamma)/dmax1(1.0d0,yn2) c c If the leading submatrix T.k+1 is well-conditioned, or if c k+1 = n , then perform a classical Levinson step ... c if (w(1) .ge. tau*smin .or. k+1 .eq. n) goto 500 c c ... else perform an inner loop to determine the optimal c block-size p adaptively. c 200 continue p = p + 1 c print *,' entering block phase: k,p =',k,p c c Compute y.k,p-1 . If the previous step was a classical step, c then use the simple updating formula ... c if (pold .eq. 1) then c print *,' use simple updating formula to get y.k,p-1' t = w(c+p-2)/gammao do 210 i = 1,k-1 y(i,p) = y(i+1,p-1) - y(1,p-1)*y(i,1) + t*y(k-i,pmax1) 210 continue y(k,p) = -y(1,p-1)*y(k,1) + t c c ... else if the previous step was the first step of the algorithm, c then compute all y.k,i , i = 1,..,pmax-1 directly ... c elseif (pold. eq. k) then c print *,' previous step was the first' if (p .eq. 2) then do 230 j = 2,pmax if (k .eq. 2) then s = -r(1+j) t = -r(2+j) call dgesl2(r(1),r(2),r(2),r(1),s,t,y(1,j),y(2,j),inf) else do 220 i = 1,k y(i,j) = -r(i+j) 220 continue call dgesl(w(matp),k,k,w(ipvt),y(1,j),0) endif 230 continue endif c c ... else update the previous y.k,1 (for p = 2 ) and continue c along this line. c elseif (p .eq. 2) then c print *,' update the previous y.k,p-1 ' kold = k - pold do 240 i = 1,pold w(e+i-1) = -r(kold+i+2) - ddot(kold,r(i+1),1,y(1,2),-1) 240 continue if (pold .eq. 2) then s = w(e) t = w(e+1) call dgesl2(r(1),r(2),r(2),r(1),s,t,w(e),w(e+1),inf) else call dgesl(w(matp),pold,pold,w(ipvt),w(e),0) endif do 250 i = 1,kold y(i,pmax2) = y(i,2) + w(e)*y(kold-i+1,pmax1) 250 continue y(kold+1,2) = w(e) do 270 j = 2,pold do 260 i = 1,kold y(i,pmax2) = y(i,pmax2) + w(e+j-1)*y(kold-i+1,j) 260 continue y(kold+j,2) = w(e+j-1) 270 continue do 280 i = 1,kold y(i,2) = y(i,pmax2) 280 continue elseif (p .eq. 3) then do 290 i = 1,k-1 y(i,pmax2) = (y(i,2) - y(i+1,1) + y(1,1)*y(i,1))/w(c) y(i,3) = y(i+1,2) - y(1,2)*y(i,1) + w(c+1)*y(i,pmax2) 290 continue y(k,pmax2) = (y(k,2) + y(1,1)*y(k,1))/w(c) y(k,3) = -y(1,2)*y(k,1) + w(c+1)*y(k,pmax2) else do 300 i = 1,k-1 y(i,p) = y(i+1,p-1) - y(1,p-1)*y(i,1) + w(c+p-2)*y(i,pmax2) 300 continue y(k,p) = -y(1,p-1)*y(k,1) + w(c+p-2)*y(k,pmax2) endif c c Construct the Schur complement Gamma.p , factorize it, and estimate c its smallest singular value. c if (p .eq. 2) then g12 = r(2) + ddot(k,r(3),1,y(1,1),1) g22 = r(1) + ddot(k,r(3),1,y(1,2),1) w(g) = gamma w(g+1) = g12 w(g+pmax) = g12 w(g+pmax1) = g22 ipvt = ipvt2 matp = ipvt + 2 w(matp) = gamma w(matp+1) = g12 w(matp+2) = g12 w(matp+3) = g22 call dsysv2(gamma,g12,g22,s,t) else ipvt = ipvt + (p-1)*p matp = ipvt + p ip1 = g + p-1 ip2 = g + (p-1)*pmax do 310 i = 1,p-1 w(ip1) = r(p-i+1) + ddot(k,r(p+1),1,y(1,i),1) w(ip2) = w(ip1) ip1 = ip1 + pmax ip2 = ip2 + 1 310 continue w(ip2) = r(1) + ddot(k,r(p+1),1,y(1,p),1) ip1 = matp do 330 j = 1,p ip2 = g + (j-1)*pmax do 320 i = 1,p w(ip1) = w(ip2) ip1 = ip1 + 1 ip2 = ip2 + 1 320 continue 330 continue call dgesm(w(matp),p,p,w(ipvt),t,w(e)) endif c c Construct the corresponding right-hand sides. c w(a+p-1) = b(k+p ) - ddot(k,r(p+1),1,x(1) ,-1) w(c+p-1) = -r(k+p+1) - ddot(k,r(p+1),1,y(1,1),-1) c c Estimate the smallest singular value sigma.min(T.k+p) of the c leading submatrix of order k+p . c ip = idamax(k,y(1,p),1) yn = dmax1(yn,dabs(y(ip,p))) yn2 = yn**2 w(p) = t/dmax1(1.0d0,yn2) c c If this T.k+p is well-conditioned, then perform an extended c Levinson step ... c if (w(p) .ge. tau*smin) goto 400 c c ... else, while p .le. pmax and k+p .lt. n , repeat the inner c loop. Otherwise, find the best possible block-size; i.e. the p c that maximizes sigma.min(T.k+p) , p = 1,...,min(pmax,p-k) . c if (k+p .eq. n) then p = idamax(p,w(1),1) elseif (p .lt. pmax) then goto 200 else p = idamax(pmax,w(1),1) endif c print *,' do best possible step' if (w(p) .eq. 0.0d0) goto 999 if (w(p) .lt. smin) smin = w(p) ipvt = ipvt2 + (p*(p*p - 1))/3 - 2 matp = ipvt + p if (p .eq. 1) goto 500 c c Perform an extended Levinson step: save the present y.k , solve c the small systems involving the Schur complement Gamma.p , and c update the solutions. c 400 continue c print *,' perform an extended Levinson step with p =',p if (p .eq. 2) then s = w(a) t = w(a+1) call dgesl2(gamma,g12,g12,g22,s,t,w(a),w(a+1),inf) s = w(c) t = w(c+1) call dgesl2(gamma,g12,g12,g22,s,t,w(e),w(e+1),inf) else do 410 i = 0,p-1 w(e+i) = w(c+i) 410 continue call dgesl(w(matp),p,p,w(ipvt),w(a),0) call dgesl(w(matp),p,p,w(ipvt),w(e),0) endif do 420 i = 1,k x(i) = x(i) + w(a)*y(k-i+1,1) + w(a+1)*y(k-i+1,2) y(i,pmax1) = y(i,1) y(i,pmax2) = y(i,1) + w(e)*y(k-i+1,1) + w(e+1)*y(k-i+1,2) 420 continue x(k+1) = w(a) x(k+2) = w(a+1) y(k+1,1) = w(e) y(k+2,1) = w(e+1) do 440 j = 3,p do 430 i = 1,k x(i) = x(i) + w(a+j-1)*y(k-i+1,j) y(i,pmax2) = y(i,pmax2) + w(e+j-1)*y(k-i+1,j) 430 continue x(k+j) = w(a+j-1) y(k+j,1) = w(e+j-1) 440 continue do 450 i = 1,k y(i,1) = y(i,pmax2) 450 continue k = k + p gammao = gamma gamma = r(1) + ddot(k,r(2),1,y(1,1),1) pold = p psteps = psteps + 1 if (w(p) .lt. psi) psi = w(p) if (wantsm) then do 460 i = 1,p-1 sigmin(k+i) = - w(i) 460 continue sigmin(k+p) = w(p) endif goto 600 c c Perform a classical Levinson step: save the present y.k , solve c for w(a) and w(e) , and update the solutions. c 500 continue c print *,' perform a classical Levinson step' w(a) = w(a)/gamma w(e) = w(c)/gamma do 510 i = 1,k y(i,pmax1) = y(i,1) x(i) = x(i) + w(a)*y(k-i+1,1) y(i,pmax2) = y(i,1) + w(e)*y(k-i+1,1) 510 continue do 520 i = 1,k y(i,1) = y(i,pmax2) 520 continue k = k + 1 x(k) = w(a) y(k,1) = w(e) gammao = gamma gamma = gamma - w(c)*w(e) pold = 1 if (w(1) .lt. psi) psi = w(1) if (wantsm) sigmin(k) = w(1) c c While k .lt. n repeat the main loop. c 600 continue if (k .lt. n) goto 100 c c Normal termination of main loop. Compute condition numbers. c 700 continue t = dasum(n,r(1),1) conda = t/psi condm = t/w(p) return c c Get here only if pmax consecutive exactly singular leading c submatrices were encountered. c 999 continue info = 2 if (wantsm) then do 1000 i = k+1,n sigmin(i) = 0.0d0 1000 continue endif return c end c c DSYTST-------------------------------------------------------------- c program dsytst c c This program tests the following subroutines: c - dsytcl c - dsyte2 c - dsytep c for solving symmetric Toeplitz systems by means of the Levinson c algorithm. Subroutine dsytcl is an implementation of the classical c Levinson algorithm, while dsyte2 and dsytep are implementations of c the extended Levinson algorithm. In dsyte2, the maximum block-size c is 2, and in dsytep the user can specify any maximum block-size. c The algorithm is described in Hansen & Chan (1990) c c The subroutines are tested on the following Toeplitz matrices: c - symmetric matrices with random elements, c - the symmetric test-matrix from Sweet (1986), and c - the symmetric KMS test-matrices from Trench (1989). c c The performance of the subroutines is compared with that of the c subroutine tsld1 from the Argonne Toeplitz package. c c References: c P.C. Hansen & T.F. Chan, "Fortran subroutines for general Toeplitz c systems", Report CAM-90-xx, Dept. of Mathematics, UCLA, 1990. c Submitted to ACM Trans. Math. Software. c D.R. Sweet, "The use of pivoting to improve the numerical perfor- c mance of Toeplitz solvers"; in J.M. Speiser (Ed.), Advanced c Algorithms and Architectures for Signal Processing", Proc. c SPIE 696 (1986), 8-18. c W.F. Trence, "Numerical solutions of the eigenvalue problem for c Hermitian Toeplitz matrices", SIAM J. Matrix Anal. Appl. 10 c (1989), 135-146. c c This version dated 09/05/90. c Per Christian Hansen, UNI-C, Technical University of Denmark. c c Subroutines and functions c c toeplitz dsytcl,dsyte2,dsytep c others timing,tsld1,urand c fortran dble,iabs c c Internal variables c logical wantsm integer iabs,in,info,iy,n,pmax,psteps real ru,time(6),timing double precision b(2048),conda,condm,dble,r(2048),relerr, - sigmin(2048),w(4096),x(2048),y(2048,10) c c Generate random matrices of order n = 512,1024,2048 and solve c by all the subroutines. For dsytep, use pmax equal to 4, 6 and 8. c Compute the overhead involved in the extended Levinson algorithm. c The exact solution consists of all ones. c write(6,1000) write(6,2001) write(6,1001) write(6,1002) iy = 223366 do 100 in=9,11 n = 2**in c do 10 i=1,n ru = urand(iy) 10 r(i) = dble(ru) c do 50 i=1,n b(i) = 0.0d0 do 40 j=1,n 40 b(i) = b(i) + r(iabs(i-j)+1) 50 continue c wantsm = .false. c time(1) = timing() call tsld1(r,r(2),b,x,w,y,n) time(1) = timing() - time(1) if (time(1) .eq. 0.0d0) time(1) = 1.0d0 relerr = 0.0d0 do 61 i=1,n 61 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1003) n,1,relerr,time(1),0.0d0 c time(2) = timing() call dsytcl(n,r,b,wantsm,w,y,x,conda,condm,sigmin,info) time(2) = timing() - time(2) if (time(2) .eq. 0.0d0) time(2) = 1.0d0 if (info .ne. 0) then print *,'return from dsytcl with info =',info goto 999 endif relerr = 0.0d0 do 62 i=1,n 62 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1004) n,2,relerr,conda,condm,time(2), - (time(2)-time(1))/time(1),0.0d0 c time(3) = timing() call dsyte2(n,r,b,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(3) = timing() - time(3) if (info .ne. 0) then print *,'returned from dsyte2 with info =',info goto 999 endif relerr = 0.0d0 do 63 i=1,n 63 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,3,relerr,conda,condm,psteps,time(3), - (time(3)-time(1))/time(1),(time(3)-time(2))/time(2) c pmax = 4 time(4) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(4) = timing() - time(4) if (info .ne. 0) then print *,'returned from dsytep (pmax=4) with info =',info goto 999 endif relerr = 0.0d0 do 64 i=1,n 64 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,4,relerr,conda,condm,psteps,time(4), - (time(4)-time(1))/time(1),(time(4)-time(2))/time(2) c pmax = 6 time(5) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(5) = timing() - time(5) if (info .ne. 0) then print *,'returned from dsytep (pmax=6) with info =',info goto 999 endif relerr = 0.0d0 do 65 i=1,n 65 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,5,relerr,conda,condm,psteps,time(5), - (time(5)-time(1))/time(1),(time(5)-time(2))/time(2) c pmax = 8 time(6) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(6) = timing() - time(6) if (info .ne. 0) then print *,'returned from dsytep (pmax=8) with info =',info goto 999 endif relerr = 0.0d0 do 66 i=1,n 66 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,6,relerr,conda,condm,psteps,time(6), - (time(6)-time(1))/time(1),(time(6)-time(2))/time(2) c write(6,1002) 100 continue c c Generate the test matrix from Sweet (1986) and solve it by means c of all the subroutines. The exact solution is all ones. c write(6,2002) write(6,1001) write(6,1002) call dsyswe(r,n) c do 150 i=1,n b(i) = 0.0d0 do 140 j=1,n 140 b(i) = b(i) + r(iabs(i-j)+1) 150 continue c wantsm = .false. c time(1) = timing() call tsld1(r,r(2),b,x,w,y,n) time(1) = timing() - time(1) relerr = 0.0d0 do 161 i=1,n 161 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1003) n,1,relerr,time(1) c time(2) = timing() call dsytcl(n,r,b,wantsm,w,y,x,conda,condm,sigmin,info) time(2) = timing() - time(2) if (info .ne. 0) then print *,'return from dsytcl with info =',info goto 999 endif relerr = 0.0d0 do 162 i=1,n 162 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1004) n,2,relerr,conda,condm,time(2) c time(3) = timing() call dsyte2(n,r,b,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(3) = timing() - time(3) if (info .ne. 0) then print *,'returned from dsyte2 with info =',info goto 999 endif relerr = 0.0d0 do 163 i=1,n 163 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,3,relerr,conda,condm,psteps,time(3) c pmax = 4 time(4) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(4) = timing() - time(4) if (info .ne. 0) then print *,'returned from dsytep (pmax=4) with info =',info goto 999 endif relerr = 0.0d0 do 164 i=1,n 164 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,4,relerr,conda,condm,psteps,time(4) c pmax = 6 time(5) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(5) = timing() - time(5) if (info .ne. 0) then print *,'returned from dsytep (pmax=6) with info =',info goto 999 endif relerr = 0.0d0 do 165 i=1,n 165 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,5,relerr,conda,condm,psteps,time(5) c pmax = 8 time(6) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(6) = timing() - time(6) if (info .ne. 0) then print *,'returned from dsytep (pmax=8) with info =',info goto 999 endif relerr = 0.0d0 do 166 i=1,n 166 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,6,relerr,conda,condm,psteps,time(6) c write(6,1002) c c Generate three KMS test matrices from Trench (1989) and solve c the systems by means of all the subroutines. The exact solution c is all ones. c write(6,2003) write(6,1001) write(6,1002) do 300 in=9,11 n = 2**in call dsykms(n,r) c do 250 i=1,n b(i) = 0.0d0 do 240 j=1,n 240 b(i) = b(i) + r(iabs(i-j)+1) 250 continue c wantsm = .false. c time(1) = timing() call tsld1(r,r(2),b,x,w,y,n) time(1) = timing() - time(1) if (time(1) .eq. 0.0d0) time(1) = 1.0d0 relerr = 0.0d0 do 261 i=1,n 261 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1003) n,1,relerr,time(1),0.0d0 c time(2) = timing() call dsytcl(n,r,b,wantsm,w,y,x,conda,condm,sigmin,info) time(2) = timing() - time(2) if (time(2) .eq. 0.0d0) time(2) = 1.0d0 if (info .ne. 0) then print *,'return from dsytcl with info =',info goto 999 endif relerr = 0.0d0 do 262 i=1,n 262 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1004) n,2,relerr,conda,condm,time(2), - (time(2)-time(1))/time(1),0.0d0 c time(3) = timing() call dsyte2(n,r,b,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(3) = timing() - time(3) if (info .ne. 0) then print *,'returned from dsyte2 with info =',info goto 999 endif relerr = 0.0d0 do 263 i=1,n 263 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,3,relerr,conda,condm,psteps,time(3), - (time(3)-time(1))/time(1),(time(3)-time(2))/time(2) c pmax = 4 time(4) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(4) = timing() - time(4) if (info .ne. 0) then print *,'returned from dsytep (pmax=4) with info =',info goto 999 endif relerr = 0.0d0 do 264 i=1,n 264 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,4,relerr,conda,condm,psteps,time(4), - (time(4)-time(1))/time(1),(time(4)-time(2))/time(2) c pmax = 6 time(5) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(5) = timing() - time(5) if (info .ne. 0) then print *,'returned from dsytep (pmax=6) with info =',info goto 999 endif relerr = 0.0d0 do 265 i=1,n 265 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,5,relerr,conda,condm,psteps,time(5), - (time(5)-time(1))/time(1),(time(5)-time(2))/time(2) c pmax = 8 time(6) = timing() call dsytep(n,r,b,pmax,wantsm,w,y,x, - conda,condm,psteps,sigmin,info) time(6) = timing() - time(6) if (info .ne. 0) then print *,'returned from dsytep (pmax=8) with info =',info goto 999 endif relerr = 0.0d0 do 266 i=1,n 266 relerr = relerr + (x(i) - 1.0d0)**2 relerr = dsqrt(relerr/dble(n)) write(6,1005) n,6,relerr,conda,condm,psteps,time(6), - (time(6)-time(1))/time(1),(time(6)-time(2))/time(2) c write(6,1002) 300 continue c 1000 format(' ','**************************************************',/, - ' ','* Numerical test of extended Levinson algorithms *',/, - ' ','* for symmetric Toeplitz matrices. *',/, - ' ','* Computer: Amdahl 5890. *',/, - ' ','**************************************************',/, - ' ',/, - ' ','Legend:',/, - ' ',' 1 - tsld1 (from Argonne Toeplitz package)',/, - ' ',' 2 - dsytcl, classical Levinson algorithm ', - 'with condition estimation',/, - ' ',' 3 - dsyte2, extended Levinson algorithm ', - 'with fixed PMAX = 2',/, - ' ',' 4 - dsytep, extended Levinson algorithm, ', - 'general version with PMAX = 4',/, - ' ',' 5 - dsytep, extended Levinson algorithm, ', - 'general version with PMAX = 6',/, - ' ',' 6 - dsytep, extended Levinson algorithm, ', - 'general version with PMAX = 8') c 1001 format(' ',' N case error CONDA CONDM PSTEPS', - ' time overhead') 1002 format(' ','-------------------------------------------------', - '--------------------') 1003 format(' ',i4,i4,1pe11.2,' ', - ' ',0pf7.1,2pf7.1,'%') 1004 format(' ',i4,i4,1p3e11.2,' ',0pf7.1,2pf7.1,'%',2pf7.1,'%') 1005 format(' ',i4,i4,1p3e11.2,i4 ,0pf7.1,2pf7.1,'%',2pf7.1,'%') c 2001 format(' ',/,' ','Test matrices with random elements:',/,' ') 2002 format(' ',/,' ','Test matrix from Sweet (1986):',/,' ') 2003 format(' ',/,' ','Test matrices from Trench (1989):',/,' ') c 999 stop end c c DTPLZ--------------------------------------------------------------- c subroutine dtplz(p,r,s,T) integer p,q double precision r(*),s(*),T(p,*) c c Given the first row r and the first column s , construct the c complete Toeplitz matrix T of order p . c c This version dated 12/05/1989. c Per Christian Hansen, UNI-C, Technical University of Denmark. c do 20 i = 1,p T(i,i) = r(1) q = 1 do 10 j = i+1,p q = q + 1 T(i,j) = r(q) T(j,i) = s(q) 10 continue 20 continue c return end c c IDAMAX-------------------------------------------------------------- c integer function idamax(n,dx,incx) c c finds the index of element having max. absolute value. c jack dongarra, linpack, 3/11/78. c double precision dx(1),dmax integer i,incx,ix,n c idamax = 0 if( n .lt. 1 ) return idamax = 1 if(n.eq.1)return if(incx.eq.1)go to 20 c c code for increment not equal to 1 c ix = 1 dmax = dabs(dx(1)) ix = ix + incx do 10 i = 2,n if(dabs(dx(ix)).le.dmax) go to 5 idamax = i dmax = dabs(dx(ix)) 5 ix = ix + incx 10 continue return c c code for increment equal to 1 c 20 dmax = dabs(dx(1)) do 30 i = 2,n if(dabs(dx(i)).le.dmax) go to 30 idamax = i dmax = dabs(dx(i)) 30 continue return end c c TIMING-------------------------------------------------------------- c function timing() c c Timing function for testing Toeplitz systems. c Put your local timing function here. c timing = etime(it) c return end c c TSLD1--------------------------------------------------------------- c subroutine tsld1(a1,a2,b,x,c1,c2,m) integer m double precision a1(m),a2(1),b(m),x(m),c1(1),c2(1) c c tsld1 solves the double precision linear system c a * x = b c with the t - matrix a . c c on entry c c a1 double precision(m) c the first row of the t - matrix a . c on return a1 is unaltered . c c a2 double precision(m - 1) c the first column of the t - matrix a c beginning with the second element . c on return a2 is unaltered . c c b double precision(m) c the right hand side vector . c on return b is unaltered . c c c1 double precision(m - 1) c a work vector . c c c2 double precision(m - 1) c a work vector . c c m integer c the order of the matrix a . c c on return c c x double precision(m) c the solution vector. x may coincide with b . c c toeplitz package. this version dated 07/23/82 . c c internal variables c integer i1,i2,n,n1,n2 double precision r1,r2,r3,r5,r6 c c solve the system with the principal minor of order 1 . c r1 = a1(1) x(1) = b(1)/r1 if (m .eq. 1) go to 80 c c recurrent process for solving the system c with the t - matrix for n = 2, m . c do 70 n = 2, m c c compute multiples of the first and last columns of c the inverse of the principal minor of order n . c n1 = n - 1 n2 = n - 2 r5 = a2(n1) r6 = a1(n) if (n .eq. 2) go to 20 c1(n1) = r2 do 10 i1 = 1, n2 i2 = n - i1 r5 = r5 + a2(i1)*c1(i2) r6 = r6 + a1(i1+1)*c2(i1) 10 continue 20 continue r2 = -r5/r1 r3 = -r6/r1 r1 = r1 + r5*r3 if (n .eq. 2) go to 40 r6 = c2(1) c2(n1) = 0.0d0 do 30 i1 = 2, n1 r5 = c2(i1) c2(i1) = c1(i1)*r3 + r6 c1(i1) = c1(i1) + r6*r2 r6 = r5 30 continue 40 continue c2(1) = r3 c c compute the solution of the system with the c principal minor of order n . c r5 = 0.0d0 do 50 i1 = 1, n1 i2 = n - i1 r5 = r5 + a2(i1)*x(i2) 50 continue r6 = (b(n) - r5)/r1 do 60 i1 = 1, n1 x(i1) = x(i1) + c2(i1)*r6 60 continue x(n) = r6 70 continue 80 continue return end c c URAND--------------------------------------------------------------- c real function urand(iy) integer iy c c urand is a uniform random number generator based on theory and c suggestions given in d.e. knuth (1969), vol 2. the integer iy c should be initialized to an arbitrary integer prior to the first call c to urand. the calling program should not alter the value of iy c between subsequent calls to urand. values of urand will be returned c in the interval (0,1). c integer ia,ic,itwo,m2,m,mic double precision halfm real s double precision datan,dsqrt data m2/0/,itwo/2/ if (m2 .ne. 0) go to 20 c c if first entry, compute machine integer word length c m = 1 10 m2 = m m = itwo*m2 if (m .gt. m2) go to 10 halfm = m2 c c compute multiplier and increment for linear congruential method c ia = 8*idint(halfm*datan(1.d0)/8.d0) + 5 ic = 2*idint(halfm*(0.5d0-dsqrt(3.d0)/6.d0)) + 1 mic = (m2 - ic) + m2 c c s is the scale factor for converting to floating point c s = 0.5/halfm c c compute next random number c 20 iy = iy*ia c c the following statement is for computers which do not allow c integer overflow on addition c if (iy .gt. mic) iy = (iy - m2) - m2 c iy = iy + ic c c the following statement is for computers where the c word length for addition is greater than for multiplication c if (iy/2 .gt. m2) iy = (iy - m2) - m2 c c the following statement is for computers where integer c overflow affects the sign bit c if (iy .lt. 0) iy = (iy + m2) + m2 urand = float(iy)*s return end