/* PURPOSE ****************************************************************** RKTEC computes the coefficients of Albrecht's expansion of the local truncation error of Runge-Kutta formulas (see Albrecht [1] and Hosea [3]). In principle the code can calculate truncation error coefficients of arbitrarily high orders for any Runge-Kutta formula. However, the amount of memory required depends on the number of stages in the formula and (generally) increases rapidly with the order. DISCLAIMER ****************************************************************** CAVEAT RECEPTOR: This software is provided free of charge and without warranty. Please acknowledge any substantial use by citing [3]. Correspondence about the code can be sent to its author, M.E. Hosea, at na.hosea@na-net.ornl.gov. COMPILATION ***************************************************************** The program is written in ANSI C. Before compiling it you may wish to change the settings of the parameters EPSILON and NORMTYPE. These parameters are defined just below the prologue in statements of the form #define EPSILON 50.e0*DBL_EPSILON #define NORMTYPE 1 EPSILON specifies the default zero threshold (described below) and should be somewhat larger than the double precision unit roundoff (i.e., the smallest double precision number E such that 1.e0 + E > 1.e0 on the machine). The value 100*E works well (DBL_EPSILON = 2*E if the machine rounds). Note: The only reason for including the header file is to define DBL_EPSILON. If your compiler does not have (e.g. Sun's C compiler), comment out the line "#include " below and define DBL_EPSILON appropriately. An example is provided for use with Sun cc. It is not necessary to define DBL_EPSILON at all if it is not used in the definition of EPSILON. NORMTYPE specifies the default P-norm to use (in addition to the maximum norm). Common choices are the 1-norm (NORMTYPE 1) and the Euclidean norm (NORMTYPE 2). NORMTYPE can be any positive integer, but it should be small to avoid excessive underflow when the norms are computed. This file should be named rktec.c and compiled (on most UNIX systems) by cc -o rktec rktec.c -lm or gcc -o rktec rktec.c -lm Consult your compiler's documentation if this does not appear to work properly. Most compilers have optimization options (usually -O or -On, where n is an optimization level) that can significantly improve the performance of the code. USAGE *********************************************************************** RKTEC computes the truncation error coefficients, tecs, of a Runge-Kutta formula, or a pair of formulas, specified in an input file. Summary output is directed to the screen, and detailed output is directed to an output file. Either form of output may be suppressed. The code will prompt for the input file name, output file name, and maximum order if they are not supplied on the command line. The formula(s) are specified in the input file in the following way: Documentation lines begin with " and may appear anywhere--they are echoed to the output file. The Butcher array for the formula, c | A ----- | b' is supplied one value per line as constants or ratios of constants. The first line must be s = number_of_stages Here number_of_stages is the number of stages in the formula. The formula coefficients may be given in any order after this line. Any value not supplied is assumed to be zero. Spacing does not matter. The coefficients must have the names A, b, c. In the case of a pair of formulas, the second formula can differ only in the weight vector b. The second weight vector is named bhat. For convenience, any name beginning with 'b' followed by any non-blank character except '(' can be used to specify bhat values, but all output will refer to bhat. If only values of bhat are supplied, it is interpreted as the only formula. The formula coefficients are echoed to the output file as they are interpreted by RKTEC, i.e., as double precision floating point numbers. If efficiency indices are to be computed and the number of stages does not reflect the effective cost per step (for example, as with a First-Same-As-Last pair), the effective cost per step (in terms of function evaluations) can be supplied in parentheses following the literal number of stages. A sample input file follows. " Sample input file for RKTEC " The Bogacki-Shampine 3(2) pair [2]. " This pair involves 4 stages and s must be defined " accordingly, but it is First-Same-As-Last, which " means that the effective cost of a step is about 3. " s = 4 (3) " " Weights for the formula of order 3 b(1) = 2/9 b(2) = 1/3 b(3) = 4/9 " " Weights for the formula of order 2 bhat(1) = 7/24 bhat(2) = 0.25e0 bhat(3) = 1/3 bhat(4) = 1/8 " c(1) = 0 c(2) = 1.e+0 / 2.e+0 c(3) = 3.e0 / 4.e0 c(4) = 1 " A(2,1) = 1/2 A(3,2) = 3/4 A(4,1) = 2/9 A(4,2) = 1/3 A(4,3) = 4/9 " " End of sample infile The format of the output depends on whether the input file specifies a single formula or an embedded pair. If a single formula is given, the output file contains the truncation error coefficients, tecs, for each order in the form truncation error coefficient = order condition * scaling factor. The tecs of each order are followed by a summary that includes their maximum norm, P-norm (see option 'nP' below), and maximum scaling factor. The screen output includes this summary information along with the maximum norm of the order conditions and the number of order conditions that are at least as large in magnitude as the zero threshold. If the input file specifies an embedded pair, the output file contains the tecs for both formulas and for the estimate of the local error obtained by subtracting the results of the two formulas. The output is in the format tec(b) - tec(bhat) = tec(error estimate) Summaries for each order include maximum norms and P-norms of the truncation error coefficients for the b formula, the bhat formula, and the local error estimate. The maximum scaling factors for the b and bhat formulas are also reported. The screen output includes the maximum norms and P-norms of the truncation error coefficients for the b and bhat formulas and for the error estimate. The command line for using RKTEC is rktec [-chiqsx] [-b[L]] [-nP] [-oM] [-tR] [infile] [outfile] The meaning and use of infile and outfile, the input and output files, respectively, have already been explained. The options may appear in any order, even between or after the input and output file names. The command line is delimited by blanks, so no blanks are allowed within the expression of an option (e.g., '-t1.0e-14' is a valid option, but '-t 1.0e-14' and '-t1.0 e - 14' are not). The options are 'bL' calculate the second quadrant stability region using L equally spaced radii. If L is not specified it defaults to the value specified by the constant DEFAULT_NSAMPLES. Summary output to the screen includes the average and standard deviation of the radii. If bhat is present the average and standard deviation of the radii for bhat are computed, as well as the average and standard deviation of the magnitude of the difference between the lengths of the radii for b and for bhat . The angles, radii, differences, and summaries are written to outfile. Several arbitrary choices were made for parameters used in calculating the stability region. These are defined in the top section of the code beginning with the constant INFACTOR. The stability calculation facility is provided as a convenience to the user--at present the associated routines are not as sophisticated and reliable as they could be. 'c' perform consistency check on the formula. This is a simple check of the row sums of the matrix A against the corresponding entries of the vector c and a check that the entries of the b and (if present) bhat vectors sum to one. A summary of the results is printed to the screen and more detail is written to the output file. This feature is helpful when trying to find typographical errors in new input files. 'h' show online usage documentation. 'i' compute efficiency indices as in [4]. For this to work properly, the zero threshold must be large enough so that the leading order TECs are the first to be nonzero. Both Error-Per-Step and Error-Per-Unit-Step indices are computed for the maximum norm and the P-norm selected (see option -n). The maximum norm indices appear first on the output line, followed by those obtained using the P-norm. 'q' suppress summary output to screen. 's' suppress writing of outfile. 'x' display a sample infile. Under most operating systems this output can be redirected to a file to produce a sample input file. For example, under UNIX or MS-DOS, rktec -x > bs32.rkm will produce a sample input file named 'bs32.rkm'. 'nP' calculate the P-norm (P a positive integer) in addition to the maximum norm. The default is specified by the parameter NORMTYPE prior to compilation. 'oM' calculate tecs through order M. 'tR' use R as the zero threshold. The default is specified by the parameter EPSILON prior to compilation. If the magnitude of the order condition corresponding to a particular tec is less than this threshold, the tec is considered to be zero and is not written to the output file. The zero threshold may be set to zero to turn off this feature. A positive zero threshold somewhat larger than the double precision unit roundoff can reduce memory requirements by 20% or more. The savings may be quite important for calculating high order tecs, especially for formulas with many stages. It has the additional benefit of making the output clearer by replacing very small non-zero values with zeros. Following is summary output for the sample input file given above. To illustrate the use of options, RKTEC was run on a Sun SPARCstation ELC with the command line rktec -o5 -n2 -t1.0e-14 bs32.rkm -s This line specifies that: 1. Tecs are to be computed through order 5. 2. The 2-norm (Euclidean norm) is to be used. 3. The zero threshold is 1.0e-14. 4. The input file is named 'bs32.rkm'. 5. No output file is to be written. Sample output: ----------------------------------------------------------------------------- RKTEC version 2.1. See rktec -h for usage information. Maximum order = 5 Zero threshold = 1.000000e-14 Input file = bs32.rkm File output suppressed Number of stages = 4 Cost per step = 3 truncation error coefficient summary ---------- maximum norm --------- ------------- 2-norm ------------ order b bhat error est b bhat error est 1 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 2 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 3 0.000e+00 2.083e-02 2.083e-02 0.000e+00 2.083e-02 2.083e-02 4 4.167e-02 3.125e-02 1.128e-02 4.323e-02 3.325e-02 1.560e-02 5 3.125e-02 2.344e-02 7.812e-03 4.007e-02 2.982e-02 1.253e-02 0.820 kbytes of storage were required. ----------------------------------------------------------------------------- REFERENCES ****************************************************************** [1] P. Albrecht, "A New Theoretical Approach to Runge-Kutta Methods", SIAM J. Numer. Anal., 24 (1987), 2, pp. 391-406. [2] P. Bogacki and L.F. Shampine, "A 3(2) Pair of Runge-Kutta Formulas", Appl. Math. Letters, 2 (1989), pp. 1-9. [3] M.E. Hosea, "A New Recurrence for Computing Runge-Kutta Truncation Error Coefficients", SIAM J. Numer. Anal., to appear. [4] M.E. Hosea and L.F. Shampine, "Efficiency Comparisons of Methods for Integrating ODEs", Computers Math. Applic., to appear. REVISION HISTORY ************************************************************ Version 1.0 Initial release version. Version 1.1 Rearranged source code for easier modification. Modularized command line parsing and reading of infile. Modified calctecs() so that it could be called repeatedly. Fixed a bug that caused incorrect scaling factors for high orders. Version 2.0 Modified the algorithm to account for a missing factor of 1/l! in [1]. Output differs from version 1.1 because those scaling factors are now incorporated into the TECs rather than the elementary differentials. Added support for efficiency index computations as in [4]. Made several efficiency hacks, mostly to reduce the amount of time spent in nconc(). Version 2.1 Added option to calculate stability region. Modified readinfile() to check to see if the named output file exists, prompting before overwriting. A carriage return at the prompt for the output file name now results in file output suppression. */ #define VERSION "2.1" /**********************************************************************/ /*** Header files. ***/ /**********************************************************************/ #include #include #include #include #include /* To use Sun cc, comment out the following line. */ #include /* To use Sun cc, uncomment the following line. */ /* #define DBL_EPSILON (nextafter(1.e0,2.e0)-1.e0) */ /**********************************************************************/ /*** EPSILON is the default zero threshold. EPSILON should be ***/ /*** somewhat larger than the double precision unit roundoff. ***/ /*** NORMTYPE is the default P-norm to use along with the max-norm. ***/ /**********************************************************************/ #define EPSILON 50.e0*DBL_EPSILON #define NORMTYPE 1 /**********************************************************************/ /*** The following constants have to do only with calculating the ***/ /*** stability region. Several were chosen somewhat arbitrarily. ***/ /**********************************************************************/ /* factor to use when searching inward along a radius for a point in the stability region. */ #define INFACTOR 0.618e0 /* factor to use when searching outward along a radius for a point outside the stability region. */ #define OUTFACTOR 1.618e0 /* When searching outward, don't bother to look beyond this limit. */ #define BIGENOUGH 1.e+5 /* When searching inward, don't look closer to the origin than this. */ #define SMALLENOUGH 1.e-5 /* The first evaluation along a radius is made this far from the origin. */ #define SMALLSTEP 1.e-1 /* the relative error tolerance for the root solver. */ #define SECANTTOL 1.e-6 /* the maximum number of iterations for root solver */ #define MAXITS 100 /* the default number of equally spaced radii to search */ #define DEFAULT_NSAMPLES 21 /* beginning angle--should be pi/2 */ #define THETABEG 0.5e0 * M_PI /* ending angle--should be pi */ #define THETAEND M_PI /**********************************************************************/ /*** Global data and miscellaneous macros. ***/ /**********************************************************************/ int s, cps, calcinds, order, *composition, quiet, wf; long int *indices; FILE *infp, *outfp; double thresh; #define BUFSIZE 255 char rkfile[BUFSIZE + 1], outfile[BUFSIZE + 1], mline[BUFSIZE + 1]; struct cons { double *first; struct cons *rest; double scaling; } trashcan; struct cmplx { double re; double im; } *zx, *ImzA; #define colsize (s*sizeof(double)) #define consize (s*sizeof(double)+sizeof(struct cons)) #define ijth(arptr,i,j) ((arptr+(i-1)+(j-1)*s)) /**********************************************************************/ /*** Main program. ***/ /**********************************************************************/ int main(argc, argv) int argc; char *argv[]; { double *A, *b, *bhat; struct cons *c; int ccheck, pnorm, nsamples; void parsecmdline(), readinfile(), chkconsist(), calctecs(), stabplot(); trashcan.rest = NULL; trashcan.scaling = 0.e0; parsecmdline(argc, argv, &ccheck, &pnorm, &nsamples); readinfile(&A, &b, &bhat, &c); if (ccheck) chkconsist(A, b, c, bhat); if (nsamples) stabplot(A, b, bhat, nsamples); calctecs(A, b, c, bhat, pnorm); if (wf) fclose(outfp); return (0); } /**********************************************************************/ /*** Error exit routine. ***/ /**********************************************************************/ void errorexit(errcode) char errcode; { int errorlevel; switch (toupper(errcode)) { case 'D': printf( "\nusage: rktec [-chiqsx] [-b[L]] [-nP] [-oM] [-tR] [infile] [outfile]\n"); printf( "Computes the coefficients of Albrecht's expansion of the local truncation\n"); printf( "error of a Runge-Kutta formula (or pair) given in an input file.\n"); printf("options: 'c' do a formula consistency check.\n"); printf(" 'i' compute efficiency indices.\n"); printf( " 'q' suppress output to screen, 's' suppress writing of outfile.\n"); printf(" 'x' display a sample infile.\n"); printf( " 'bL' calculate quadrant II stability region along L radii.\n"); printf(" 'nP' use the P-norm (always get the maximum norm)."); printf(" Default = %i.\n", NORMTYPE); printf(" 'oM' calculate tecs through order M.\n"); printf(" 'tR' use R as the zero threshold. "); printf("Default = %le.\n", thresh); printf( "The input file, infile, specifies the Runge-Kutta formula in terms of its\n"); printf("Butcher array: c | A\n"); printf(" -----\n"); printf(" | b'\n"); printf( "The file may contain comment lines beginning with \". Values in A, b, and\n"); printf( "c are supplied one per line as constants or ratios of constants. These\n"); printf( "values may be given in any order but must be preceded by a line that\n"); printf( "specifies the number of stages in the formula. Any values not supplied\n"); printf( "are set to zero. The second formula of an embedded pair can be specified\n"); printf( "by an additional vector bhat. See rktec -x for a sample infile.\n"); printf( "Version %s. Written by M.E. Hosea. Correspondence about rktec can be\n", VERSION); printf("sent to the author at na.hosea@na-net.ornl.gov.\n"); errorlevel = 0; break; case 'X': printf("\" Sample RKTEC input file: the Bogacki-Shampine 3(2) pair\n"); printf("\" (from Appl. Math. Letters, 2 (1989), pp. 1-9). The number\n"); printf("\" of stages is 4, but the effective cost per step is 3.\n"); printf("s = 4 (3)\n"); printf("\" Weights for the formula of order 3\n"); printf("b(1) = 2/9\n"); printf("b(2) = 1/3\n"); printf("b(3) = 4/9\n"); printf("\" Weights for the formula of order 2\n"); printf("bhat(1) = 7/24\n"); printf("bhat(2) = 0.25e0\n"); printf("bhat(3) = 1/3\n"); printf("bhat(4) = 1/8\n\"\n"); printf("c(2) = 1.e+0 / 2.e+0\n"); printf("c(3) = 3.e0 / 4.e0\n"); printf("c(4) = 1\n"); printf("A(2,1) = 1/2\n"); printf("A(3,2) = 3/4\n"); printf("A(4,1) = 2/9\n"); printf("A(4,2) = 1/3\n"); printf("A(4,3) = 4/9\n"); printf("\" End of sample input file.\n"); errorlevel = 0; break; case 'Q': /* not currently used */ printf("\nError: can't use -q and -s together.\n"); printf("See rktec -h for usage information.\n"); errorlevel = 1; break; case 'I': printf("Cannot open '%s' for reading.\n", rkfile); errorlevel = 2; break; case 'O': /* not currently used */ printf("Cannot open output file '%s'.\n", outfile); errorlevel = 2; break; case 'C': printf("\nError in '%s':\n", rkfile); printf("--> %s", mline); printf("A 'c(i)' line precedes the 's' line.\n"); errorlevel = 3; break; case 'B': printf("\nError in '%s':\n", rkfile); printf("--> %s", mline); printf("A 'b(i)' line precedes the 's' line.\n"); errorlevel = 3; break; case 'A': printf("\nError in '%s':\n", rkfile); printf("--> %s", mline); printf("An 'A(i,j)' line precedes the 's' line.\n"); errorlevel = 3; break; case 'S': printf("\nSyntax error in '%s':\n", rkfile); printf("--> %s", mline); printf("Nonsense 's' line.\n"); errorlevel = 3; break; case 'Y': printf("\nSyntax error in '%s':\n", rkfile); printf("--> %s\n", mline); errorlevel = 3; break; case 'R': printf("\nError in '%s':", rkfile); printf("\n--> %s", mline); printf("Only one 's' line is permitted.\n"); errorlevel = 3; break; case 'L': printf("\nError in '%s':", rkfile); printf("\n--> %s", mline); printf("Subscript out of range.\n"); errorlevel = 3; break; case 'N': printf("\n'%s' does not define a complete method.\n", rkfile); errorlevel = 3; break; case 'M': printf("\nMemory allocation failure.\n"); printf("The malloc function could not allocate the "); printf("amount of memory requested.\n"); errorlevel = 4; break; default: errorlevel = 5; } if (outfp != NULL) fclose(outfp); if (infp != NULL) fclose(infp); exit(errorlevel); } /**********************************************************************/ /*** Utility to check the consistency of the formula. ***/ /**********************************************************************/ void chkconsist(A, b, c, bhat) double *A, *b, *bhat; struct cons *c; { double sumrow, sumb, sumbhat; register int i, j; int flag = 1; int lquiet = quiet; if (!lquiet) { printf("Formula consistency check:\n"); if (thresh == 0.e0) { printf("A positive zero threshold is required for "); printf("on-screen summary of check.\n"); if (wf) { printf("Detail will be written to '%s'.\n\n", outfile); lquiet = 1; } else { printf("Consistency check not performed. "); printf("Use a positive zero threshold or do\n"); printf("not suppress writing of an output file.\n\n"); return; } } } if (wf) fprintf(outfp, "\nFormula consistency check:\n"); for (i = 1, sumb = 0.e0, sumbhat = 0.e0; i <= s; i++) { for (j = s, sumrow = 0.e0; j; j--) sumrow += *ijth(A, i, j); sumrow -= (c->first)[i - 1]; if (!lquiet && (fabs(sumrow) >= thresh)) { if (flag) { printf("*** Inconsistent rows of Butcher array:"); flag = 0; } printf(" %i", i); } if (wf) fprintf(outfp, "Sum of row %3i of A minus c(%3i) = %12.5le.\n", i, i, sumrow); sumb += b[i - 1]; if (bhat != NULL) sumbhat += bhat[i - 1]; } sumb -= 1.e0; if (bhat != NULL) sumbhat -= 1.e0; if (!lquiet) { if (flag) printf("The A matrix and c vector are consistent.\n"); else printf(". ***\n"); if (fabs(sumb) >= thresh) printf("*** The b vector is inconsistent. ***\n"); else printf("The b vector is consistent.\n"); if (bhat != NULL) { if (fabs(sumbhat) >= thresh) printf("*** The bhat vector is inconsistent. ***\n"); else printf("The bhat vector is consistent.\n"); } printf("\n"); } if (wf) { fprintf(outfp, "Sum of entries of b minus 1 = %12.5le.\n", sumb); if (bhat != NULL) fprintf(outfp, "Sum of entries of bhat minus 1 = %12.5le.\n", sumbhat); } return; } /**********************************************************************/ /*** Command line parser. ***/ /**********************************************************************/ void parsecmdline(argc, argv, ccheck, pnorm, nsamples) int argc, *ccheck, *pnorm, *nsamples; char *argv[]; { int i, j, k; quiet = 0; wf = 1; calcinds = 0; *ccheck = 0; *pnorm = NORMTYPE; *nsamples = 0; order = -1; thresh = EPSILON; rkfile[0] = '\0'; outfile[0] = '\0'; for (i = 1; i < argc; i++) { if (argv[i][0] == '-') { k = strlen(argv[i]); for (j = 1; j <= k; j++) { switch (argv[i][j]) { case '?': case 'h': case 'H': errorexit('d'); case 'i': case 'I': calcinds = 1; break; case 'b': if (sscanf(argv[i], "-b%i", nsamples) < 1) *nsamples = DEFAULT_NSAMPLES; break; case 'B': if (sscanf(argv[i], "-b%i", nsamples) < 1) *nsamples = DEFAULT_NSAMPLES; break; case 'c': case 'C': *ccheck = 1; break; case 'n': sscanf(argv[i], "-n%i", pnorm); break; case 'N': sscanf(argv[i], "-N%i", pnorm); break; case 'o': sscanf(argv[i], "-o%i", &order); break; case 'O': sscanf(argv[i], "-O%i", &order); break; case 'q': case 'Q': quiet = 1; /* if (!wf) errorexit('q'); */ break; case 's': case 'S': wf = 0; /* if (quiet) errorexit('q'); */ break; case 't': sscanf(argv[i], "-t%lf", &thresh); break; case 'T': sscanf(argv[i], "-T%lf", &thresh); break; case 'x': case 'X': errorexit('x'); } } } else if (rkfile[0] == '\0') strncpy(rkfile, argv[i], BUFSIZE); else strncpy(outfile, argv[i], BUFSIZE); } if (*pnorm < 1) *pnorm = 1; if (*nsamples < 0) *nsamples = 0; return; } /**********************************************************************/ /*** Special-purpose BLAS-like functions. ***/ /**********************************************************************/ void gemv(alpha, A, x, beta, y) double alpha, beta, *A, *x, *y; { double *Aptr, *xptr, *yptr; register int i, j; int n = s; if (beta == 0.e0) for (i = n, yptr = y; i; i--, yptr++) *yptr = 0.e0; else for (i = n, yptr = y; i; i--, yptr++) *yptr *= beta; for (j = n, xptr = x, Aptr = A; j; j--, xptr++) { for (i = n, yptr = y; i; i--, yptr++, Aptr++) if (*Aptr != 0.e0) *yptr += (alpha * (*Aptr) * (*xptr)); } return; } double dot(x, y) double *x, *y; { double *xptr, *yptr, rval; register int i; rval = 0.e0; for (i = s, xptr = x, yptr = y; i; i--, xptr++, yptr++) rval += ((*xptr) * (*yptr)); return (rval); } void copy(x, y) double *x, *y; { double *xptr, *yptr; register int i; for (i = s, xptr = x, yptr = y; i; i--, xptr++, yptr++) *yptr = *xptr; return; } double maxnorm(x) double *x; { double *xptr, rval, temp; register int i; rval = fabs(*x); for (i = s - 1, xptr = (x + 1); i; i--, xptr++) { temp = fabs(*xptr); if (temp > rval) rval = temp; } return (rval); } /**********************************************************************/ /*** Some functions useful in debugging. ***/ /**********************************************************************/ #define DEBUG 0 #if DEBUG void printmat(A, m, n) double *A; int m, n; { register int i, j; double *dptr; printf("\nTranspose of matrix:\n"); for (j = 0, dptr = A; j < n; j++) { for (i = 0; i < m; i++, dptr++) printf("%le ", *dptr); printf("\n"); } return; } void printlist(list) struct cons *list; { register int i; struct cons *cptr; double *dptr; printf("\n"); for (cptr = list; cptr != NULL; cptr = cptr->rest) { for (i = 0, dptr = cptr->first; i < s; i++, dptr++) printf("%le ", *dptr); printf("\n"); } return; } void printpar(part, mult) int *part, *mult; { int i; double mcoeff(); printf("mcoeff = %lg, %i = ", mcoeff(mult), part[0]); for (i = 1; i <= mult[0]; i++) printf("%i*%i ", part[i], mult[i]); printf("\n"); return; } void printcom(comp, ind) int *comp; long int *ind; { long int i; printf("composition: "); for (i = 1; i <= comp[0]; i++) printf("p(%li)=%i ", ind[i], comp[i]); printf("\n"); return; } #endif /**********************************************************************/ /*** Allocates cons structures and their arrays. ***/ /**********************************************************************/ struct cons *makeconses(n) long int n; { struct cons *rval, *cptr; double *columns, *dptr; long int i; if (n < 1) return (NULL); i = sizeof(struct cons) * n; if (i != (size_t) i) errorexit('m'); if ((rval = (struct cons *) malloc((size_t) i)) == NULL) errorexit('m'); i = colsize * n; if (i != (size_t) i) errorexit('m'); if ((columns = (double *) malloc((size_t) i)) == NULL) errorexit('m'); for (i = n - 1, cptr = rval, dptr = columns; i; i--, cptr++, dptr += s) { cptr->first = dptr; cptr->rest = cptr + 1; cptr->scaling = 1.e0; } cptr->first = dptr; cptr->rest = NULL; cptr->scaling = 1.e0; trashcan.scaling += (double) n; return (rval); } /**********************************************************************/ /*** Returns the length of a list of conses. ***/ /**********************************************************************/ long int length(list) struct cons *list; { long int len; struct cons *ptr; if (list == NULL) return (0); for (len = 1, ptr = list; ptr->rest != NULL; len++, ptr = ptr->rest); return (len); } /**********************************************************************/ /*** Concatenates two lists. ***/ /**********************************************************************/ struct cons *nconc(list1, list2) struct cons *list1, *list2; { struct cons *cptr; if (list1 == NULL) return (list2); if (list2 == NULL) return (list1); for (cptr = list1; cptr->rest != NULL; cptr = cptr->rest); cptr->rest = list2; return (list1); } /**********************************************************************/ /*** Returns a pointer to the nth cons of a list. ***/ /**********************************************************************/ struct cons *nth(n, list) long int n; struct cons *list; { struct cons *cptr; long int i; if (list == NULL) return (NULL); for (cptr = list, i = 1; (cptr->rest != NULL) && (i < n); cptr = cptr->rest, i++); if (i == n) return (cptr); else return (NULL); } /**********************************************************************/ /*** Splits a list after the nth element. Return value is the ***/ /*** first piece and *listaddr is the second. ***/ /**********************************************************************/ struct cons *split(n, listaddr) long int n; struct cons **listaddr; { struct cons *p1, *p2; if ((*listaddr == NULL) || (n < 1)) return (NULL); p1 = *listaddr; p2 = nth(n, *listaddr); if (p2 == NULL) *listaddr = NULL; else { *listaddr = p2->rest; p2->rest = NULL; } return (p1); } /**********************************************************************/ /*** Makes a list of length n. Roots around in the trashcan for ***/ /*** discarded conses before allocating new ones. ***/ /**********************************************************************/ struct cons *makelist(n) long int n; { long int len; struct cons *rval, *cptr, *endptr; rval = split(n, &trashcan.rest); if (rval == NULL) return (makeconses(n)); for (cptr = rval, endptr = NULL, len = 0; cptr != NULL; cptr = cptr->rest, len++) { cptr->scaling = 1.e0; endptr = cptr; } if (len < n) /* The effect is rval=nconc(rval,makeconses(n-len)). */ nconc(endptr, makeconses(n - len)); return (rval); } /**********************************************************************/ /*** Copy a list. ***/ /**********************************************************************/ struct cons *copylist(list) struct cons *list; { struct cons *rval, *cptr1, *cptr2; if (list == NULL) return (NULL); rval = makelist(length(list)); for (cptr1 = list, cptr2 = rval; cptr1 != NULL; cptr1 = cptr1->rest, cptr2 = cptr2->rest) { copy(cptr1->first, cptr2->first); cptr2->scaling = cptr1->scaling; } return (rval); } /**********************************************************************/ /*** Discard a list of conses. ***/ /**********************************************************************/ void trash(listaddr) struct cons **listaddr; { if (*listaddr != NULL) { if (trashcan.rest == NULL) /* for efficiency only */ trashcan.rest = *listaddr; else trashcan.rest = nconc(*listaddr, trashcan.rest); *listaddr = NULL; } return; } /**********************************************************************/ /*** A "safe" function to serve the purpose of gets(). ***/ /**********************************************************************/ void sgets(buffer, buflength) char *buffer; int buflength; { char *charptr; fgets(buffer, buflength, stdin); if ((charptr = strchr(buffer, '\n')) != NULL) *charptr = '\0'; /* strip trailing '\n' */ return; } /**********************************************************************/ /*** Function to read the input file. ***/ /**********************************************************************/ void readinfile(Aptr, bptr, bhatptr, cptr) double **Aptr, **bptr, **bhatptr; struct cons **cptr; { double *dptrA, *dptrb, *dptrc, num, den, *A, *b, *bhat; struct cons *c; int i, j, k; char linetype, *rmline; if (!quiet) printf("\nRKTEC version %s. See rktec -h for usage information.\n\n", VERSION); /*** Get ready to read input file. ***/ if (rkfile[0] == '\0') { printf("Enter Runge-Kutta method filename: "); sgets(rkfile, BUFSIZE); printf("\n"); } if (!strlen(rkfile)) { printf("No input file.\n"); exit(0); } if ((infp = fopen(rkfile, "r")) == NULL) errorexit('i'); /*** Get ready to write output file. ***/ if (wf) { do { if (outfile[0] == '\0') { printf("Enter output filename: "); sgets(outfile, BUFSIZE); } if (!strlen(outfile)) { printf("No output file.\n"); wf = 0; } else { if ((outfp = fopen(outfile, "r+")) != NULL) { fclose(outfp); printf("Warning: '%s' already exists.\n", outfile); printf("Overwrite it (y/n)? "); sgets(mline, BUFSIZE); if (toupper(mline[0]) != 'Y') outfile[0] = '\0'; } } } while (wf && (outfile[0] == '\0')); } if (wf) { if ((outfp = fopen(outfile, "w")) == NULL) { wf = 0; printf("Cannot open '%s' for writing.\n", outfile); printf("File output will be suppressed.\n"); } } else outfp = NULL; if (order < 0) { printf("\nEnter maximum order: "); scanf("%i", &order); } if (!quiet) { printf("\nMaximum order = %i\n", order); printf("Zero threshold = %le\n", thresh); printf("Input file = %s\n", rkfile); if (!wf) printf("File output suppressed\n"); else printf("Output file = %s\n", outfile); } if (wf) { fprintf(outfp, "\nRKTEC version %s\n", VERSION); fprintf(outfp, "Maximum order = %i\n", order); fprintf(outfp, "Zero threshold = %le\n\n", thresh); } if (quiet && !wf && calcinds) printf("Input file = %s\n", rkfile); /*** Read the input file. ***/ s = 0; b = NULL; bhat = NULL; fgets(mline, BUFSIZE, infp); while (!feof(infp)) { if (sscanf(mline, " %c", &linetype) == 1) { if ((rmline = strchr(mline, linetype)) != NULL) rmline++; else errorexit('y'); switch (toupper(linetype)) { case '"': if (wf) fputs(mline, outfp); break; case 'S': if (s) errorexit('r'); k = sscanf(rmline, " = %i ( %i ) ", &s, &cps); if (k < 1) k = sscanf(rmline, " = %i", &s); if (k < 1) errorexit('y'); if ((k == 1) || (cps < 1)) cps = s; if (s < 1) errorexit('s'); if (wf) fprintf(outfp, "s = %i\n", s); if (!quiet) { printf("Number of stages = %i\n", s); if (cps != s) printf("Cost per step = %i\n", cps); printf("\n"); } c = makelist((long int) 1); if (((long int) colsize != (size_t) colsize) || ((long int) (s * colsize) != (size_t) (s * colsize))) errorexit('m'); if ((A = (double *) malloc((size_t) (s * colsize))) == NULL) errorexit('m'); for (i = s, dptrA = A, dptrc = c->first; i; i--, dptrc++) { *dptrc = 0.e0; for (j = s; j; j--, dptrA++) *dptrA = 0.e0; } break; case 'C': if (!s) errorexit('c'); k = sscanf(rmline, " ( %i ) = %lf / %lf", &i, &num, &den); if (k < 2) k = sscanf(rmline, " ( %i ) = %lf", &i, &num); if (k < 2) errorexit('y'); if (k == 3) num /= den; if (i > s) errorexit('l'); (c->first)[i - 1] = num; if (wf) fprintf(outfp, "c(%i) = %le\n", i, num); break; case 'A': if (!s) errorexit('a'); k = sscanf(rmline, " ( %i , %i ) = %lf / %lf", &i, &j, &num, &den); if (k < 3) k = sscanf(rmline, " ( %i , %i ) = %lf", &i, &j, &num); if (k < 3) errorexit('y'); if (k == 4) num /= den; if ((i > s) || (j > s)) errorexit('l'); *ijth(A, i, j) = num; if (wf) fprintf(outfp, "A(%i,%i) = %le\n", i, j, num); break; case 'B': if (!s) errorexit('b'); if ((*rmline != ' ') && (*rmline != '(')) { if ((rmline = strchr(rmline, '(')) == NULL) errorexit('y'); if (bhat == NULL) { if ((bhat = (double *) malloc((size_t) colsize)) == NULL) errorexit('m'); for (i = s, dptrb = bhat; i; i--, dptrb++) *dptrb = 0.e0; } k = sscanf(rmline, " ( %i ) = %lf / %lf", &i, &num, &den); if (k < 2) k = sscanf(rmline, " ( %i ) = %lf", &i, &num); if (k < 2) errorexit('y'); if (k == 3) num /= den; if (i > s) errorexit('l'); bhat[i - 1] = num; if (wf) fprintf(outfp, "bhat(%i) = %le\n", i, num); } else { if (b == NULL) { if ((b = (double *) malloc((size_t) colsize)) == NULL) errorexit('m'); for (i = s, dptrb = b; i; i--, dptrb++) *dptrb = 0.e0; } k = sscanf(rmline, " ( %i ) = %lf / %lf", &i, &num, &den); if (k < 2) k = sscanf(rmline, " ( %i ) = %lf", &i, &num); if (k < 2) errorexit('y'); if (k == 3) num /= den; if (i > s) errorexit('l'); b[i - 1] = num; if (wf) fprintf(outfp, "b(%i) = %le\n", i, num); } break; } } fgets(mline, BUFSIZE, infp); } if (b == NULL) { b = bhat; bhat = NULL; } if (!s || (b == NULL)) errorexit('n'); fclose(infp); *Aptr = A; *bptr = b; *bhatptr = bhat; *cptr = c; return; } /**********************************************************************/ /*** Calculates the "next" partition of an integer, part[0], ***/ /*** excluding partitions that contain ones. ***/ /*** On exit, part[0] = sum of (part[i]*mult[i]), ***/ /*** i = 1,...,mult[0]. Set rewind!=0 to start over. ***/ /*** This is a modification of the algorithm of the same name ***/ /*** appearing in _Combinatorial_Algorithms_ by Nijenhuis and Wilf, ***/ /*** Academic Press, New York, 1975. ***/ /**********************************************************************/ void nexpar(part, mult, rewind) int *part, *mult, *rewind; { int q, f, sum; if (*rewind) { mult[0] = 1; part[1] = part[0]; mult[1] = 1; *rewind = (part[1] <= 3); return; } do { if (part[mult[0]] == 2) { part[mult[0]] = 1; mult[mult[0]] = 2 * mult[mult[0]]; } if (part[mult[0]] <= 1) { sum = mult[mult[0]] + 1; (mult[0])--; } else sum = 1; f = part[mult[0]] - 1; if (mult[mult[0]] != 1) { mult[mult[0]]--; (mult[0])++; } part[mult[0]] = f; mult[mult[0]] = 1 + (int) (sum / f); q = sum % f; if (q > 0) { (mult[0])++; part[mult[0]] = q; mult[mult[0]] = 1; } if (2 * (int) (part[0] / 2) == part[0]) *rewind = (part[1] <= 2); else *rewind = ((part[1] == 3) && (mult[1] <= 1)); } while ((part[mult[0]] == 1) && !(*rewind)); return; } /**********************************************************************/ /*** Calculates the "next" composition of ind[0] parts of an ***/ /*** integer n. On exit, n = sum{ comp[i] }, i = 1,...,comp[0]. ***/ /*** The comp[i] are the nonzero components of the composition and ***/ /*** the ind[i] are their corresponding indices. Set rewind!=0 ***/ /*** to start over. This is a modification of the algorithm of the ***/ /*** same name appearing in _Combinatorial_Algorithms_ by Nijenhuis ***/ /*** and Wilf, Academic Press, New York, 1975. ***/ /**********************************************************************/ void nexcom(comp, ind, n, rewind) int *comp, n, *rewind; long int *ind; { long int oldindp1; register int i; if (*rewind) { comp[1] = n; ind[1] = 1; comp[0] = 1; } else { oldindp1 = ind[1] + 1; ind[1] = 1; comp[1]--; if (comp[1] == 0) { if (comp[0] == 1) { comp[1] = 1; ind[1] = oldindp1; } else { if (ind[2] == oldindp1) { comp[2]++; for (i = 1; i < comp[0]; i++) { ind[i] = ind[i + 1]; comp[i] = comp[i + 1]; } comp[0]--; } else { comp[1] = 1; ind[1] = oldindp1; } } } else { if (comp[0] == 1) { comp[0] = 2; comp[2] = 1; ind[2] = oldindp1; } else { if (ind[2] == oldindp1) comp[2]++; else { for (i = comp[0]; i > 0; i--) { comp[i + 1] = comp[i]; ind[i + 1] = ind[i]; } comp[0]++; comp[2] = 1; ind[2] = oldindp1; } } } } *rewind = (ind[0] == ind[1]); return; } /**********************************************************************/ /*** Calculates the number of terms in the equivalence class ***/ /*** corresponding to part[] and mult[] (see nexpar() above). ***/ /*** This number is actually independent of part[]: ***/ /*** mcoeff(mult[]) = sum{ mult[i] }! / product{ mult[i]! }. ***/ /*** This routine also gives the number of terms for each ***/ /*** composition generated by nexcom(). ***/ /**********************************************************************/ double mcoeff(mult) int *mult; { register int i, j; double y; long int sum; for (i = 1, sum = 1, y = 1.e0; i <= mult[0]; i++) for (j = 1; j <= mult[i]; j++, sum++) y *= ((double) sum / (double) j); return (y); } /**********************************************************************/ /*** Scalar times a list. ***/ /**********************************************************************/ void scalarmult(scaling, list) double scaling; struct cons *list; { struct cons *cptr; for (cptr = list; cptr != NULL; cptr = cptr->rest) cptr->scaling *= scaling; return; } /**********************************************************************/ /*** Componentwise product of arrays in cons1 and cons2. ***/ /**********************************************************************/ void circledot(cons1, cons2, result) struct cons *cons1, *cons2, *result; { double *dptr1, *dptr2, *resptr; register int i; for (i = s, dptr1 = cons1->first, dptr2 = cons2->first, resptr = result->first; i; i--, dptr1++, dptr2++, resptr++) *resptr = *dptr1 * *dptr2; result->scaling = cons1->scaling * cons2->scaling; return; } /**********************************************************************/ /*** Prunes numerically zero columns from a list. ***/ /**********************************************************************/ struct cons *prune(list) struct cons *list; { struct cons *cptr, *tptr, **plink, *rval; rval = list; plink = &rval; cptr = list; while (cptr != NULL) if (maxnorm(cptr->first) < thresh) { *plink = cptr->rest; tptr = cptr; cptr = cptr->rest; tptr->rest = NULL; trash(&tptr); } else { plink = &(cptr->rest); cptr = cptr->rest; } return (rval); } /**********************************************************************/ /*** Binary tensor product of two lists. ***/ /**********************************************************************/ struct cons *cross(list1, list2) struct cons *list1, *list2; { struct cons *result, *l1ptr, *l2ptr, *resptr; if ((list1 == NULL) || (list2 == NULL)) return (NULL); result = makelist(length(list1) * length(list2)); for (resptr = result, l1ptr = list1; l1ptr != NULL; l1ptr = l1ptr->rest) for (l2ptr = list2; l2ptr != NULL; l2ptr = l2ptr->rest, resptr = resptr->rest) circledot(l1ptr, l2ptr, resptr); if (thresh > 0.e0) result = prune(result); return (result); } /**********************************************************************/ /*** Computes list crossed with itself n times. ***/ /**********************************************************************/ struct cons *listpower(list, n) struct cons *list; int n; { struct cons *rval, *resptr, *cptr, *endrval; long int *ind; int i, j, rewind, *comp; if (list == NULL) return (NULL); if (n == 1) return (copylist(list)); comp = composition; ind = indices; ind[0] = length(list); rval = NULL; endrval = NULL; rewind = 1; do { nexcom(comp, ind, n, &rewind); resptr = makelist((long int) 1); cptr = nth(ind[1], list); copy(cptr->first, resptr->first); resptr->scaling = mcoeff(comp) * cptr->scaling; for (j = 2; j <= comp[1]; j++) circledot(cptr, resptr, resptr); for (i = 2; i <= comp[0]; i++) { cptr = nth(ind[i], list); for (j = 1; j <= comp[i]; j++) circledot(cptr, resptr, resptr); } if ((thresh > 0.e0) && (maxnorm(resptr->first) < thresh)) trash(&resptr); else { if (rval == NULL) rval = resptr; else endrval->rest = resptr; /* The effect is rval=nconc(rval,resptr). */ endrval = resptr; /* because length(resptr)==1. */ } } while (!rewind); return (rval); } /**********************************************************************/ /*** Calculates a tensor product term of the recurrence. ***/ /**********************************************************************/ struct cons *tensor(part, mult, R, overwrite) int *part, *mult, overwrite; struct cons **R; { register int i; struct cons *work, *oldwork, *Rpower; double scaling; scaling = mcoeff(mult); if (overwrite && (mult[1] == 1)) { work = R[part[1]]; R[part[1]] = NULL; } else work = listpower(R[part[1]], mult[1]); if (overwrite && (R[part[1]] != NULL)) trash(&(R[part[1]])); if (work != NULL) { scalarmult(scaling, work); scaling = 1.e0; } for (i = 2; i <= mult[0]; i++) { oldwork = work; if (mult[i] > 1) { Rpower = listpower(R[part[i]], mult[i]); work = cross(oldwork, Rpower); trash(&Rpower); } else work = cross(oldwork, R[part[i]]); trash(&oldwork); if ((work != NULL) && (scaling != 1.e0)) { scalarmult(scaling, work); scaling = 1.e0; } } return (work); } /**********************************************************************/ /*** Calculates R_j=((c^j/j!-A*c^(j-1)/(j-1)!)) (+) A*W_(j-1), ***/ /*** maintains cpower, and updates Q[]. ***/ /**********************************************************************/ void update_Q_and_R(j, Q, R, c, cpower, A, invfact) int j; struct cons **Q, **R, *c, *cpower; double *A, *invfact; { struct cons *Rnew, *Qcptr, *Rcptr, *oldcpower, *endR; register int i; double *dptr, temp; if (j < 2) { R[j] = NULL; Q[j] = NULL; if (j == 0) for (i = s, dptr = cpower->first; i; i--, dptr++) *dptr = 1.e0; else circledot(c, cpower, cpower); return; } oldcpower = copylist(cpower); circledot(c, cpower, cpower); R[j] = copylist(cpower); endR = R[j]; gemv(-invfact[j - 1], A, oldcpower->first, invfact[j], R[j]->first); trash(&oldcpower); for (i = 2; i < j; i++) { Rnew = makelist(length(Q[i])); endR->rest = Rnew; /* effect is R[j]=nconc(R[j],Rnew) */ temp = 1.e0 / (double) (j - i); for (Qcptr = Q[i], Rcptr = Rnew; Qcptr != NULL; Qcptr = Qcptr->rest, Rcptr = Rcptr->rest) { gemv(1.e0, A, Qcptr->first, 0.e0, Rcptr->first); Rcptr->scaling = Qcptr->scaling; endR = Rcptr; circledot(c, Qcptr, Qcptr); Qcptr->scaling *= temp; } if (thresh > 0.e0) Q[i] = prune(Q[i]); } if (thresh > 0.e0) R[j] = prune(R[j]); return; } /**********************************************************************/ /*** Calculates Q_j. ***/ /**********************************************************************/ struct cons *calcnewQ(j, R, part, mult) int j, *part, *mult; struct cons **R; { struct cons *cptr; int overwrite, rewind; cptr = NULL; part[0] = j; rewind = 1; do { nexpar(part, mult, &rewind); /* * The following statement is a complicated logical computation that * determines when R[part[1]] may be overwritten. */ overwrite = ((j == (order - 2)) && (part[1] == j)) || ((j == (order - 1)) && ((part[1] == 2) || ((part[1] == 3) && (mult[1] <= 2)) || ((part[1] > 3) && (mult[1] == 1) && ((mult[0] == 1) || (part[2] == 2) || ((part[2] == 3) && (mult[2] == 1)))))); cptr = nconc(cptr, tensor(part, mult, R, overwrite)); } while (!rewind); return cptr; } /**********************************************************************/ /*** Utility to maintain statistics. ***/ /**********************************************************************/ double assimilate(ordcnd, scale, pnorm, maxoc, maximum, sumpow, maxscale, ngtthresh) double ordcnd, scale, *maxoc, *maximum, *sumpow, *maxscale, *ngtthresh; int pnorm; { double absoc, tec, abstec; absoc = fabs(ordcnd); if (absoc >= thresh) { tec = ordcnd * scale; abstec = fabs(tec); (*ngtthresh)++; if (scale > *maxscale) *maxscale = scale; if (absoc > *maxoc) *maxoc = absoc; if (abstec > *maximum) *maximum = abstec; if (abstec > 0.e0) { if (pnorm == 1) (*sumpow) += abstec; else (*sumpow) += pow(abstec, (double) pnorm); } } else tec = 0.e0; return (tec); } /**********************************************************************/ /*** Top level function to compute truncation error coefficients. ***/ /**********************************************************************/ void calctecs(A, b, c, bhat, pnorm) double *A, *b, *bhat; struct cons *c; int pnorm; { double *dptr, *invfact, bytes, scale, invpnorm; double tecb, ocb, maxocb, maxb, normb, maxsb, nzb; double tecbhat, ocbhat, maxocbhat, maxbhat, normbhat, maxsbhat, nzbhat; double tecerrest, maxocerrest, maxerrest, normerrest, maxserrest, nzerrest; struct cons **Q, **R, *cpower, *cptr; int *part, *mult, i, j, hotsb, hotsbhat; char bbuffer[96], bhatbuffer[96]; /* output line buffers */ if (order < 1) return; hotsb = 0; hotsbhat = 0; if (calcinds && (thresh == 0.e0)) { printf("A positive zero threshold is required for the calculation\n"); printf("efficiency indices. Indices will not be calculated.\n\n"); calcinds = 0; } invpnorm = 1.e0 / (double) pnorm; cpower = makelist((long int) 1); if (((R = (struct cons **) malloc((size_t) (order * sizeof(struct cons **)))) == NULL) || ((Q = (struct cons **) malloc((size_t) (order * sizeof(struct cons **)))) == NULL) || ((part = (int *) malloc((size_t) (order * sizeof(int)))) == NULL) || ((mult = (int *) malloc((size_t) (order * sizeof(int)))) == NULL) || ((composition = (int *) malloc((size_t) (order * sizeof(int)))) == NULL) || ((indices = (long int *) malloc((size_t) (order * sizeof(long int)))) == NULL) || ((invfact = (double *) malloc((size_t) ((order + 1) * sizeof(double)))) == NULL)) errorexit('m'); for (j = 1, dptr = invfact + 1, *invfact = 1.e0; j <= order; j++, dptr++) *dptr = *(dptr - 1) / (double) j; bytes = (double) colsize *(s + 1) + (double) 3 *order * sizeof(int) + (double) order *sizeof(long int) + (double) (order + 1) * sizeof(double) + (double) 2 *order * sizeof(struct cons **); if (bhat != NULL) bytes += (double) colsize; if (!quiet) { if (bhat == NULL) { printf(" "); printf("truncation error coefficient summary\n\n"); printf("order conditions maximum-norm %3i-norm", pnorm); printf(" maxscale num>=thresh\n"); } else { printf(" "); printf("truncation error coefficient summary\n\n"); printf(" ---------- maximum norm --------- "); printf("------------- %i-norm ------------", pnorm); printf("\norder b bhat error est "); printf("b bhat error est\n"); } } if (wf) { if (bhat == NULL) fprintf(outfp, "\nKEY: tec = order condition * scaling factor.\n"); else fprintf(outfp, "\nKEY: tec( b ) - tec( bhat ) = tec( error est ).\n"); } for (j = 0; j < order; j++) { update_Q_and_R(j, Q, R, c, cpower, A, invfact); if (wf) fprintf(outfp, "\n--- tecs for order %i:\n", j + 1); ocb = invfact[j + 1] - dot(b, cpower->first) * invfact[j]; maxocb = 0.e0; maxb = 0.e0; normb = 0.e0; maxsb = 1.e0; nzb = 0.e0; scale = 1.e0; tecb = assimilate(ocb, scale, pnorm, &maxocb, &maxb, &normb, &maxsb, &nzb); if (bhat == NULL) { if (wf) fprintf(outfp, "%14.6le (quadrature error coefficient)\n", tecb); } else { ocbhat = invfact[j + 1] - dot(bhat, cpower->first) * invfact[j]; maxocbhat = 0.e0; maxbhat = 0.e0; normbhat = 0.e0; maxsbhat = 1.e0; nzbhat = 0.e0; tecbhat = assimilate(ocbhat, scale, pnorm, &maxocbhat, &maxbhat, &normbhat, &maxsbhat, &nzbhat); maxocerrest = 0.e0; maxerrest = 0.e0; normerrest = 0.e0; maxserrest = 1.e0; nzerrest = 0.e0; tecerrest = tecb - tecbhat; tecerrest = assimilate(tecerrest, scale, pnorm, &maxocerrest, &maxerrest, &normerrest, &maxserrest, &nzerrest); if (wf) { fprintf(outfp, "%14.6le -%14.6le =%14.6le", tecb, tecbhat, tecerrest); fprintf(outfp, " (quadrature error coefficients)\n"); } } for (i = 2; i <= j; i++) { if (i == j) Q[i] = calcnewQ(i, R, part, mult); for (cptr = Q[i]; cptr != NULL; cptr = cptr->rest) { scale = cptr->scaling; ocb = dot(b, cptr->first); tecb = assimilate(ocb, scale, pnorm, &maxocb, &maxb, &normb, &maxsb, &nzb); if (bhat == NULL) { if (wf) if (fabs(ocb) >= thresh) fprintf(outfp, "%14.6le = %14.6le * %lg\n", tecb, ocb, scale); } else { ocbhat = dot(bhat, cptr->first); tecbhat = assimilate(ocbhat, scale, pnorm, &maxocbhat, &maxbhat, &normbhat, &maxsbhat, &nzbhat); tecerrest = assimilate(tecb - tecbhat, 1.e0, pnorm, &maxocerrest, &maxerrest, &normerrest, &maxserrest, &nzerrest); if (wf) if ((fabs(ocb) >= thresh) || (fabs(ocbhat) >= thresh)) fprintf(outfp, "%14.6le -%14.6le =%14.6le\n", tecb, tecbhat, tecerrest); } } if (j == (order - 1)) trash(Q + i); } if (pnorm > 1) { if (normb > 0.e0) normb = pow(normb, invpnorm); if (normbhat > 0.e0) normbhat = pow(normbhat, invpnorm); if (normerrest > 0.e0) normerrest = pow(normerrest, invpnorm); } if (calcinds && j && (maxb > 0.e0) && (!hotsb)) { sprintf(bbuffer, "eff(b): %7.4lg (EPS) %7.4lg (EPUS) %7.4lg (EPS) %7.4lg (EPUS)\n", pow(maxb, -1.e0 / (double) (j + 1)) / (double) cps, pow(maxb, -1.e0 / (double) j) / (double) cps, pow(normb, -1.e0 / (double) (j + 1)) / (double) cps, pow(normb, -1.e0 / (double) j) / (double) cps); hotsb = 1; } else bbuffer[0] = '\0'; if (bhat == NULL) { if (!quiet) { printf("%3i %12.4le %12.4le %12.4le %12.7lg %12.7lg\n", j + 1, maxocb, maxb, normb, maxsb, nzb); if (calcinds) printf("%s", bbuffer); } if (wf) { fprintf(outfp, "summary for order %i:\n", j + 1); fprintf(outfp, "max-norm = %10.4le, %i-norm = %10.4le, maxscale = %10.4le\n", maxb, pnorm, normb, maxsb); if (calcinds) fprintf(outfp, "%s", bbuffer); } if (calcinds && quiet && !wf) if (bbuffer[0] != '\0') printf("%2i %s", j + 1, bbuffer); } else { if (calcinds && j && (maxbhat > 0.e0) && (!hotsbhat)) { sprintf(bhatbuffer, "eff(bhat): %7.4lg (EPS) %7.4lg (EPUS) %7.4lg (EPS) %7.4lg (EPUS)\n", pow(maxbhat, -1.e0 / (double) (j + 1)) / (double) cps, pow(maxbhat, -1.e0 / (double) j) / (double) cps, pow(normbhat, -1.e0 / (double) (j + 1)) / (double) cps, pow(normbhat, -1.e0 / (double) j) / (double) cps); hotsbhat = 1; } else bhatbuffer[0] = '\0'; if (!quiet) { printf("%3i %10.3le %10.3le %10.3le %10.3le %10.3le %10.3le\n", j + 1, maxb, maxbhat, maxerrest, normb, normbhat, normerrest); if (calcinds) { if (bbuffer[0] != '\0') printf("%s", bbuffer); if (bhatbuffer[0] != '\0') printf("%s", bhatbuffer); } } if (wf) { fprintf(outfp, "summary for order %i:\n", j + 1); fprintf(outfp, " b: max-norm = %10.4le, ", maxb); fprintf(outfp, "%i-norm = %10.4le, maxscale = %10.4le\n", pnorm, normb, maxsb); fprintf(outfp, " bhat: max-norm = %10.4le, ", maxbhat); fprintf(outfp, "%i-norm = %10.4le, maxscale = %10.4le\n", pnorm, normbhat, maxsbhat); fprintf(outfp, "errest: max-norm = %10.4le, %i-norm = %10.4le\n", maxerrest, pnorm, normerrest); if (calcinds) { if (bbuffer[0] != '\0') fprintf(outfp, "%s", bbuffer); if (bhatbuffer[0] != '\0') fprintf(outfp, "%s", bhatbuffer); } } if (calcinds && quiet && !wf) { if (bbuffer[0] != '\0') printf("%2i %s", j + 1, bbuffer); if (bhatbuffer[0] != '\0') printf("%2i %s", j + 1, bhatbuffer); } } } if (order >= 3) if (R[2] != NULL) trash(R + 2); trash(&cpower); free((void *) Q); free((void *) R); free((void *) part); free((void *) mult); free((void *) composition); free((void *) indices); free((void *) invfact); bytes += (double) (consize * trashcan.scaling); if (!quiet) printf("\n%12.3lf kbytes of storage were required.\n", bytes / 1024.e0); return; } void zswaps(z1, z2) struct cmplx *z1, *z2; { double temp; temp = z1->re; z1->re = z2->re; z2->re = temp; temp = z1->im; z1->im = z2->im; z2->im = temp; return; } /**********************************************************************/ /*** Solves the linear system ImzA*zx=1. ***/ /**********************************************************************/ int linsolve() { double temp, maxs; struct cmplx multip, invdiag, *aptr, *aptr1, *aptr2, *xptr; int i, j, k, imax; /* This is the right-hand side vector */ for (i = s, xptr = zx; i; i--, xptr++) { xptr->re = 1.e0; xptr->im = 0.e0; } for (i = 1; i < s; i++) { /* search for pivot */ aptr = ijth(ImzA, i, i); for (k = i, aptr1 = aptr, maxs = 0.e0, imax = 0; k <= s; k++, aptr1++) { temp = aptr1->re * aptr1->re + aptr1->im * aptr1->im; if (temp > maxs) { maxs = temp; imax = k; } } if (imax == 0) return (i); /* * swap rows, if necessary. Maybe we should use a reference vector * instead, but profiling suggests there is little to be gained from it. */ if (imax != i) { for (k = i, aptr1 = aptr, aptr2 = aptr + imax - i; k <= s; k++, aptr1 += s, aptr2 += s) zswaps(aptr1, aptr2); zswaps(zx + i - 1, zx + imax - 1); } /* eliminate */ invdiag.re = -aptr->re / maxs; invdiag.im = aptr->im / maxs; for (k = i + 1, aptr2 = aptr + 1, aptr1 = aptr; k <= s; k++, aptr2 = aptr + k - i, aptr1 = aptr) { multip.re = aptr2->re * invdiag.re - aptr2->im * invdiag.im; multip.im = aptr2->re * invdiag.im + aptr2->im * invdiag.re; for (j = s - i, aptr1 += s, aptr2 += s; j; j--, aptr1 += s, aptr2 += s) { aptr2->re += (aptr1->re * multip.re - aptr1->im * multip.im); aptr2->im += (aptr1->re * multip.im + aptr1->im * multip.re); } zx[k - 1].re += (zx[i - 1].re * multip.re - zx[i - 1].im * multip.im); zx[k - 1].im += (zx[i - 1].re * multip.im + zx[i - 1].im * multip.re); } } /* back solve */ for (j = s; j; j--) { aptr = ijth(ImzA, j, j); xptr = zx + j - 1; temp = (aptr->re * aptr->re + aptr->im * aptr->im); multip.re = (xptr->re * aptr->re + xptr->im * aptr->im) / temp; multip.im = (xptr->im * aptr->re - xptr->re * aptr->im) / temp; xptr->re = multip.re; xptr->im = multip.im; if ((multip.re != 0.e0) || (multip.im != 0.e0)) for (i = j - 1, aptr--, xptr--; i; i--, aptr--, xptr--) { if ((aptr->re != 0.e0) || (aptr->im != 0.e0)) { xptr->re -= (aptr->re * multip.re - aptr->im * multip.im); xptr->im -= (aptr->re * multip.im + aptr->im * multip.re); } } } return (0); } /**********************************************************************/ /*** Evaluate abs(R(z))-1. ***/ /**********************************************************************/ double absrzm1(z, A, b) struct cmplx *z; double *A, *b; { struct cmplx *zptr, r, btx; double *dptr; int i, j; if (ImzA == NULL) { ImzA = (struct cmplx *) malloc((size_t) (s * s * sizeof(struct cmplx))); zx = (struct cmplx *) malloc((size_t) (s * sizeof(struct cmplx))); if ((ImzA == NULL) || (zx == NULL)) errorexit('m'); } for (i = s, zptr = ImzA, dptr = A; i; i--) for (j = s; j; j--, zptr++, dptr++) { zptr->re = -(*dptr * z->re); zptr->im = -(*dptr * z->im); if (i == j) zptr->re += 1.e0; } if (linsolve()) return (BIGENOUGH); for (i = s, zptr = zx, dptr = b, btx.re = 0.e0, btx.im = 0.e0; i; i--, zptr++, dptr++) { btx.re += (*dptr * zptr->re); btx.im += (*dptr * zptr->im); } r.re = 1.e0 + z->re * btx.re - z->im * btx.im; r.im = z->re * btx.im + z->im * btx.re; return (sqrt(r.re * r.re + r.im * r.im) - 1.e0); } /**********************************************************************/ /*** Search inward for a negative function value. ***/ /**********************************************************************/ int searchin(z, A, b, xa, fxa, xb, fxb) struct cmplx *z; double *A, *b, *xa, *fxa, *xb, *fxb; { struct cmplx zz; *xa = *xb * INFACTOR; do { zz.re = *xa * z->re; zz.im = *xa * z->im; *fxa = absrzm1(&zz, A, b); if (*fxa <= 0.e0) return (0); *xb = *xa; *fxb = *fxa; *xa *= INFACTOR; } while (*xa >= SMALLENOUGH); return (1); } /**********************************************************************/ /*** Search outward for a positive function value. ***/ /**********************************************************************/ int searchout(z, A, b, xa, fxa, xb, fxb) struct cmplx *z; double *A, *b, *xa, *fxa, *xb, *fxb; { struct cmplx zz; *xb = *xa * OUTFACTOR; do { zz.re = *xb * z->re; zz.im = *xb * z->im; *fxb = absrzm1(&zz, A, b); if (*fxb >= 0.e0) return (0); *xa = *xb; *fxa = *fxb; *xb *= OUTFACTOR; } while (*xb <= BIGENOUGH); return (1); } /**********************************************************************/ /*** Secant iteration with brackets. ***/ /**********************************************************************/ int secant(z, A, b, x) struct cmplx *z; double *A, *b, *x; { double xa, xb, xc, fxc, bmad2, ttxa, x1, fx1, x2, fx2; struct cmplx zz; int itcount; if (*x <= 0.e0) xa = SMALLSTEP; else xa = *x; zz.re = xa * z->re; zz.im = xa * z->im; fx1 = absrzm1(&zz, A, b); if (fx1 == 0.e0) return (0); if (fx1 > 0.e0) { if (xa == SMALLSTEP) { *x = 0.e0; return (0); } xb = xa; fx2 = fx1; if (searchin(z, A, b, &xa, &fx1, &xb, &fx2)) { *x = 0.e0; return (0); } } else { if (searchout(z, A, b, &xa, &fx1, &xb, &fx2)) { *x = BIGENOUGH; return (0); } } itcount = 0; bmad2 = (xb - xa) * 0.5e0; ttxa = SECANTTOL * xa; fxc = fx1; x1 = xa; x2 = xb; while (bmad2 > ttxa) { if (itcount > MAXITS) { /* printf("\n Final bracket: [%lf,%lf] \n", xa, xb); */ *x = xa + bmad2; return (1); } xc = x2 + (x2 - x1) * fx2 / (fx1 - fx2); if ((xc <= xa) || (xc >= xb)) /* outside interval */ xc = xa + bmad2; else if (xc < (xa + ttxa)) /* too close to xa */ xc = xa + ttxa; else if (xc > (xb - ttxa)) /* too close to xb */ xc = xb - ttxa; zz.re = xc * z->re; zz.im = xc * z->im; fxc = absrzm1(&zz, A, b); if (fxc == 0.e0) { *x = xc; return (0); } else if (fxc < 0.e0) { xa = xc; ttxa = SECANTTOL * xa; } else xb = xc; x1 = x2; fx1 = fx2; x2 = xc; fx2 = fxc; bmad2 = (xb - xa) * 0.5e0; itcount++; } *x = xa + bmad2; return (0); } /**********************************************************************/ /*** Algorithm to calculate stability region. ***/ /**********************************************************************/ void stabplot(A, b, bhat, nsamples) double *A, *b, *bhat; int nsamples; { struct cmplx z; double x, xb, theta, xsum, xssq, xmu, xsigma, xbsum, xbssq, xbmu, xbsigma, dsum, dssq, dmu, dsigma; int i, probflag; x = 0.e0; xb = 0.e0; probflag = 0; ImzA = NULL; zx = NULL; xsum = xssq = xbsum = xbssq = dsum = dssq = 0.e0; if (wf) { if (bhat != NULL) fprintf(outfp, "\n\n "); else fprintf(outfp, "\n\n "); fprintf(outfp, "stability region radii\n\n"); fprintf(outfp, "Angle (rad) b radius"); if (bhat != NULL) fprintf(outfp, " bhat radius difference\n"); else fprintf(outfp, "\n"); } for (i = 0; i < nsamples; i++) { theta = THETABEG + (double) i *(THETAEND - THETABEG) / (double) (nsamples - 1); z.re = cos(theta); z.im = sin(theta); if (wf) fprintf(outfp, " %7.5lf ", theta); if (secant(&z, A, b, &x)) probflag = 1; if (wf) fprintf(outfp, "%13.5le ", x); xsum += x; xssq += x * x; if (bhat != NULL) { if (secant(&z, A, bhat, &xb)) probflag = 1; if (wf) fprintf(outfp, "%13.5le %13.5le\n", xb, x - xb); xbsum += xb; xbssq += xb * xb; dsum += fabs(x - xb); dssq += (x - xb) * (x - xb); } else if (wf) fprintf(outfp, "\n"); } xmu = xsum / (double) nsamples; xsigma = sqrt((xssq - nsamples * xmu * xmu) / (double) (nsamples - 1)); if (wf) { fprintf(outfp, "stability region summary:\n"); if (bhat != NULL) fprintf(outfp, " b: "); fprintf(outfp, "Average radius = %10.4le, standard deviation = %10.4le\n", xmu, xsigma); } if (!quiet) { if (bhat != NULL) { printf(" "); printf("stability region summary\n"); printf(" "); printf("(number of samples = %i, pi/2 <= theta <= pi)\n\n", nsamples); printf(" b: "); } else { printf(" "); printf("stability region summary\n"); printf(" "); printf("(number of samples = %i, pi/2 <= theta <= pi)\n\n", nsamples); printf(" "); } printf("Average radius = %10.4le, standard deviation = %10.4le\n", xmu, xsigma); } if (bhat != NULL) { xbmu = xbsum / (double) nsamples; xbsigma = sqrt((xbssq - nsamples * xbmu * xbmu) / (double) (nsamples - 1)); dmu = dsum / (double) nsamples; dsigma = sqrt((dssq - nsamples * dmu * dmu) / (double) (nsamples - 1)); if (wf) { fprintf(outfp, "bhat: Average radius = %10.4le, standard deviation = %10.4le\n", xbmu, xbsigma); fprintf(outfp, "diff: Average |diff| = %10.4le, standard deviation = %10.4le\n", dmu, dsigma); } if (!quiet) { printf( " bhat: Average radius = %10.4le, standard deviation = %10.4le\n", xbmu, xbsigma); printf( " diff: Average |diff| = %10.4le, standard deviation = %10.4le\n", dmu, dsigma); } } if (probflag) { if (!quiet) { printf( "Warning: the root solver had some difficulty finding the boundary\n"); printf( "of the stability region. The stability radii may be incorrect.\n\n"); } if (wf) { fprintf(outfp, "Warning: the root solver had some difficulty finding the boundary\n"); fprintf(outfp, "of the stability region. The stability radii may be incorrect.\n\n"); } } else { if (!quiet) printf("\n"); if (wf) fprintf(outfp, "\n"); } free(ImzA); free(zx); } /**********************************************************************/ /*** End of rktec.c ***/ /**********************************************************************/