NA Digest Tuesday, April 12, 1988 Volume 88 : Issue 15

Today's Editor: Cleve Moler

Today's Topics:


From: Heikki Apiola <>
Date: Sun, 3 Apr 88 20:37 O
Subject: Fortran Code Generation by Computer Algebra Systems

I would like to know about existing computer algebra based (especially
MACSYMA) Fortran code generation packages and their availability.
There have been articles about GENTRAN by Paul S. Wang in J. of Symbolic

Please reply to:

Heikki Apiola
e-mail: apiola@finfun.bitnet

Scientific Computing Centre
State Computer Centre
SF-02101 Espoo


From: Hans D. Mittelmann <aihdm@asuacad.bitnet, na.mittelmann>
Date: 8 Apr 88 19:35 MST
Subject: Temporary Address Change for Hans Mittlemann

My address from 1 May 1988 until 31 July 1988 will be:

Institut fuer Angewandte Mathematik
Universitaet Erlangen-Nuernberg
Martensstrasse 3
8520 Erlangen
West Germany


--Hans Mittelmann


From: Jim Buchanan <>
Date: Tue, 12 Apr 88 16:40:24 EST
Subject: Models of the Human Head or Ear

I am looking for information on mathematical models of the human head
in general, or of the ear in particular. Does anyone have any
information in this area? Thanks in advance.
--Jim Buchanan


From: Warren E Ferguson <smu!ferguson@uunet.UU.NET>
Date: Tue, 12 Apr 88 15:31:27 CDT
Subject: Transcendental Functions on the 80387

I am thinking of having students in one of my classes create a partial
emulator of the INTEL 80387 chip. After looking at the spec sheets for the
80387, it appears that INTEL is using CORDIC's to compute their special
functions. Does anyone know of literature describing which CORDIC's INTEL is
using for these functions? Did members of the IEEE Binary Standard ever
suggest how the Standard could be used to compute the basic transcendental

[Editor's Note: See the announcement of Kahan's lectures at Sun
later in this Digest.]

Warren Ferguson, Southern Methodist University


From: Cleve Moler <>
Date: Tue, 12 Apr 88 22:14:26 PDT
Subject: Inverse Square Root of a Matrix

I was asked recently if I had any software to compute the inverse
square root of a symmetric, positive definite matrix. All I had to
suggest was EISPACK and the obvious diagonalization algorithm.
Has anybody got anything faster? The setting involves lots of
matrices of small order, say less than 10-by-10.


From: [Three levels of forwarding]
Date: Fri, 08 Apr 88 14:05:16 -0800
Subject: International Conference on Applications of Supercomputers

Announcement of ASE 89

The International Conference on Applications of Supercomputers in
Engineering will be held 5-7 September 1989 (yes 1989) at Southampton
University, United Kingdom. This International Conference is convened
to increase the awareness of the potential of the new supercomputers
amongst scientists and engineers.

Papers are invited on the following topics, and the abstracts should reach
the Organizers by 30 December 1988. Topics include:
* computer architecture and current supercomputers
* use and development of transputers
* case studies in
environmental sciences, etc.
* new parallel algorithms for finite differences
and finite and boundary element methods
* optimization problems
* structural and dynamic analysis including crash and impact
simulation applications in automotive engineering
* use of supercomputers in CAD and CIM
* benchmarks and comparison between different processors
* fluid dynamics and atmospheric science applications
* expert system applications
* future trends.

For further information, please contact:

Liz Newman
Conference Secretary
Computational Mechanics Institute
Ashurst Lodge
Southampton, S04 2AA

Telephone: (042129) 3223
Telex: 47388
Fax: (042129) 2853


From: David Hough <dgh@Sun.COM>
Date: Fri, 8 Apr 88 17:47:25 PDT
Subject: Floating-Point Indoctrination by Kahan

Sun Microsystems presents a public series of lectures:

Computer System Support for Scientific and Engineering Computation

Prof. W. Kahan
University of California

The environment for computing
Comparisons of floating-point arithmetics in computers
Programmers' models of floating-point arithmetic
Comparison of computers' floating-point register architectures
Real elementary transcendental functions
Complex arithmetic
Exception handling
Numerical lapses in standard programming languages

A detailed syllabus follows this announcement.


The lecture series will start at 8:00 AM on Tuesday May
3, 1988, in the Training Room at Sun Microsystems building
7, 2700 Coast Avenue, Mountain View, CA. From US 101, take
the Amphitheater Parkway exit in Mountain View and turn left
on Garcia. Turn right on Marine Way, then right on Coast
Avenue to its end.

The first lecture will provide an overview of the sub-
ject matter to be discussed in the course, and the last lec-
ture will summarize the material presented previously and
survey advanced topics and directions for further research.
Approximately 24 lectures will be presented Tuesdays and
Thursdays through approximately July 29. Each lecture will
start at 8:00 AM and last for two hours.

Lectures are open to the public subject to space avail-
able, and persons with experience in some of the lecture
topics are encouraged to attend and contribute to the dis-
cussion. The amount of space available will be determined by
the number of pre-registrants, so persons desiring to insure
a place should register in advance with the Sun Training and
Development Department using the form below.

Persons with training or experience comparable to that
represented by a B. S. degree in Computer Science should be
able to understand most of the conclusions. The mathematical
background required for a deep understanding of the lecture
topics varies considerably, especially for some of the
proofs. Chapter 4 of Knuth's Seminumerical Algorithms exem-
plifies the presentation level - accessible to undergradu-
ates, but containing interesting mathematics at all levels.
Note that the scope of the course is much broader than
Knuth's book.

Written lecture notes will be distributed to regis-
trants about a week after each lecture. These notes may
subsequently be commercially published to supersede the 1973
Implementation of Algorithms lecture notes distributed by

The first and last lectures will be videotaped; regis-
trants will receive a copy of the first for their personal

This lecture series does not provide academic credit.
Registrants interested in obtaining formal recognition of
participation may present an acceptable individual or group
project. Outstanding project reports will be included in
the published course lecture notes. An example of such a
project is to reimplement some of the 4.3 BSD libm functions
using fast table-driven techniques. Prof. Kahan will be
available after lectures for consultation on projects.

Questions pertaining to the technical scope and depth
of the course may be addressed to David Hough,, (415) 691-2268. Questions pertaining to
registration procedures should be directed to Janet Rath,, Sun Training and Development, (415) 494-6058
or 494-8008 x 6058.


Vendors of semiconductor devices, calculators, comput-
ers, and software products oriented toward scientific and
engineering computation are encouraged to contribute techni-
cal literature describing their products for distribution to
attendees. Please contact David Hough after 22 April to
determine quantities required.


The registration fee guarantees a place and provides
lecture notes and a videotape of the initial lecture of 3
May. Registration by 22 April 1988 is recommended, but late
registrants will be accommodated subject to available space
through at least the second class meeting, Thursday 5 May.
The registration fee is $1000, subject to reduction if more
participants register than projected.

Send a check for the registration fee, payable to Sun
Microsystems, with the Registration Form below.

Full-time students at accredited institutions may
register without fee; they will receive no videotape. Such
students should submit current evidence of their status in
lieu of a check.


Please send the following information to:

Sun Microsystems Training and Development
Attention: Janet Rath - Floating-Point Lectures
MS A6-05
2550 Garcia
Mountain View, CA 94043

Your name:

USENET or ARPANET address for electronic mail, if any:

Mailing address:

FULL-TIME STUDENTS ONLY: Your institution, department, next degree and projected date:

Telephone number during business hours:

Highest academic degree and major field:

Circle any of the following languages of which you have a working knowledge:

Ada APL BASIC C C++ Fortran Modula-2 Pascal PL/I

Name computers, if any, of which you have working knowledge of
assembly-language programming:

Name computers, if any, of which you have a working knowledge of the floating-point
architecture, such as number representations, rounding methods, etc.

What do you want to get out of this course?


The following details indicate the scope of the lec-
tures and point to references. While the focus will be on
general principles, particular areas of interest expressed
by attendees will also be discussed.

Lecture Plan

Lecture number Topic

1 Prospective overview from an elementary level (videotaped)
2-21 See details below
22-23 Student project reports
24 Retrospective overview from an advanced level (videotaped)

1. Environment for computing
Gresham's Law: "the bad drives out the good"; similarly the fast
drives out the slow even if the fast is wrong.
Computer Users: sophisticated but numerically naive.
Many layers of software from diverse sources.
Portable software in the public domain: LINPACK, EISPACK, ...
Commercial Libraries: IMSL, NAG, ... [Rice 1983]
Standards: for programming languages but little else.
Proprietary rights, copyright, patent, trade secrets.
L. J. Comrie's trick
Execute-only code, copy protection.
Liability and exculpation: law of torts
Reliability means conformity to reasonable expectations
Marketing claims vs. absolute and negligent liabilities
Obligation to correct errors; change vs. compatibility

2. Comparisons of floating-point arithmetics in computers
Conventional floating-point arithmetics: [Sterbenz 1975]
Radix, Word size, Range/Precision trade-off
Why is binary best? When is decimal best?
Commercially significant machines:
IBM 7090/7094, 360, 370, PC's
Univac 11xx
CDC 6600, Cyber 1xx, 205
Burroughs 6500
Hewlett-Packard 3000
Calculators by Hewlett-Packard and Texas Instruments
IEEE Standard 754:
Chips: Intel 80x87, Motorola 6888X, ATT 32106
National 32081, Weitek 1164/5, AMD 29027,
Fairchild Clipper, Analog Devices 3212/22
BIT, Inmos T800, ...
Micros: IBM PC's, Sun-3 and Sun-4, Apple Macintosh II, ...
Mainframes: ELXSI 6400, HP Spectrum
Nonconventional representations of real numbers:
Logarithmic (Radix 1)
Variable exponent width - why are they all bad ideas? Demmel 1987
Matsui-Iri 1983, Hamada 1987
Symmetric level-index arithmetic, Clenshaw-Olver 1987
Floating-slash rational arithmetic: Kornerup-Matula 1987
Interval arithmetic: Moore, Herzberger and Alefeld
what it is and what good it is, briefly
Multi-precision schemes:
Conventional double and extended precisions
Numerical Turing language - Hull 1987
MP package for Fortran - Brent
MACSYMA Bigfloat - Fateman
Kulisch-Miranker paradigm, IBM ACRITH package 1983
Kahan's indefinitely extended IEEE 754/854
Implementation of algebraic operations +-*/ rem sqrt:
Fast carry propagation: fast prefix calculation, VLSI
Floating-point add/subtract, shifters, normalizers
Multiplication: iterative, Booth re-encoding,
Wallace tree of carry-save adders
Division: higher radix [Taylor]; preconditioned
iterative using fast multiply, can be grubby [Cray]
can be cleaned up and tested [Kahan 1987]
Why keep (division time)/(multiplication time) < 4
Interruptible remainder
Square root by hardware, by software, or without division
Rounding in theory: guard-round-sticky bits
rerounding without double rounding (C. Lee 1988)
carry-free roundings (chop, jam, ROM)
Rounding as practiced on commercially significant machines
DPROD, or multiply-add primitives
Conversions: of precisions, float-integer, BCD-binary
Exceptions: Traps/Faults, Flags, Defaults, Presubstitution,
Invalid operation, Division by zero, Over/Underflow,
as practiced on commercially significant machines
partial overflow on Cray
partial underflow on CDC Cyber 1xx
gradual underflow in IEEE 754/854
NaNs, infinities, and reserved operands
Range extension via over/underflow tallies [Kahan 1966]
Precision doubling, and further extensions [Pichat 1972]
Ultra-wide multiplication (Karatsuba, Cook, Schoenhage)

3. Models of floating-point arithmetic for programmers
Purpose of models: promote correctness proofs and portability
Axioms: non-categorical axiomatization is too loose
32 by Wijngaarden 1966
20 by Brown 1981
Axioms: over-axiomatization is too slow
Kulisch's ringoids, semi-morphism, optimality
crude inequalities, Wilkinson 1959
parameterization, Sterbenz 1973
environmental inquiries in Fortran, ADA, ...
Computer manufacturer's manuals
IEEE standards 754/854
Tests for correct floating-point arithmetic:
Test batteries: the IEEE 754 test-vectors tape
Test patterns: Schryer 1980, NAG 1987
Inquiries and consistency tests: MACHAR Cody 1980, PARANOIA Kahan 1986
P-adic tests for correctly rounded sqrt: Kahan 1966
P-adic tests for correctly rounded division: Kahan 1987
Identities that persist despite roundoff
Weak monotonicity
p-q is exact if 1/2 <= p/q <= 2
When does integer m = (m/n)*n rounded?
When does x = (x+y) - y rounded ?
When does sqrt(x*x) = |x| ?
Avoiding drift:
during conversion, Matula 1968
during iteration; Knuth-Reiser proof
Harry Diamond's theorem
Error analysis:
Conventional "forward" analysis (Hotelling)
"Backward" analysis (Turing, Givens, Bauer, Wilkinson)
Combined approaches: the web of relationships
Misconceptions corrected:
Rounding errors are NOT infinitesimal
Rounding errors are NOT random variables
Unstable computations are often free from cancellation
Cancellation is often a GOOD thing in equation solving,
extrapolation, divided differences
Simulation is easy, prediction difficult - removable singularities
Backward error analysis is an explanation, not a license:
accuracy not transitive under composition
Precision does not limit accuracy (Dekker, Pichat)
Why double+ precision is a good rule of thumb:
double zeros, normal equations, divided differences
What "unstable", "ill-conditioned", "ill-posed" mean; how to
help an ill-posed problem get well
Interval arithmetic vs. Running Error-Analysis
When arbitrarily high precision might not help
Triangle with given side-lengths
Quadratic and Cubic equations: betrayed by books
Polynomial equations in general
Linear systems of equations and eigenproblems
Numerical quadrature and infinite series
Trajectory problems and boundary value problems

4. Comparison of computers' floating-point register archi-
One accumulator: IBM 7094, DEC PDP 10, GE/Honeywell 635
Stack-top: B5500, HP 3000
Small stack: HP calculators, Inmos T800, Intel 80x87
indefinitely long stack intended for 8087
Effect on procedures' argument passing and register saving
Extended width:
internally in HP calculators
Kulisch-Miranker's hidden Super-Accumulator
in registers: IBM 7094, DEC PDP-10, GE 635
IEEE 754: 8087, 68881, WE 32106
Orthogonal registers: IBM 370, CDC 6600, NS 32081, Cray,
DEC VAX, ELXSI 6400, Weitek, AMD, Analog Devices, etc.
Vector registers: Cray; simulated vectors in CDC Cyber 205
Fast vector arithmetic without vector registers.
Influences on programming languages' expression evaluation:
Precisions available
Mixed-precision expressions:
Fundamentalist's Fortran for IBM 370, DEC VAX
Widest available for C on PDP-11; 8087; 68881
"scan for widest" seems best [Corbett]
Conflict between compiler's pseudo-machine and actual registers

5. Real elementary transcendental functions
Argument reduction for log, exponential, and trig functions
How to simulate 2/pi to infinite precision, or not.
CORDIC algorithms = argument reduction carried to its limit
Polynomial, rational, and other "best" approximation
The Remez algorithm, with compensation for roundoff
Interpolation in BIG tables: IBM 370, DEC VAX VMS
J. C. P. Miller's trick to suppress roundoff: Gal
"Kernel function" approach; monotonicity achieved thereby
Special properties for special functions:
Weak monotonicity: EXP(X+Y**2) >= EXP(X)
Cardinal values: cos(0) = 1, |cos(x)| <= 1 .
Huge arguments: cos(3083975227) = -.00000000007486714572...
Removable singularities: does 0**0 = 1 ?
Unremovable singularities: ln(0)=-infinity; ln(-1) is NaN.
Accuracy claims and proofs: the Table-Maker's Dilemma
Testing: Cody 1980, Liu 1987

6. Complex Arithmetic
Mixed Real and Complex expressions
Multiply/Divide despite exceptions in concurrent environments
Different approaches in infinity
The sign of zero: functions like sqrt on slitted domains
Accurate elementary transcendental functions
Peter Tang's complex Remez algorithms

7. Exception handling
An Exception is not an Error unless handled badly
Why we must neither always ABORT nor always merely CONTINUE
Retrospective Diagnostics resolve the dilemma.
Let's not get enmeshed in the operating system's kernel.
Concurrent/parallel/pipelines machines restrict the options:
how can we pause and resume during debugging?
Infinity and NaN/Indefinite/Reserved operand allow CONTINUE option.
Presubstitution, and some of its applications.
Flags permit recovery from ill-presubstituted exceptions.
Circumventing over/underflow in extended products/quotients:
Kahan's 1966 implementations on 7094 and 360.
Barnett's 1987 implementations on VAX and Sun-3.
The problem of Scope:
solved without language support by Apple SANE 1986
with minimal language support:
APL's localization of system variables
proper language support achieves minimal run-time cost:
diminish incidence of forgotten SAVE/RESTOREs
avoid unnecessary SAVE/RESTORE in leaf subprograms.
Comparisons with:
"ON condition" in PL/I and Basic
"drop-through" in ADA.
"Error-Handlers" in DEC VMS Fortran.
Signals thrown and caught by Unix.
Augmented data-structures for Data-Flow.

8. Numerical lapses in standard programming languages:
Misguided attempts:
to define the semantics of arithmetic
Brown's model in ADA, Fortran 8x, ANSI C
Ambiguous environmental inquiries
to define occurrences like 1.0/0.0 and 0.0**0.0 as Errors.
to ignore integer overflow
Running roughshod over parentheses (C)
Moving code out of loops despite side-effects.
Other would-be optimizations like
misuse of register variables causing schizophrenia
misuse of "0-x" instead of "-x", or
"-(x-y)" instead of "y-x" , and
" x-y > 0" instead of "x > y".
Pascal's fixed array-dimensions, lack of double precision,
inadequate provision for precompiled libraries
C's awkward multi-subscripted arrays inside procedures
Fortran's unprotected argument lists
Inability to overload arithmetic operators to introduce:
complex arithmetic
matrix arithmetic
interval arithmetic
Awkward treatment of
"Eureka" problem (Break)
"Lost search-party" problem (compare Exception Handling)
Mixed-precision expressions (Corbett)
Re-running a program with higher precision
Mixed real-complex expressions
Inexact constants, inaccurate conversions
Output too wide for its field
What else should compiler writers know about floating point:
Farnum's 1988 paper


End of NA Digest