\documentstyle{article} % Specifies the document style.
% The preamble begins here.
\title{Numerical Performance Results from the Shallow Water
Equation Test Suite}
\author{
John B. Drake, {\em Ed.}
\thanks{send correspondence to bbd@ornl.gov or
Mathematical Sciences Section, Oak Ridge National Laboratory,
P.O. Box 2008, Oak Ridge, Tennessee 37831-8083 }
}
\date{Updated: 3 May 1993}
\begin{document} % End of preamble and beginning of text.
\maketitle % Produces the title.
\section{Introduction}
The DOE Computer Hardware,
Advanced Mathematics and Model Physics (CHAMMP) program
seeks to provide climate researchers with
an advanced modeling capability for the study of global change issues
and is interested in the development of new methods
for the study of climate dynamics.
The shallow water equations have been used as a kernel for both
oceanic and atmospheric general circulation models and
are useful in evaluating numerical methods for weather
forecasting and climate modeling.
To promote development of new methods, a set of
test cases has been proposed \cite{Williamson-Drake}
and example software and reference solutions provided \cite{Jakob-Hack}.
This report summarizes the performance of
methods that have been applied to the
test cases.
Promising schemes should be subjected to other tests appropriate to
their intended application.
It is hoped that the bibliography provided herein will offer pointers
to the appropriate literature for more
comprehensive studies of the strengths and weaknesses of individual
methods.
\section{Comparison of Algorithms and Computing Platforms}
Table 1 gives a comparison of methods by accuracy and computational
performance.
In Table 1, execution time is given in seconds and represents
the best measurement of how long it takes to perform a 5 day
integration on a dedicated machine. Dedicated time is not
always available and measurements of time are often peculiar to a given
installation.
The accuracy reported is the normalized $l_2 (h)$ error as requested
in \cite[eq. 83]{Williamson-Drake}.
Gflops is an estimate of the number of floating point operations performed
per second during the integration. Hardware performance monitors are
the preferred measurement method.
\begin{table}[htbp]
\begin{tabular}{||rrrrrrrr||} \hline
Algorithm & Resol. & Machine & $P$ & Accuracy & Gflops & Execution & Notes \\
& . & & & & & Time (sec) & \\
\hline
Spectral & T42 & Y-MP & 1 & $10^{-10}$ & 0.162 & 3.5 & 1 \\
Spectral & T42 & Y-MP & 6 & $10^{-10}$ & 0.567 & 1.0 & 2 \\
\hline
Spectral & T213 & Y-MP & 1 & $10^{-10}$ & 0.215 & 690.0 & 2 \\
Spectral & T213 & Y-MP & 6 & $10^{-10}$ & 1.210 & 130.0 & 2 \\
\hline
TIG Model & 2562 & C90 & 1 & $2.5 \times 10^{-4}$& 0.087 & 20.4 & 3 \\
\hline
A-L & $72 \times 44$ & C90& 1 & $2.5 \times 10^{-4}$ & 0.351 & 3.9 & 4 \\
\hline
Icosohedral PIC & 10242 & Y-MP & 1 & $7.5 \times 10^{-4}$ & 0.103 & 26.0 & 5 \\
\hline
\end{tabular}
\caption{Best CPU time - Accuracy for Test Case 2}
\end{table}
Table 2 compares the parallel performance of methods. $P$ is the
number of processors used in the computation and $S_P$ is the
parallel speed up with $P$ processors over a single processor time.
If $T_P$ denotes the execution time for $P$ processors, then
$S_P = \frac{T_1}{T_P}$. The parallel efficiency is given by
$E_P = \frac{S_p}{P}$.
\begin{table}[htbp]
\begin{tabular}{||rrrrrrrr||} \hline
Algorithm & Resol. & Machine & $P$ & $S_P$ & $E_P$ & Execution & Notes \\
& & & & & & Time (sec) & \\
\hline
Spectral & T42 & Y-MP & 6 & 3.5 & 0.58 & 1.0 & 2 \\
Spectral & T213 & Y-MP & 6 & 5.4 & 0.90 & 130.0 & 2 \\
\hline
Spectral & T21 & iPSC/860 & 64 & 5.6 & 0.08 & 1.37 & 6 \\
Spectral & T42 & iPSC/860 & 128 & 18.4 & 0.14 & 3.92 & 6 \\
Spectral & T85 & iPSC/860 & 128 & 49.6 & 0.39 & 16.9 & 6 \\
\hline
\end{tabular}
\caption{Parallel Performance on Test Case 2}
\end{table}
\section{Notes}
\begin{enumerate}
\item Results of STSWM \cite{Jakob-Hack}. Solution exactly representable
in spectral expansion so accuracy not representative. The Y-MP results
were calculated in 64bit arithmetic.
\item Rudy Jacob's results of multitasked STSWM reported at the Third CHAMMP
Workshop on Numerical Solution of PDE's in Spherical Geometry.
\item TIG is the twisted icosahedral grid method described in
\cite{Heikes-Randall}. Execution time estimated from 600 sec
timesteps at 0.0284 sec/step on test case 5.
\item Arakawa-Lamb as described in
\cite{Heikes-Randall}. Execution time estimated from 600 sec
timesteps at 0.0284 sec/step on test case 5.
\item The PIC method is applied on an icosahedral grid of 10242
points. 90 timesteps were taken for the 5 day simulation.
Results presented by John Baumgardner at the Third CHAMMP
Workshop on Numerical Solution of PDE's in Spherical Geometry.
\item The Intel iPSC/860 results are 32bit arithmetic with accuracy
$O(10^{-5})$. The T21 case required 90 timesteps, T42 -- 180,
and T85 -- 360, for the five day integration.
\end{enumerate}
\section{Literature}
Seven test cases were proposed in \cite{Williamson-Drake}. These
cases collect several tests common in the literature but particularly
follow work in \cite{Browning-Hack-Swarztrauber}. A code to solve
the shallow water equations using the spectral transform method (STSWM) is
described in \cite{Hack-Jakob}. High resolution test case solutions
using the spectral code STSWM are given in \cite{Jakob-Hack}.
The report \cite{Heikes-Randall} compares solutions using an icosahedral
grid twisted to maintain grid symmetry between hemispheres.
Parallel algorithms for the spectral transform are discussed in
\cite{Worley-PartI,Walker-PartII}.
\bibliographystyle{plain}
\bibliography{shallow}
\end{document} % End of document.