SparseBench: a sparse iterative benchmark

Jack Dongarra, Victor Eijkhout
Computer Science Department
University of Tennessee
Knoxville, TN 37996-1301, USA
and
Henk van der Vorst
Universiteit Utrecht
Utrecht, the Netherlands

Click here to see the number of accesses to this library.

For comments and questions, mail to sparsebench@cs.utk.edu.

About the benchmark

SparseBench is a benchmark suite of iterative methods on sparse data. Sparse matrices, such as derived from PDEs, form an important problem area in numerical analysis. Unlike in the case of dense matrices, handling them does not entail much reuse of data. Thus, algorithms for sparse matrices will be more bound by memory-speed than by processor-speed.

This benchmark uses common iterative methods, preconditioners, and storage schemes to evaluate machine performance on typical sparse operations. The benchmark components are:

Conjugate Gradient and GMRES iterative methods,
Jacobi and ILU preconditioners,
diagonal storage and compressed row storage matrices.

Instructions

Download the file benchmark.tgz below.
Unpack it by

gunzip benchmark.tgz
tar -xf benchmark.tar  or  tar -x -f benchmark.tar

Go into the benchmark directory

cd SparseBench

and configure for your architecture

configure

Install the software and test your machine by

Test -m <machine name>

where "machine name" is an arbitrary name for your machine. If you run 'Test' more than once, only higher numbers are kept.

Mail the results back to the benchmark reporting authority by

Report -m <machine name>

You are strongly encouraged to read the files README and install.ps below, which are also part of the full tgz file.

Benchmark results

These are preliminary benchmark results, performed mostly on computers owned by the Innovative Computing Labs of the University of Tennessee. All tests report Megaflop rates on code that is compiled straight out of the box.

First we report the highest rate found for any problem. This was typically attained on a fairly small problem size, the implication being that the whole problem fit into cache.

Highest performance ranking
EV6 [a]	759
Power3 [a]	606
EV6 [b]	438
Power3 [b]	331
EV56	262
PPC G4	198
R12000 [a]	155
UltraSparcII [a]	154
Athlon	154
R12000 [b]	108
Origin	106
UltraSparcII [b]	102
PentiumIII	96
LX164	81
UltraSparcII [c]	47

List of machines used
Processor	Machine	Owned by	Compiler options
Athlon	Athlon 600MHz	R. Clint Whaley	-O
EV56	Dec Alpha, 433 MHz	ICL, University of Tennessee
EV6 [a]		Geophysik, Freie Universitaet Berlin
EV6 [b]	DEC Alpha, 500 MHZ	ICL, University of Tennessee
LX164	ALPHA, 533MHz	ICL, University of Tennessee
Origin	SGI Origin, single processor	NCSA	-O
PPC G4	Macintosh at 450MHz	ICL, University of Tennessee
PentiumIII	Dell, dual 550MHz	ICL, University of Tennessee
Power3 [a]	IBM quad 375MHz power3	ICL, University of Tennessee
Power3 [b]	IBM Power3, dual 200MHz	ICL, University of Tennessee
R12000 [a]	SGI Octane, 270 MHz	ICL, University of Tennessee	-O
R12000 [b]	SGI Indigo	ICL, University of Tennessee	-O
UltraSparcII [a]	Sun Enterprise 450 model 1300, single 296MHz	ICL, University of Tennessee	-O
UltraSparcII [b]	Sun Ultra5	ICL, University of Tennessee	-O
UltraSparcII [c]	Sun Enterprise, 248MHz	ICL, University of Tennessee

Next we filter problem by

Iterative method: gmres or cg;
Storage scheme: regular (diagonal) or crs (compressed row);
Preconditioner: none or ilu (incomplete LU).

and we report the "asymptotic performance" which will be the expected Mflop rate for large problems that overflow the cache. Asymptotic performance is determined by making a y=a+b/x fit through the observations, where x is the data set size in Mbytes.

Asymptotic performance on "gmres" problems
EV6 [a] 216

Power3 [a] 209

EV6 [b] 168

Power3 [b] 130

R12000 [a] 78

Origin 71

EV56 60

Athlon 44

LX164 40

PentiumIII 39

PPC G4 38

UltraSparcII [a] 37

R12000 [b] 30

UltraSparcII [c] 23

UltraSparcII [b] 23

Asymptotic performance on "cg" problems
EV6 [a] 285

Power3 [a] 254

EV6 [b] 198

Power3 [b] 110

Origin 70

UltraSparcII [a] 57

R12000 [a] 52

PPC G4 45

LX164 45

Athlon 43

EV56 40

PentiumIII 37

UltraSparcII [c] 26

UltraSparcII [b] 21

R12000 [b] 19

Asymptotic performance on "reg" problems
EV6 [a] 285

Power3 [a] 254

EV6 [b] 198

Power3 [b] 110

R12000 [a] 78

Origin 71

UltraSparcII [a] 57

EV56 55

PPC G4 45

LX164 45

Athlon 43

PentiumIII 37

R12000 [b] 28

UltraSparcII [c] 26

UltraSparcII [b] 21

Asymptotic performance on "crs" problems
Power3 [a] 209

EV6 [a] 209

EV6 [b] 166

Power3 [b] 130

Origin 68

R12000 [a] 63

EV56 60

Athlon 44

LX164 40

PentiumIII 39

UltraSparcII [a] 35

R12000 [b] 30

UltraSparcII [c] 23

UltraSparcII [b] 23

PPC G4 23

Asymptotic performance on "none" problems
Power3 [a] 215

EV6 [a] 205

EV6 [b] 158

Power3 [b] 88

R12000 [a] 64

Origin 63

UltraSparcII [a] 40

EV56 40

PPC G4 38

LX164 36

Athlon 33

PentiumIII 27

UltraSparcII [c] 26

R12000 [b] 24

UltraSparcII [b] 21

Asymptotic performance on "ilu" problems
EV6 [a] 163

EV6 [b] 132

Power3 [a] 120

Power3 [b] 90

R12000 [a] 62

Origin 57

EV56 39

UltraSparcII [a] 34

Athlon 34

LX164 33

PPC G4 31

PentiumIII 27

R12000 [b] 20

UltraSparcII [c] 16

UltraSparcII [b] 15

#########################################################################

file    readme

file    install.ps
file    install.pdf
for     Installation Guide for the Sparse Iterative Benchmark

file    benchmark.tgz
for     Benchmark of Conjugate Gradient methods, using sparse data storage
,	Sparse benchmark, version 0.9.7, released 17 Nov 2000.
,       Questions/comments to sparsebench@cs.utk.edu
by      Jack Dongarra, Victor Eijkhout, Henk van der Vorst

file    bench.ps
file    bench.pdf
for     Details and results of the Sparse Iterative Benchmark

#########################################################################