Next: The LINPACK Benchmark Up: Performance of Supercomputers Previous: Performance of Supercomputers


Single synthetic benchmarks like Drystone, Whetstone have been quite common for small machines for a decade [8]. But due to increasing cache sizes and better optimizing compilers, which are able to detect and eliminate unnecessary code, they became obsolete.

In the workstation class the SPEC initiative [i]SPEC was a quite successful attempt to create a standardised multi-application benchmark [9]. A set of 13 different full size application codes is used to evaluate the capabilities of workstations in a more realistic way. But still there are a lot of problems left which have to be solved by the SPEC-consortium as well as by other benchmark initiatives in the future, e.g.:

In 1987, [i]Perfect Club a group of supercomputer vendors and researchers from universities started an effort known as `Perfect Club' to overcome these problems, especially for the class of vector supercomputers [10]. Due to their complex architecture there was a very big interest in developing a methodology for fair benchmarking including the possibility of optimization of the codes. The optimization of codes is very important in the case of vector or parallel computers, as their architectures are too different to be equally treated by a single implementaion of a given problem. Therefore, the Perfect Club collected a set of 13 full size scientific application codes (Table 3.1) and described a procedure for optimizing the code and reporting results. In the case of vector computers the Perfect Club was quite successful, but due to the big amount of work needed for such an effort, they are falling behind in creating a similar benchmark suite for MPPs as they had done for vector machines.


A big problem with benchmarks including optimization of full size codes is just the effort and time you have to put in to do the job properly. This is one of the major reasons why kernel based benchmarks are still very important. Also in many cases they can be good first approximations to what can be expected for the performance of full codes. The sizes of many kernels themselves can range between a hundred up to a few thousand lines of code.

In [i]Livermore Kernels the eighties the `Livermore Fortran Kernels' were quite popular [11]. All 24 kernels add up to about a thousand lines of Fortran code and were mainly designed to test vector computers. Because of this they are not well suited for MPPs.

In the last two years the `NAS Parallel Benchmark' [i]NAS PB became very widespread [12]. Most important about this benchmark is, that it was designed for studying the performance of parallel supercomputers. The eight problems shown in Table 3.2 are specified as ``pencil and paper'' benchmarks, but examples of their implementations are also available. To achieve comparable results the definition as well as the reporting of results are exactly specified. The latest results may be found in [13]. The big interest the NAS PB found in the high performance community show the big need for such reliable, fair and appropriate benchmarks.


Not too long ago [i]Parkbench a new initiative was formed to establish a scientific methodology for benchmarking especially suited for MPP systems, the `Parkbench' initiative [14]. The Parkbench will include not only statement and kernel based benchmarks but also full size application code benchmarks. It is an initiative of many scientists working in this field and the benchmark will include the NAS PB as well as a variant of the LINPACK benchmark. Its major goal is not just to create another suite of benchmarks, but to continue the development of the scientific foundation of the field of supercomputer performance evaluation.

Next: The LINPACK Benchmark Up: Performance of Supercomputers Previous: Performance of Supercomputers
Fri Jun 3 11:30:36 MDT 1994