Machine type: RISC-based distributed-memory multi-processor cluster.
Models: IBM9076 SP2.
Operating system: AIX (IBMs Unix variant).
Connection structure: Dependent on type of connection (see remarks).
Compilers: XL Fortran, XL C, XL C++.
Note: and are quoted for a 64 processor system.
As a basis for the computational nodes in the SP2 RS/6000 processors with a clock cycle of 15 ns are used. This amounts to a peak performance of 266 Mflop/s per node because the floating-point units of the SP2 processors can deliver up to 4 results/cycle. The SP2 configurations are housed in columns that each can contain 8--16 processor nodes. This depends on the type of node employed: there are two types, thin nodes and wide nodes. Although the processors in these nodes are basically the same there are some differences. Wide nodes have the double amount of microchannel slots (8 instead of 4) as compared to the thin nodes. Furthermore, the maximum memory of a wide node can be 2 GB whereas the maximum for thin nodes is 512 MB. More important in terms of performance is the fact that the data cache of a wide node is four times larger than that of a thin node (256 KB instead of 64 KB) and that the memory bus is two times wider than that of a thin node (8 instead of 4 words/cycle). The latter differences explain than a performance gain of a factor 1.5 has been observed for wide nodes over the thin nodes. In the new Thinnode-2 this bandwidth/cache problem should have disappeared.
IBM envisions the wide node more or less as a server for a column and recommends configurations of one wide node packaged with 14 thin nodes per column (although this may differ with the needs of the user). The SP2 is accessed through a front-end control workstation that also monitors system failures. Failing nodes can be taken off line and exchanged without interrupting service. In addition, fileservers can be connected to the system while every node can have up to 2GB. This can greatly speed up applications with significant I/O requirements.