next up previous contents
Next: Recount of (almost) available systems. Up: The Main Architectural Classes Previous: QsNet

SCI

SCI, standing for Scalable Coherent Interface, is the oldest of the networks discussed here. SCI became an IEEE/ANSI standard by October 1992. It was born as a reaction to the limitations that were encountered in bus-based multiprocessor systems in these days and a working group from various vendors and from universities tried to devise a solution that would do away with these limitations once and for all. A nice introduction into the rationale and the design choices for SCI can be found in [15]. Further discussions can be found in [20].
SCI was largely designed to avoid the usual bus limitations which resulted in a ring structure to connect the hosts. In fact pairs of rings are used to enabled reverse signaling between a send and receive host. This is because of one of the design principles of SCI: signals flow continuously in one direction enabling a very high signal rate without noise interference. A consequence is that the ring is always active, sending zero-length payload messages when no actual messages have to be transferred. The information length in a packet has a fixed sizes of 0, 16, 64, and 256 bytes and a header of 16 bytes or 32 bytes while the packet is closed by a 2-byte error correcting CRC code. By having the error correcting code at the end of the packet it is immediately known whether the data are corrupted or not and immediate action can be taken. The limited packet format enables fast reception and checking of the packets.

A special feature of SCI is its ability keep the caches of the processors it connects to coherent. This is also a consequence of its design history in the sense that SCI should be able to replace buses in multi-processor systems where such buses via a snoopy-bus protocol (see section ccNUMA) should keep the processor caches up-to-date or coherent. So, like with QsNet, one can use it to implement virtual shared memory. Unlike with QsNet, this is (have been) indeed been done: the late Convex Exemplar systems used SCI to connect its nodes, as did Data General. Presently the NUMA-Q systems of former Sequent, now IBM, use it as a memory interconnect medium. In clusters SCI is always used as just an internode network. Because of the ring structure of SCI this means that the network is arranged as a 1-D, 2-D, or 3-D torus as shown in Figure 17.

Figure of SCI networks arranged as 1-D, 2-D, and 3-D toruses.
Figure 17. SCI networks arranged as 1-D, 2-D, and 3-D toruses.

The torus network has some risks in comparison to other networks like the fat tree or the Clos network: when a node adaptor fails it incapacitates all the nodes that share the ring on which it lies. So, for instance, Dolphin networks, one of the SCI vendors provides software to reconfigure the torus in such a way that the minimal number of nodes become unavailable. Furthermore, it is not possible to add or remove an arbitrary number of nodes in the cluster because of the torus topology.
Bandwidths are reported for the SCI-based clusters: up to about 320 MB/s for a Ping-Pong experiment over MPI with very low latencies of 1—2 µs for small messages.



next up previous contents
Next: Recount of (almost) available systems Up: The Main Architectural Classes Previous: QsNet



Aad van der Steen
Tue Oct 12 13:43:30 CEST 2004