next up previous contents
Next: The Kendall Square Up: Description of Machine Previous: The Intel Paragon

The Kendall Square Research KSR1.

Machine type: RISC-based distributed-memory multi-processor.
Models: KSR1.
Operating system: KSR OS (OSF/1).
Connection structure: Hierarchical ring.
Compilers: Fortran 77, ANSI C with extensions.

System parameters:

Performance:

Note: The values for and are for a 128 node system.

The KSR1 system employs proprietary 64-bit processors with a peak speed of 40 Mflop/s per processor. Each node has a 32 MB local memory, in KSR terms called a local cache . The KSR1 is unique in that it is a virtual shared memory machine, i.e., data that are not found in the local cache of a node are routed automatically from the node that has them. Coherency between caches is automatically maintained. So, for a user the memory behaves as a shared data space, be it that he might have to wait for some data that have to be routed to the processor that requires them. The total of the local memories with the supporting virtual shared memory system is called the ALLCACHE system by KSR. The logical address space is very large: 40-bit addresses are supported.

The routing is controlled by the so-called ALLCACHE engines which comprise a hierarchical ring network. The ALLCACHE engine attempts to locate requested data within its local group, ALLCACHE Group, level 0 (AG:0 in KSR jargon) by sending the request around the ring. When this request cannot be satisfied, it is sent up one level higher to AG:1, etc. The aggregate communication bandwidth in AG:0 is 2 GB/s, while AG:1 and higher the aggregate bandwidth can be up to 4 GB/s, thus forming a (not too) fat tree. Apart from the local cache of 32 MB, which in another system would have been called local memory, each node has a 0.5 MB ``local sub-cache''. Cache refill times (which occur in sub-pages of 128 bytes), are approximately 2, 20, 150, and 570 cycles for local sub-cache, local cache, AG:0, and AG:1, respectively.

>From Kuck & Associates the KAP preprocessor is available which, by inserting compiler directives, control data distribution over the nodes (called ``tiling'' in KSR terms), similar but not identical to the High Performance Fortran Forum proposal.



next up previous contents
Next: The Kendall Square Up: Description of Machine Previous: The Intel Paragon



top500@rz.uni-mannheim.de
Tue Nov 14 15:39:09 PST 1995