The Intel Paragon XP.

Next: The Kongsberg SCALI system. Up: Distributed-memory MIMD systems Previous: The IBM 9076 SP2

The Intel Paragon XP.

Machine type RISC-based distributed-memory multi-processor
Models Paragon XP/S (MP), XP/E
Operating system OSF/1, SunMos
Connection structure 2-D mesh (torus)
Compilers Fortran 77, ADA
Vendors information Web page http://www.ssd.intel.com/pubs.html

Machine type	RISC-based distributed-memory multi-processor
Models	Paragon XP/S (MP), XP/E
Operating system	OSF/1, SunMos
Connection structure	2-D mesh (torus)
Compilers	Fortran 77, ADA
Vendors information Web page	http://www.ssd.intel.com/pubs.html

System parameters:

Model Paragon XP/S Paragon XP/E
Clock cycle 20 ns 20 ns
Theor. peak performance
Per Proc. (64-bits) 75 Mflop/s 75 Mflop/s
64-bits precision 300 Gflop/s 2.1 Gflop/s

Main memory <=128 GB <=4.5 GB
Memory/node <=128 MB <=128 MB
Communication bandwidtd 200 MB/s 200 MB/s
No. of processors 64-4000 4-32

Model	Paragon XP/S	Paragon XP/E
Clock cycle	20 ns	20 ns
Theor. peak performance
Per Proc. (64-bits)	75 Mflop/s	75 Mflop/s
64-bits precision	300 Gflop/s	2.1 Gflop/s
Main memory	<=128 GB	<=4.5 GB
Memory/node	<=128 MB	<=128 MB
Communication bandwidtd	200 MB/s	200 MB/s
No. of processors	64-4000	4-32

Remarks:

The Paragon is a commercialised offspring of the experimental Touchstone Delta system. The latter machine was built for the Concurrent Supercomputing Consortium at CalTech. The Delta system used i860 processors as computational elements in its nodes but, unlike its predecessor, the iPSC/860, the nodes were not arranged in a hypercube topology but in a 2-D grid (for many physical simulation phenomena, as well as for the solution of linear systems this is a quite natural topology). The Delta system proved to be quite fast for a variety of problems (a speed of 11.9 Gflop/s was reported for an order 20,000 full linear system). The Paragon machine should do better because of the faster i860/XP processor that is used in the nodes. In addition, the i860/XP has processor communication hardware on-chip which makes the communication bandwidth higher.

In November 1993 the Paragon XP/E was introduced. This is an entry-level system with the same characteristics as the XP/S and up to 32 processors. The maximal configuration of the XP/E, the XP/E-28N has 32 nodes of which 28 are compute nodes. The others are used for assisting the routing, I/O, and other operating system tasks.

The Paragons retain compatibility with the former iPSC/860 systems, an Intel hypercube system preceding them. In particular the the transparent parallel Distributed File System can be used in applications migrated from the iPSC/860. The Paragon has its own parallel file system.

In 1995 the MP (Multi Processor) node was introduced. In such an MP node 3 i860/XP processors reside on one board and the processors share one address space. Fortran and C compilers take care of the automatic parallelisation within a MP node. The Intel-provided information claims a better performance than with single processor nodes. Until now this seems consistently but not spectacularly true (see Measured Performances).

Measured Performances: As on many systems a results are available for the solution of a large dense linear system. In [4] a speed of 281.1 Gflop/s is reported for a system of size 128,600 on a 6768-node ensemble of XP/S MP systems. No actual systems of this size are in operation. Results as quoted above are obtained by systems that are put together for the occasion. In [14] results for the class B EP, MG, and FT benchmarks, the times obtained on 512 processors were 3.98, 7.01, and 16.17 seconds for the single-node XP, while on the MP-node XP of the same size these times were 2.98, 6.72, and 12.4 seconds, respectively.

Next: The Kongsberg SCALI system. Up: Distributed-memory MIMD systems Previous: The IBM 9076 SP2

Jack Dongarra
Sat Feb 10 15:12:38 EST 1996