|Machine type||RISC-based distributed-memory multi-processor|
|Models||Paragon XP/S (MP), XP/E|
|Operating system||OSF/1, SunMos|
|Connection structure||2-D mesh (torus)|
|Compilers||Fortran 77, ADA|
|Vendors information Web page||http://www.ssd.intel.com/pubs.html|
|Model||Paragon XP/S||Paragon XP/E|
|Clock cycle||20 ns||20 ns|
|Theor. peak performance|
|Per Proc. (64-bits)||75 Mflop/s||75 Mflop/s|
|64-bits precision||300 Gflop/s||2.1 Gflop/s|
|Main memory||<=128 GB||<=4.5 GB|
|Memory/node||<=128 MB||<=128 MB|
|Communication bandwidtd||200 MB/s||200 MB/s|
|No. of processors||64-4000||4-32|
The Paragon is a commercialised offspring of the experimental Touchstone Delta system. The latter machine was built for the Concurrent Supercomputing Consortium at CalTech. The Delta system used i860 processors as computational elements in its nodes but, unlike its predecessor, the iPSC/860, the nodes were not arranged in a hypercube topology but in a 2-D grid (for many physical simulation phenomena, as well as for the solution of linear systems this is a quite natural topology). The Delta system proved to be quite fast for a variety of problems (a speed of 11.9 Gflop/s was reported for an order 20,000 full linear system). The Paragon machine should do better because of the faster i860/XP processor that is used in the nodes. In addition, the i860/XP has processor communication hardware on-chip which makes the communication bandwidth higher.
In November 1993 the Paragon XP/E was introduced. This is an entry-level system with the same characteristics as the XP/S and up to 32 processors. The maximal configuration of the XP/E, the XP/E-28N has 32 nodes of which 28 are compute nodes. The others are used for assisting the routing, I/O, and other operating system tasks.
The Paragons retain compatibility with the former iPSC/860 systems, an Intel hypercube system preceding them. In particular the the transparent parallel Distributed File System can be used in applications migrated from the iPSC/860. The Paragon has its own parallel file system.
In 1995 the MP (Multi Processor) node was introduced. In such an MP node 3 i860/XP processors reside on one board and the processors share one address space. Fortran and C compilers take care of the automatic parallelisation within a MP node. The Intel-provided information claims a better performance than with single processor nodes. Until now this seems consistently but not spectacularly true (see Measured Performances).
Measured Performances: As on many systems a results are available for the solution of a large dense linear system. In  a speed of 281.1 Gflop/s is reported for a system of size 128,600 on a 6768-node ensemble of XP/S MP systems. No actual systems of this size are in operation. Results as quoted above are obtained by systems that are put together for the occasion. In  results for the class B EP, MG, and FT benchmarks, the times obtained on 512 processors were 3.98, 7.01, and 16.17 seconds for the single-node XP, while on the MP-node XP of the same size these times were 2.98, 6.72, and 12.4 seconds, respectively.