next up previous contents
Next: The Hitachi SR2201 series. Up: Distributed-memory MIMD systems Previous: The Fujitsu AP1000.

The Fujitsu VPP300 series.

Machine type Distributed-memory vector multi-processor
Models VX series, VPP300
Operating system UXP/VPP (a V5.4 based variant of Unix)
Connection structure Full distributed crossbar
Compilers Fortran 90/VP (Fortran 90 Vector compiler), Fortran 90/VPP (Fortran 90 Vector Paralles compiler), C/VP (C Vectore compiler),C,C++
Vendors information Web page http://www.fujitsu.co.jp/hypertext/Products/Info_process/vpp300/vpp300br.html

System parameters:

Model VX VPP300
Clock cycle 7/10 ns 7/10 ns
Theor. peak performance
Per proc. (64-bit) 1.6/2.2 Gflop/s 1.6/2.2 Gflop/s
Maximal (64-bit) 6.4/8.8 Gflop/s 25.6/35.2 Gflop/s
Main memory <=8 GB <=32 GB
Memory/node <=2 GB <=2 GB
Memory bandwidth
Memory banwidth/proc. 12.8/18.2 GB/s 12.8/18.2 GB/s
Communication bandwidth 400/570 MB/s 400/570 MB/s
No. of processors 1-4 1-16

Remarks:

The VPP300 is a succesor to the earlier VPP500. It is a much cheaper CMOS implementation of its predecessor with some important differences. First, no VPX200 front-end system is required anymore. Second, the crossbar that is used to connect the vector nodes is distributed. Therefore, the cost of a system is scalable: one does not need to buy a complete enclosure with the full crossbar for only a few nodes. The VX series is in fact a smaller version of the VPP300 with a maximum of 4 processors. Both the VX machines and the larger VPP300 systems are air-cooled. The systems are marketed either with a 10 ns or a 7 ns clock.

At this moment the VPP300 is officially only available with 16 processors connected by a direct crossbar. However, it is presumed that an announcement of larger systems will be made in the first quarter of 1996 in which multiple 16-processor machines are connected by a second level crossbar.

The architecture of the VPP300 nodes is almost identical to that of the VPP500: Each node, called a Processing Element (PE) in the system is a powerful (2.2 Gflop/s peak speed with a 7 ns clock) vector processor in its own right. The vector processor is complemented by a RISC scalar processor with a peak speed of 200 or 285 Mflop/s dependent on the clock speed. The scalar instruction format is 64 bits wide and may cause the execution of three operations in parallel. Each PE has a memory of up to 2 GB MB while a PE communicates with its fellow PEs at a point-to-point speed of 400 or 570 MB/s. This communication is cared for by separate Data Transfer Units (DTUs). To enhance the communication efficiency, the DTU has various transfer modes like contiguous, stride, sub array, and indirect access. Also translation of logical to physical PE-ids and from Logical in-PE address to real address are handled by the DTUs. When synchronisation is required each PE can set its corresponding bit in the SR. The value of the SR is broadcast to all PEs and synchronisation has occurred if the SR has all its bits set for the relevant PEs. This method is comparable to the use of synchronisation registers in shared-memory vector processors and much faster than synchronising via memory.

The Fortran compiler that comes with the VPP300 has extensions that enable data decomposition by compiler directives. This evades in many cases restructuring of the code. The directives are different from those as defined in the High Performance Fortran Proposal but it should be easy to adapt them. Furthermore, it is possible do define parallel regions, barriers, etc., via directives, while there are several intrinsic functions to enquire about the number of processors and to execute POST/WAIT commands. Furthermore, also a message passing programming style is possible by using the PVM or PARMACS communication libraries that are available.

Measured Performances: The first VPP300 systems will be delivered in the first quarter of 1996 (first only with the 10 ns clock). Therefore, no performance figures are available yet.



next up previous contents
Next: The Hitachi SR2201 series. Up: Distributed-memory MIMD systems Previous: The Fujitsu AP1000.



Jack Dongarra
Sat Feb 10 15:12:38 EST 1996