In 1984 the company announced the SX-1 and SX-2 and started delivery in 1985. The first two SX-2 systems were domestic deliveries to Osaka University and the Institute for Computational Fluid Dynamics (ICFD). The SX-2 had multiple pipelines with one set of add and multiply floating point units each. With a cycle time of 6 nanoseconds, each pipelined floating-point unit could peak at 167 Mflop/s. With four pipelines per unit and two floating-point units, the peak performance was about 1.3 Gflop/s. Due to limited memory bandwidth and other issues the sustained performance in benchmark tests was typically less than half the peak value. The SX-1 had a slightly higher cycle time (7 ns) than the SX-2. In addition it had only half the number of pipelines. The maximum execution rate was 570 Mflop/s. At the end of 1987, NEC improved its supercomputer family with the A-series which gave improvements to the memory and I/O bandwidth. The top model, the SX-2A, had the same theoretical peak performance as the SX-2. Several low-range models were also announced but today none of these systems can qualify for the TOP500 list.
In 1989 NEC announced a rather aggressive new model, the SX-3, with several important changes. The vector cycle time was brought down to 2.9 ns, the number of pipelines was doubled, but most significantly NEC added multiprocessing capability to its new series. The new top of the range featured four independent arithmetic processors (each with a scalar and a vector processing unit); and NEC pushed its performance by more than one order of magnitude to an impressive peak of 22 Gflop/s (from 1.33 on the SX-2A). The combination of these features put the SX-3 at the top of the list of now the most powerful vector processors in the world. The total memory bandwidth was subdivided into two halves which in turn featured two vector load and one vector store paths per pipeline set as well as one scalar load and one scalar store paths. This gave a total memory bandwidths to the vector units of about 66 GB/s. Like its predecessors, the SX-3 was therefore unable to offer the memory bandwidth needed to sustain peak performance unless most operands were contained in the vector registers. In 1992 NEC announced the SX-3R with a couple of improvements compared to the first version. The clock was further reduced to 2.5 ns, so that the peak performance increased to 6.4 Gflop/s per processor.