5.2.5 ``Melting''-a Non-crystalline Problem

Next: 5.2.6 The Mark III Up: 5.2 A ``Packet'' History Previous: 5.2.4 Automated Decomposition-whoami

5.2.5 ``Melting''-a Non-crystalline Problem

Up to this point, we had concentrated on the most obvious scientific problems: FFTs, ordinary and partial differential equations, matrices, and so on, which were all characterized by their amenability to the lock step, short-range communication primitives available. Note that some of these, such as the FFT and matrix algorithms, are not strictly ``nearest neighbor'' in the sense of the communication primitives discussed earlier, since they require data to be distributed to nodes further than one step away. These problems, however, are amenable to the ``collective communication'' strategies.

Based on our success with these problems, we began to investigate areas that were not so easily cast into the crystalline methodology. A long-term goal was the support of event-driven simulations, database machines, and transaction-processing systems, which did not appear to be crystalline .

In the shorter term, we wanted to study the physical process of ``melting'' [Johnson:86a] described in Section 14.2. The melting process is different from the applications described thus far, in that it inherently involves some indeterminacies-the transition from an ordered solid to a random liquid involves complex and time-varying interactions. In the past, we had solved such an irregular problem-that of N-body gravity [Salmon:86b] by the use of what has since been called the ``long-range-force'' algorithm [Fox:88a]. This is a particularly powerful technique and leads to highly efficient programs that can be implemented with crystalline commands.

The melting process differs from the long-range force algorithm in that the interactions between particles do not extend to infinity, but are localized to some domain whose size depends upon the particular state of the solid/fluid. As such, it is very wasteful to use the long-range force technique, but the randomness of the interactions makes a mapping to a crystalline algorithm difficult (see Figure 5.2).

Figure 5.2: Interprocessor Communication Requirements

To address these issues effectively, it seemed important to build a communication system that allowed messages to travel between nodes that were not necessarily connected by ``channels,'' yet didn't need to involve all nodes collectively.

At this point, an enormous number of issues came up-routing, buffering, queueing, interrupts, and so on. The first cut at solving these problems was a system that never acquired a real name, but was known by the name of its central function, ``rdsort'' [Johnson:85a]. The basic concept was that a message could be sent from any node to any other node, at any time, and the receiving node would have its program interrupted whenever a message arrived. At this point, the user provided a routine called ``rdsort'' which, as its name implies, needed to read, sort and process the data.

While simple enough in principle, this programming model was not initially adopted (although it produced an effective solution to the melting problem). To users who came from a number-crunching physics background, the concept of ``interrupts'' was quite alien. Furthermore, the issues of sorting, buffering, mutual exclusion, and so on, raised by the asynchronous nature of the resulting programs, proved hard to code. Without debugging tools, it was extremely hard to develop programs using these techniques. Some of these ideas were taken further by the Reactive Kernel [Seitz:88b] (see Section 16.2), which do not, however, implement ``reaction'' with an interrupt level handler. The recent development of active messages on the CM-5 has shown the power of the rdsort concepts [Eiken:92a].

Next: 5.2.6 The Mark III Up: 5.2 A ``Packet'' History Previous: 5.2.4 Automated Decomposition-whoami

Guy Robinson
Wed Mar 1 10:19:35 EST 1995