next up previous contents index
Next: Simulation of the Up: Irregular Loosely Synchronous Previous: Irregular Loosely Synchronous

12.1 Irregular Loosely Synchronous Problems Are Hard

This chapter contains some of the hardest applications we developed within CP at Caltech. The problems are still ``just'' data-parallel with the natural ``massive'' (i.e., large scale as directly proportional to the number of data elements or problem size) loosely synchronous parallelism summarized in Figure 12.1. However, the irregularity of the problem-both static and dynamic-makes the implementation technically hard. Interestingly, after this hard work, we do find very good speedups, that is, this problem class has as much parallelism as the simpler synchronous problems of Chapters 4 and 6. In fact, one finds that it is in this class of problems that parallel machines most clearly outperform traditional vector supercomputers [Fox:89i;89n;90o]. The (dynamic) irregularity makes the parallelism harder to expose, but it does not remove it; however, the irregularity of a problem can make it impossible to get good performance on (SIMD) vector processors.

  
Figure 12.1: Data Parallelism

The problems contained in this chapter are also typical of the hardest challenges for parallelizing compilers.  These applications are not easy to write in a high-level language, such as High Performance Fortran of Chapter 13, in a way that compilers can efficiently extract the parallelism. This area is one of major research activity with interesting contributions from the groups at Yale [Bhatt:92a] and Stanford [Singh:92a] for the N-body  problem described in Section 12.4.

The applications in this chapter can be summarized as follows:

  1. Sections 12.2, 12.3: Adaptive unstructured meshes-the data structure is an irregular graph-where we use the DIME system of Chapter 10 to cope with the ``complication.'' A domain-specific software tool, rather than a general compiler, has been used.

  2. Section 12.6: Adaptive irregular clusters superimposed on a regular grid. This application has essentially the same parallelization issues as region finding in image processing  [Copty:92a].

  3. Section 12.7: Sorting features an adaptive treelike (hierarchical) data structure.

  4. Sections 12.4, 12.5, 12.8: These combine a hierarchical tree structure for fast calculation of forces with the underlying geometric structure of the physical domain. The applications are either ``pure'' particle simulations or a mix of particle and continuum calculations. In the latter case, they generalize to irregular dynamic problems the particle in the cell  methods described in Section 9.3.

We suggest that Chapters 12, 14, and 18 contain some of those applications which should be studied by computer scientists developing new software tools and parallel languages. This is where the application programmer needs help! We have separated off Chapter 14, as the violation of the loose synchronization  condition in this chapter produces different complications from the dynamic irregularity that characterizes the applications of Chapter 12. Chapter 18 contains compound metaproblems  combining all types of problem structure.



next up previous contents index
Next: Simulation of the Up: Irregular Loosely Synchronous Previous: Irregular Loosely Synchronous



Guy Robinson
Wed Mar 1 10:19:35 EST 1995