Users should choose vendor-supplied BLACS optimized for their computer; these BLAS will be the fastest BLACS implementation. If no vendor-supplied BLACS exists, users will have to choose among the publicly available BLACS libraries.
Many distributed-memory computers offer several communication libraries. The SP2, for example, offers MPI, PVM and MPL communication libraries. Since implementations of the BLACS exist on each of several communication libraries, one may have a choice of several different BLACS implementations. On the SP2, for example, the user can run the BLACS MPI, BLACS MPL, or BLACS PVM version.
Unfortunately, no hard rule exists as to which BLACS implementation will be fastest. However, since the BLACS cannot be faster than the communication library upon which it is built, and since the BLACS typically add little overhead, it is usually best to choose the BLACS implementation that is based on the fastest communication library.
Identifying the fastest communication library may not be trivial. The speed of communication libraries may be reported in different ways. Moreover, although the speed of blocking sends is reported because they are faster than nonblocking sends, the BLACS must use the nonblocking sends or provide its own buffering. Those who are using one of the computers listed in this chapter should refer to Tables 5.2 and 5.3 to see which library we used for timing. Our experience is that the fastest communication library was the library that is native to that particular computer.