next up previous contents index
Next: 14.3.4 Load Balancing Up: 14.3.3 Parallel Alpha-Beta Pruning Previous: Analysis of Alpha-Beta

Global Hash Table

The central role of the hash table in providing refutations and telling the program when to go parallel makes it clear that the hash table must be shared among all processors. Local hash tables would not work since the complex, dynamically changing organization of processors makes it very unlikely that a processor will search the same region of the tree in two successive levels of iterative deepening. A shared table is expensive on a distributed-memory machine, but in this case it is worth it.

Each processor contributes an equal amount of memory to the shared hash table. The global hash function maps each chess position to a global slot number consisting of a processor ID and a local slot number. Remote memory is accessed by sending a message to the processor in which the desired memory resides. To insure prompt service to remote memory requests, these messages must cause an interrupt on arrival. The VERTEX system does not support this feature, so we implemented a system called generalized signals [Felten:88b], which allows interrupt-time servicing of some messages without disturbing the running program.

When a processor wants to read a remote slot in the hash table, it sends a message containing the local slot number and the 64-bit collision check to the appropriate processor. When this message arrives the receiving processor is interrupted; it updates the staleness flag and sends the contents of the desired slot back to the requesting processor. The processor which made the request waits until the answer comes back before proceeding.

Remote writing is a bit more complicated due to the possibility of collisions. As explained previously, collisions are resolved by a priority scheme; the decision of whether to overwrite the previous entry must be made by the processor which actually owns the relevant memory. Remote writing is accomplished by sending a message containing the new hash table entry to the appropriate processor. This message causes an interrupt on arrival and the receiver examines the new data and the old contents of that hash table slot and decides which one to keep.

Since hash table data is shared among many processors, any access to the hash table must be an atomic operation. This means we must guarantee that two accesses to the same slot cannot happen at the same time. The generalized signals system provides a critical-section protection feature which can be used to queue remote read and write requests while an access is in progress.

Experiments show that the overhead associated with the global hash table is only a few percent, which is a small price to pay for accurate move ordering.



next up previous contents index
Next: 14.3.4 Load Balancing Up: 14.3.3 Parallel Alpha-Beta Pruning Previous: Analysis of Alpha-Beta



Guy Robinson
Wed Mar 1 10:19:35 EST 1995