Dynamic Process Groups

The dynamic process group functions are built on top of the core PVM routines. A separate library libgpvm3.a must be linked with user programs that make use of any of the group functions. The pvmd does not perform the group functions. This task is handled by a group server that is automatically started when the first group function is invoked. There is some debate about how groups should be handled in a message-passing interface. The issues include efficiency and reliability, and there are tradeoffs between static versus dynamic groups. Some people argue that only tasks in a group can call group functions.

In keeping with the PVM philosophy, the group functions are designed to be very general and transparent to the user, at some cost in efficiency. Any PVM task can join or leave any group at any time without having to inform any other task in the affected groups. Tasks can broadcast messages to groups of which they are not a member. In general, any PVM task may call any of the following group functions at any time. The exceptions are pvm_lvgroup(), pvm_barrier(), and pvm_reduce(), which by their nature require the calling task to be a member of the specified group.

int inum = pvm_joingroup( char *group )
int info = pvm_lvgroup( char *group )
call pvmfjoingroup( group, inum )
call pvmflvgroup( group, info )

These routines allow a task to join or leave a user named group. The first call to pvm_joingroup() creates a group with name group and puts the calling task in this group. pvm_joingroup() returns the instance number (inum) of the process in this group. Instance numbers run from 0 to the number of group members minus 1. In PVM 3, a task can join multiple groups.

If a process leaves a group and then rejoins it, that process may receive a different instance number. Instance numbers are recycled so a task joining a group will get the lowest available instance number. But if multiple tasks are joining a group, there is no guarantee that a task will be assigned its previous instance number.

To assist the user in maintaining a continuous set of instance numbers despite joining and leaving, the pvm_lvgroup() function does not return until the task is confirmed to have left. A pvm_joingroup() called after this return will assign the vacant instance number to the new task. It is the user's responsibility to maintain a contiguous set of instance numbers if the algorithm requires it. If several tasks leave a group and no tasks join, then there will be gaps in the instance numbers.

int tid = pvm_gettid( char *group, int inum )
int inum = pvm_getinst( char *group, int tid )
int size = pvm_gsize( char *group )
call pvmfgettid( group, inum, tid )
call pvmfgetinst( group, tid, inum )
call pvmfgsize( group, size )

The routine pvm_gettid() returns the TID of the process with a given group name and instance number. pvm_gettid() allows two tasks with no knowledge of each other to get each other's TID simply by joining a common group. The routine pvm_getinst() returns the instance number of TID in the specified group. The routine pvm_gsize() returns the number of members in the specified group.

int info = pvm_barrier( char *group, int count )
call pvmfbarrier( group, count, info )

On calling pvm_barrier() the process blocks until count members of a group have called pvm_barrier. In general count should be the total number of members of the group. A count is required because with dynamic process groups PVM cannot know how many members are in a group at a given instant. It is an error for processes to call pvm_barrier with a group it is not a member of. It is also an error if the count arguments across a given barrier call do not match. For example it is an error if one member of a group calls pvm_barrier() with a count of 4, and another member calls pvm_barrier() with a count of 5.

int info = pvm_bcast( char *group, int msgtag )
call pvmfbcast( group, msgtag, info )

pvm_bcast() labels the message with an integer identifier msgtag and broadcasts the message to all tasks in the specified group except itself (if it is a member of the group). For pvm_bcast() ``all tasks'' is defined to be those tasks the group server thinks are in the group when the routine is called. If tasks join the group during a broadcast, they may not receive the message. If tasks leave the group during a broadcast, a copy of the message will still be sent to them.

int info = pvm_reduce(   void (*func)(), void *data,
                          int nitem, int datatype,
                          int msgtag, char *group, int root )
call pvmfreduce(   func, data, count, datatype,
                   msgtag, group, root, info )

pvm_reduce() performs a global arithmetic operation across the group, for example, global sum or global max . The result of the reduction operation appears on root. PVM supplies four predefined functions that the user can place in func. These are

    PvmMax
    PvmMin
    PvmSum
    PvmProduct

The reduction operation is performed element-wise on the input data. For example, if the data array contains two floating-point numbers and func is PvmMax, then the result contains two numbers-the global maximum of each group members first number and the global maximum of each member's second number.

In addition users can define their own global operation function to place in func. See Appendix B for details. An example is given in the source code for PVM. For more information see PVM_ROOT/examples/gexamples.

Note: pvm_reduce() does not block. If a task calls pvm_reduce and then leaves the group before the root has called pvm_reduce, an error may occur.

Next: Program Examples Up: PVM User Interface Previous: Unpacking Data