Collective Communications Module
Applications Programmers Interface
and
User's Guide

Introduction

The Collective Communications Module Applications Programmers Interface (CCMAPI) contains routines for use primarily on distributed memory parallel processing systems. CCMAPI provides most of the functionality of the collective communications routines from MPI and some additional functionality. Also, CCMAPI has a simplified interface, is compatible with Fortran 90 calling semantics, and designed to be used with or without MPI.

This portion of the user's guide provides an introduction and links to information about the routines defined within the API. Also, the links contain simple example applications that use the routines.

CCMAPI supplements existing communications packages. Programs can use CCMAPI and other packages. This portion of the user's guide discusses MPI because there is an obvious link between the functionality provided by MPI collectives and CCMAPI. There are only a few references to particular communication packages in the rest of this document.

This document specifies an applications programmers interface, an API. It does not specify an implementation. A library writer can use MPI, P4, SHMEM, PVM, or Unix sockets to build an implementation of the API. However, there is a MPI reference implementation and one using SHMEM routines. There is also a timing suite and test suite for the API. A tar file with this API is available. These implementations contain notes on building the module from the sources. For questions contact ccm@coherentcognition.com

An HTML version of a slide presentation on this API is also available.

The core routines in this module are:

ccm_init: Initialization routine.
ccm_close: Finalization routine.
ccm_bcast: Broadcast, data is sent from one task to each task.
ccm_barrier: Barrier operation, block until all tasks have reached this routine.
ccm_reduce: Reduces values from all tasks to a single task. Data is sent from every task to one task with an operation, such as, addition, applied to the data while in transit.
ccm_allreduce: Reduction operation with all tasks receiving the final data.
ccm_scatter: Scatter, sends different data from one task to all other tasks.
ccm_scatterv: Scatter operation with different amounts of data going to each task.
ccm_gather: Gathers together values from all tasks to a single task.
ccm_gatherv: Gather operation with different amounts of data from each task.
ccm_alltoall: All tasks exchange the same amount of data.
ccm_alltoallv: All tasks exchange data, possibly different amounts.
ccm_checkin: A special purpose barrier with a timeout feature. Also allows for checking for some deadlock conditions.

Utility routines:

ccm_info: Returns information, the id of the calling task, the number of tasks, and information about supported data types.
ccm_testing: Sets and/or returns the error checking level.
ccm_warning: Sets error conditions and optionally prints a message. Not user callable.
ccm_print_warning: Prints a message.
ccm_clear_warning: Clears warning message buffers.
ccm_error: Prints a message and aborts the program. Not user callable.
ccm_unique: Returns a string that is based on task id. Can be used as a file name.
ccm_time: Returns a time since some point in the past.

Constants and variables:

ccm_reproducible :: integer, parameter: Used to set the algorithm for reduction operations. See ccm_reduce.
ccm_fast :: integer, parameter: Used to set the algorithm for reduction operations. See ccm_reduce.
ccm_auto_print :: logical: If true, print warning messages. See ccm_warning.
ccm_alloff :: integer, parameter: Used to set error checking level. See ccm_testing.
ccm_checksize :: integer, parameter: Used to set error checking level. See ccm_testing.
ccm_deadlock :: integer, parameter: Used to set error checking level. See ccm_testing.
ccm_trace :: integer, parameter: Used to set error checking level. See ccm_testing.
ccm_internal :: integer, parameter: Used to set error checking level. See ccm_testing.

MPI users will recognize some of these routines as being part of that standard. The primary differences are that these routines are:

Designed to be used with Fortran 90. They do not have some of the calling restrictions associated with MPI and Fortran 90.
Designed to be easier to use. They do not need as many calling parameters.
Designed to be used with message passing packages other than MPI. You can mix these routines with SHMEM or Coarray Fortran calls.

The Allocation and Sizes of Arrays

For a call to the module to be compliant with the API, the sum of the number of values sent by tasks sending data must be equal to the sum of the number of values received by tasks receiving data. If a task is sending data then the number of values sent is determined by the size of the input array. The whole array is sent. Likewise, when a task is receiving data the whole output array is overwritten.

This has implications for the sizes of arrays passed to the routines:

ccm_bcast
- The input/output array must be the same size on all tasks
ccm_reduce
- Input arrays must be the same size on all tasks
- Output array on task receiving data must be the same size as the input array
ccm_allreduce
- Input arrays must be the same size on all tasks
- Output array on all tasks must be the same size as the input array
ccm_scatter
- Input array on task sending data must be of size N*(total number of tasks)
- Output array on all tasks must me the same size, N
ccm_gather
- Input array on all tasks must be the same size, N
- Output array on task receiving data must be of size N*(total number of tasks)
ccm_alltoall
- Input array on all tasks must be of size N*(total number of tasks)
- Output array on tasks must be of size N*(total number of tasks)
ccm_scatterv
- On the task sending data, the array ,to_send, must be of size equal to the total number of tasks
- Input array on the task sending data must be of size sum(to_send)
- Output array on task N must be of size equal to the Nth element of to_send.
ccm_gatherv
- On the task receiving data, the array ,to_get, must be of size equal to the total number of tasks
- Output array on the task receiving data must be of size sum(to_send)
- Input array on task N must be of size equal to the Nth element of to_send.
ccm_alltoallv
- On all tasks, the array ,to_get, must be of size equal to the total number of tasks.
- On all tasks, the array ,to_send, must be of size equal to the total number of tasks.
- The input array must be of size sum(to_send).
- The output array must be of size sum(to_get).

If the user sets the ccm_testing flag to "ccm_checksize" then compliance to the requirements discussed above will be checked by the routines. If a requirement is violated an error will be generated and the routine will return without any data being sent. See ccm_testing and the individual routines for error messages. If the flag is not set and a requirement is violated then the routine may generate an unrecoverable error and the program may crash.

It is possible for the size of arrays to be zero. In this case no data is transferred in the call.

However, the API requires that arrays that have the potential to be read or modified by a call be allocated. Thus for ccm_reduce, ccm_scatter, and ccm_scatterv the input array only needs to be allocated on the task that is sending data. For ccm_gather and ccm_gatherv the output array only needs to be allocated on the task receiving data. These arrays can be allocated on all tasks with no effect. If the arrays are not allocated as required then the routine may generate an unrecoverable error and the program may crash.

The following tables summarize the allocation requirements for arrays. P is the number of tasks in the parallel application and N is some integer.

Array Allocation Requirement in Routine		ARRAY
Array Allocation Requirement in Routine		input on root	input on others	output on root	output on others
Routine
	allreduce	na	allocated Size: N	na	allocated Size: N
	alltoall	na	allocated Size: (N*P)	na	allocated Size: (N*P)
	alltoallv	na	allocated Size: varied	na	allocated Size: varied
	bcast	allocated Size: N	allocated Size: N	na*	na*
	gather	allocated Size: N	allocated Size: N	allocated Size: (N*P)	free
	gatherv	allocated Size: varied	allocated Size: varied	allocated Size: varied	free
	reduce	allocated Size: N	allocated Size: N	allocated Size: N	free
	scatter	allocated Size: (N*P)	free	allocated Size: N	allocated Size: N
	scatterv	allocated Size: varied	free	allocated Size: varied	allocated Size: varied

* for bcast input and output arrays are the same

Array Allocation Requirement in Routine		ARRAY
Array Allocation Requirement in Routine		send size on root	send size on others	receive size on root	receive size on others
Routine
	alltoallv	na	allocated Size: P	na	allocated Size: P
	gatherv	na	na	allocated Size: P	free
	scatterv	allocated Size: P	free	na	na

The following arrays are optional. But if they are present they must be allocated as indicated.

Array Allocation Requirement in Routine		ARRAY
Array Allocation Requirement in Routine		send offset on root	send offset on others	receive offset on root	receive offset on others
Routine
	alltoallv	na	allocated Size: P	na	allocated Size: P
	gatherv	na	na	allocated Size: P	free
	scatterv	allocated Size: P	free	na	na

Basic requirements to use these routines.

Ccm_init must be called before any other routines from the API and after the initialization of other communications packages.
Ccm_close must be called before the program exits.
The statement
use ccm
must appear in every routine that calls a subroutine from the API.

If you are new to Fortran 90 you may not be familiar with the "use" statement. The use statement is part of the Fortran 90 standard. A program unit (the main program, a function, or subroutine) can contain a number of use statements. They must be before any data declarations or implicit statements.

The "include" statement is an extension to most implementations of Fortran. In some sense, the use statement is a superset of "include". What is "included" by a use statement is something called a module. A module can contain things like variables, constant (parameter) definitions, functions, subroutines, and interfaces to subroutines. The module "ccm" contains all these.

The interfaces to subroutines are very important for this API. Interfaces allow type checking for arguments to be done at compile time. This promotes program correctness. So, if you had the code


program example
use ccm
real x
...
call ccm_init()
...
...
call ccm_bcast(x)

the compiler would check to see if it is legal to call ccm_bcast with a real value, (obviously it is).

Consider the following expansion on the example


program example
use ccm
double precision bigx
real array(10)
integer i
...
call ccm_init()
...
...
call ccm_bcast(x)
call ccm_bcast(i)
call cmm_bcast(array)

We call ccm_bcast with a real double precision value. Next we call ccm_bcast with an integer. Finally, we call the routine with a real array.

Why is this unusual? Each one of these calls to ccm_bcast is doing the broadcast with different data types and must send different amounts of information. Assume for the moment that a real double precision is 8 bytes, an integer is 4, and a normal real is 4 bytes. For ccm_bcast to work correctly, it must know to send 8 bytes in the first call, 4 bytes in the second call, and 4*10=40 bytes in the third call.

There is an interface for the routine ccm_bcast in the ccm module. The type checking enabled by having an interface for ccm_bcast allows "seeing" the data type inside the subroutine and, thus, the length of the data to be sent. This works with arrays because the lengths of arrays are also available.

The Fortran 90 terminology is that the module provides a "generic subroutine." You can call a generic subroutine with any type allowed by the author of the subroutine. It is up to the author to write the code (the writer of the ccm module here) to handle the various data types passed to the generic subroutine.

Most routines in the ccm module have fewer parameters than the comparable MPI routines. For example, the data sent is the only parameter for a call to bcast. The MPI broadcast routine has six parameters: the actual data, the data type, the amount of data to send, a "root" processor, a "communicator", and an error flag. For ccm_bcast the data type and the amount of data to be sent are determined by the compiler, thus, less prone to error. The root processor and communicator information have a default values applicable for most applications. The user's guide shows methods for overriding defaults and accessing the error flags.

For information on individual routines click on the links given above. The Fortran 90 intrinsic function DATE_AND_TIME is used in many of the examples. A description of the routine can be found here.

[an error occurred while processing this directive]

Acknowledgment, Copyright, and License
This publication (and the related software) made possible throught support porvided by DoD High Performance Computin Modernizaion Program (HPCMP) Programming Environment and Trainig (PET) activities through Mississippi State University under the terms of Contract No. N62306-01-D-7110. Additional information: Mississippi State University Subcontract No. 0606808-01090729-24 Task Order Number: N62306-01-D-7110/0009 Agreement No. N62306-01-D-7110 High Performance Computing Modernization Program Task Number: CE 019 Title: SPMD Collective Communication Module This software, the Collective Communications Module Application Programmers Interface and Reference implementations, are released under the following License. License Copyright (c) 2002 Department of Defense, High Performance Computing Modernization Program Copyright (c) 2002 Coherent Cognition Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: This Software resulted from work developed under a U.S. Government contract. Therefore the Government is granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable worldwide license in this computer software to reproduce, prepare derivative works, and perform publicly and display publicly. The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Acknowledgment, Copyright, and License


         
         
         
This publication (and the related software) made possible throught support
porvided by DoD High Performance Computin Modernizaion Program (HPCMP)
Programming Environment and Trainig (PET) activities through Mississippi
State University under the terms of Contract No. N62306-01-D-7110.

   Additional information:
   
    Mississippi State University Subcontract No. 0606808-01090729-24
    Task Order Number: N62306-01-D-7110/0009
    Agreement No. N62306-01-D-7110
    High Performance Computing Modernization Program Task Number: CE 019
    Title: SPMD Collective Communication Module
    

This software, the Collective Communications Module Application Programmers
Interface and Reference implementations, are released under the following
License.


                               License
                               
Copyright (c) 2002 Department of Defense, High Performance Computing
                   Modernization Program
Copyright (c) 2002 Coherent Cognition


Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

This Software resulted from work developed under a  U.S. Government contract.
Therefore the Government is granted for itself and others acting on its behalf a
paid-up, nonexclusive, irrevocable worldwide license in this computer software
to reproduce, prepare derivative works, and perform publicly and display
publicly.

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.