** Next:** Vector Updates.
** Up:** Parallelism J. Dongarra and
** Previous:** Parallelism J. Dongarra and
** Contents**
** Index**

####

Inner Products.

The computation of an inner product of two vectors
can be easily parallelized; each processor computes the
inner product of corresponding segments of each vector
(local inner products (LIPs)).
On distributed memory machines the LIPs then
have to be sent to other processors
to be combined for the global inner product. This can be done either
with an all-to-all send where every processor performs the summation
of the LIPs, or by a global accumulation in one processor, followed by
a broadcast of the final result.
Clearly, this step requires communication.

For shared memory machines, the accumulation of LIPs can be
implemented as a critical section where all processors add their local
result in turn to the global result, or as a piece of serial
code, where one processor performs the summations.

Susan Blackford
2000-11-20