source: proiecte/hpl/openmpi_compiled/share/man/man3/MPI_Reduce.3 @ 97

Last change on this file since 97 was 97, checked in by (none), 14 years ago

Adding compiled files

File size: 16.6 KB
Line 
1.\"Copyright 2006-2008 Sun Microsystems, Inc.
2.\" Copyright (c) 1996 Thinking Machines Corporation
3.TH MPI_Reduce 3 "Dec 08, 2009" "1.4" "Open MPI"
4.SH NAME
5\fBMPI_Reduce\fP \- Reduces values on all processes within a group.
6
7.SH SYNTAX
8.ft R
9.SH C Syntax
10.nf
11#include <mpi.h>
12int MPI_Reduce(void *\fIsendbuf\fP, void *\fIrecvbuf\fP, int\fI count\fP,
13        MPI_Datatype\fI datatype\fP, MPI_Op\fI op\fP, int\fI root\fP, MPI_Comm\fI comm\fP)
14
15.SH Fortran Syntax
16.nf
17INCLUDE 'mpif.h'
18MPI_REDUCE(\fISENDBUF, RECVBUF, COUNT, DATATYPE, OP, ROOT, COMM,
19                IERROR\fP)
20        <type>  \fISENDBUF(*), RECVBUF(*)\fP
21        INTEGER \fICOUNT, DATATYPE, OP, ROOT, COMM, IERROR\fP
22
23.SH C++ Syntax
24.nf
25#include <mpi.h>
26void MPI::Intracomm::Reduce(const void* \fIsendbuf\fP, void* \fIrecvbuf\fP,
27        int \fIcount\fP, const MPI::Datatype& \fIdatatype\fP, const MPI::Op& \fIop\fP,
28        int \fIroot\fP) const
29
30.SH INPUT PARAMETERS
31.ft R
32.TP 1i
33sendbuf
34Address of send buffer (choice).
35.TP 1i
36count
37Number of elements in send buffer (integer).
38.TP 1i
39datatype
40Data type of elements of send buffer (handle).
41.TP 1i
42op
43Reduce operation (handle).
44.TP 1i
45root
46Rank of root process (integer).
47.TP 1i
48comm
49Communicator (handle).
50
51.SH OUTPUT PARAMETERS
52.ft R
53.TP 1i
54recvbuf
55Address of receive buffer (choice, significant only at root).
56.ft R
57.TP 1i
58IERROR
59Fortran only: Error status (integer).
60
61.SH DESCRIPTION
62.ft R
63The global reduce functions (MPI_Reduce, MPI_Op_create, MPI_Op_free, MPI_Allreduce, MPI_Reduce_scatter, MPI_Scan) perform a global reduce operation (such as sum, max, logical AND, etc.) across all the members of a group. The reduction operation can be either one of a predefined list of operations, or a user-defined operation. The global reduction functions come in several flavors: a reduce that returns the result of the reduction at one node, an all-reduce that returns this result at all nodes, and a scan (parallel prefix) operation. In addition, a reduce-scatter operation combines the functionality of a reduce and a scatter operation.
64.sp
65MPI_Reduce combines the elements provided in the input buffer of each process in the group, using the operation op, and returns the combined value in the output buffer of the process with rank root. The input buffer is defined by the arguments sendbuf, count, and datatype; the output buffer is defined by the arguments recvbuf, count, and datatype; both have the same number of elements, with the same type. The routine is called by all group members using the same arguments for count, datatype, op, root, and comm. Thus, all processes provide input buffers and output buffers of the same length, with elements of the same type. Each process can provide one element, or a sequence of elements, in which case the combine operation is executed element-wise on each entry of the sequence. For example, if the operation is MPI_MAX and the send buffer contains two elements that are floating-point numbers (count = 2 and datatype = MPI_FLOAT), then recvbuf(1) = global max (sendbuf(1)) and recvbuf(2) = global max(sendbuf(2)).
66.sp
67.SH USE OF IN-PLACE OPTION
68When the communicator is an intracommunicator, you can perform a reduce operation in-place (the output buffer is used as the input buffer).  Use the variable MPI_IN_PLACE as the value of the root process \fIsendbuf\fR.  In this case, the input data is taken at the root from the receive buffer, where it will be replaced by the output data. 
69.sp
70Note that MPI_IN_PLACE is a special kind of value; it has the same restrictions on its use as MPI_BOTTOM.
71.sp
72Because the in-place option converts the receive buffer into a send-and-receive buffer, a Fortran binding that includes INTENT must mark these as INOUT, not OUT.   
73.sp
74.SH WHEN COMMUNICATOR IS AN INTER-COMMUNICATOR
75.sp
76When the communicator is an inter-communicator, the root process in the first group combines data from all the processes in the second group and then performs the \fIop\fR operation.  The first group defines the root process.  That process uses MPI_ROOT as the value of its \fIroot\fR argument.  The remaining processes use MPI_PROC_NULL as the value of their \fIroot\fR argument.  All processes in the second group use the rank of that root process in the first group as the value of their \fIroot\fR argument.  Only the send buffer arguments are significant in the second group, and only the receive buffer arguments are significant in the root process of the first group.   
77.sp 
78.SH PREDEFINED REDUCE OPERATIONS
79.sp
80The set of predefined operations provided by MPI is listed below (Predefined Reduce Operations). That section also enumerates the datatypes each operation can be applied to. In addition, users may define their own operations that can be overloaded to operate on several datatypes, either basic or derived. This is further explained in the description of the user-defined operations (see the man pages for MPI_Op_create and MPI_Op_free).
81.sp
82The operation op is always assumed to be associative. All predefined operations are also assumed to be commutative. Users may define operations that are assumed to be associative, but not commutative. The ``canonical'' evaluation order of a reduction is determined by the ranks of the processes in the group. However, the implementation can take advantage of associativity, or associativity and commutativity, in order to change the order of evaluation. This may change the result of the reduction for operations that are not strictly associative and commutative, such as floating point addition. 
83.sp
84Predefined operators work only with the MPI types listed below (Predefined Reduce Operations, and the section MINLOC and MAXLOC, below).  User-defined operators may operate on general, derived datatypes. In this case, each argument that the reduce operation is applied to is one element described by such a datatype, which may contain several basic values. This is further explained in Section 4.9.4 of the MPI Standard, "User-Defined Operations."
85
86The following predefined operations are supplied for MPI_Reduce and related functions MPI_Allreduce, MPI_Reduce_scatter, and MPI_Scan. These operations are invoked by placing the following in op:
87.sp
88.nf
89        Name                Meaning
90     ---------           --------------------
91        MPI_MAX             maximum
92        MPI_MIN             minimum
93        MPI_SUM             sum
94        MPI_PROD            product
95        MPI_LAND            logical and
96        MPI_BAND            bit-wise and
97        MPI_LOR             logical or
98        MPI_BOR             bit-wise or
99        MPI_LXOR            logical xor
100        MPI_BXOR            bit-wise xor
101        MPI_MAXLOC          max value and location
102        MPI_MINLOC          min value and location
103.fi
104.sp
105The two operations MPI_MINLOC and MPI_MAXLOC are discussed separately below (MINLOC and MAXLOC). For the other predefined operations, we enumerate below the allowed combinations of op and datatype arguments. First, define groups of MPI basic datatypes in the following way:
106.sp
107.nf
108        C integer:            MPI_INT, MPI_LONG, MPI_SHORT,
109                              MPI_UNSIGNED_SHORT, MPI_UNSIGNED,
110                              MPI_UNSIGNED_LONG
111        Fortran integer:      MPI_INTEGER
112        Floating-point:       MPI_FLOAT, MPI_DOUBLE, MPI_REAL,
113                              MPI_DOUBLE_PRECISION, MPI_LONG_DOUBLE
114        Logical:              MPI_LOGICAL
115        Complex:              MPI_COMPLEX
116        Byte:                 MPI_BYTE
117.fi
118.sp
119Now, the valid datatypes for each option is specified below.
120.sp
121.nf
122        Op                              Allowed Types
123     ----------------         ---------------------------
124        MPI_MAX, MPI_MIN                C integer, Fortran integer,
125                                                floating-point
126
127        MPI_SUM, MPI_PROD               C integer, Fortran integer,
128                                                floating-point, complex
129
130        MPI_LAND, MPI_LOR,              C integer, logical
131        MPI_LXOR
132
133        MPI_BAND, MPI_BOR,              C integer, Fortran integer, byte
134        MPI_BXOR
135.fi
136.sp
137\fBExample 1:\fR A routine that computes the dot product of two vectors that are distributed across a  group of processes and returns the answer at process zero.
138.sp
139.nf
140    SUBROUTINE PAR_BLAS1(m, a, b, c, comm)
141    REAL a(m), b(m)       ! local slice of array
142    REAL c                ! result (at process zero)
143    REAL sum
144    INTEGER m, comm, i, ierr
145     
146    ! local sum
147    sum = 0.0
148    DO i = 1, m
149       sum = sum + a(i)*b(i)
150    END DO
151     
152    ! global sum
153    CALL MPI_REDUCE(sum, c, 1, MPI_REAL, MPI_SUM, 0, comm, ierr)
154    RETURN
155.fi
156.sp
157\fBExample 2:\fR A routine that computes the product of a vector and an array that are distributed across a  group of processes and returns the answer at process zero.
158.sp
159.nf
160    SUBROUTINE PAR_BLAS2(m, n, a, b, c, comm)
161    REAL a(m), b(m,n)    ! local slice of array
162    REAL c(n)            ! result
163    REAL sum(n)
164    INTEGER n, comm, i, j, ierr
165     
166    ! local sum
167    DO j= 1, n
168      sum(j) = 0.0
169      DO i = 1, m
170        sum(j) = sum(j) + a(i)*b(i,j)
171      END DO
172    END DO
173   
174    ! global sum
175    CALL MPI_REDUCE(sum, c, n, MPI_REAL, MPI_SUM, 0, comm, ierr)
176     
177    ! return result at process zero (and garbage at the other nodes)
178    RETURN
179
180.SH MINLOC AND MAXLOC
181.ft R
182The operator MPI_MINLOC is used to compute a global minimum and also an index attached to the minimum value. MPI_MAXLOC similarly computes a global maximum and index. One application of these is to compute a global minimum (maximum) and the rank of the process containing this value.   
183
184.sp
185The operation that defines MPI_MAXLOC is
186.sp
187.nf
188         ( u )    (  v )      ( w )
189         (   )  o (    )   =  (   )
190         ( i )    (  j )      ( k )
191
192where
193
194    w = max(u, v)
195
196and
197
198         ( i            if u > v
199         (
200   k   = ( min(i, j)    if u = v
201         (
202         (  j           if u < v)
203
204
205MPI_MINLOC is defined similarly:
206
207         ( u )    (  v )      ( w )
208         (   )  o (    )   =  (   )
209         ( i )    (  j )      ( k )
210
211where
212
213    w = max(u, v)
214
215and
216
217         ( i            if u < v
218         (
219   k   = ( min(i, j)    if u = v
220         (
221         (  j           if u > v)
222
223
224.fi
225.sp
226
227Both operations are associative and commutative. Note that if MPI_MAXLOC is
228applied to reduce a sequence of pairs (u(0), 0), (u(1), 1),\ ..., (u(n-1),
229n-1), then the value returned is (u , r), where u= max(i) u(i) and r is
230the index of the first global maximum in the sequence. Thus, if each
231process supplies a value and its rank within the group, then a reduce
232operation with op = MPI_MAXLOC will return the maximum value and the rank
233of the first process with that value. Similarly, MPI_MINLOC can be used to
234return a minimum and its index. More generally, MPI_MINLOC computes a
235lexicographic minimum, where elements are ordered according to the first
236component of each pair, and ties are resolved according to the second
237component.
238.sp
239The reduce operation is defined to operate on arguments that consist of a
240pair: value and index. For both Fortran and C, types are provided to
241describe the pair. The potentially mixed-type nature of such arguments is a
242problem in Fortran. The problem is circumvented, for Fortran, by having the
243MPI-provided type consist of a pair of the same type as value, and coercing
244the index to this type also. In C, the MPI-provided pair type has distinct
245types and the index is an int.
246.sp
247In order to use MPI_MINLOC and MPI_MAXLOC in a reduce operation, one must
248provide a datatype argument that represents a pair (value and index). MPI
249provides nine such predefined datatypes. The operations MPI_MAXLOC and
250MPI_MINLOC can be used with each of the following datatypes:
251.sp
252.nf
253    Fortran:
254    Name                     Description
255    MPI_2REAL                pair of REALs
256    MPI_2DOUBLE_PRECISION    pair of DOUBLE-PRECISION variables
257    MPI_2INTEGER             pair of INTEGERs
258   
259    C:         
260    Name                        Description
261    MPI_FLOAT_INT            float and int
262    MPI_DOUBLE_INT           double and int
263    MPI_LONG_INT             long and int
264    MPI_2INT                 pair of ints
265    MPI_SHORT_INT            short and int
266    MPI_LONG_DOUBLE_INT      long double and int
267.fi
268.sp
269The data type MPI_2REAL is equivalent to:
270.nf
271    MPI_TYPE_CONTIGUOUS(2, MPI_REAL, MPI_2REAL)     
272.fi
273.sp
274Similar statements apply for MPI_2INTEGER, MPI_2DOUBLE_PRECISION, and
275MPI_2INT.
276.sp
277The datatype MPI_FLOAT_INT is as if defined by the following sequence of
278instructions.
279.sp
280.nf
281    type[0] = MPI_FLOAT
282    type[1] = MPI_INT
283    disp[0] = 0
284    disp[1] = sizeof(float)
285    block[0] = 1
286    block[1] = 1
287    MPI_TYPE_STRUCT(2, block, disp, type, MPI_FLOAT_INT)
288.fi
289.sp
290Similar statements apply for MPI_LONG_INT and MPI_DOUBLE_INT. 
291.sp
292\fBExample 3:\fR Each process has an array of 30 doubles, in C. For each of
293the 30 locations, compute the value and rank of the process containing the
294largest value.
295.sp
296.nf
297        \&...
298        /* each process has an array of 30 double: ain[30]
299         */
300        double ain[30], aout[30];
301        int  ind[30];
302        struct {
303            double val;
304            int   rank;
305        } in[30], out[30];
306        int i, myrank, root;
307     
308        MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
309        for (i=0; i<30; ++i) {
310            in[i].val = ain[i];
311            in[i].rank = myrank;
312        }
313        MPI_Reduce( in, out, 30, MPI_DOUBLE_INT, MPI_MAXLOC, root, comm );
314        /* At this point, the answer resides on process root
315         */
316        if (myrank == root) {
317            /* read ranks out
318             */
319            for (i=0; i<30; ++i) {
320                aout[i] = out[i].val;
321                ind[i] = out[i].rank;
322            }
323        }
324.sp
325.fi
326\fBExample 4:\fR  Same example, in Fortran. 
327.sp
328.nf
329    \&...
330    ! each process has an array of 30 double: ain(30)
331     
332    DOUBLE PRECISION ain(30), aout(30)
333    INTEGER ind(30);
334    DOUBLE PRECISION in(2,30), out(2,30)
335    INTEGER i, myrank, root, ierr;
336     
337    MPI_COMM_RANK(MPI_COMM_WORLD, myrank);
338        DO I=1, 30
339            in(1,i) = ain(i)
340            in(2,i) = myrank    ! myrank is coerced to a double
341        END DO
342     
343    MPI_REDUCE( in, out, 30, MPI_2DOUBLE_PRECISION, MPI_MAXLOC, root,
344                                                              comm, ierr );
345    ! At this point, the answer resides on process root
346     
347    IF (myrank .EQ. root) THEN
348            ! read ranks out
349            DO I= 1, 30
350                aout(i) = out(1,i)
351                ind(i) = out(2,i)  ! rank is coerced back to an integer
352            END DO
353        END IF
354.fi
355.sp
356\fBExample 5:\fR Each process has a nonempty array of values.  Find the minimum global value, the rank of the process that holds it, and its index on this process.
357.sp
358.nf
359    #define  LEN   1000
360     
361    float val[LEN];        /* local array of values */
362    int count;             /* local number of values */
363    int myrank, minrank, minindex;
364    float minval;
365     
366    struct {
367        float value;
368        int   index;
369    } in, out;
370     
371    /* local minloc */
372    in.value = val[0];
373    in.index = 0;
374    for (i=1; i < count; i++)
375        if (in.value > val[i]) {
376            in.value = val[i];
377            in.index = i;
378        }
379     
380    /* global minloc */
381    MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
382    in.index = myrank*LEN + in.index;
383    MPI_Reduce( in, out, 1, MPI_FLOAT_INT, MPI_MINLOC, root, comm );
384        /* At this point, the answer resides on process root
385         */
386    if (myrank == root) {
387        /* read answer out
388         */
389        minval = out.value;
390        minrank = out.index / LEN;
391        minindex = out.index % LEN;
392.fi
393.sp
394All MPI objects (e.g., MPI_Datatype, MPI_Comm) are of type INTEGER in Fortran.
395.SH NOTES ON COLLECTIVE OPERATIONS
396
397The reduction functions (
398.I MPI_Op
399) do not return an error value.  As a result,
400if the functions detect an error, all they can do is either call
401.I MPI_Abort
402or silently skip the problem.  Thus, if you change the error handler from
403.I MPI_ERRORS_ARE_FATAL
404to something else, for example,
405.I MPI_ERRORS_RETURN
406,
407then no error may be indicated.
408
409The reason for this is the performance problems in ensuring that
410all collective routines return the same error value.
411
412.SH ERRORS
413Almost all MPI routines return an error value; C routines as the value of the function and Fortran routines in the last argument. C++ functions do not return errors. If the default error handler is set to MPI::ERRORS_THROW_EXCEPTIONS, then on error the C++ exception mechanism will be used to throw an MPI:Exception object.
414.sp
415Before the error value is returned, the current MPI error handler is
416called. By default, this error handler aborts the MPI job, except for I/O function errors. The error handler may be changed with MPI_Comm_set_errhandler; the predefined error handler MPI_ERRORS_RETURN may be used to cause error values to be returned. Note that MPI does not guarantee that an MPI program can continue past an error. 
417
418.SH SEE ALSO
419.ft R
420.sp
421MPI_Allreduce
422.br
423MPI_Reduce_scatter
424.br
425MPI_Scan
426.br
427MPI_Op_create
428.br
429MPI_Op_free
430
431
432
Note: See TracBrowser for help on using the repository browser.