source: proiecte/hpl/openmpi_compiled/share/man/man7/orte_hosts.7 @ 97

Last change on this file since 97 was 97, checked in by (none), 14 years ago

Adding compiled files

File size: 8.0 KB
Line 
1.\"
2.\" Copyright (c) 2008       Los Alamos National Security, LLC  All rights reserved.
3.\" Copyright (c) 2008-2009  Sun Microsystems, Inc.  All rights reserved.
4.\"
5.\" Man page for ORTE's Hostfile functionality
6.\"
7.\" .TH name     section center-footer   left-footer  center-header
8.TH ORTE_HOSTS 7 "Dec 08, 2009" "1.4" "Open MPI"
9.\" **************************
10.\"    Name Section
11.\" **************************
12.SH NAME
13.
14OpenRTE Hostfile and HOST Behavior \- Overview of OpenRTE's support for user-supplied hostfiles and comma-delimited lists of hosts
15.
16.\" **************************
17.\"    Description Section
18.\" **************************
19.SH DESCRIPTION
20.
21.PP
22OpenRTE supports several levels of user-specified host lists based on an established
23precedence order. Users can specify a \fIdefault hostfile\fP that contains a list of
24nodes available to all app_contexts given on the command line. Only \fIone\fP default
25hostfile can be provided for any job. In addition, users
26can specify a \fIhostfile\fP that contains a list of nodes to be used for a specific
27app_context, or can provide a comma-delimited list of nodes to be used for that
28app_context via the \fI-host\fP command line option.
29.sp
30The precedence order applied to these various options depends to some extent on
31the local environment. The following table illustrates how host and hostfile directives
32work together to define the set of hosts upon which a job will execute
33in the absence of a resource manager (RM):
34.sp
35.nf
36 default
37 hostfile      host        hostfile       Result
38----------    ------      ----------      -----------------------------------------
39 unset        unset          unset        Job is co-located with mpirun
40 unset         set           unset        Host defines resource list for the job
41 unset        unset           set         Hostfile defines resource list for the job
42 unset         set            set         Hostfile defines resource list for the job,
43                                          then host filters the list to define the final
44                                          set of nodes available to each application
45                                          within the job
46  set         unset          unset        Default hostfile defines resource list for the job
47  set          set           unset        Default hostfile defines resource list for the job,
48                                          then host filters the list to define the final
49                                          set of nodes available to each application
50                                          within the job
51  set          set            set         Default hostfile defines resource list for the job,
52                                          then hostfile filters the list, and then host filters
53                                          the list to define the final set of nodes available
54                                          to each application within the job
55.fi
56.sp
57This changes somewhat in the presence of a RM as that entity specifies the
58initial allocation of nodes. In this case, the default hostfile, hostfile and host
59directives are all used to filter the RM's specification so that a user can utilize different
60portions of the allocation for different jobs. This is done according to the same precedence
61order as in the prior table, with the RM providing the initial pool of nodes.
62.sp
63.
64.\" **************************
65.\"    Relative Indexing
66.\" **************************
67.SH RELATIVE INDEXING
68.
69.PP
70Once an initial allocation has been specified (whether by an RM, default hostfile, or hostfile),
71subsequent hostfile and -host specifications can be made using relative indexing. This allows a
72user to stipulate which hosts are to be used for a given app_context without specifying the
73particular host name, but rather its relative position in the allocation.
74.sp
75This can probably best be understood through consideration of a few examples. Consider the case
76where an RM has allocated a set of nodes to the user named "foo1, foo2, foo3, foo4". The user
77wants the first app_context to have exclusive use of the first two nodes, and a second app_context
78to use the last two nodes. Of course, the user could printout the allocation to find the names
79of the nodes allocated to them and then use -host to specify this layout, but this is cumbersome
80and would require hand-manipulation for every invocation.
81.sp
82A simpler method is to utilize OpenRTE's relative indexing capability to specify the desired
83layout. In this case, a command line of:
84.sp
85mpirun -pernode -host +n1,+n2 ./app1 : -host +n3,+n4 ./app2
86.sp
87.PP
88would provide the desired pattern. The "+" syntax indicates that the information is being
89provided as a relative index to the existing allocation. Two methods of relative indexing
90are supported:
91.sp
92.TP
93.B +n<#>
94A relative index into the allocation referencing the <#> node. OpenRTE will substitute
95the <#> node in the allocation
96.
97.
98.TP
99.B +e[:<#>]
100A request for <#> empty nodes - i.e., OpenRTE is to substitute this reference with
101<#> nodes that have not yet been used by any other app_context. If the ":<#>" is not
102provided, OpenRTE will substitute the reference with all empty nodes. Note that OpenRTE
103does track the empty nodes that have been assigned in this manner, so multiple
104uses of this option will result in assignment of unique nodes up to the limit of the
105available empty nodes. Requests for more empty nodes than are available will generate
106an error.
107.sp
108.PP
109Relative indexing can be combined with absolute naming of hosts in any arbitrary manner,
110and can be used in hostfiles as well as with the -host command line option. In addition,
111any slot specification provided in hostfiles will be respected - thus, a user can specify
112that only a certain number of slots from a relative indexed host are to be used for a
113given app_context.
114.sp
115Another example may help illustrate this point. Consider the case where a user has a default
116hostfile containing:
117.sp
118.nf
119dummy1 slots=4
120dummy2 slots=4
121dummy3 slots=4
122dummy4 slots=4
123dummy5 slots=4
124.fi
125.sp
126.PP
127This may, for example, be a hostfile that describes a set of commonly-used resources that
128the user wishes to execute applications against. For this particular application, the user
129plans to map byslot, and wants the first two ranks to be on the second node of any allocation,
130the next ranks to land on an empty node, have one rank specifically on dummy4, the next rank
131to be on the second node of the allocation again, and finally any remaining ranks to be on
132whatever empty nodes are left. To accomplish this, the user provides a hostfile of:
133.sp
134.nf
135+n2 slots=2
136+e:1
137dummy4 slots=1
138+n2
139+e
140.fi
141.sp
142.PP
143The user can now use this information in combination with OpenRTE's sequential mapper to
144obtain their specific layout:
145.sp
146.nf
147mpirun --default-hostfile dummyhosts -hostfile mylayout -mca rmaps seq ./my_app
148.fi
149.sp
150.PP
151which will result in:
152.nf
153.sp
154rank0 being mapped to dummy3
155.br
156rank1 to dummy1 as the first empty node
157.br
158rank2 to dummy4
159.br
160rank3 to dummy3
161.br
162rank4 to dummy2 and rank5 to dummy5 as the last remaining unused nodes
163.sp
164.fi
165Note that the sequential mapper ignores the number of slots arguments as it only
166maps one rank at a time to each node in the list.
167.sp
168If the default round-robin mapper had been used, then the mapping would have resulted in:
169.sp
170.nf
171ranks 0 and 1 being mapped to dummy3 since two slots were specified
172.br
173ranks 2-5 on dummy1 as the first empty node, which has four slots
174.br
175rank6 on dummy4 since the hostfile specifies only a single slot from that node is to be used
176.br
177ranks 7 and 8 on dummy3 since only two slots remain available
178.br
179ranks 9-12 on dummy2 since it is the next available empty node and has four slots
180.br
181ranks 13-16 on dummy5 since it is the last remaining unused node and has four slots
182.fi
183.sp
184.PP
185Thus, the use of relative indexing can allow for complex mappings to be ported across
186allocations, including those obtained from automated resource managers, without the need
187for manual manipulation of scripts and/or command lines.
188.
189.
190.\" **************************
191.\"    See Also Section
192.\" **************************
193.
194.SH SEE ALSO
195  orterun(1)
196.
Note: See TracBrowser for help on using the repository browser.