1 | <?xml version="1.0" encoding="UTF-8"?> |
---|
2 | |
---|
3 | <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.0//EN" "http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd"> |
---|
4 | |
---|
5 | <article> |
---|
6 | <title>The Hoard Memory Allocator</title> |
---|
7 | |
---|
8 | <articleinfo> |
---|
9 | <author> |
---|
10 | <firstname>Emery</firstname> |
---|
11 | <surname>Berger</surname> |
---|
12 | <affiliation>University of Massachusetts Amherst</affiliation> |
---|
13 | <street>Department of Computer Science</street> |
---|
14 | <city>Amherst</city> |
---|
15 | <state>Massachusetts</state> |
---|
16 | <country>USA</country> |
---|
17 | <email>emery@cs.umass.edu</email> |
---|
18 | </author> |
---|
19 | <pubdate>2004-12-08</pubdate> |
---|
20 | <revhistory> |
---|
21 | <revision> |
---|
22 | <revnumber>1.1</revnumber> |
---|
23 | <date>2004-12-08</date> |
---|
24 | <authorinitials>EDB</authorinitials> |
---|
25 | <revremark>Improved formatting</revremark> |
---|
26 | </revision> |
---|
27 | <revision> |
---|
28 | <revnumber>1.0</revnumber> |
---|
29 | <date>2004-12-06</date> |
---|
30 | <authorinitials>EDB</authorinitials> |
---|
31 | <revremark>First draft</revremark> |
---|
32 | </revision> |
---|
33 | </revhistory> |
---|
34 | <copyright> |
---|
35 | <year>2004</year> |
---|
36 | <holder role="mailto:emery@cs.umass.edu">Emery Berger</holder> |
---|
37 | </copyright> |
---|
38 | <abstract> |
---|
39 | Documentation for the Hoard scalable memory allocator, including build and usage directions for several platforms. |
---|
40 | </abstract> |
---|
41 | </articleinfo> |
---|
42 | |
---|
43 | <!-- |
---|
44 | add mailing list and website info. |
---|
45 | proper web links |
---|
46 | big line between sections |
---|
47 | |
---|
48 | --> |
---|
49 | |
---|
50 | <blockquote> |
---|
51 | <para> |
---|
52 | <emphasis role="strong">hoard:</emphasis> |
---|
53 | To amass and put away (anything valuable) for preservation, security, |
---|
54 | or future use; to treasure up: esp. money or wealth. |
---|
55 | </para> |
---|
56 | <para><emphasis>Oxford English Dictionary</emphasis> |
---|
57 | </para> |
---|
58 | </blockquote> |
---|
59 | |
---|
60 | <sect1 id="intro"> |
---|
61 | <title>Introduction</title> |
---|
62 | <para> |
---|
63 | The Hoard memory allocator is a fast, scalable, and memory-efficient |
---|
64 | memory allocator for shared-memory multiprocessors. It runs on a |
---|
65 | variety of platforms, including Linux, Solaris, and Windows. |
---|
66 | </para> |
---|
67 | <sect2> |
---|
68 | <title>Why Hoard?</title> |
---|
69 | <sect3> |
---|
70 | <title>Contention</title> |
---|
71 | <para> |
---|
72 | Multithreaded programs often do not scale because the heap is a |
---|
73 | bottleneck. When multiple threads simultaneously allocate or |
---|
74 | deallocate memory from the allocator, the allocator will serialize |
---|
75 | them. Programs making intensive use of the allocator actually slow |
---|
76 | down as the number of processors increases. Your program may be |
---|
77 | allocation-intensive without you realizing it, for instance, if your |
---|
78 | program makes many calls to the C++ Standard Template Library (STL). |
---|
79 | </para> |
---|
80 | </sect3> |
---|
81 | <sect3> |
---|
82 | <title>False Sharing</title> |
---|
83 | <para> |
---|
84 | The allocator can cause other problems for multithreaded code. It can |
---|
85 | lead to <emphasis>false sharing</emphasis> in your application: |
---|
86 | threads on different CPUs can end up with memory in the same cache |
---|
87 | line, or chunk of memory. Accessing these falsely-shared cache lines |
---|
88 | is hundreds of times slower than accessing unshared cache lines. |
---|
89 | </para> |
---|
90 | </sect3> |
---|
91 | <sect3> |
---|
92 | <title>Blowup</title> |
---|
93 | <para> |
---|
94 | Multithreaded programs can also lead the allocator to blowup memory |
---|
95 | consumption. This effect can multiply the amount of memory needed to |
---|
96 | run your application by the number of CPUs on your machine: four CPUs |
---|
97 | could mean that you need four times as much memory. Hoard is a fast |
---|
98 | allocator that solves all of these problems. |
---|
99 | </para> |
---|
100 | </sect3> |
---|
101 | </sect2> |
---|
102 | |
---|
103 | <sect2> |
---|
104 | <title>How Do I Use Hoard?</title> |
---|
105 | <para> |
---|
106 | Hoard is a drop-in replacement for malloc(), etc. In general, you just |
---|
107 | link it in or set just one environment variable. You do not have to |
---|
108 | change your source code in any way. See the section "Windows Builds" |
---|
109 | below for more information for particular platforms. |
---|
110 | </para> |
---|
111 | </sect2> |
---|
112 | |
---|
113 | <sect2> |
---|
114 | <title>Who's Using Hoard?</title> |
---|
115 | <para> |
---|
116 | Companies using Hoard in their products and servers include <ulink |
---|
117 | url="http://www.aol.com">AOL</ulink>, <ulink |
---|
118 | url="http://www.bt.com">British Telecom</ulink>, <ulink |
---|
119 | url="http://www.businessobjects.com">Business Objects</ulink> |
---|
120 | (formerly Crystal Decisions), <ulink |
---|
121 | url="http://www.entrust.com">Entrust</ulink>, <ulink |
---|
122 | url="http://www.novell.com">Novell</ulink>, <ulink |
---|
123 | url="http://www.openwave.com">OpenWave Systems</ulink> (for their |
---|
124 | Typhoon and Twister servers), and <ulink |
---|
125 | url="http://www.reuters.com">Reuters</ulink>. |
---|
126 | </para> |
---|
127 | |
---|
128 | <para> |
---|
129 | Open source projects using Hoard include the Bayonne GNU telephony |
---|
130 | server, the <ulink |
---|
131 | url="http://supertech.lcs.mit.edu/cilk/">Cilk</ulink> parallel |
---|
132 | programming language, the <ulink |
---|
133 | url="http://www.cs.dartmouth.edu/research/DaSSF/index.html">Dartmouth |
---|
134 | Scalable Simulation Framework</ulink>, and the <ulink |
---|
135 | url="http://www.gnu.org/software/commoncpp/">GNU Common C++</ulink> |
---|
136 | system. |
---|
137 | </para> |
---|
138 | </sect2> |
---|
139 | </sect1> |
---|
140 | |
---|
141 | <sect1 id="Building Hoard"> |
---|
142 | <title>Building Hoard</title> |
---|
143 | <para> |
---|
144 | You can use the available pre-built binaries or build Hoard |
---|
145 | yourself. Hoard is written to work on Windows and any variant of UNIX |
---|
146 | that supports threads, and should compile out of the box. Rather than |
---|
147 | using Makefiles or configure scripts, Hoard includes custom scripts |
---|
148 | that all start with the prefix compile. |
---|
149 | </para> |
---|
150 | |
---|
151 | <sect2> |
---|
152 | <title>Platform-specific directions</title> |
---|
153 | <sect2> |
---|
154 | <title>Linux and Solaris Builds</title> |
---|
155 | <para> |
---|
156 | You can compile Hoard out of the box for Linux and Solaris using the |
---|
157 | GNU compilers (g++) just by running the <filename>compile</filename> |
---|
158 | script: |
---|
159 | |
---|
160 | <programlisting> |
---|
161 | ./compile |
---|
162 | </programlisting> |
---|
163 | |
---|
164 | </para> |
---|
165 | </sect2> |
---|
166 | <sect2> |
---|
167 | <title>Windows Builds</title> |
---|
168 | <para> |
---|
169 | There are now three alternative ways of using Hoard with Windows. |
---|
170 | |
---|
171 | <itemizedlist> |
---|
172 | |
---|
173 | <listitem> |
---|
174 | <para> |
---|
175 | The first approach builds a DLL, <filename>libhoard.dll</filename> and |
---|
176 | its associated library <filename>libhoard.lib</filename>. |
---|
177 | |
---|
178 | <programlisting> |
---|
179 | .\compile-dll |
---|
180 | </programlisting> |
---|
181 | </para> |
---|
182 | </listitem> |
---|
183 | |
---|
184 | <listitem> |
---|
185 | <para> |
---|
186 | The second approach relies on Microsoft Research's <ulink |
---|
187 | url="http://research.microsoft.com/sn/detours">Detours</ulink>. With |
---|
188 | Detours, you can take advantage of Hoard without having to relink your |
---|
189 | applications. Install Detours into <filename |
---|
190 | class="directory">C:\detours</filename>, and then build the Hoard |
---|
191 | detours library: |
---|
192 | |
---|
193 | <programlisting> |
---|
194 | .\compile-detours |
---|
195 | </programlisting> |
---|
196 | </para> |
---|
197 | </listitem> |
---|
198 | |
---|
199 | <listitem> |
---|
200 | <para> |
---|
201 | The third approach generates winhoard, which replaces malloc/new calls |
---|
202 | in your program <emphasis>and</emphasis> in any DLLs it might use. |
---|
203 | |
---|
204 | <programlisting> |
---|
205 | .\compile-winhoard |
---|
206 | </programlisting> |
---|
207 | </para> |
---|
208 | </listitem> |
---|
209 | </itemizedlist> |
---|
210 | </para> |
---|
211 | </sect2> |
---|
212 | </sect2> |
---|
213 | </sect1> |
---|
214 | |
---|
215 | |
---|
216 | <sect1 id="Using Hoard"> |
---|
217 | <title>Using Hoard</title> |
---|
218 | <sect2> |
---|
219 | <title>UNIX</title> |
---|
220 | <para> |
---|
221 | In UNIX, you can use the <envar>LD_PRELOAD</envar> variable to use |
---|
222 | Hoard instead of the system allocator for any program not linked with |
---|
223 | the "static option" (that's most programs). Below are settings for |
---|
224 | Linux and Solaris. |
---|
225 | </para> |
---|
226 | <sect3> |
---|
227 | <title>Linux</title> |
---|
228 | <para> |
---|
229 | <programlisting> |
---|
230 | LD_PRELOAD="/path/libhoard.so:/usr/lib/libdl.so" |
---|
231 | </programlisting> |
---|
232 | </para> |
---|
233 | </sect3> |
---|
234 | <sect3> |
---|
235 | <title>Solaris</title> |
---|
236 | <para> |
---|
237 | Depending on whether you are using the GNU-compiled version (as |
---|
238 | produced by <filename>compile</filename>) or the Sun |
---|
239 | Workshop-compiled versions (produced by |
---|
240 | <filename>compile-sunw</filename>), your settings will be slightly |
---|
241 | different. |
---|
242 | </para> |
---|
243 | |
---|
244 | <informaltable frame="none"> |
---|
245 | <tgroup cols="2"> |
---|
246 | <colspec colwidth="1in"/> |
---|
247 | <thead> |
---|
248 | <row> |
---|
249 | <entry>Version</entry> |
---|
250 | <entry>Setting</entry> |
---|
251 | </row> |
---|
252 | </thead> |
---|
253 | <tbody> |
---|
254 | <row valign="center"> |
---|
255 | <entry>GNU-compiled</entry> |
---|
256 | <entry> |
---|
257 | <programlisting> |
---|
258 | LD_PRELOAD="/path/libhoard.so:/usr/lib/libdl.so" |
---|
259 | </programlisting> |
---|
260 | </entry> |
---|
261 | </row> |
---|
262 | <row valign="center"> |
---|
263 | <entry>Sun-compiled (<emphasis>32-bits</emphasis>)</entry> |
---|
264 | <entry> |
---|
265 | <programlisting> |
---|
266 | LD_PRELOAD="/path/libhoard_32.so" |
---|
267 | </programlisting> |
---|
268 | </entry> |
---|
269 | </row> |
---|
270 | <row valign="center"> |
---|
271 | <entry>Sun-compiled (<emphasis>64-bits</emphasis>)</entry> |
---|
272 | <entry> |
---|
273 | <programlisting> |
---|
274 | LD_PRELOAD="/path/libhoard_64.so:/usr/lib/64/libCrun.so.1:/usr/lib/64/libdl.so" |
---|
275 | </programlisting> |
---|
276 | </entry> |
---|
277 | </row> |
---|
278 | </tbody> |
---|
279 | </tgroup> |
---|
280 | </informaltable> |
---|
281 | <note> |
---|
282 | <para> |
---|
283 | For some security-sensitive applications, Solaris requires you place |
---|
284 | libraries used in <envar>LD_PRELOAD</envar> into the <filename |
---|
285 | class="directory">/usr/lib/secure</filename> directory. In that event, |
---|
286 | after copying these libraries into <filename |
---|
287 | class="directory">/usr/lib/secure</filename>, set |
---|
288 | <envar>LD_PRELOAD</envar> by omitting the absolute locations of the libraries, as follows: |
---|
289 | <programlisting> |
---|
290 | LD_PRELOAD="libhoard.so:libCrun.so.1:libdl.so" |
---|
291 | </programlisting> |
---|
292 | </para> |
---|
293 | </note> |
---|
294 | |
---|
295 | </sect3> |
---|
296 | </sect2> |
---|
297 | |
---|
298 | <sect2> |
---|
299 | <title>Windows</title> |
---|
300 | <para> |
---|
301 | There are three ways to use Hoard on Windows. |
---|
302 | </para> |
---|
303 | <orderedlist> |
---|
304 | <listitem> |
---|
305 | Using Detours |
---|
306 | <para> |
---|
307 | By using Detours, you can take advantage of Hoard's benefits without |
---|
308 | relinking your Windows application (as long as it is dynamically |
---|
309 | linked to the C runtime libraries). You will need to use one of the |
---|
310 | two included Detours tools (<filename>setdll.exe</filename> or |
---|
311 | <filename>withdll.exe</filename> in the <filename |
---|
312 | class="directory">detours/</filename> directory) in conjunction with |
---|
313 | this version of Hoard. To <emphasis>temporarily</emphasis> use Hoard |
---|
314 | as the allocator for a given application, use <filename>withdll</filename>: |
---|
315 | |
---|
316 | <programlisting> |
---|
317 | withdll -d:hoarddetours.dll myprogram.exe |
---|
318 | </programlisting> |
---|
319 | |
---|
320 | If you want your program to use Hoard without having to invoke |
---|
321 | <filename>withdll</filename> every time, you can use |
---|
322 | <filename>setdll</filename> to add it to your executable: |
---|
323 | |
---|
324 | <programlisting> |
---|
325 | setdll -d:hoarddetours.dll myprogram.exe myprogram.exe |
---|
326 | </programlisting> |
---|
327 | |
---|
328 | You can later remove Hoard from your executable as follows: |
---|
329 | |
---|
330 | <programlisting> |
---|
331 | setdll -r:hoarddetours.dll myprogram.exe |
---|
332 | </programlisting> |
---|
333 | </para> |
---|
334 | </listitem> |
---|
335 | <listitem> |
---|
336 | Using <filename>winhoard</filename> |
---|
337 | <para> |
---|
338 | Another method is to use <filename>winhoard</filename>. Winhoard, |
---|
339 | like Detours, replaces malloc/new calls from your program and any DLLs |
---|
340 | it might use (leaving <filename>HeapAlloc</filename> calls |
---|
341 | intact). One advantage is that it does not require Detours to do this. |
---|
342 | |
---|
343 | To use the Winhoard version, link your executable with |
---|
344 | <filename>usewinhoard.obj</filename> and |
---|
345 | <filename>winhoard.lib</filename>, and then use |
---|
346 | <filename>winhoard.dll</filename>: |
---|
347 | |
---|
348 | <programlisting> |
---|
349 | cl /Ox /MD /c usewinhoard.cpp |
---|
350 | cl /Ox /MD myprogram.cpp usewinhoard.obj winhoard.lib |
---|
351 | </programlisting> |
---|
352 | |
---|
353 | </para> |
---|
354 | </listitem> |
---|
355 | <listitem> |
---|
356 | Using <filename>libhoard</filename> |
---|
357 | <para> |
---|
358 | |
---|
359 | The last method is to link directly with the |
---|
360 | <filename>libhoard</filename> DLL. This approach is simple, but only |
---|
361 | suitable for small applications, since it will not affect malloc calls |
---|
362 | in any other DLL you might load. To use this option, you should put |
---|
363 | the following into your source code as the very first lines: |
---|
364 | |
---|
365 | <programlisting> |
---|
366 | #if defined(USE_HOARD) |
---|
367 | #pragma comment(lib, "libhoard.lib") |
---|
368 | #endif |
---|
369 | </programlisting> |
---|
370 | |
---|
371 | This stanza should be in the first part of a header file included by |
---|
372 | all of your code. It ensures that Hoard loads before any other library |
---|
373 | (you will need <filename>libhoard.lib</filename> in your path). When |
---|
374 | you execute your program, as long as <filename>libhoard.dll</filename> |
---|
375 | is in your path, your program will run with Hoard instead of the |
---|
376 | system allocator. Note that you must compile your program with the |
---|
377 | <filename>/MD</filename> flag, as in: |
---|
378 | |
---|
379 | <programlisting> |
---|
380 | cl /MD /G6 /Ox /DUSE_HOARD=1 myprogram.cpp |
---|
381 | </programlisting> |
---|
382 | |
---|
383 | Hoard will not work if you use another switch (like |
---|
384 | <filename>/MT</filename>) to compile your program. |
---|
385 | </para> |
---|
386 | </listitem> |
---|
387 | </orderedlist> |
---|
388 | </sect2> |
---|
389 | </sect1> |
---|
390 | |
---|
391 | |
---|
392 | <sect1 id="FAQs"> |
---|
393 | <title>Frequently Asked Questions</title> |
---|
394 | <qandaset> |
---|
395 | |
---|
396 | <qandaentry> |
---|
397 | <question> |
---|
398 | <para>What kind of applications will Hoard speed up?</para> |
---|
399 | </question> |
---|
400 | <answer> |
---|
401 | <para> |
---|
402 | Hoard will always improve the performance of multithreaded programs |
---|
403 | running on multiprocessors that make frequent use of the heap (calls |
---|
404 | to malloc/free or new/delete, as well as many STL functions). Because |
---|
405 | Hoard avoids false sharing, Hoard also speeds up programs that only |
---|
406 | occasionally call heap functions but access these objects frequently. |
---|
407 | </para> |
---|
408 | </answer> |
---|
409 | </qandaentry> |
---|
410 | |
---|
411 | <qandaentry> |
---|
412 | <question> |
---|
413 | <para>I'm using the STL but not seeing any performance improvement. Why not?</para> |
---|
414 | </question> |
---|
415 | <answer> |
---|
416 | <para> |
---|
417 | In order to benefit from Hoard, you have to tell STL to use malloc instead of its internal custom memory allocator, as in: |
---|
418 | |
---|
419 | <programlisting> |
---|
420 | typedef list<unsigned int, malloc_alloc> mylist; |
---|
421 | </programlisting> |
---|
422 | |
---|
423 | </para> |
---|
424 | </answer> |
---|
425 | </qandaentry> |
---|
426 | |
---|
427 | |
---|
428 | <qandaentry> |
---|
429 | <question> |
---|
430 | <para> Have you compared Hoard against mtmalloc or libumem?</para> |
---|
431 | </question> |
---|
432 | <answer> |
---|
433 | <para> |
---|
434 | Yes. Hoard is much faster than either. For example, here's an |
---|
435 | execution of threadtest on Solaris: |
---|
436 | |
---|
437 | <informaltable frame="none"> |
---|
438 | <tgroup cols="2"> |
---|
439 | <tbody> |
---|
440 | <row> |
---|
441 | <entry>Default:</entry> |
---|
442 | <entry>4.60 seconds</entry> |
---|
443 | </row> |
---|
444 | <row> |
---|
445 | <entry>Libmtmalloc:</entry> |
---|
446 | <entry>6.23 seconds</entry> |
---|
447 | </row> |
---|
448 | <row> |
---|
449 | <entry>Libumem:</entry> |
---|
450 | <entry>5.47 seconds</entry> |
---|
451 | </row> |
---|
452 | <row> |
---|
453 | <entry>Hoard 3.2:</entry> |
---|
454 | <entry>1.99 seconds</entry> |
---|
455 | </row> |
---|
456 | </tbody> |
---|
457 | </tgroup> |
---|
458 | </informaltable> |
---|
459 | </para> |
---|
460 | </answer> |
---|
461 | </qandaentry> |
---|
462 | |
---|
463 | <qandaentry> |
---|
464 | <question> |
---|
465 | <para> |
---|
466 | What systems does Hoard work on? |
---|
467 | </para> |
---|
468 | </question> |
---|
469 | <answer> |
---|
470 | <para> |
---|
471 | Hoard has been successfully tested on numerous Windows, Linux and |
---|
472 | Solaris systems, including a 4-processor x86 box running Windows |
---|
473 | NT/2000, a 4-processor x86 box running RedHat Linux 6.0 and 6.1, and a |
---|
474 | 16-processor Sun Enterprise server running Solaris. |
---|
475 | </para> |
---|
476 | </answer> |
---|
477 | </qandaentry> |
---|
478 | |
---|
479 | <qandaentry> |
---|
480 | <question> |
---|
481 | <para> |
---|
482 | Have you compared Hoard with SmartHeap SMP? |
---|
483 | </para> |
---|
484 | </question> |
---|
485 | <answer> |
---|
486 | <para> |
---|
487 | We tried SmartHeap SMP but it did not work on our Suns (due to an |
---|
488 | apparent race condition in the code). |
---|
489 | </para> |
---|
490 | </answer> |
---|
491 | </qandaentry> |
---|
492 | </qandaset> |
---|
493 | </sect1> |
---|
494 | |
---|
495 | <sect1 id="More Info"> |
---|
496 | <title>More Information</title> |
---|
497 | <para> |
---|
498 | The first place to look for Hoard-related information is at the Hoard |
---|
499 | web page, <ulink url="http://www.hoard.org">www.hoard.org</ulink>. |
---|
500 | </para> |
---|
501 | |
---|
502 | <para> |
---|
503 | There are two mailing lists you should consider being on if you are a |
---|
504 | user of Hoard. If you are just interested in being informed of new |
---|
505 | releases, join the <ulink |
---|
506 | url="http://groups.yahoo.com/group/hoard-announce/">Hoard-Announce</ulink> |
---|
507 | list. For general Hoard discussion, join the <ulink |
---|
508 | url="http://groups.yahoo.com/group/hoard/">Hoard</ulink> mailing |
---|
509 | list. You can also search the archives of these lists. |
---|
510 | </para> |
---|
511 | </sect1> |
---|
512 | |
---|
513 | <sect1 id="License Info"> |
---|
514 | <title>License Information</title> |
---|
515 | |
---|
516 | <para> |
---|
517 | The use and distribution of Hoard is governed by the GNU General |
---|
518 | Public License as published by the <ulink |
---|
519 | url="http://www.fsf.org">Free Software Foundation</ulink>: see the |
---|
520 | included file <filename>COPYING</filename> for more details. |
---|
521 | </para> |
---|
522 | |
---|
523 | <para> |
---|
524 | Because of the restrictions imposed by this license, most commercial |
---|
525 | users of Hoard have purchased commercial licenses, which are arranged |
---|
526 | through the University of Texas at Austin. <ulink |
---|
527 | url="mailto:software@otc.utexas.edu">Contact Richard Friedman</ulink> |
---|
528 | at the Office of Technology Commercialization at The University of |
---|
529 | Texas at Austin for more information (phone: (512) 471-4738). |
---|
530 | </para> |
---|
531 | |
---|
532 | </sect1> |
---|
533 | |
---|
534 | </article> |
---|