The Hoard Multiprocessor Memory Allocator

...if you'll be running on multiprocessor machines, ... use Emery Berger's excellent Hoard multiprocessor memory management code. It's a drop-in replacement for the C and C++ memory routines and is very fast on multiprocessor machines. Debugging Applications for Microsoft .NET and Microsoft Windows, Microsoft Press, 2003

The Hoard memory allocator is a fast, scalable, and memory-efficient memory allocator for shared-memory multiprocessors. It runs on a variety of platforms, including Linux, Solaris, and Windows. Hoard is a drop-in replacement for malloc(), etc. No change to your source is necessary. Just link it in or set just one environment variable (see Using Hoard for more information). Hoard can dramatically improve the performance of multithreaded programs running on multiprocessors.

Why Hoard?

There are a number of problems with existing memory allocators that make Hoard a better choice.

Contention

Multithreaded programs often do not scale because the heap is a bottleneck. When multiple threads simultaneously allocate or deallocate memory from the allocator, the allocator will serialize them. Programs making intensive use of the allocator actually slow down as the number of processors increases. Your program may be allocation-intensive without you realizing it, for instance, if your program makes many calls to the C++ Standard Template Library (STL).

False Sharing

The allocator can cause other problems for multithreaded code. It can lead to false sharing in your application: threads on different CPUs can end up with memory in the same cache line, or chunk of memory. Accessing these falsely-shared cache lines is hundreds of times slower than accessing unshared cache lines.

Blowup

Multithreaded programs can also lead the allocator to blowup memory consumption. This effect can multiply the amount of memory needed to run your application by the number of CPUs on your machine: four CPUs could mean that you need four times as much memory. Hoard is a fast allocator that solves all of these problems.

Press

Intel highlights the benefits of using Hoard (a previous, slower version) on a 4-way Xeon system.

Sun concludes that Hoard is more space-efficient than their own allocators.

Who's Using Hoard?

Companies using Hoard in their products and servers include AOL, British Telecom, Business Objects (formerly Crystal Decisions), Entrust, Novell, OpenWave Systems (for their Typhoon and Twister servers), and Reuters.

Open source projects using Hoard include the Bayonne GNU telephony server, the Cilk parallel programming language, the Dartmouth Scalable Simulation Framework, and the GNU Common C++ system.

Hoard is also a part of several major Linux distributions, including Debian and Novell's SuSe.

More Information

The first place to look for Hoard-related information is at the Hoard web page, www.hoard.org.

There are two mailing lists you should join if you are a user of Hoard. If you are just interested in being informed of new releases, join the Hoard-Announce list. For general Hoard discussion, join the Hoard mailing list. You can also search the archives of these lists.

Technical Information

For technical details of a previous version of Hoard, read Hoard: A Scalable Memory Allocator for Multithreaded Applications, by Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. The Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IX). Cambridge, MA, November 2000.