[Gllug] HugePages

Nix nix at esperi.org.uk
Sat Oct 30 12:17:50 UTC 2010


On 29 Oct 2010, James Courtier-Dutton told this:

> 2) Google a bit to understand how memory allocation works.
> A quick example.
> An application requests 4 bytes of memory.
> The kernel assigns a free page to that application and allocates a
> small section of the whole page to store the 4 bytes in.

More specifically, malloc() hunts for four free bytes in its arenas, of
which it can have many, one expanding one allocated via sbrk() and zero
to many others allocated via mmap(). Large allocations go straight into
mmap()ed regions, because they can be released back to the OS when
empty, while the sbrk()ed region is generally fragmented enough that
shrinking it is hopeless.

> Now, the next request for 4 more bytes of memory from the same
> application will use the un-used space of the 4K page.

(As above.)

> But, if a different process requests 4 bytes of memory, it cannot be
> allocated from the same page as the previous process, so a second 4K
> page is allocated to the new process.

This is because a page is the minimum-sized protection domain on the
CPU, and we can't have one process accidentally overwriting another
process's memory. (Also, the page would probably have a different
address in both processes, but this is a less severe problem, mostly
requiring a lack of absolute-addressed pointers within the page: shared
libraries can have different addresses in every process they're mapped
into, but the pages are still shared.)

> Now, if hugh pages are being used and those huge pages are 1Gig
> (although in your case they are probably 2MB pages), all it would take
> is 72 processes requesting 4 bytes of memory each and that will
> consume 72GB of RAM.

This is one of many reasons why transparent hugepages are preferable:
they only kick in if you allocate a lot of memory. It's ridiculous to
use a hugepage for a four byte allocation :) (but a good limit case).

> 3) Memory fragmentation.
> This is one of the worst problems and it is a very difficult problem to fix.

Mostly 'cos you can't move blocks of memory around without cooperation
from the application, which almost no applications are willing to
provide (for good reason).

> Quick example.
> An application requests 4 bytes of memory (call it request 1). The
> application is assigned one 4K page.
> The application then requests more small amounts of memory (call it
> request 2-200)
> Say that on request 201, it asks for 3 more bytes of memory, it has
> used all of the 4K page, so the application is assigned a second 4K
> page to handle request 201.
> The application then frees the memory from request 2-200.
> So, request 1 and request 201 still exist.
> The application has been allocated 8K of RAM resources, but only
> request 1 and 201 still exist, totalling 7 bytes. So, now although the
> application only needs 7bytes of memory, it is using 2 pages, and thus
> 8K of RAM. Each page of RAM has to stay allocated until the process
> has freed all requests against that page of RAM.

Yep. It's even worse than that. If this allocation was from the sbrk()ed
region, which with four-byte allocations it almost certainly was, even
the first page cannot be freed, as the sbrk()ed region must be freed
from the top down, in last-in-first-out order.

(this obviously does not apply to hugepages, though.)

> So, this is one of the reasons why using too many hugepages is a bad
> idea.

(and, again, doesn't apply to hugepages).

> Java does in fact help you here, because although a native x86
> application cannot get out of the memory fragmentation problem
> described, the Java Memory manager can because it can move memory from
> one page to another without the Java application having to know. This

Yep. :)

> is one of the advantages of a software VM over a hardware implemented
> VM.

Well, technically the hardware implemnted VM is doing just the same thing:
it's moving memory around between physical pages without its virtual
address(es) changing at all. You'd really be annoyed if it wasn't.
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list