Brain Dump

Memory Allocation

Tags
comp-arch

Is the process of allocating and managing more memory from the heap at runtime. So called memory-allocators oversee the role of memory allocation including management of memory retrieved from the OS and re-purposing freed memory for newer allocations. They must track which bytes are currently allocated and which are available for use.

void* malloc(size_t size) {
    // Ask the system for more bytes by extending the heap space.
    // sbrk returns -1 on failure
    void *p = sbrk(size);
    if (p == (void *) -1)
	return NULL; // No space left
    return p;
}

void free() { /* Do nothing */ }
Code Snippet 1: A rudimentary malloc implementation that only fetches more memmory and never frees anything. code:basic-malloc

In code:basic-malloc we've presented the simplest malloc implementation possible. This one has a few drawbacks:

  • System calls are slow compared to library calls so we should instead reserve a large amount of memory and only occasionally ask for more from the system.
  • Memory isn't reused when it's freed. It just keeps asking for a bigger heap.

Placement Strategies

As a program keeps running it allocates and frees memory throughout its life-time. Every free operation would return a few blocks of memory to the allocator pool which could be reused by a later malloc call so-long as its larger than the requested memory size.

In some cases we may end up with multiple possible ways to allocate enough memory for a request. For example if we have a heap with a 30KiB unallocated block and a 2KiB unallocated block after it and we receive a request for 2KiB we may return the 2KiB block because it meets the requirement exactly or we may split the 30KiB block into two and return the appropriate one.

These considerations determine the placement strategy of an allocator. You can see examples of such strategies in fig:malloc_placement_worst, fig:malloc_placement_first, fig:malloc_placement_best.

Figure 1: A worst fit storage placement strategy which just finds the largest hole of sufficient size and splits it into 2. fig:malloc_placement_worst

Figure 1: A worst fit storage placement strategy which just finds the largest hole of sufficient size and splits it into 2. fig:malloc_placement_worst

Figure 2: A first fit storage placement strategy which just finds the first hole that is of sufficient size. fig:malloc_placement_first

Figure 2: A first fit storage placement strategy which just finds the first hole that is of sufficient size. fig:malloc_placement_first

Figure 3: A best fit storage placement strategy which finds the smallest hole of sufficient size. fig:malloc_placement_best

Figure 3: A best fit storage placement strategy which finds the smallest hole of sufficient size. fig:malloc_placement_best

Considerations

  • Need to minimise fragmentation (maximise memory utilisation).
  • Need high performance.
  • You may need fiddly implementations with lots of pointer manipulation using linked lists and pointer arithmetic.
  • Both fragmentation and performance depend on the allocation profile which can be evaluated but not predicted. In practice it might be the case that a special-purpose allocator can often out-perform a general purpose implementation.
  • The allocator doesn't know the programs memory allocation requests in advance.

See some general tips on writing memory allocators here.

Example Implementation

Conceptually you can implement a memory allocator by maintaining a linked list of blocks. Each entry encompasses the actual memory returned. There will be a small header containing metadata about the block (example: is this free or allocated), the usable space of the entry, and a boundary tag (holds the amount of space of the previous block to simplify merging the previous block into a newly freed block when both are freed). Instead of maintaining pointers in each block to the next block, we simply align blocks such that where one block ends in memory the next one begins. This is possible when all memory is managed by the memory allocator and therefore can be treated as a single uniform chain of blocks.

Consider we have a pointer to the start of the heap called p. The type of p will be a char * so we can move forward in bytes, however it'll point to the metadata struct that prefixes the block. We can move to the end of the metadata with p + sizeof(meta). We can move to the end of the usable space with p + sizeof(meta) + p->size. We can move to the end of the block (and the start of the next block) with p + sizeof(meta) + p->size + sizeof(btag).