In the beginning, there were stack buffer overflows everywhere. Overflowing data on the stack made for a quick and easy way to subvert a program to run code provided by an attacker. Initially, this meant simply overwriting the saved return address on the stack with the location of shellcode typically on the stack and perhaps prefaced by a NOP sled, depending on how accurately the attacker could predict addresses of data in the corrupted stack.
One of the initial responses to this was to mark stack memory as non-executable (NX), so that an attacker couldn’t simply point EIP to instructions dumped onto the stack. Naturally, as this is a cat-and-mouse game, attackers figured out that they could setup the stack to return into libraries, allowing useful attacker specified code to run without the need to introduce new instructions. Now, the combination of randomized memory (ASLR) and data execution prevention (DEP) make it increasingly difficult to exploit a stack overflow on a modern operating system.
The answer to this came in 2007 in one of my favorite technical papers titled, “The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86)” by Hovav Shacham. This groundbreaking paper (and subsequent Black Hat 2008 talk) lays out an expansion of return-to-libc exploitation with a new name, ‘Return Oriented Programming (ROP)’. The basic idea of ROP, like its predecessor ret-to-libc, is to chain together instructions in memory marked as executable using short instruction sequences ending in the x86 ret instruction.
Unlike returning to a library and using existing code as it was (in some ways) intended, ROP takes advantage of unintended code sequences occurring in a program. As described in the paper, the technique works particularly well on x86 architectures due to the lack of instruction alignment and the high-density of the instruction set. The x86 machine code is similar to a page of writing with no punctuation and therefore, can be read to mean any number of things depending on where you start reading.
For example, “To win friends and influence” could be re-read to make new English words like “Tow in friend sand influence”. Putting this into context of machine instructions, a binary may contain the following byte sequence within executable code: “806b891: 8b 15 58 c3 0c 08”. If the CPU starts reading at 0x806b891, the opcode 0x8b indicates data should be moved from EDX (0x15) to the address 0x80cc358. Starting reading at 0x806b893 the CPU would interpret “pop eax; ret”.
This is a fundamental building block for ROP, also known as a gadget. Research has shown that even simplistic programs with minimal functionality contain enough instructions added by the compiler to create a Turing complete gadget set.
The idea is that by searching for the ret opcode (0xC3) within x86 machine code, automated tools can disassemble backwards from that point to generate a list of useful ROP gadgets. For a gadget to be considered useful, it must allow the caller to manipulate registers and memory in a consistent and reliable manner. This means avoiding long sequences with undesirable pointer dereferences or register manipulations. Gadgets with more than four instructions will generally create headaches but can still be used in a pinch. Several tools are freely available for enumerating and classifying useful gadgets from a target binary (for example, http://www.ropshell.com/).
Earlier this month, Bas van den Berg, held a ROP primer workshop at the BSides London. Unfortunately, I was unable to attend his session, but luckily, I did get to chat a moment with Bas who was kind enough to give me the VM used in the workshop. It turned out that this proved to be an interesting way to pass a little extra time sitting in London’s airport while waiting on my rescheduled departure.
I’d now like to share this experience by walking through my quick and dirty approach to creating a ROP chain exploit for the provided vulnerable target.
The program itself (level0 on the VM) is quite simple. Although no source code was provided it was easy enough to find that user-supplied data from stdin was being directly copied onto a stack buffer without a bounds check. Some sample output is as follows:
Providing a longer input string causes an interesting segfault:
The “A” (0x41) values ended up overwriting the saved return address, causing a SEGFAULT when the CPU tried to fetch an instruction from the bogus address.
The next step in the process is just like with traditional stack overflow exploitation in that we need to identify the offset between the start of crafted input and the saved return address. This is most easily achieved using a pattern generator similar to:
We can confirm this as follows:
The segfault at deadbeef is enough to confirm that we have EIP control, so the next step is to determine how to setup a fake stack so that the program will execute shellcode. As expected (since this is a ROP exercise), the stack is not executable, so we will have to work with executable memory within the program itself.
I started this process by uploading the binary to ropshell.com for quick analysis. A complete analysis of the binary is available at http://ropshell.com/ropsearch?h=6785d72a7337d3ef7367bc9246f88d55. I picked the following gadgets to construct my ROP chain:
The end goal here is to setup the stack such that I can invoke the execve system call to get a root shell (the challenge binary is suid root). To do this, I will need to arrange some data structures in memory for the system call, load EAX with the execve syscall number (0xb), and then invoke a system call (int 0x80). Setting EAX and triggering the system call are trivial but creating the argument and environment data structures required for the system call requires a little more creativity.
Before getting into that however, I think it would be useful to demonstrate how a ROP gadget is consumed. As a simple example, we can set EAX with an arbitrary value 0x00031337 by crafting input such that the address of ‘pop eax; ret’ will be at EIP followed by the value we desire for EAX:
In the above figure, I have prepared a crafted input with a single stage ROP chain and set a breakpoint at the gadget. As you can see (with a little help from the GDB peda addon), control has reached the ‘pop eax’ and the value on the top of the stack is 0x31337 as desired. Stepping into the next instruction shows EAX is in fact loaded as expected:
In general, this is how ROP chaining works to create fake stack frames. Each gadget address is specified followed by any stack values needed for execution of the gadget and then, of course, the address of the next gadget in the chain.
While there are many different ways to spawn a shell through ROP, my approach was to pick a predictable address in the program’s data section and use this to arrange the needed data structures. My rw memory starts at 0x080ca660, so I started by loading this address into ECX and the filling a register (I used EAX) with the first 4 bytes of the filename I wanted to execute (/bin) and then moving EAX into the dereferenced address at ECX (my storage area).
This process is repeated, advancing ECX 4 bytes each time until the complete null terminated command filename is specified. I then used the next 4 bytes of my memory pool to store the address of where the filename is specified (this makes up the argument pointer). For the env pointer, I used similar logic. With the command address loaded into EBX, the execve call number in EAX, the argument pointer in ECX, and the envp pointer in EDX, it is now time to call the ‘int 0x80’ gadget to invoke the system call.
Exploiting the target now is just a matter of supplying the crafted input and then interacting with the shell it has spawned:
Title image courtesy of ShutterStock