Saturday, May 22, 2010

Stack-smashing remains fun and profitable

Buffer overflows have long been a rich source of security vulnerabilities. Unfortunately, its popularity led to a family of knee-jerk reactions: W^X; Data Execution Prevention (DEP); the NX (No eXecute) bit; executable space protection.

As the terms suggest, code is divorced from data. For example, an operating system may forbid execution of any instructions lying in the stack. A simple buffer exploit now causes the program to halt, rather than execute injected code.

My knee-jerk counter-reaction was suspicion and hostility. Shouldn’t we be focusing on the causes of buffer overflow disease, and not its symptoms? Also, some of the most beautiful and fundamental results in computer science involve feeding a Turing machine to a Turing machine. Data is code and code is data.

Self-modifying code considered awesome

Self-modifying code is fascinating by virtue of its self-referential nature, but it is not just a pretty face: it has practical applications. Just-in-time compilation is perhaps the best-known example. However, I find it most useful for nested functions in C.

Thomas Breuel describes how a C compiler can be modified to allow nested function via trampolining. Briefly, for each nested function, we generate its code as usual, but we also add a local variable to the scope where it is defined: we add an array of bytes, whose contents are opcodes that simply rig a pointer to make it look like we’re in the current stack frame before jumping to the nested function.

Standard C code expecting a function pointer still works when passed a pointer to this array. When they call the function pointer, they jump to the array, where they run the stack-frame-rigging code before continuing on with the call. By magic, the code in the nested function executes within the desired scope. One pointer does the work of two.

Observe we require the array to be local and populated at runtime, because only then do we know the address of the current stack frame. We cannot setup this tomfoolery in advance. In other words, we must execute code we just placed on the stack.

Like duct tape, treating data as code deftly solves a host of problems. W^X and friends block this avenue, or at least make it less efficient. Breuel writes:

"There are, however, some architectures and/or operating systems that forbid a program to generate and execute code at runtime. We consider this restriction arbitrary and consider it poor hardware or software design. Implementations of programming languages such as FORTH, Lisp, or Smalltalk can benefit significantly from the ability to generate or modify code quickly at runtime."

Return-oriented programming

Hovav Shacham recently filled me in on return-oriented programming. I’m a fan. I now have more than gut feelings to support my stance. I can gloat and say "I told you so".

Return-oriented programming skirts around stupid restrictions via one level of indirection. Instead of putting code in the stack, we put pointers to the code in the stack. We choose to point at code that soon runs into a return instruction. On typical architectures, a return instruction increments the stack pointer before jumping to the address to which it points. Hence the stack pointer becomes a sort of indirect instruction pointer: we run a snippet of code until it hits a return statement, which causes us to run the next snippet of code, and so on.

Thus using a compromised stack, attackers cleverly glue together snippets of code in executable parts of memory, such as the BIOS or the standard C library. On popular systems there’s enough stuff to do anything. Tools for automating the process exist; namely, you can write source and have it compile to a sequence of cherry-picked memory addresses. It’s the return-to-libc attack on steroids.

I had hoped return-oriented programming could cut both ways; does it allow self-modifying code even in the presence of an NX bit? Sadly, on further reflection, return-oriented programming appears to have no legitimate uses. Arbitrary code execution is only possbile within the stack frame of a function we control, and, for example, we still have no means of adjusting the static link pointer when qsort() invokes our nested function. Hopefully I overlooked a sneaky trick.

In short, thanks to return-oriented programming, executable space protection is a minor inconvenience for the bad guys, and a major inconvenience for the good guys.