Having phun with Symbolic Execution (SE)

This year at 30c3 I gave a talk about triggering deep vulnerabilities using symbolic execution. The talk gives an overview about symbolic execution and shows how you can use symbolic execution as a new tool in your reverse engineering toolbox. Let's say you have a specific location in a binary and you want to know what kind of input can trigger a specific fault at that location. If you configure it correctly then symbolic execution can help you find that input.

This post starts with a general description of what symbolic execution is and then goes into detail on how to install FuzzBALL, our symbolic execution engine. In the end I present some examples how you can use FuzzBALL (which is a fairly complex tool and new users usually have a rough start).

Symbolic Execution (SE)

Symbolic Execution, or SE for short is an alternate way to execute code (programs, binaries, drivers, or even full operating systems). Instead of using specific concrete values, SE uses abstract, symbolic values for register and memory values. Operations on these values are carried along the symbolic execution of the code. Think of symbolic execution as a form of execution that concatenates the different computations on a value into a long formula.

At decision points or control flow transfers that depend on symbolic values (e.g., conditional jumps when the flags depend on symbolic values, or indirect control flow transfer when the address or pointer is symbolic) symbolic execution must follow all possible paths. For conditional jumps, the symbolic execution engine splits the execution into two paths. On one path the SE engine assumes that the branch condition is true and on the other path the SE engine assumes a false branch condition, extending the symbolic formulas with this information.

The SE engine proves that a specific condition is true at a specific location. When using SE, the programmer encodes a set of conditions that should be reached at one or more locations. If the SE engine finds a symbolic input that satisfies the condition then it can calculate one or more possible concrete inputs that trigger the vulnerability condition by solving the path formulas.

A common problem of SE engines is the limited scalability. Due to the exponential path explosion explained above (at each conditional branch that depends on symbolic input the number of paths is doubled) the number of symbolic input bytes and the amount of code that can be executed symbolically is limited.

Different SE engines

Several different SE engines exist and they are all tailored to different use cases. FuzzBALL is a SE engine that can be used to execute arbitrary binary code. At the core, code is translated into a high-level intermediate representation (using VEX and Vine) and conditions can be encoded on the binary level. FuzzBALL is used to generate proof of concept exploits (an input that triggers a bug) given a vulnerability condition in the application. KLEE is a SE engine that compiles programs into LLVM intermediate representation (therefore relies on the source code of the application) and tests if it can trigger a vulnerability condition. KLEE is used to find bugs in applications. S2E builds on KLEE and selectively executes large systems symbolically. The goal of S2E is similar to KLEE but on a larger scale (in the amount of code that is executed in one instance, KLEE can be used for large scale analysis where a large set of smaller programs, e.g., the binutils can be analyzed), with its main contributions in choosing which path to analyze. Next to these three systems there is some prior and a lot of follow up work. The goal of this blog post is not to go into too much details but the interested reader can take the FuzzBALL, KLEE, and S2E papers as a start and explore from there.

I just highlighted 3 different SE engines and there are many more. In addition, many engines support different SMT sovlers that can also be tailored to different use cases. Just the question of choosing the right SE engine for a task would merit a couple of blog posts.

Setting up an SE environment

For the rest of this blog post we will concentrate on the FuzzBALL symbolic execution engine (the one I know best). FuzzBALL currently runs on any x86-32 system (or x86-64 with x86-32 backward compatibility libraries) and is easy to set up on any Debian-based host. You can also follow the great INSTALL file in the FuzzBALL repository.

Use the following magic commands to install the necessary build tools:

$ sudo apt-get install build-essential valgrind qemu binutils-multiarch \
                       binutils-dev ocaml ocaml-findlib libgdome2-ocaml-dev \
                       camlidl libextlib-ocaml-dev ocaml-native-compilers \
                       libocamlgraph-ocaml-dev libsqlite3-ocaml-dev \
                       subversion git libgmp3-dev automake
$ sudo apt-get build-dep binutils-multiarch ocaml

Download a set of necessary repositories:

$ cd ~; mkdir SE; cd SE
$ git clone https://github.com/bitblaze-fuzzball/fuzzball
$ svn co -r2737 svn://svn.valgrind.org/vex/trunk vex-r2737
$ cd vex-r2737; make -f Makefile-gcc
$ cd ..
$ svn co -r1673 https://svn.code.sf.net/p/stp-fast-prover/code/trunk/stp  stp-r1673
$ cd stp-r1673; patch -p0 <../fuzzball/stp/stp-r1668-true-ce.patch
$ ./clean-install.sh --with-prefix=$(pwd)/install
$ cp install/bin/stp install/lib/libstp.a ../fuzzball/stp
$ cd ../fuzzball
$ ./autogen.sh
$ ./configure --with-vex=`pwd`/../vex-r2737
$ make

And you should be done and have a readily compiled version of FuzzBALL with all the libraries you will need for x86-32 symbolic execution.

Now download and install the examples from here and unpack them in our playground:

$ cd ~/SE; mkdir playground; cd playground
$ wget -c https://nebelwelt.net/blog/static/2014/0114/SE-testbed.tar.bz2
$ tar -xvjf SE-testbed.tar.bz2

Our first symbolically executed piece of code

Now after we have initialized and compiled our execution environment it's time to execute our first piece of code symbolically. Enter the SE directory and change to the first code example:

$ cd ~/SE/playground/simple
$ make
$ ./run_se.sh

What's actually happening there? Well, let's start with the actual code:

#define MAGIC 42
// add some concrete values into the process memory
char arr[256];
for (i=0; i<256; i++)
    arr[i] = i;
// buf0[0] is symbolic
// copy the concrete value from arr[symb] into buf0[42]
buf0[42] = arr[buf0[0]];

// check if the copied value was the magic value defined above
if (buf0[42] == MAGIC) {
            printf("Correctly recovered value 42\n");
}

So, if we assume that buf0 is symbolic and somewhat input dependent then what kind of values do we need to place in buf0[0] to trigger the printf statement. This example may look a little bit hypothetical and constructed (and it is) but it illustrates an interesting use case for FuzzBALL. We have some (binary) program and a pre-defined condition that we want to trigger. The FuzzBALL SE engine is then used to produce a concrete input (e.g., a specific memory buffer, or a file) that triggers the pre-defined condition.

FuzzBALL supports a large set of configuration and runtime options and I invite you to look at them using ./fuzzball/exec_utils/fuzzball -h. In our example directory we have conveniently set up a config file and a run script that then uses this config file. In the config file we specify the locations and basic settings for FuzzBALL. In the run script we specify the runtime configuration for FuzzBALL. For the simple example we use the following parameters:

-random-seed $NUMBER (FussBALL uses random values, a common seed makes the choices reproducible)
--solver stpvc (we use the STP constraint solver)
-linux-syscalls (we emulate linux system calls)
-fuzz-start-addr (concrete execution until SE starts at this address)
-fuzz-end-addr (SE ends at this address)
-symbolic-region 0xc0fe0000+1 (one symbolic byte at address 0xc0fe0000)
-check-condition-at (encodes the location and the condition)
-finish-on-nonfalse-cond (let's stop SE if we find a satisfying input)

The most important arguments are fuzz-start-addr, fuzz-end-addr as they set the constraints where the magic happens and symbolic-region which defines what values are tracked symbolically and not concrete.

The condition is checked at a specific location (or at multiple location) and can check the values of registers or memory locations. In the simple example we use "0xc0de:R_EAX:reg32_t == 0x2A:reg32_t". This condition checks if the 32-bit register EAX contains the 32-bit value 0x2A when the instruction pointer is at the location 0xc0de. You can play around with the conditions and look at the machine code of the simple example as well (objdump -d ./simple). Play around with some FuzzBALL options and try to trigger different values or different conditions.

Getting a little help for some Vortex wargames

The Vortex wargames are an awesome way to learn more about system security. In a bunch of different levels you learn about different aspects of system security and software vulnerabilities. You have to understand the concepts, write some code, and exploit these vulnerabilities to reach levels higher up. I really recommend that you try them out to see how far you can go.

For some of the levels SE can be used to simplify the process of exploiting the given vulnerabilities (some jokingly say without even understanding the core vulnerability). In this part we look at Vortex level 01 where we have a simple program that reads some input from stdin:

unsigned char buf[512];
unsigned char *ptr = buf + (sizeof(buf)/2);
...
while((x = getchar()) != EOF) {
  switch(x) {
    case '\n': print(buf, sizeof(buf)); continue; break;
    case '\\': ptr--; break;
    default: e(); if(ptr > buf + sizeof(buf)) continue;
             ptr++[0] = x; break;
   }
 }

The experienced hacker will see that this code defines a simple state machine that can be used to trigger the vulnerability condition hidden in e(). But let's try not to think too hard and use a brute-force SE approach.

We setup the SE engine in playground/vortex:

$ cd ~/SE/playground/vortex
$ make
$ ./run_se.sh

If your GCC decides to use a weird stack layout (well, some GCC optimizations tend to reorder the variables on the stack, mitigating the intended exploit) then you can use my precompiled versions of the binaries (static-vortex and run_se-static.sh).

We have modified the original vortex source code in some small places. The sestart and sestop function calls are added to make the binary grepable (the only purpose is that your gcc might generate different code and the call to these non inlineable functions allows me to more conveniently search for the SE start address). Also, we made the buffer smaller, reducing the hardness of the problem. We can then use the SE generated result to abstract the solution and adapt it to larger buffers as well (or we let the SE run for a longer amount of time). This example already gives us a good overview of the scalability issues that SE faces and if we use more than 10-20 symbolic bytes then we reach the scalability boundaries.

I don't want to spoil the fun too much and give you the exact results, so I hope that you play around a little with the binaries and enjoy solving this vortex level.

Splitting SE along transformation boundaries: PDF encodings

The last example takes a real program that we have only in binary form. We look at a binary that decodes a HEX string first and then does a RLE decompression on the output of the first transformation. The code would look a little bit like:

ASCIIHexDecode(buf0, len0, buf1, 4096);
if (RunLengthDecode(buf1, len1, buf2, 4096) != -1) {
  if (strncmp(argv[3], (char*)buf2, strlen(argv[3])) == 0) {
     buf0[len0] = 0;
     buf1[len1] = 0;
     buf2[strlen(argv[3])] = 0;
     printf("Correctly recovered str\n");
  }
}

In the binary we have two transformations: one that decodes HEX strings and one that decompresses RLE encoded buffers. These transformations are implemented just like the PDF specification recommends them and the fun thing is that PDF allows recursive encodings, so an object inside a PDF file can be encoded using a set of transformations and may include other PDFs as well.

Reversing all transformations in one large step is usually not feasible for SE because of the state explosion. There are just too many symbolic bytes to carry along and too much state that accumulates. But single transformations are completely in the scope of transformations. In the HICFG paper, the first TR, or the second TR we define an analysis technique that allows a concretization of symbolic values at transformation boundaries, minimizing the state that needs to be carried from one transformation to the other. If you are interested in the details then I'd advise you to read the papers. On a high level, we split a large single SE computation into several smaller SE computations with intermediate concrete execution steps.

Looking at our binary we can define two transformations, reversing the RLE encoding first and the HEX encoding in a second step. This allows us to define a target string that is then RLE compressed and HEX encoded using the RLE decompression code and the HEX decoding code.

You can execute the example as follows:

$ cd ~/SE/playground/hexrle
$ ./run_se.sh

Again, I invite you to look at the individual files, play around with them, peek into the binaries using objdump and have fun with the SE engine.

Wrapping up

In this blog post I've given you a gentle introduction into symbolic execution, how you can use symbolic execution to reverse engineer a given input that triggers a condition of your choosing and how you can optimize the SE process to scale to a larger amount of symbolic input bytes. I hope you enjoyed the read! I'm always happy about comments, feedback, or questions. You can find me on twitter or just send me a mail.