This year at 30c3 I
gave a talk about triggering deep vulnerabilities using symbolic execution. The talk gives an overview
about symbolic execution and shows how you can use symbolic execution as a new
tool in your reverse engineering toolbox. Let's say you have a specific
location in a binary and you want to know what kind of input can trigger a
specific fault at that location. If you configure it correctly then symbolic
execution can help you find that input.
This post starts with a general description of what symbolic execution is and
then goes into detail on how to install FuzzBALL, our symbolic execution
engine. In the end I present some examples how you can use FuzzBALL (which is a
fairly complex tool and new users usually have a rough start).
Symbolic Execution (SE)
Symbolic Execution, or SE for short is an alternate way to execute code
(programs, binaries, drivers, or even full operating systems). Instead of using
specific concrete values, SE uses abstract, symbolic values for register and
memory values. Operations on these values are carried along the symbolic
execution of the code. Think of symbolic execution as a form of execution that
concatenates the different computations on a value into a long formula.
At decision points or control flow transfers that depend on symbolic values
(e.g., conditional jumps when the flags depend on symbolic values, or indirect
control flow transfer when the address or pointer is symbolic) symbolic
execution must follow all possible paths. For conditional jumps, the symbolic
execution engine splits the execution into two paths. On one path the SE engine
assumes that the branch condition is true and on the other path the SE engine
assumes a false branch condition, extending the symbolic formulas with this
information.
The SE engine proves that a specific condition is true at a specific location.
When using SE, the programmer encodes a set of conditions that should be reached
at one or more locations. If the SE engine finds a symbolic input that
satisfies the condition then it can calculate one or more possible concrete
inputs that trigger the vulnerability condition by solving the path formulas.
A common problem of SE engines is the limited scalability. Due to the
exponential path explosion explained above (at each conditional branch that
depends on symbolic input the number of paths is doubled) the number of
symbolic input bytes and the amount of code that can be executed symbolically
is limited.
Different SE engines
Several different SE engines exist and they are all tailored to different use
cases. FuzzBALL is a SE
engine that can be used to execute arbitrary binary code. At the core, code is
translated into a high-level intermediate representation (using VEX and Vine)
and conditions can be encoded on the binary level. FuzzBALL is used to generate
proof of concept exploits (an input that triggers a bug) given a vulnerability
condition in the application. KLEE is a SE
engine that compiles programs into LLVM intermediate representation
(therefore relies on the source code of the application) and tests if it can
trigger a vulnerability condition. KLEE is used to find bugs in applications.
S2E builds on KLEE and selectively executes
large systems symbolically. The goal of S2E is similar to KLEE but on a larger
scale (in the amount of code that is executed in one instance, KLEE can be used
for large scale analysis where a large set of smaller programs, e.g., the
binutils can be analyzed), with its main contributions in choosing which path
to analyze. Next to these three systems there is some prior and a lot of follow
up work. The goal of this blog post is not to go into too much details but the
interested reader can take the FuzzBALL, KLEE, and S2E papers as a start and
explore from there.
I just highlighted 3 different SE engines and there are many more. In addition,
many engines support different SMT sovlers that can also be tailored to
different use cases. Just the question of choosing the right SE engine for a
task would merit a couple of blog posts.
Setting up an SE environment
For the rest of this blog post we will concentrate on the FuzzBALL symbolic
execution engine (the one I know best). FuzzBALL currently runs on any x86-32
system (or x86-64 with x86-32 backward compatibility libraries) and is easy to
set up on any Debian-based host. You can also follow the great INSTALL file in
the FuzzBALL repository.
Use the following magic commands to install the necessary build tools:
$ sudo apt-get install build-essential valgrind qemu binutils-multiarch \
binutils-dev ocaml ocaml-findlib libgdome2-ocaml-dev \
camlidl libextlib-ocaml-dev ocaml-native-compilers \
libocamlgraph-ocaml-dev libsqlite3-ocaml-dev \
subversion git libgmp3-dev automake
$ sudo apt-get build-dep binutils-multiarch ocaml
Download a set of necessary repositories:
$ cd ~; mkdir SE; cd SE
$ git clone https://github.com/bitblaze-fuzzball/fuzzball
$ svn co -r2737 svn://svn.valgrind.org/vex/trunk vex-r2737
$ cd vex-r2737; make -f Makefile-gcc
$ cd ..
$ svn co -r1673 https://svn.code.sf.net/p/stp-fast-prover/code/trunk/stp stp-r1673
$ cd stp-r1673; patch -p0 <../fuzzball/stp/stp-r1668-true-ce.patch
$ ./clean-install.sh --with-prefix=$(pwd)/install
$ cp install/bin/stp install/lib/libstp.a ../fuzzball/stp
$ cd ../fuzzball
$ ./autogen.sh
$ ./configure --with-vex=`pwd`/../vex-r2737
$ make
And you should be done and have a readily compiled version of FuzzBALL with all
the libraries you will need for x86-32 symbolic execution.
Now download and install the examples from here and unpack
them in our playground:
$ cd ~/SE; mkdir playground; cd playground
$ wget -c https://nebelwelt.net/blog/static/2014/0114/SE-testbed.tar.bz2
$ tar -xvjf SE-testbed.tar.bz2
Our first symbolically executed piece of code
Now after we have initialized and compiled our execution environment it's time
to execute our first piece of code symbolically. Enter the SE directory and
change to the first code example:
$ cd ~/SE/playground/simple
$ make
$ ./run_se.sh
What's actually happening there? Well, let's start with the actual code:
#define MAGIC 42
// add some concrete values into the process memory
char arr[256];
for (i=0; i<256; i++)
arr[i] = i;
// buf0[0] is symbolic
// copy the concrete value from arr[symb] into buf0[42]
buf0[42] = arr[buf0[0]];
// check if the copied value was the magic value defined above
if (buf0[42] == MAGIC) {
printf("Correctly recovered value 42\n");
}
So, if we assume that buf0 is symbolic and somewhat input dependent then what
kind of values do we need to place in buf0[0] to trigger the printf statement.
This example may look a little bit hypothetical and constructed (and it is) but
it illustrates an interesting use case for FuzzBALL. We have some (binary)
program and a pre-defined condition that we want to trigger. The FuzzBALL SE
engine is then used to produce a concrete input (e.g., a specific memory
buffer, or a file) that triggers the pre-defined condition.
FuzzBALL supports a large set of configuration and runtime options and I invite
you to look at them using ./fuzzball/exec_utils/fuzzball -h. In our example
directory we have conveniently set up a config file and a run script that then
uses this config file. In the config file we specify the locations and basic
settings for FuzzBALL. In the run script we specify the runtime configuration
for FuzzBALL. For the simple example we use the following parameters:
-random-seed $NUMBER (FussBALL uses random values, a common seed makes the choices reproducible)
--solver stpvc (we use the STP constraint solver)
-linux-syscalls (we emulate linux system calls)
-fuzz-start-addr (concrete execution until SE starts at this address)
-fuzz-end-addr (SE ends at this address)
-symbolic-region 0xc0fe0000+1 (one symbolic byte at address 0xc0fe0000)
-check-condition-at (encodes the location and the condition)
-finish-on-nonfalse-cond (let's stop SE if we find a satisfying input)
The most important arguments are fuzz-start-addr, fuzz-end-addr as they
set the constraints where the magic happens and symbolic-region which defines
what values are tracked symbolically and not concrete.
The condition is checked at a specific location (or at multiple location) and
can check the values of registers or memory locations. In the simple example we
use "0xc0de:R_EAX:reg32_t == 0x2A:reg32_t". This condition checks if the
32-bit register EAX contains the 32-bit value 0x2A when the instruction pointer
is at the location 0xc0de. You can play around with the conditions and look at
the machine code of the simple example as well (objdump -d ./simple). Play
around with some FuzzBALL options and try to trigger different values or
different conditions.
Getting a little help for some Vortex wargames
The Vortex wargames are an awesome way
to learn more about system security. In a bunch of different levels you learn
about different aspects of system security and software vulnerabilities. You
have to understand the concepts, write some code, and exploit these
vulnerabilities to reach levels higher up. I really recommend that you try them
out to see how far you can go.
For some of the levels SE can be used to simplify the process of exploiting the
given vulnerabilities (some jokingly say without even understanding the core
vulnerability). In this part we look at Vortex level 01 where we have a simple
program that reads some input from stdin:
unsigned char buf[512];
unsigned char *ptr = buf + (sizeof(buf)/2);
...
while((x = getchar()) != EOF) {
switch(x) {
case '\n': print(buf, sizeof(buf)); continue; break;
case '\\': ptr--; break;
default: e(); if(ptr > buf + sizeof(buf)) continue;
ptr++[0] = x; break;
}
}
The experienced hacker will see that this code defines a simple state machine
that can be used to trigger the vulnerability condition hidden in e(). But
let's try not to think too hard and use a brute-force SE approach.
We setup the SE engine in playground/vortex:
$ cd ~/SE/playground/vortex
$ make
$ ./run_se.sh
If your GCC decides to use a weird stack layout (well, some GCC optimizations
tend to reorder the variables on the stack, mitigating the intended exploit)
then you can use my precompiled versions of the binaries (static-vortex and
run_se-static.sh).
We have modified the original vortex source code in some small places. The
sestart and sestop function calls are added to make the binary grepable (the
only purpose is that your gcc might generate different code and the call to
these non inlineable functions allows me to more conveniently search for the SE
start address). Also, we made the buffer smaller, reducing the hardness of the
problem. We can then use the SE generated result to abstract the solution and
adapt it to larger buffers as well (or we let the SE run for a longer amount of
time). This example already gives us a good overview of the scalability issues
that SE faces and if we use more than 10-20 symbolic bytes then we reach the
scalability boundaries.
I don't want to spoil the fun too much and give you the exact results, so I
hope that you play around a little with the binaries and enjoy solving this
vortex level.
Splitting SE along transformation boundaries: PDF encodings
The last example takes a real program that we have only in binary form. We look
at a binary that decodes a HEX string first and then does a RLE decompression
on the output of the first transformation. The code would look a little bit
like:
ASCIIHexDecode(buf0, len0, buf1, 4096);
if (RunLengthDecode(buf1, len1, buf2, 4096) != -1) {
if (strncmp(argv[3], (char*)buf2, strlen(argv[3])) == 0) {
buf0[len0] = 0;
buf1[len1] = 0;
buf2[strlen(argv[3])] = 0;
printf("Correctly recovered str\n");
}
}
In the binary we have two transformations: one that decodes HEX strings and one
that decompresses RLE encoded buffers. These transformations are implemented
just like the PDF specification recommends them and the fun thing is that PDF
allows recursive encodings, so an object inside a PDF file can be encoded using
a set of transformations and may include other PDFs as well.
Reversing all transformations in one large step is usually not feasible for SE
because of the state explosion. There are just too many symbolic bytes to carry
along and too much state that accumulates. But single transformations are
completely in the scope of transformations. In the HICFG paper, the first TR, or the second TR we define an analysis
technique that allows a concretization of symbolic values at transformation
boundaries, minimizing the state that needs to be carried from one
transformation to the other. If you are interested in the details then I'd
advise you to read the papers. On a high level, we split a large single SE
computation into several smaller SE computations with intermediate concrete
execution steps.
Looking at our binary we can define two transformations, reversing the RLE
encoding first and the HEX encoding in a second step. This allows us to define
a target string that is then RLE compressed and HEX encoded using the
RLE decompression code and the HEX decoding code.
You can execute the example as follows:
$ cd ~/SE/playground/hexrle
$ ./run_se.sh
Again, I invite you to look at the individual files, play around with them,
peek into the binaries using objdump and have fun with the SE engine.
Wrapping up
In this blog post I've given you a gentle introduction into symbolic execution,
how you can use symbolic execution to reverse engineer a given input that
triggers a condition of your choosing and how you can optimize the SE process
to scale to a larger amount of symbolic input bytes. I hope you enjoyed the
read! I'm always happy about comments, feedback, or questions. You can find me
on twitter or just send me a mail.