The last three weeks I've been traveling through China, Hong Kong, and Macau on
an interesting security tour thanks to this year's AsiaCCS being held in Xi'an,
China. AsiaCCS was right after Oakland, so I flew directly from San Francisco to
Xi'an China and then continued to visit friends at Beijing, Shanghai, and Hong
Kong/Macau. Overall, Asia has been a breathtaking experience with an immense
set of impressions that I'll try to summarize in a country-specific blog post.
Here, I'll focus on the research aspects and the conference.
AsiaCCS is in the process of migrating from a symposium to a full conference.
While not exactly in the big four, the conference is still fairly competitive
and has a tendency to accept good papers that just did not make it at the big
four. One of the challenges, shared with CCS proper, is that there is no
physical PC meeting and therefore no overall quality control and discussion of
the papers. This shows, in my opinion, in the slightly higher randomness in
paper selection that we see each year compared to conferences that have real
on-site PC meetings.
This was my first AsiaCCS and I left with an OK impression. The conference was
fairly well organized with an exciting dinner and great people to talk to.
Compared to the big four, AsiaCCS has the problem that many people only attend
for a day and then head off for sight-seeing, so if you want to meet with others
you actively have to coordinate (compared to other conferences where you'll just
walk into each other by accident as, let's be honest, San Jose does not have
that much to offer for tourists). On the downside, Internet is a mess in China
with lots of sites blocked, connections timing out, and even VPN sessions being
randomly killed and subverted by the great firewall. After a couple of annoying
weeks I found out that an SSH SOCKS proxy works much better than OpenVPN.
On the paper site, I attended all interesting sessions and also asked my fair
share of questions. There were a bunch of interesting papers and keynotes that
I'll discuss below. In general, I enjoyed the diverse keynotes, especially
Michael Backes' call for privacy research and Giovanni Vigna's shameless plug
for angr and collaborative systems research.
TouchAlytics
Let me begin this blog post with a shameless plug of our TouchAlytics work. We
finally published our TouchAlytics paper at AsiaCCS
(slides)
and I was the only author who was willing and managed to get a visa for China.
In our paper we propose a forgery-resistant touch-based authentication method
that uses how people react and adapt to different environments as biometrics
instead of something people "have" as in classical biometrics.
Our authentication method samples a user in different environments (that we
control) and then uses this information to subtly and continuously change the
underlying environment. As the user adapts her behavior, she is authenticated
against the different profiles we collected. As attackers do not know what
environment is used during the authentication, they cannot forge an
authentication, even with perfect information of all possible environments.
In our prototype we add an adaptive layer between the touchscreen sensor and the
display that allows us to stretch individual strokes into both dimensions. The
application therefore receives slightly different strokes than the user executes
on the touchscreen. Due to the different app behavior the user will adapt her
strokes accordingly and we use this adaptation to identify and authenticate the
user based on the slight variances. Our authentication framework is both stable
and sensitive, i.e., it allows us to differentiate between different settings
for a single user and between different users. This work moves biometrics from
a "what you have" to a "how you react"-based authentication.
An angr'y keynote
The best talk at AsiaCCS was Giovanni's angr'y keynote in my opinion. Based on
the premise that hacking is awesome, Giovanni and his group want to automate
awesomeness. Hacking can be manifold and can involve hacking the user through
social engineering, hacking the process through weak password resets, weak PINs,
or bruteforce attacks, or hacking the code. Hacking the code is the most
involved as actual knowledge and intelligence is needed. The question is if we
can incorporate the domain knowledge and intelligence into a tool. Angr is a
framework that tries to achieve that.
Binary code on the one hand is incredibly difficult as it has a (very) low
abstraction level, no structured types, no modules and no defined functions. In
addition, compiler optimizations make code very complex. On the other hand,
binary code is truthful, what you see is what you execute. In manual
vulnerability analysis, a very intelligent person stares at the code and sees
what she can find. This approach discovers deep and complex vulnerabilities but
does not scale. The holy grail of vulnerability research is a magic tool that,
when run, finds the vulnerability and develops a patch/exploit for it.
Automatic vulnerability analysis systems have a high level policy and try to
force violations. Such an approach requires replayability, i.e., the ability to
generate attack instances. These systems try to generate inputs that, when fed
to the program generate a violation. Such a violation is then a proof-of-concept
exploit (depending on the high level policy). An orthogonal aspect is semantic
insight, i.e., the ability to understand the root cause of the crash which will
allow the attacker to abstract and generalize from the single fault.
A problem that automatic vulnerability analysis systems face is that high
replayability implies low coverage, low replayability implies false
positives, semantic insight implies high overhead, replayability and semantic
insight imply low scalability and lack of soundness which result in false
negatives. Therefore heuristics need to balance between these different options
to achieve good results. Both static and dynamic analyses can be used to
evaluate the search space.
Static analysis has the advantage of high coverage but is complex and runs into
the aliasing problem. Dynamic analysis on the other hand has high replayability,
does not worry about aliasing but runs into coverage problems. So far, angr
focused on a binary analysis toolkit, providing static analysis and symbolic
execution. For the DARPA cyber grand challenge, the UC Santa Barbara folks
extended angr and combined angr and AFL intro Driller. Surprisingly, fuzzing is
the most effective technique at finding bugs. Generating random inputs and
feeding those into a program discovers a large amount of vulnerabilities but has
the tendency to get stuck with limited coverage. AFL tries to address the
coverage problem through a path-guided analysis that records which paths were
already evaluated and forces input mutations to evaluate alternate paths. In
Driller, whenever AFL gets stuck it evaluates the paths using symbolic execution
to find alternate inputs that trigger new paths.
In the later part of the keynote, Giovanni also talked about some details of the
cyber grand challenge, infrastructure availability (never segfault your
infrastructure), analysis scalability (how to cope with limited resources), and
the performance/security trade-off.
In our current system the attacker is at an inherent advantage as it takes one
single vulnerability to bring down a system but the defender needs to cover all
bases. We need to move forward as a community to provide better analysis tools
and better general defense techniques that actually hold up to attacks.
Automatic Dynamic Firmware Analysis at Scale: A Case Study on Embedded Web Interfaces
In this work, Andrei Costin, Apostolis Zarras, and Aurelien Francillon extend
their framework for automatic firmware collection and extraction. In addition to
searching for simple bugs using pattern matching, the now force run the images
inside QEMU and run the web interface in the image. Their existing
infrastructure already collects information and extracts the individual files of
the firmware. They have now built a QEMU emulator that runs some of the binaries
in the firmware. Many IoT devices like routers primarily run a single suid/root
binary that encapsulates all the functionality of the device.
After getting the binary to run through some hackery, they run basic
vulnerability discovery and penetration testing tools to find vulnerabilities in
the services and had good results. A problem they ran into was that those
service binaries often have hardware specific calls and kernel issues that
reduce their coverage.
In the last year we (Craig West, Jacek Rzeniewicz, and myself) have looked at a
similar problem. We got stuck at the same location where the service binaries
were calling into the kernel or reading/writing privileges flash areas that we
could not easily emulate. Also, instead of running simple penetration testing
tolls it would be much more interesting to run something like AFL. We have tried
integrating AFL into our own QEMU-based framework (yeah, we went down almost the
same path in our research) but could not get the path-based feedback to work and
AFL was therefore limited in the input it could generate. This might be an
interesting project to continue, so if anyone is interested, please reach out.
Preventing Page Faults from Telling your Secrets
For SGX, the operating system manages individual memory pages of enclaves. This
enables a side channel where the OS restricts the amount of pages an enclave
gets and learns which pages are accessed.
Shweta Shinde, Zheng Leong Chua, Viswesh Narayanan, and Prateek Saxena present a
compiler-based defense against such pigeon-hole attacks that makes all page
accesses deterministic. The defense assumes that the OS cannot distinguish
between accesses on a single page (i.e., OS cannot learn the offset in the page,
just the page itself). Make program page fault oblivious.
The programmer then marks which part of the program is hardened against attacks
and marks code and data that is accessed. The compiler then rearranges code and
data on that page. Code and data are then moved onto staging pages before they
are used and only executed/accessed from those pages. The programmer controls
selective optimization and everything is hand tuned. Such an approach is a
simple solution but does not scale to larger code bases and involves a lot of
manual effort. A hardware-based extension allows an application to enforce a
contract that specific pages cannot be unloaded and the enclave is then informed
about page faults (such a mechanism is available in newer SGX versions.
While individual accesses are no longer observable, copying code and data to
staging pages still leaks information. In addition, the programmer effort will
reduce automation and will make it hard to deploy such a defense. While the work
presents an interesting start, I wonder about how much more this can be
automated and how effective the side-channel reduction is in practice (e.g., for
larger applications).
Cross Processor Cache Attacks
In this attack paper, Gorka Iraqzoqui, Thomas Eisenbarth, and Berk Sunar present
an interesting side channel that is based on the cache coherency protocol
instead of cache access times. All existing cache side channels like flush and
reload or prime and probe rely on inclusiveness and will not port to other
architectures like AMD. The authors target exclusive last level caches as
present on AMD architectures. Their attack enables a cross-CPU attack called
invalidate and transfer.
The attack uses the cache coherency protocol to invalidate a memory block
(flushing the block from all caches) and then waits for computation to happen.
Afterwards, the same block is requested again and the access time is measured.
If another CPU has already requested the block then the transfer time is lower
than refreshing it from DRAM, resulting in a side-channel.
ROPMEMU: A Framework for the Analysis of Complex Code-Reuse Attacks
Mariano Graziano, Davide Balzarotti, and Alain Zidouemba present a framework to
analyze ROP attacks. ROP attacks are incredibly difficult to understand as the
control flow is immensely complex. ROPEMU uses heuristics to map ROP gadgets
into equivalent instructions. Individual ROP gadgets are decompiled and matched
into individual simplified instructions based on a set of heuristics based on
flattening and simplification.
While this works on some examples, the approach has not yet been tested on
larger ROP frameworks and ROP programs. Also, the heuristics will likely break
for arbitrary programs and will need more work. I wonder if such generalized
gadget decompilation is even possible in the general case. The problem they
address is interesting and their framework does well for simple attacks (thereby
adding value). On the other hand, I wonder how generalizable the results are to
arbitrary (hand crafted) ROP programs as they are even more difficult to
analyze, simplify, and decompile than decompiling handwritten assembly programs
into, e.g., C.
Defenses likely to be broken soonish
In addition to the above mentioned attacks and defenses, AsiaCCS also had a fair
share of incremental defenses that are likely to be broken at the next
conference. In "Juggling the Gadgets: Binary-level Code Randomization using
Instruction Displacement", Hyungjoon Koo and Michalis Polychronakis assume that
fine-grained instruction randomization is in place and they target some
remaining static sequences, trying to complete the randomization and to protect
against, e.g., JIT-ROP attacks. Unfortunately, 2.5% of gadgets remain which is
likely enough for an attacker to carry out an attack (as we've seen with
coarse-grained CFI protections). Therefore, I'm not too optimistic about this
defense. It extends a complicated defense mechanism even more and still leaves a
large set of remaining gadgets at the attackers disposal.
In "No-Execute-After-Read: Preventing Code Disclosure in Commodity Software",
Jan Werner, George Baltas, Rob Dallara, Nathan Otternes, Kevin Snow, Fabian
Monrose, and Michalis Polychronakis present another mechanism to protect against
JIT-ROP attacks that relies on destructive code reads. As the same set of
authors just showed at Oakland 2 weeks before this conference, such protections
are broken by design as an attacker can (i) find a prefix before the gadget to
find the actual gadget, (ii) reload libraries after gadget discovery, or (iii)
generate multiple equal gadgets through, e.g., JavaScript. This defense is,
as-is, broken before publication (by the same authors).