The last three weeks I've been traveling through China, Hong Kong, and Macau on an interesting security tour thanks to this year's AsiaCCS being held in Xi'an, China. AsiaCCS was right after Oakland, so I flew directly from San Francisco to Xi'an China and then continued to visit friends at Beijing, Shanghai, and Hong Kong/Macau. Overall, Asia has been a breathtaking experience with an immense set of impressions that I'll try to summarize in a country-specific blog post. Here, I'll focus on the research aspects and the conference.
AsiaCCS is in the process of migrating from a symposium to a full conference. While not exactly in the big four, the conference is still fairly competitive and has a tendency to accept good papers that just did not make it at the big four. One of the challenges, shared with CCS proper, is that there is no physical PC meeting and therefore no overall quality control and discussion of the papers. This shows, in my opinion, in the slightly higher randomness in paper selection that we see each year compared to conferences that have real on-site PC meetings.
This was my first AsiaCCS and I left with an OK impression. The conference was fairly well organized with an exciting dinner and great people to talk to. Compared to the big four, AsiaCCS has the problem that many people only attend for a day and then head off for sight-seeing, so if you want to meet with others you actively have to coordinate (compared to other conferences where you'll just walk into each other by accident as, let's be honest, San Jose does not have that much to offer for tourists). On the downside, Internet is a mess in China with lots of sites blocked, connections timing out, and even VPN sessions being randomly killed and subverted by the great firewall. After a couple of annoying weeks I found out that an SSH SOCKS proxy works much better than OpenVPN.
On the paper site, I attended all interesting sessions and also asked my fair share of questions. There were a bunch of interesting papers and keynotes that I'll discuss below. In general, I enjoyed the diverse keynotes, especially Michael Backes' call for privacy research and Giovanni Vigna's shameless plug for angr and collaborative systems research.
TouchAlytics
Let me begin this blog post with a shameless plug of our TouchAlytics work. We finally published our TouchAlytics paper at AsiaCCS (slides) and I was the only author who was willing and managed to get a visa for China. In our paper we propose a forgery-resistant touch-based authentication method that uses how people react and adapt to different environments as biometrics instead of something people "have" as in classical biometrics.
Our authentication method samples a user in different environments (that we control) and then uses this information to subtly and continuously change the underlying environment. As the user adapts her behavior, she is authenticated against the different profiles we collected. As attackers do not know what environment is used during the authentication, they cannot forge an authentication, even with perfect information of all possible environments.
In our prototype we add an adaptive layer between the touchscreen sensor and the display that allows us to stretch individual strokes into both dimensions. The application therefore receives slightly different strokes than the user executes on the touchscreen. Due to the different app behavior the user will adapt her strokes accordingly and we use this adaptation to identify and authenticate the user based on the slight variances. Our authentication framework is both stable and sensitive, i.e., it allows us to differentiate between different settings for a single user and between different users. This work moves biometrics from a "what you have" to a "how you react"-based authentication.
An angr'y keynote
The best talk at AsiaCCS was Giovanni's angr'y keynote in my opinion. Based on the premise that hacking is awesome, Giovanni and his group want to automate awesomeness. Hacking can be manifold and can involve hacking the user through social engineering, hacking the process through weak password resets, weak PINs, or bruteforce attacks, or hacking the code. Hacking the code is the most involved as actual knowledge and intelligence is needed. The question is if we can incorporate the domain knowledge and intelligence into a tool. Angr is a framework that tries to achieve that.
Binary code on the one hand is incredibly difficult as it has a (very) low abstraction level, no structured types, no modules and no defined functions. In addition, compiler optimizations make code very complex. On the other hand, binary code is truthful, what you see is what you execute. In manual vulnerability analysis, a very intelligent person stares at the code and sees what she can find. This approach discovers deep and complex vulnerabilities but does not scale. The holy grail of vulnerability research is a magic tool that, when run, finds the vulnerability and develops a patch/exploit for it.
Automatic vulnerability analysis systems have a high level policy and try to force violations. Such an approach requires replayability, i.e., the ability to generate attack instances. These systems try to generate inputs that, when fed to the program generate a violation. Such a violation is then a proof-of-concept exploit (depending on the high level policy). An orthogonal aspect is semantic insight, i.e., the ability to understand the root cause of the crash which will allow the attacker to abstract and generalize from the single fault.
A problem that automatic vulnerability analysis systems face is that high replayability implies low coverage, low replayability implies false positives, semantic insight implies high overhead, replayability and semantic insight imply low scalability and lack of soundness which result in false negatives. Therefore heuristics need to balance between these different options to achieve good results. Both static and dynamic analyses can be used to evaluate the search space.
Static analysis has the advantage of high coverage but is complex and runs into the aliasing problem. Dynamic analysis on the other hand has high replayability, does not worry about aliasing but runs into coverage problems. So far, angr focused on a binary analysis toolkit, providing static analysis and symbolic execution. For the DARPA cyber grand challenge, the UC Santa Barbara folks extended angr and combined angr and AFL intro Driller. Surprisingly, fuzzing is the most effective technique at finding bugs. Generating random inputs and feeding those into a program discovers a large amount of vulnerabilities but has the tendency to get stuck with limited coverage. AFL tries to address the coverage problem through a path-guided analysis that records which paths were already evaluated and forces input mutations to evaluate alternate paths. In Driller, whenever AFL gets stuck it evaluates the paths using symbolic execution to find alternate inputs that trigger new paths.
In the later part of the keynote, Giovanni also talked about some details of the cyber grand challenge, infrastructure availability (never segfault your infrastructure), analysis scalability (how to cope with limited resources), and the performance/security trade-off.
In our current system the attacker is at an inherent advantage as it takes one single vulnerability to bring down a system but the defender needs to cover all bases. We need to move forward as a community to provide better analysis tools and better general defense techniques that actually hold up to attacks.
Automatic Dynamic Firmware Analysis at Scale: A Case Study on Embedded Web Interfaces
In this work, Andrei Costin, Apostolis Zarras, and Aurelien Francillon extend their framework for automatic firmware collection and extraction. In addition to searching for simple bugs using pattern matching, the now force run the images inside QEMU and run the web interface in the image. Their existing infrastructure already collects information and extracts the individual files of the firmware. They have now built a QEMU emulator that runs some of the binaries in the firmware. Many IoT devices like routers primarily run a single suid/root binary that encapsulates all the functionality of the device.
After getting the binary to run through some hackery, they run basic vulnerability discovery and penetration testing tools to find vulnerabilities in the services and had good results. A problem they ran into was that those service binaries often have hardware specific calls and kernel issues that reduce their coverage.
In the last year we (Craig West, Jacek Rzeniewicz, and myself) have looked at a similar problem. We got stuck at the same location where the service binaries were calling into the kernel or reading/writing privileges flash areas that we could not easily emulate. Also, instead of running simple penetration testing tolls it would be much more interesting to run something like AFL. We have tried integrating AFL into our own QEMU-based framework (yeah, we went down almost the same path in our research) but could not get the path-based feedback to work and AFL was therefore limited in the input it could generate. This might be an interesting project to continue, so if anyone is interested, please reach out.
ORIGEN: Automatic Extraction of Offset-Revealing Instructions for Cross-Version Memory Analysis
An interesting challenge for forensics tools is to recover data structures from memory images. Unfortunately, these data structures change as new fields are added or removed whenever software evolves. The fingerprints are therefore tied to specific software versions.
Qian Feng, Aravind Prakash, Minghua Wang, Curtis Carmony, and Heng Yin evaluated how software evolves and how structures change between releases. They line up code that accesses the same struct and evaluate the offsets of individual instructions. As the offsets change across releases they can recover the different offsets and therefore discover the changes in the struct by simply lining up the code that accesses the same struct and evaluating the different offsets.
During the Q and A I wondered how resilient the approach is to changes in compiler settings and across compiler optimizations. This will unfortunately disrupt the tracking and pose some difficulties, so more research is needed in that regard. Nevertheless, this work is an interesting approach to this challenge.
Preventing Page Faults from Telling your Secrets
For SGX, the operating system manages individual memory pages of enclaves. This enables a side channel where the OS restricts the amount of pages an enclave gets and learns which pages are accessed.
Shweta Shinde, Zheng Leong Chua, Viswesh Narayanan, and Prateek Saxena present a compiler-based defense against such pigeon-hole attacks that makes all page accesses deterministic. The defense assumes that the OS cannot distinguish between accesses on a single page (i.e., OS cannot learn the offset in the page, just the page itself). Make program page fault oblivious.
The programmer then marks which part of the program is hardened against attacks and marks code and data that is accessed. The compiler then rearranges code and data on that page. Code and data are then moved onto staging pages before they are used and only executed/accessed from those pages. The programmer controls selective optimization and everything is hand tuned. Such an approach is a simple solution but does not scale to larger code bases and involves a lot of manual effort. A hardware-based extension allows an application to enforce a contract that specific pages cannot be unloaded and the enclave is then informed about page faults (such a mechanism is available in newer SGX versions.
While individual accesses are no longer observable, copying code and data to staging pages still leaks information. In addition, the programmer effort will reduce automation and will make it hard to deploy such a defense. While the work presents an interesting start, I wonder about how much more this can be automated and how effective the side-channel reduction is in practice (e.g., for larger applications).
Cross Processor Cache Attacks
In this attack paper, Gorka Iraqzoqui, Thomas Eisenbarth, and Berk Sunar present an interesting side channel that is based on the cache coherency protocol instead of cache access times. All existing cache side channels like flush and reload or prime and probe rely on inclusiveness and will not port to other architectures like AMD. The authors target exclusive last level caches as present on AMD architectures. Their attack enables a cross-CPU attack called invalidate and transfer.
The attack uses the cache coherency protocol to invalidate a memory block (flushing the block from all caches) and then waits for computation to happen. Afterwards, the same block is requested again and the access time is measured. If another CPU has already requested the block then the transfer time is lower than refreshing it from DRAM, resulting in a side-channel.
ROPMEMU: A Framework for the Analysis of Complex Code-Reuse Attacks
Mariano Graziano, Davide Balzarotti, and Alain Zidouemba present a framework to analyze ROP attacks. ROP attacks are incredibly difficult to understand as the control flow is immensely complex. ROPEMU uses heuristics to map ROP gadgets into equivalent instructions. Individual ROP gadgets are decompiled and matched into individual simplified instructions based on a set of heuristics based on flattening and simplification.
While this works on some examples, the approach has not yet been tested on larger ROP frameworks and ROP programs. Also, the heuristics will likely break for arbitrary programs and will need more work. I wonder if such generalized gadget decompilation is even possible in the general case. The problem they address is interesting and their framework does well for simple attacks (thereby adding value). On the other hand, I wonder how generalizable the results are to arbitrary (hand crafted) ROP programs as they are even more difficult to analyze, simplify, and decompile than decompiling handwritten assembly programs into, e.g., C.
Defenses likely to be broken soonish
In addition to the above mentioned attacks and defenses, AsiaCCS also had a fair share of incremental defenses that are likely to be broken at the next conference. In "Juggling the Gadgets: Binary-level Code Randomization using Instruction Displacement", Hyungjoon Koo and Michalis Polychronakis assume that fine-grained instruction randomization is in place and they target some remaining static sequences, trying to complete the randomization and to protect against, e.g., JIT-ROP attacks. Unfortunately, 2.5% of gadgets remain which is likely enough for an attacker to carry out an attack (as we've seen with coarse-grained CFI protections). Therefore, I'm not too optimistic about this defense. It extends a complicated defense mechanism even more and still leaves a large set of remaining gadgets at the attackers disposal.
In "No-Execute-After-Read: Preventing Code Disclosure in Commodity Software", Jan Werner, George Baltas, Rob Dallara, Nathan Otternes, Kevin Snow, Fabian Monrose, and Michalis Polychronakis present another mechanism to protect against JIT-ROP attacks that relies on destructive code reads. As the same set of authors just showed at Oakland 2 weeks before this conference, such protections are broken by design as an attacker can (i) find a prefix before the gadget to find the actual gadget, (ii) reload libraries after gadget discovery, or (iii) generate multiple equal gadgets through, e.g., JavaScript. This defense is, as-is, broken before publication (by the same authors).