Oakland from a system security perspective

This year's Oakland (the IEEE Symposium on Security and Privacy, formerly held in Oakland, California) has been a wild ride. Just a little more than a week before Oakland I've been in the bay area at the Usenix Security PC meeting at Google in Mountain View, talking to many folks I saw again at Oakland. A little unfortunately, the deadline for CCS overlapped with the Oakland conference, so most folks were pretty busy wrapping up their CCS submissions during at least the first day of Oakland (and up into the night as the CCS deadline was 4am local time).

As I am getting scientifically more mature ("older"), the individual paper presentations become less important (as I've likely already read the ones in my core area and skimmed the ones that I'm otherwise interested in). Nowadays, meeting other folks aka the hallway track dominates my conference schedule. One skill I'm still terrible at is introducing people and finding good seats with interesting folks during lunches and dinners. I guess, that's just part of my introvert legacy and a skill that I have to become better at.

It was interesting to catch up with old friends and also to meet other colleagues. Several important topics stood out and were discussed multiple times across many dimensions: (i) review culture in system security, (ii) moving towards a rolling submission model, and (iii) benchmarking (crimes) in system security. All these topics carry a lot of tension from different angles and therefore warrant their own blog posts. Bear with me and stay tuned.

At the beginning of the conference, Ulfar Erlingsson gave a great peek into the review culture and fed us some interesting statistics. Oakland had 411 initial submissions whereas 399 received two initial reviews (the others were dropped due to format issues). Out of the 399 submissions, 205 made it to round 2, the others received early rejects. 98 papers then continued to round 3 and 55 were accepted as part of the PC discussion. What surprised me a little was that there were only few Systematization of Knowledge papers in the program. I wonder if there were just fewer submitted or if the review was more competitive for them.

Despite the interesting and tempting hallway track (and the CCS deadline), I tried to attend as many talks related to system security as I could. Overall, there were several talks that made the conference worthwhile. I've also added a large set of the talks to the proposed reading list for the system security seminar in fall semester. Some of the papers, I'd like to quickly highlight, hopefully encouraging others to go read the underlying papers and start a more open discussion.

SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis

In this "survey" paper, Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Andrew Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, and Giovanni Vigna present their angr framework. Angr is basically a reimplementation of the state of the art of binary analysis techniques and a combination of static analysis and symbolic execution techniques. At the core, angr is built to have a large set of flexible modules that can be combined in different ways, e.g., a dynamic loader can be used to analyze the structure of a binary that is then passed to a disassembler where a symbolic execution engine then tries to fine a satisfying assignment of the program representation of a function to trigger a vulnerability condition. The whole platform is implemented in Python and relies on VEX to disassemble binary blobs. Building on VEX has the advantage that a large set of architectures are supported and allows cross-platform binary analysis.

The strength of angr lies in the compositional framework, allowing a user to build up a sophisticated analysis platform based on a set of reusable modules. Compared to other binary analysis toolkits that exist, angr is open-source and can be repurposed and retargeted to other tasks. So far, a lot of research has focused on single-task binary analysis platforms that are not compositional and cannot be repurposed to vastly different tasks. Each new research question forced users to build a new prototype implementation, often starting from scratch. Angr is a step towards an open platform of binary analysis engines.

What I am wondering is, if VEX is the optimal format for binary analysis. Several formats have been proposed and used in different prototypes (VEX, BAP IR, QEMU Tiny Code, and many other ad hoc formats). These intermediate representations are usually rather low level and close to an RISC-like ISA to reduce complexity (CISC is so incredibly complex and vast that nobody wants to design an IR that is complex enough to capture all of it). Especially for x86/x86-64, there is a lot of translation overhead going down and up again which makes efficient analysis difficult. I wonder if we should look into better low level intermediate representations, especially as the x86 architecture will stick with us for a while.

A Tough call: Mitigating Advanced Code-Reuse Attacks At The Binary Level

After their PathArmor paper from last year, Victor van der Veen and Enes Goktas, Moritz Contag and Andre Pawlowski, Xi Chen, Sanjay Rawat, Herbert Bos, Thorsten Holz, and Elias Athanasopoulos and Cristiano Giuffrida present an extension of the Control-Flow Integrity (CFI) policy presented there that increases the precision of the enforced policy. CFI is hard to get right as precision of the underlying policy is always a problem, especially for binaries. Source-based solutions usually compute a type-based approximation of a CFI policy that ensures tight bounds on each indirect call. For binary-only solutions, such rich type information is not available.

TypeArmor recovers as much of the function prototype information as possible, figuring out the arity of each function (i.e., how many arguments the function takes). For each call site, the same analysis concludes how many arguments are passed to the function and then the two pairs are matched. This analysis allows the distinction of CFI target classes based on the arity of functions, which is more precise than the lump set of all functions. This policy is comparable to Google's forward-edge arity CFI policy in IFCC that we looked at for our CFI survey paper. Different from Google's CFI policy, Victor et al. propose to trash any unused caller-saved register. The underlying assumption is that this hinders exploits as it makes it much harder for the attacker to keep control over register values across function calls. This aspect basically proposes a new calling convention that trashes any unused registers across function calls, making exploits less likely as the number of arguments will have to match the desired registers, adding another degree of complexity.

It might be interesting to explore such security-sensitive calling conventions on the compiler level as well. Clearing all unused and caller-saved registers before calling will increase the complexity of exploit development. I wonder how much this would increase actual security in practice and how we could quantify this increase.

Return of the Zombie Gadgets: Undermining Destructive Code Reads via Code-Inference Attacks

In a slightly weird turn of events, Kevin Z. Snow, Roman Rogowski, Fabian Monrose, Jan Werner, Hyungjoon Koo and Michalis Polychronakis presented the attack paper against destructive code reads before presenting their defense based on destructive code reads at AsiaCCS one week later. At the core, Kevin et al. claim that destructive code reads will not work as an attacker can always figure out a set of gadgets and then leverage relative offsets to create actual gadgets.

Fine-grained randomization defenses only work if the attacker cannot read the binary code using an information leak. Otherwise, the attacker would just leak the binary code and find the gadgets using dynamic ROP gadget finding (JIT-ROP). Unfortunately, code cannot easily be made execute-only as there is a lot of interleaving between code and data. The underlying assumption of destructive code reads is that instructions that read the code will destroy the underlying gadget as a side effect, i.e., after a code region has been read it cannot be executed or execution of that code region will trap. In that sense, destructive code reads are a protection against JIT-ROP to re-enable fine-grained randomization as an effective defense (seems as if we are layering hot-patches to mitigate different defenses here).

Not surprisingly, destructive code reads are not effective as an attacker who is interested in sequence A of bytes can search for the prefix B if she knows that BA exists in the program. If B is unique, reading B will allow the attacker to use gadget A which has not been compromised through the destructive read. The authors call this disassociation and code inference about gadget locations. In addition to disassociation, two alternative methods allow mitigation of destructive reads: singularity (generating the same gadgets all over again, i.e., through JIT gadget generation) and persistence (reloading a library after disclosure or across processes).

HDFI: Hardware-assisted Data-Flow Isolation

Data-Flow Integrity (DFI) protects software from corruptions based on data-flow. Only locations (instructions) that are allowed to write to a certain location are allowed to execute the actual write (or read). Castro et al. proposed DFI (OSDI'06) as a compiler-based enforcement mechanism. A static analysis used type inference to find equivalence classes, only instructions of the correct equivalence class are allowed to read/write a memory location. A runtime mechanism tracks equivalence classes for each memory location.

The HDFI paper by Chengyu Son, Hyungon Moon, Monjur Alam, Insu Yun, Byoungyoung Lee, Taesoo Kim, Wenke Lee, and Yunheung Paek, implements an ISA extension for Data-Flow Integrity. In the paper, the authors assume that a compiler-based type analysis assigns types (the pointer analysis is not their contribution). The tables are then checked using the new hardware instructions, significantly reducing overhead.

One of the questions I had was how severe the changes on the underlying architecture are given that a lot of additional information needs to be tracked. As I'm not working close to the hardware I cannot assess the likelihood of such a mechanism to succeed but I found the discussion and approach of implementing such a defense mechanism in hardware interesting. On more general terms, it would be interesting to see how (small) ISA extensions could significantly reduce the performance overhead of defense mechanisms. E.g., a generic two-level lookup for metadata (similar to a page table lookup) as used by, e.g., binary translators, meta data managers, and other mechanisms would significantly improve many defense mechanisms, yet still stay generic enough to allow different mechanisms to compete.

Data-Oriented Programming: On the Expressiveness of Non-Control Data Attacks

We have known that data-oriented attacks are powerful for a while. Last year we have published Control-Flow Bending and more recently there was Control Jujutsu, both papers looking at different forms of data-oriented attacks that allow Turing-complete computation. Control-Flow Bending is a more general approach that requires (lots of) human analysis while Control Jujutsu would allow for more automation.

In this paper, Hong Hu, Shweta Shinde, Adrian Sendroiu, Zheng Leong Chua, Prateek Saxena, and Zhenkai Liang look at more powerful automation of data-oriented attacks. The authors build an attack by evaluating the program and finding a path through the program that does not change control-flow explicitly but implicitly. Instead of corrupting code pointers, they use memory safety violations to steer execution to arbitrary computation. This is an interesting paper that shows how far we can (currently) scale automatic analysis before we run into state explosions. Future research will look at finding better heuristics to tune the search, hopefully restricting the search space so that more powerful adversarial programs can be engineered.

Dedup Est Machina: Memory Deduplication as an Advanced Exploitation Vector

This was a simple, yet beautiful paper by Erik Bosman, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida that combines memory deduplication to leak memory secrets and rowhammer to escalate privileges. In a first step, the authors use memory dedupllication available in Windows 10 to leak ASLR base addresses. Memory deduplication merges equal pages across processes. Writing to a page triggers a copy-on-write which the attacker can user as a side channel. See our earlier paper on CAIN for more details. In a second step, they use rowhammer to toggle "god mode" in Internet Explorer, allowing the malicious JavaScript to escape the sandbox.

LAVA: Large-scale Automated Vulnerability Addition

The LAVA paper by Dolan-Gavitt, Patrick Hulin, Engin Kirda, Tim Leek, Andrea Mambretti, Wil Robertson, Frederick Ulrich, and Ryan Whelan discusses a large-scale framework to evaluate and test defense mechanisms. Evaluating defense mechanisms based on a large scale corpus is a hard problem. This paper proposes an automated framework that leverages dynamic taint analysis to automatically generate test cases with vulnerabilities from benign programs. These test cases can then be used to test defense mechanisms and let defense mechanisms compete against each others. Defense mechanisms can be dynamic, i.e., protecting against memory corruptions or type confusions at runtime) or static, i.e., finding vulnerabilities through static analysis at compile time.

Such a framework has long been overdue and it would allow us to let different mechanisms compete with each other, maybe even ranking different defense mechanisms in how well they protect against different forms of attack. The framework first generates a set of test cases based on a bug/error specification and then evaluates the different selected mechanisms against each other.

There are plans to open-source the framework and the data-set but due to corporate policies this will take more time.

blogroll

social