On one hand, GPUs expose broad functionality for graphics and machine learning workloads, on the other hand, this functionality may be exploited due to large amounts of unvetted code, complex functionality, and the information gap between user-space application, kernel, and the auxiliary GPU. We introduce a novel framework that allows repurposing of WebGL security checks from the Chrome browser to protect the Android kernel against active exploitation from malicious apps at low performance overhead.
This post discusses the open-source release of our CCS'18 Milkomeda paper. This is joint work between Zhihao Yao, Saeed Mirzamohammadi, Ardalan Amiri Sani, and Mathias Payer.
New usage scenarios result in new threats
With the rise of machine learning workloads and modern games that require powerful computation, GPUs have become massively parallel computing co-processors that expose a complex and versatile interface. This complex interface enables the flexibility and performance required by modern workloads but increases the attack surface. Current operating systems do not enforce scheduling and privilege separation between different GPU workloads. The operating system simply exposes the interface to user-space programs, enabling them to use the vast functionality of the hardware at low overhead.
The intended usage scenario for GPUs is a user-space library that provides access to GPU functionality that then calls into the kernel driver which forwards the data to the GPU itself where the computation is executed. This scenario exposes bugs at three different locations. First, the user-space library may be buggy, crashing the calling process. Second, the kernel driver may be buggy, crashing the kernel (and all processes). Third, the code running on the GPU may be buggy, crashing the GPU (and all kernels currently running on the GPU). It's interesting to note that a user-space program is not restricted to the functionality exported by the user-space library but the (often only partially documented) functionality of the kernel driver.
The original threat model focused on local applications either running gaming workloads or machine learning workloads. These trusted workloads may crash if they are programmed incorrectly but there was no focus on a security angle. With the rise of graphics functionality in browsers, the threat model is changing. The exposed threat surface of the GPU kernel interface is exploited through several attacks against, e.g., Google Chrome where bugs in the GPU render process allow further privilege escalation.
WebGL: exposing ioctl to JavaScript
Recently, WebGL enables untrusted websites to access the OpenGL interface through JavaScript. While this is great news for JavaScript programmers that want to program 3D workloads, this is terrible news for security as a highly complex interface is now exposed to untrusted code.
("This is fine" comic by KC Green.)
Given the uncertain provenance of the code and the large amount of security vulnerabilities in user-space libraries and kernel drivers, the Google Chrome team deployed a safety net: an interposition layer that checks every GPU call before it is sent to the GPU. A local shim library forwards the GPU call to a separate process where the GPU state is replicated and the call is verified given the current GPU state. After passing the checks, the call is forwarded by the separate process to the GPU. While this adds some overhead due to the inter-process communication overhead (and the checks), it protects the kernel and the GPU from unvetted calls.
The figure above shows the Chrome WebGL security checks. WebGL calls are sent to the secure process where they are checked and then forwarded to the kernel.
These WebGL checks are limited along two dimensions. First, they are restricted to OpenGL calls and do not cover, e.g., the CUDA computational interface (due to both the massive additional complexity and the close-source nature of the computational interfaces). Second, they are incomplete. Due to the almost 1-1 mapping between OpenGL and WebGL, the amount of functionality is massive and checks are therefore reactive. The Chrome developers have added checks for frequently attacked interfaces or interfaces with certain bug patterns. The checks are continuously extended and improved, increasing the guarantees with every release.
Android GPU security
Android is exposed to similar issues as WebGL: untrusted applications ("apps") access the exposed GPU interface either through native libraries (hopefully the ones supplied by the Android systems) but may also access the native ioctl interface directly.
The figure above shows the default Android security stack: OpenGL (and other GPU calls) are never vetted and applications have direct access to the exposed interface.
The only reason why we did not yet see a large amount of local privilege escalation attacks against Android through the GPU interface is the combination of lack of knowledge about this interface and the availability of easier targets. With the deployment of new defenses on Android, the GPU interface will become a prime target.
Milkomeda: reuse checks
We decouple the GPU interface from user-space processes and force all interactions with the GPU through our interposition layer. The key idea of our system is to automatically reuse the WebGL checks from Chrome, extracting them dynamically and weaving them into our interposition layer. This allows us to reduce the cost of check development. Additionally, new checks will be imported automatically when they are added to Chrome, further reducing maintenance cost.
For WebGL, performance is less critical than for "native" Android applications. We therefore design and implement a safe area in the application process that executes the GPU checks. During regular execution, the safe area remains hidden. Through a call gate that is injected when GPU functionality is accessed, control-flow is transferred to this safe area. All safety critical arguments are copied into the safe area where they are checked. Non-safety critical arguments (as specified by the checks) can remain in the process and the checks can inspect them without additional overhead.
The figure above shows the Milkomeda layout: OpenGL calls are redirected to the safe area where they are vetted and checked before being forwarded to the kernel, protecting the Android system from potentially malicious applications at low performance overhead.
For more technical details and a discussion of design and implementation trade-offs, please refer to the ACM CCS'18 Milkomeda paper. We have also released the full source code of our Milkomeda implementation on GitHub, ready for reproduction of our results as well as future extensions!