On differences between the CFI, CPS, and CPI properties

At OSDI'14 we published our paper on [1] where we introduce two new security properties that protect programs against control-flow hijack attacks enabled by memory corruption vulnerabilities. The design space in this area is already very cluttered and we use this blog post to highlight the differences between the individual properties.

Some limitations of this post: we do not discuss actual implementations and vulnerabilities in these implementations but compare the properties itself (assuming the best implementation that is theoretically possible). Where we cite overhead numbers we will refer to strongest prototype implementations of the properties. This post is also limited in scope, for a complete overview of all defense mechanisms we refer to the Eternal War in Memory paper and the related work section of our [1] paper.

Control-Flow Integrity (CFI)

The CFI property [2] ensures that the control-flow of the application stays within the control-flow graph, as determined ahead of execution. The original CFI paper, as well as many its follow-ups determine this CFG using static analysis. This analysis determines, for each indirect call site, a set of functions that can be called from that call site on any program execution. All indirect call sites are then instrumented with runtime checks to enforce that the function being actually called is one of the functions from the statically-determined set for that call site.

CFI does not prevent memory corruptions, instead it simply does a sanity check on function pointers before they're actually used. The original CFI paper also proposed to enforce the integrity of return addresses on the stack through the use of a shadow stack.

CFI Limitations

CFI restricts the set of targets to which an attacker might redirect the control-flow of a program, however, this set necessarily contains all functions that might be called at a given call site on any program execution, not just the current one.

To illustrate this, consider the classical callback function pattern. Say we have a Linux driver that installs a callback for a read() operation on the device file that it manages. The driver sets one of its functions as the callback upon its initialization, and the kernel will call this function whenever a read() operation on the corresponding device file is performed. Now, the indirect call instruction in the kernel that does this dispatch is the same for all drivers, so on different executions this instruction might call a callback from any driver. CFI can, at most, enforce that this instruction calls any callback for a read() operation from any driver. An attacker who can overwrite the callback pointer might set it to any function from the above set. If this call is followed by another indirect function call, an attacker might even setup a chain of functions to execute.

Unfortunately, there are further limitations in practice. Doing the above requires precise full-program static analysis: it must look at the entire kernel and all possible drivers that the kernel can ever load. This is hard in practice, and we are not aware of such analysis being ever done precisely on non-trivial systems (not to mention that it's also a theoretically undecidable problem). In practice, the CFI enforcement mechanism simply will not know which functions the dispatch instruction in the above example might call, so it will just enforce that it calls any valid function. This was the case for the original CFI implementation and, although the follow-up work can reduce this set in certain situations (such as virtual function calls), any CFI implementation we are aware of would have cases where this set contains almost every program function as well.

An potential attack against CFI mechanism was first introduced by Göktas et al. in the [3] paper that appeared at the Oakland'14 conference. The paper introduces several new types of gadgets (i.e., fragments of existing application code that might be reused by an attacker in a way that was not intended by the application), including call gadgets. Such gadgets start at a regular function prologue and end with an indirect function call, which enables to transfer control to the next gadget in the chain. The program might have enough variety of such linkable gadgets to perform meaningful computation: which is called a call-oriented programming (analogous to return-oriented programming, which similarly reuses gadgets that end with a return instruction). Göktas relied on a combination of call gadgets and other types of gadgets to mount a practical attack against a coarse-grained CFI implementation. This work can be generalized, for some programs, to bypasses a fine-grained CFI enforcement mechanism. We do have both simple and realistic examples of such programs. E.g., we where able to construct such an attack for the Perl interpreter (although, to be fair, we did not automate or weaponize the attack yet).

Code-Pointer Separation (CPS)

CPS prevents an attacker from forging a code pointer out of thin air. At most, an attacker can cause a program to call a function whose address was previously taken during current execution. In the above example of a Linux driver, the dispatch instruction can be potentially subverted to call, e.g., a valid callback function of a registered driver, but not a callback from a driver that is not registered on the current run of the kernel or a function that is never used as a callback.

Causing a program to call wrong function under CPS requires to (i) find an indirect function call that gets the pointer to be called through another level of indirection, e.g., struct_ptr->func_ptr() but not just func_ptr(); (ii) use a memory corruption to overwrite the pointer that is used to get the function pointer indirectly, e.g., struct_ptr, to a location that already contains a valid function pointer at a proper offset, and (iii) ensure that the program would not crash if the overwritten pointer, e.g., struct_ptr, is used after it was overwritten but before the function is called.

CPS ensures precise protection of the return address using safe stack: this address cannot be overwritten by the attacker in the first place. Moreover, the safe stack also protects all register spills as well as most of the local variables. The safe stack, as implemented for CPI and CPS, has much better performance characteristics than a shadow stack. For details, see our CPI paper.

Code-Pointer Integrity (CPI)

CPI enforces memory safety for all sensitive pointers, which include all code pointers (just as under CPS) but also all pointers that are used to access other sensitive pointers indirectly. All such pointers are protected through runtime memory safety checks. CPI guarantees that an attacker will not be able to use a memory corruption vulnerability in order to directly change the value of any sensitive pointer and cause the program to jump to a wrong location as a result. The only attack vector that remains open is data-based attacks, preventing those requires full memory safety.

In contrast to CFI, CPI does not execute any sanity checks on the code pointers before using them, but instead prevents an attacker from corrupting such pointers in the first place.

Property comparison

Let's compare the security CFI, CPS, and CPI security policies on several attack scenarios:

  1. An attacker attempts to overwrite the return pointer on the stack:
  • CFI: the strong CFI, as introduced by Abadi et al., includes optional return address protection that is based on the shadow stack, which can prevent such attack. To the best of our knowledge, the shadow stack causes around 5% extra performance overhead on average, hence Abadi et al. made it optional. Many modern CFI solutions provide weaker return pointer protection for performance reasons.
  • CPS & CPI: the return address cannot be overwritten by a corrupt memory access through the safe stack, which has nearly zero performance overhead on average.
  1. An attacker attempts to overwrite a code pointer stored in memory using memory corruption vulnerability:
  • CFI: the attacker may redirect the control-flow to any target in the set of targets permitted for the call site where the pointer is used. In practice, this set often includes every function in the program (as was the case for both the original CFI implementation and multiple follow-ups) or wide sets of functions (e.g., all functions with the same number of arguments). The most recent CFI implementation [4] reduces the size of these sets for virtual function calls in C++, but not generic indirect function calls.
  • CPS & CPI: code pointers cannot be overwritten by corrupt data pointer dereferences.
  1. An attacker attempts to overwrite a data pointer that is used to access the code pointer indirectly (e.g., struct_ptr in struct_ptr->func_ptr()):
  • CFI: same limitations as above.
  • CPS: the attacker may redirect the control flow to any valid function whose address has already been taken during current execution and is still stored in the safe memory.
  • CPI: all such indirect pointers are protected and cannot be overwritten by a corrupt pointer dereferences.
  1. An attacker attempts to change program data and cause it to take different execution path (e.g., by changing the decisions made in the if statements):
  • CFI: no guarantees
  • CPS: no guarantees
  • CPI: no guarantees, but some of the pointers are protected.

Comparing CFI and CPS shows that CPS is stronger for scenario B, protecting the application from attack. For scenario C, CFI may allow a smaller set of targets then CPS, however, CPS makes it harder to make the program use wrong code pointer in the first place. The security comparison between CFI and CPS in this case depends on the program being protected.

Comparing CPS and CPI shows that CPI is stronger than CPS for scenario C.

Comparing CFI and CPI shows that CPI is stronger than CFI in scenarios B and C due to the memory safety guarantees that it provides for all sensitive pointers. CFI enforces a static property that was determined at compile time, whereas CPI enforces actual integrity of all sensitive pointers.

Performance comparison

Comparing performance is hard, especially as most CFI implementations are not open-source. Also, we did not yet have time to evaluate the open-source VTV implementation. As some indication at performance we list the performance numbers as reported in the actual papers. Please be aware that these numbers were measured on different systems under different configurations and serve -- at best -- as an indication of actual performance overhead.

Safe Stack / CPS / CPI:

Property Average Median Maximum Benchmark Reported in
Safe stack 0.0% 0.0% 4.1% C/C++ SPEC2006 CPU [1]
CPS 1.9% 0.4% 17.2% C/C++ SPEC2006 CPU [1]
CPI 8.4% 0.4% 44.2% C/C++ SPEC2006 CPU [1]

VTV and IFCC ([4]):

The paper introduces VTV (virtual-table verification) and IFCC (indirect function-call checks). The former protects virtual function calls, ensuring that, at any given call site, only the virtual function that corresponds to a static type of that call site or its subtype might be called. The IFCC restricts indirect function calls to either any valid function, or any function with the same number of arguments as the call site.

Note that neither VTV nor IFCC provide return addresses protection.

[4] reports "1% to 8.4% overhead" for VTV only as evaluated on 7 (out of 19) SPEC2006 CPU benchmarks and Google Chrome. The reported overhead for IFCC only is around 4%.