Type++: A Type-Safe C++ Dialect

The C++ language combines a massive potential for raw power with the massive risk of type and memory safety violations. The developer is inherently responsible for securing all executed code and to guarantee type safety and memory safety. We are particularly focused on type safety. In C++, developers can cast objects from one type to another. While upcasts in the type lattice are always safe, downcasts may be unsafe as the size (and fields) of a child object may be different from the parent. In existing code bases, only few---if any---casts are validated at runtime due to not just performance cost but due to an inherent incompatibility: thanks to C++s relation with C, most objects are simple arrays of bytes without any associated type information. Only few classes, namely those with virtual functions, carry type information. To implement virtual dispatch, C++ adds a field to the class that identifies the type and allows virtual functions to call the corresponding method according to the underlying type of the object. Only a small percentage of classes have virtual methods and only those can be checked at runtime.

typepp1

To ensure compatibility between C and C++, existing mechanisms tried to store disjoint type metadata. For each allocated object they allocated type information and for each cast, they checked that the cast object actually corresponds to the correct type. This disjoint metadata is expensive to manage and results in incompatibilities as cleaning up metadata after objects are destructed is often omitted.

typepp2

Our key idea with type++ is to create a dialect of C++ that explicitly incorporates type information into all classes. This embedded type information, that is present in all allocated objects, allows efficient type checks for all casts. In our implementation, we make all type casts explicit and leverage this runtime information to validate each cast at runtime.

Our evaluation shows that even for large projects, only few lines need to be changed. For example, for the millions of lines of code of the SPEC CPU benchmarks, we only change 125 lines for SPEC CPU2006 and 131 lines for SPEC CPU2017. For Chromium, we only change 229 lines of code to protect large parts of the 35 millions lines of code. The performance overhead ranges around a very reasonable 1%. We conclude that enforcing type safety across even large projects is feasible with minimal code changes. Developers should aim at full type safety and protect their code against type confusion attacks!

This work was a collaboration among Nicolas Badoux, Flavio Toffalini, Yuseok Jeon, and Mathias Payer all at the HexHive as part of Nicolas' main PhD project. This work received all artifact evaluation badges and, at the conference, received a distinguished paper award.

links

social