After serious advertising of the NSF TTP program at several
conferences throughout last year, I've decided to submit to the NSF TTP program
last fall. The NSF TTP program is supposed to help transition research into
practice, either by forming a company to commercialize a prototype or by
developing a full usable implementation of a research prototype.
I thought to have identified a key issue with software security that I wanted to
address. At every security conference several papers will propose new
mitigations and sanitizers to protect against different forms of attack
vectors. These defenses are generally evaluated using some prototype
implementation, often on old version of LLVM. Few academic defenses
are open sourced and even fewer (none?) are upstreamed or integrated into LLVM
itself, most are simply left to bit rot. This poses two problems: usability and
maintainability.
First, and most severe, these defenses don't drive the state of security
forward and are not usable. While they claim their little space in the academic
landscape, they will not be used in practice. To be usable, defenses must be
part of the core compiler platform such as LLVM-CFI or the different sanitizers.
Developers can use them by simply using a compiler switch.
Second, if open-sourced, they often rely on an old obscure version of LLVM. If
the open-sourced prototype can be compiled at all, it will rely on an outdated
version of LLVM and may not even be compatible with recent software. For
example, compiling Google Chromium generally relies on the most recent head
version of LLVM and any older LLVM version will throw errors. Open-sourced
prototypes are generally not integrated into the LLVM development platform and
therefore will rot away quickly.
These two problems are not necessarily a fault of academia. The job of academics
is to provide a reasonable working prototype that shows the feasibility of the
system for reasonable software. Providing a complete implementation for any
software is usually too complex and upstreaming and maintaining the software
forever would incur too much overhead. A graduate student should rather work on
the next research problem than on maintaining code. Maintaining code and
upstreaming software is not part of the graduate job profile.
The goal of my TTP proposal was to identify high profile mitigations and
sanitizers and turn them from research prototypes into usable defenses by
integrating them into the LLVM platform and making them available to the general
public. Following the idea of LLVM-CFI, the proposed mitigations focused on
control-flow hijacking and sanitization focused on type safety -- both important
and upcoming areas that have several gaps that need to be filled. For example,
while there are three concurrent type safety sanitizer prototypes, none of them
has been integrated and upstreamed into LLVM due to the long and difficult
upstreaming process.
My thought was that getting sanitizers and mitigations into LLVM was a core
achievement that is reasonable in itself as a transition to practice exercise.
The upstreaming process takes a lot of time and resources, including from
turning a research prototype into a full prototype, testing, code review, and
online discussion. Upstreaming makes a defense available to all developers at
the flick of a command switch. This by itself vastly increases the impact of a
given defense. For the proposal, I identified reasonable mitigations and
sanitizers and proposed a plan on how to get them into production. The core of
the proposal focused on the upstreaming process and guaranteeing code quality
with another part focusing on outreach such as speaking at LLVM developer
conferences, hacker conferences, and industry conferences to spread information
about the different sanitizers and mitigations -- with the goal of fitting into
the existing dissemination process.
Today, I received the reviews and the proposal was ranked low competitive and
not funded. The main points against the proposal were (and I quote):
- "While the goals of incorporating work into LLVM is certainly worthwhile and
potentially high impact, there is no target adopter that is identified in the
proposal, nor is there a milestone as to when an early adopter is to be
named."
- "This is a technically sound proposal, but with a weak transition plan, which
is very important in the TTP program."
- "The PI emphasized presenting at conferences as a key transition and outreach.
From the proposal itself, the PI also notes 'Many developers are not aware of
the current research (and do not care)'. These developers are not at key
conferences and do not care if they do attend. The proposal should identify
some more concrete activities to interact with this community."
- "This transition to practice proposal has no supporting letters from any
industry collaborators."
The negative reviewers all point towards the lack of industry interaction and
want to see both a concrete plan on how to reach out to individual programmers
(programmers using LLVM, not the LLVM developers) or industry letters. I
consider both of these comments unreasonable. LLVM is an open-source product and
therefore steered by the open-source community, industry letters are out of
scope. Still, LLVM is the main compiler used for many software systems (Google
and Apple use LLVM for all their platforms) and therefore high impact.
In the proposal, I made the point that educating individual programmers (not
LLVM developers) was unreasonable as most programmers will not care about
security. The proposed approach was to convince LLVM maintainers that the
mitigations are reasonable, turning them on by default and thereby protecting
the general set of programmers. Reaching out to LLVM maintainers happens at LLVM
developer meetings which I proposed to attend.
Orthogonally, the benefit of sanitizers can be emphasized by providing features
that help programmers during software development. Tutorials and testing
platforms will educate programmers and show them how to use these sanitizers to
test their code, thereby increasing security and making them more resilient
against attacks.
Overall, I'm a little disappointed at the reviews. The reviewers primarily
focused on industry collaboration and commercialization, increasing the security
of open-source products, to programmers, and, through indirection, to all
software products compiled by LLVM seems to be out of scope for an NSF SaTC
transition to practice proposal.
Let me add as a disclaimer that the NSF reviews are generally good, insightful,
and deep. The reviewers generally do a great job and the merit based review,
while harsh, provides a fair evaluation of the submitted proposals. A challenge
for these panels is the alignment and calibration across panels as the dynamic
can vastly differ from one panel to another. And, unfortunately, some reviews
are sub par. What made me a little sad is that this was the second NSF proposal
that was rejected with two very short reviews. I'll include one in it's full
length:
A strength is this proposal focuses on mitigations of Control Flow Hijacking
from C or C++ code bases which addresses an area that is vulnerable for many
current attack vectors. A further strength is the proposed further development
of a testing environment using open source.
A strength is that this proposal envisions regular outreach with developer and
compiler communities. A related weakness is a lack of concrete plans for
outreach other than for one international hacker conference. The outreach plan
would benefit from some re-thinking.
Very specifically deals with practices that can easily be taught in
universities. The team may want to explore if there are any synergies with the
Software Assurance Marketplace https://continuousassurance.org.
This reviews has very little useful information. If I received such a review at
a conference, I'd complain. I've had several such NSF reviews. Having served on
NSF panels myself and serving on lots of program committees, seeing such reviews
makes me a little sad. Reviewing is part of the academic job profile. If you are
not interested (or able due to other constraints) to do a good job, then
decline. But this discussion should be the topic of another blog post.
In the spirit of sharing negative results, I figured that this rejection would
be a good example. I thought that the idea, enabling broad usage of academic
sanitizers and mitigations, was a great fit for the NSF SaTC TTP as it will
increase the security guarantees of our systems at large, indirectly by
protecting software compiled with the updated compiler.
The lesson I learned from this TTP proposal is to explicitly state my
assumptions. For example, upstreaming into LLVM is a significant achievement
that indirectly allows all programmers to profit from new default defense
mechanisms. Another point is to clarify the use of open-source software and the
development practices used for open-source software. I'll also have to clarify
differences between academic conferences where new research is discussed,
developer conferences where new features such as sanitizers or mitigations are
discussed, and hacker conferences where applied usage of these tools is
discussed. NSF draws reviewers from different areas and not everyone will be
familiar with these different nuances and terms.
As always, please let me know comments, thoughts, and concerns. I'm always happy
to share proposals (both funded and unfunded) given reasonable requests.