This blog has promoted control flow integrity (CFI) as a game changing security mitigation and encouraged its use. We wanted to take our own security advice and start securing software we use. To that end, we decided to apply CFI to facebook’s osquery, a cross-platform codebase with which we are deeply familiar. Using osquery, we could compare clang’s implementation of CFI (ClangCFI) against Visual Studio’s Control Flow Guard (CFGuard).
That comparison never happened.
Instead, this blog post is going to be about a very important but underappreciated aspect of security mitigations: development costs and ease of use. We will describe our adventures in applying control flow integrity protections to osquery, and how seemingly small tradeoffs in security mitigations have serious implications for usability.
The plan was simple: we would enable CFGuard for the Windows build of osquery, and ClangCFI for the Linux build of osquery. The difference between the protected and unprotected builds on osquery’s test suite would be the quantitative measurement. We’d contribute our patches back to the osquery code, resulting in a great blog post and a more secure osquery.
We got the Windows build of osquery running with CFGuard in about 15 minutes. Here is the pull request to enable CFGuard on osquery for Windows. The changes are two lines in one CMake script.
Even after weeks of effort, we still haven’t managed to enable ClangCFI on the Linux build. The discrepancy is a direct result of well meaning security choices with surprisingly far reaching consequences. The effort wasn’t for naught; we reported two clang bugs (one and two), hit a recently resolved issue, and had very insightful conversations with clang developers. They very patiently explained details of ClangCFI, identified the issues we were seeing, and graciously offered debugging assistance.
Let’s take a step-by-step walk through each security choice and the resulting consequences.
ClangCFI is stricter than CFGuard
For every protected indirect call, ClangCFI permits fewer valid destinations than CFGuard. This is good: fewer destinations means less ways to turn a bug into an exploit. ClangCFI also detects more potential errors than CFGuard (e.g. cast checks, ensuring virtual call destinations fall in the object hierarchy, etc.).
The specifics of what each CFI scheme permits has critical usability implications. For ClangCFI, an indirect call destination must match the type signature at the call site. The ClangCFI virtual method call checks are even stricter. For example, ClangCFI checks that the destination method belongs to the same object hierarchy. For CFGuard, an indirect call destination can be a valid function entry point .
ClangCFI’s type signature validation and virtual method call checks require whole-program analysis. The whole program analysis requirement results in two additional requirements:
- In general, every linked object and static library that comprise the final program must be built with CFI enabled .
- Link-time optimization (LTO) is required when using ClangCFI, because whole-program analysis is not possible until link time.
The new requirements are sensible: requiring CFI on everything ensures no part of the program is unprotected. LTO not only allows for whole-program analysis but also whole-program optimization, potentially offsetting CFI-related performance losses.
The looser validation standard used by CFGuard is less secure, but does not require whole-program analysis. Objects built with CFGuard validate indirect calls; objects built without CFGuard do not. Both objects can coexist in the same program. The linker, however, must be aware of CFGuard in order to emit a binary with appropriate tables and flags in the PE header.
ClangCFI is all or nothing. CFGuard is incremental.
In general, ClangCFI must be enabled for every object file and static library in a program: it is an error to link CFI-enabled code with non-CFI code . The error is easy to make but difficult to identify because the linker does not inspect objects for ClangCFI protections. The linker will not report errors, but the resulting executable will fail runtime CFI checks.Osquery, by design, statically links every dependency, including libc++. Those dependencies statically link other dependencies, and so on. To enable ClangCFI for osquery, we would have to enable ClangCFI for the entire osquery dependency tree. As we’ll see in the next section, that is a difficult task. We could not justify that kind of time commitment for this blog post, although we would love to do this in the future.
CFGuard can be applied on a per-compilation unit level. The documentation for CFGuard explicitly mentions that it is permissible to mix and match CFG-enabled and non-CFG objects and libraries . Calls across DSOs (i.e DLLs, in Windows terminology) are fully supported. This flexibility was critical for enabling CFGuard for osquery; we enabled CFGuard for osquery itself and linked against existing unprotected dependencies. Fortunately, Windows ships with CFGuard-protected system libraries that are utilized when the main program image supports CFGuard. The unprotected code is limited to static libraries used while building osquery.
ClangCFI is too strict for some codebases
ClangCFI is too strict for some codebases. This is not clangs’ fault: some code uses shortcuts and conveniences that may not be strictly standards compliant. We ran into this issue when trying to enable ClangCFI for strongSwan. Our goal was to attempt a smaller example than osquery, and to create a security-enhanced version of strongSwan for Algo, our VPN solution.
We were not able to create a CFI-enabled version of strongSwan because libstrongswan, the core component of strongSwan, uses an OOP-like system for C. This system wraps most indirect calls with an interface that fails ClangCFI’s strict checks. ClangCFI is technically correct: the type signatures of caller and callee should match. In practice, there is shipping code where they do not.
Thankfully ClangCFI has a feature to relax strictness: the CFI blacklist. The blacklist will disable CFI checks for source files, functions, or types matching a regular expression. Unfortunately, in this case, almost every indirect call site would have to be blacklisted, making CFI effectively useless.
CFGuard is unlikely to cause the same issue: there is (probably) some code that indirect calls to the middle of a function, but such code is orders of magnitude more rare than mismatched type signatures.
From a security perspective, ClangCFI is “better” than CFGuard. It is stricter, it requires the whole program to be protected, and it tests for more runtime errors. It is possible to utilize ClangCFI to protect large and complex codebases: the excellent Google Chrome team does it. However, the enhanced security comes with a steep cost. Enabling ClangCFI can turn into a complex undertaking that requires considerable developer time and rigorous testing.
Conversely, CFGuard is considerably more flexible. A program can mix guarded and unguarded code, and CFGuard is much less likely to break existing code. These compromises make CFGuard much easier to enable for existing codebases.
Our experience using ClangCFI and CFGuard reflects these tradeoffs. A ClangCFI-enabled osquery would be more secure than the CFGuard-enabled osquery. However, the CFGuard-enabled osquery for Windows exists right now. The ClangCFI-enabled osquery for Linux is still a work-in-progress after weeks of trial and error.
 This is not strictly true. For example, suppressed functions are function entry points but invalid indirect call destinations.
 Again, this not strictly true; there are specific exceptions to the mixing rule. For example, the CFI support library is not built with CFI. Linking CFI and non-CFI objects is fine if every function in the non-CFI object is only called directly. See this comment by Evgeniy Stepanov.
 from this page: “… a mixture of CFG-enabled and non-CFG enabled code will execute fine.”
Pingback: A walk down memory lane | Trail of Bits Blog
Pingback: 2017 in review | Trail of Bits Blog