For my winternship and springternship at Trail of Bits, I researched novel techniques for symbolic execution on cryptographic protocols. I analyzed various implementation-level bugs in cryptographic libraries, and built a prototype Manticore-based concolic unit testing tool, Sandshrew, that analyzed C cryptographic primitives under a symbolic and concrete environment.
Sandshrew is a first step for crypto developers to easily create powerful unit test cases for their implementations, backed by advancements in symbolic execution. While it can be used as a security tool to discover bugs, it also can be used as a framework for cryptographic verification.
Playing with Cryptographic Verification
When choosing and implementing crypto, our trust should lie in whether or not the implementation is formally verified. This is crucial, since crypto implementations often introduce new classes of bugs like bignum vulnerabilities, which can appear probabilistically. Therefore, by ensuring verification, we are also ensuring functional correctness of our implementation.
There are a few ways we could check our crypto for verification:
- Traditional fuzzing. We can use fuzz testing tools like AFL and libFuzzer. This is not optimal for coverage, as finding deeper classes of bugs requires time. In addition, since they are random tools, they aren’t exactly “formal verification,” so much as a sotchastic approximation thereof.
- Extracting model abstractions. We can lift source code into cryptographic models that can be verified with proof languages. This requires learning purely academic tools and languages, and having a sound translation.
- Just use a verified implementation! Instead of trying to prove our code, let’s just use something that is already formally verified, like Project Everest’s HACL* library. This strips away configurability when designing protocols and applications, as we are only limited to what the library offers (i.e HACL* doesn’t implement Bitcoin’s secp256k1 curve).
What about symbolic execution?
Due to its ability to exhaustively explore all paths in a program, using symbolic execution to analyze cryptographic libraries can be very beneficial. It can efficiently discover bugs, guarantee coverage, and ensure verification. However, this is still an immense area of research that has yielded only a sparse number of working implementations.
Why? Because cryptographic primitives often rely on properties that a symbolic execution engine may not be able to emulate. This can include the use of pseudorandom sources and platform-specific optimized assembly instructions. These contribute to complex SMT queries passed to the engine, resulting in path explosion and a significant slowdown during runtime.
One way to address this is by using concolic execution. Concolic execution mixes symbolic and concrete execution, where portions of code execution can be “concretized,” or run without the presence of a symbolic executor. We harness this ability of concretization in order to maximize coverage on code paths without SMT timeouts, making this a viable strategy for approaching crypto verification.
After realizing the shortcomings in cryptographic symbolic execution, I decided to write a prototype concolic unit testing tool, sandshrew. sandshrew verifies crypto by checking equivalence between a target unverified implementation and a benchmark verified implementation through small C test cases. These are then analyzed with concolic execution, using Manticore and Unicorn to execute instructions both symbolically and concretely.
Writing Test Cases
We first write and compile a test case that tests an individual cryptographic primitive or function for equivalence against another implementation. The example shown in Figure 1 tests for a hash collision for a plaintext input, by implementing a libFuzzer-style wrapper over the
MD5() function from OpenSSL. Wrappers signify to sandshrew that the primitive they wrap should be concretized during analysis.
Sandshrew leverages a symbolic environment through the robust Manticore binary API. I implemented the
manticore.resolve() feature for ELF symbol resolution and used it to determine memory locations for user-written
SANDSHREW_* functions from the GOT/PLT of the test case binary.
Once Manticore resolves out the wrapper functions, hooks are attached to the target crypto primitives in the binary for concretization. As seen in Figure 2, we then harness Manticore’s Unicorn fallback instruction emulator, UnicornEmulator, to emulate the call instruction made to the crypto primitive. UnicornEmulator concretizes symbolic inputs in the current state, executes the instruction under Unicorn, and stores modified registers back to the Manticore state.
All seems well, except this: if all the symbolic inputs are concretized, what will be solved after the concretization of the call instruction?
Restoring Symbolic State
Before our program tests implementations for equivalence, we introduce an unconstrained symbolic variable as the returned output from our concretized function. This variable guarantees a new symbolic input that continues to drive execution, but does not contain previously collected constraints.
Mathy Vanhoef (2018) takes this approach to analyze cryptographic protocols over the WPA2 protocol. We do this in order to avoid the problem of timeouts due to complex SMT queries.
As seen in Figure 3, this is implemented through the concrete_checker hook at the
SANDSHREW_* symbol, which performs the unconstrained re-symbolication if the hook detects the presence of symbolic input being passed to the wrapper.
Once symbolic state is restored, sandshrew is then able to continue to execute symbolically with Manticore, forking once it has reached the equivalence checking portion of the program, and generating solver solutions.
Here is Sandshrew performing analysis on the example MD5 hash collision program from earlier:
The prototype implementation of Sandshrew currently exists here. With it comes a suite of test cases that check equivalence between a few real-world implementation libraries and the primitives that they implement.
Sandshrew has a sizable test suite for critical cryptographic primitives. However, analysis still becomes stuck for many of the test cases. This may be due to the large statespace needing to be explored for symbolic inputs. Arriving at a solution is probabilistic, as the Manticore z3 interface often times out.
With this, we can identify several areas of improvement for the future:
- Add support for allowing users to supply concrete input sets to check before symbolic execution. With a proper input generator (i.e., radamsa), this potentially hybridizes Sandshrew into a fuzzer as well.
- Implement Manticore function models for common cryptographic operations. This can increase performance during analysis and allows us to properly simulate execution under the Dolev-Yao verification model.
- Reduce unnecessary code branching using opportunistic state merging.
Sandshrew is an interesting approach at attacking the problem of cryptographic verification, and demonstrates the awesome features of the Manticore API for efficiently creating security testing tools. While it is still a prototype implementation and experimental, we invite you to contribute to its development, whether through optimizations or new example test cases.
Working at Trail of Bits was an awesome experience, and offered me a lot of incentive to explore and learn new and exciting areas of security research. Working in an industry environment pushed me to understand difficult concepts and ideas, which I will take to my first year of college.
Pingback: Trail of Bits @ ICSE 2019 – Recap | Trail of Bits Blog
Pingback: DeepState Now Supports Ensemble Fuzzing | Trail of Bits Blog