Maat: Symbolic execution made easy

Boyan Milanov

February 23, 2022

symbolic-execution, research-practice, program-analysis

We have released Maat, a cross-architecture, multi-purpose, and user-friendly symbolic execution framework. It provides common symbolic execution capabilities such as dynamic symbolic execution (DSE), taint analysis, binary instrumentation, environment simulation, and constraint solving.

Maat is easy-to-use, is based on the popular Ghidra intermediate representation (IR) language p-code, prioritizes runtime performance, and has both a C++ and a Python API. Our goal is to create a powerful and flexible framework that can be used by both experienced security engineers and beginners that want to get started with symbolic execution.

While our Manticore tool offers a high-level interface to symbolically explore binaries, Maat is a lower-level symbolic execution toolkit that can be easily integrated into other projects or used to build stand-alone analysis tools. For a straight-to-the-point example, read our tutorial on how to solve a basic reverse engineering challenge with Maat.

A user-friendly, flexible API

Maat has a C++ programmatic API that can be used in low-level or performance-critical projects. It also offers Python bindings allowing users to easily and quickly write portable analysis scripts.

The API has been designed to give the user as much control as possible. Its debugger-like interface can be used to start, pause, and even rewind the symbolic execution process. Users can instrument the target code with arbitrary callback functions that are triggered by certain events (such as register and memory accesses and branch operations), write custom dynamic analyses, modify the program state at runtime, specify a particular state at which the process should stop, and even perform path exploration on a portion of a binary.

Last but not least, Maat’s execution engine has customizable settings that allow users to control its fundamental behavior in processing symbolic data. It includes policies for dealing with symbolic pointers, saving state constraints, and making symbolic simplifications, among other customizations. The default settings prioritize soundness over performance and suit the most general use cases, but advanced users can tailor the engine to their own use cases and bypass certain limitations of the defaults.

Rich architecture support

With Maat, we want to bring symbolic execution capabilities to as many architectures as possible. To do so, we based Maat’s symbolic execution engine on p-code, the IR language used by Ghidra. By basing Maat on p-code, we were able to leverage Ghidra’s awesome C++ library, sleigh, for disassembling and lifting binary code, which already supports a very broad range of architectures. The cherry on top: Maat uses a separate standalone version of sleigh, so you don’t have to install Ghidra to use Maat.

The use of sleigh brings three major advantages to Maat:

The ability to perform symbolic execution on any architecture supported by Ghidra
The reliability of a very popular, open-source, and actively supported disassembling and lifting library
The possibility to add additional architectures using the sleigh specification language

While Maat has been tested only on X86 and X64 so far, we plan to add interfaces for other architectures soon. We are particularly excited by the prospect of introducing support in Maat for exotic architectures that are not currently supported by any existing tool; sleigh’s unrivaled architecture support makes this possible. Another thrilling opportunity is the use of Maat to perform symbolic execution on virtual machine bytecode such as Java, Dalvik, and Ethereum.

Performance-driven

It can be a struggle to scale symbolic execution to real-world applications. For generic, binary-only symbolic execution tools, significant runtime overhead is inherent to lifting and executing an IR; it is simply unavoidable. That being said, in any reasonable day-to-day workflow, scripts that run within minutes instead of hours can make all the difference. We thus put care into the design and implementation of Maat so that it runs as fast as possible while also yielding useful results.

The core of Maat is written entirely in C++, many developers’ language of choice for optimizations and performance. We do our best to write efficient code without sacrificing code readability or restricting features. Maat’s runtime performance can vary widely depending on the amount of symbolic computations, on calls to the SMT solver, and on user-provided analysis callbacks; but our early experimental measurements are quite promising, with 100,000 to 300,000 instructions symbolically executed per second on a typical laptop (2.3 GHz Intel Core i7, 32 GB RAM).

We also plan on adding and exposing introspection capabilities to allow users to identify runtime bottlenecks. This will not only help end users to optimize their analysis scripts for their specific use cases but also enable us to make more fundamental improvements to Maat’s core components.

How to get started

Simply install Maat with python3 -m pip install pymaat! Check out our series of tutorials for guidance on using it. While this series offers a few basic tutorials, our long-term goal is to provide a more comprehensive series that covers the basics of the framework and advanced applications and complex features.

Curious readers can check out Maat’s source code on GitHub! Along with the tutorials, you will find installation instructions and C++/Python API documentation on Maat’s website.

Finally, join our GitHub discussions for questions and feedback—let us know what you think!