Buttercup is now open-source!

Trail of Bits

August 08, 2025

aixcc, research-practice, darpa, machine-learning, tool-release

Page content

We’re thrilled to announce that Trail of Bits won second place in DARPA’s AI Cyber Challenge (AIxCC)! Now that the competition has ended, we can finally make Buttercup, our cyber reasoning system (CRS), open source. We’re thrilled to make Buttercup broadly available and see how the security community uses, extends, and benefits from it.

To ensure as many people as possible can use Buttercup, we created a standalone version that runs on a typical laptop. We’ve also tuned this version to work within an AI budget appropriate for individual projects rather than a massive competition at scale. In addition to releasing the standalone version of Buttercup, we’re also open-sourcing the versions that competed in AIxCC’s semifinal and final rounds.

In the rest of this post, we’ll provide a high-level overview of how Buttercup works, how to get started using it, and what’s in store for it next. If you’d prefer to go straight to the code, check it out here on GitHub.

How Buttercup works

Buttercup is a fully automated, AI-driven system for discovering and patching vulnerabilities in open-source software. Buttercup has four main components:

Orchestration/UI coordinates the overall actions of Buttercup’s other components and displays information about vulnerabilities discovered and patches generated by the system. In addition to a typical web interface, Buttercup also reports its logs and system events to a SigNoz telemetry server to make it easy for users to see what Buttercup is doing.
Vulnerability discovery uses AI-augmented mutational fuzzing to find program inputs that demonstrate vulnerabilities in the program. Buttercup’s vulnerability discovery engine is based on OSS-Fuzz/Clusterfuzz and uses libFuzzer and Jazzer to find vulnerabilities.
Contextual analysis uses traditional static analysis tools to create queryable program models that are used to provide context to AI models used in vulnerability discovery and patching. Buttercup uses tree-sitter and CodeQuery to build the program model.
Patch generation is a multi-agentic system for creating and validating software patches for vulnerabilities discovered by Buttercup. Buttercup’s patch generation system uses seven distinct AI agents to create robust patches that fix vulnerabilities it finds and avoid breaking the program’s other functionality.

The following flowchart depicts how these components help Buttercup discover and patch vulnerabilities:

Flowchart showing Buttercup’s vulnerability discovery and patching pipeline — Figure 1: Conceptual overview of Buttercup’s vulnerability discovery and patching pipeline

When Buttercup is started, it waits for tasking from the user in the form of an OSS-Fuzz-compatible source code repository. Once tasked, Buttercup retrieves the code repository, builds the program with and without various sanitizers enabled, and begins fuzzing the program with the help of an AI-based input generator. When inputs trigger sanitizers, timeouts, or crashes in the program, these inputs are recorded as proofs of vulnerability (PoVs).

Next, Buttercup’s orchestrator deduplicates PoVs and sends unique crashes to the patch generation system for patching. The patch generation system, using information from the contextual analysis system, iteratively creates, tests, and refines patches until it generates a patch that 1) prevents the PoV and its duplicates from triggering the vulnerability and 2) maintains the program’s other functions. Finally, Buttercup’s orchestrator retains the PoVs and patches so they can be reported to the user.

Getting started

We’ve made it easy for individual users to get Buttercup up and running on a typical laptop. Buttercup works best on x86-64 Linux systems, but does partially support ARM64 systems like macOS devices. You’ll need at least 8 CPU cores, 16 GB of RAM, 100 GB of free disk space, and an active network connection to run Buttercup. You’ll also need to provide an API key for at least one third-party LLM provider like OpenAI or Anthropic. Don’t worry: we make it easy to set a cost limit so Buttercup doesn’t run up an unexpectedly large bill.

All you need to do is clone Buttercup’s code repository, ensure that you have a few common system packages installed, and run a few easy commands in your terminal:

Setup: Guides the user through installing Buttercup on the system and configuring it with AI API keys.
Deploy: Creates a fully localized cluster with all of Buttercup’s components running within pods. Here’s what it looks like when Buttercup is started and ready to process a new task:
Figure 2: Buttercup ready to find and patch vulnerabilities
Send task: Sends Buttercup a sample code repository with an intentionally inserted vulnerability to demonstrate Buttercup’s capabilities. It takes Buttercup less than 10 minutes to find and patch the vulnerability.
Open UI: Start up Buttercup’s browser-based UI to see the PoVs and patches that Buttercup has discovered. Here’s what the Buttercup web UI looks like when it finds vulnerabilities and patches them:

Image showing Buttercup web UI after a vulnerability has been discovered and patched — Figure 3: Buttercup web UI after a vulnerability has been discovered and patched

Image showing detailed view of a PoV in the Buttercup web UI — Figure 4: Detailed view of a PoV in the Buttercup web UI

These are just the basics. Check out Buttercup’s documentation for more information, including how to run Buttercup on your own software targets!

What’s next for Buttercup

Many of the improvements and capabilities we wanted to build into Buttercup during the AIxCC ended up on the cutting room floor due to competition constraints. Now that the competition is over, we’re free to work on upgrading and maintaining the standalone version of Buttercup to make it as capable as possible. If you’re interested in contributing to Buttercup’s success, we welcome you to join us!

Stay tuned for more updates on Buttercup’s life after the AIxCC!

If you are interested in the versions of Buttercup that we submitted to the AIxCC semifinal (ASC) and final (AFC) competitions, you can find them at the links below. Please note that these versions were designed to interact with DARPA’s competition infrastructure, which has since been shut down. We are not actively maintaining these versions of Buttercup.

For background on the challenge, see our previous posts on the AIxCC: