Our team won the runner-up prize of $3M at DARPA’s AI Cyber Challenge, demonstrating Buttercup’s world-class automated vulnerability discovery and patching capabilities with remarkable cost efficiency.
Now that DARPA’s AI Cyber Challenge (AIxCC) has officially ended, we can finally make Buttercup, our CRS (Cyber Reasoning System), open source!
While the AIxCC winner has not yet been announced, differences in the finalists’ approaches show that there are multiple viable paths forward to using AI for vulnerability detection.
Prompt injection pervades discussions about security for LLMs and AI agents. But there is little public information on how to write powerful, discreet, and reliable prompt injection exploits. In this post, we will design and implement a prompt injection exploit targeting GitHub’s Copilot Agent, with a focus on maximizing reliability and minimizing the odds of detection.
In my first month at Trail of Bits as an AI/ML security engineer, I found two remotely accessible memory corruption bugs in NVIDIA’s Triton Inference Server during a routine onboarding practice.
We’re releasing pajaMAS: a curated set of MAS hijacking demos that illustrate important principles of MAS security.
Today we’re announcing the beta release of mcp-context-protector, a security wrapper for LLM apps using the Model Context Protocol (MCP). It defends against the line jumping attacks documented earlier in this blog series, such as prompt injection via tool descriptions and ANSI terminal escape codes.
Datasig generates compact, unique fingerprints for AI/ML datasets that let you compare training data with high accuracy—without needing access to the raw data itself.
This critical capability helps AIBOM (AI bill of materials) tools detect data-borne vulnerabilities that traditional security tools completely miss.
This post describes how many examples of MCP software store long-term API keys for third-party services in plaintext on the local filesystem, often with insecure, world-readable permissions.
This post describes attacks using ANSI terminal code escape sequences to hide malicious instructions to the LLM, leveraging the line jumping vulnerability we discovered in MCP.
This post explains how malicious MCP servers can exploit the Model Context Protocol to covertly exfiltrate entire conversation histories by injecting trigger phrases into tool descriptions, allowing for targeted data theft against specific organizations.
This post is about a vulnerability in the Model Context Protocol (MCP) called “Line Jumping,” where malicious servers can inject prompts through tool descriptions to manipulate AI model behavior without being explicitly invoked, effectively bypassing security measures designed to protect users.
Trail of Bits’ Cyber Reasoning System “Buttercup” is competing in DARPA’s AI Cyber Challenge Finals, which now features increased budgets, multiple rounds, diverse challenge types, and the ability to use custom AI models.
While Trail of Bits is known for developing security tools like Slither, Medusa, and Fickling, our engineering efforts extend far beyond our own projects. Throughout 2024, our team has been deeply engaged with the broader security ecosystem, tackling challenges in open-source tools and infrastructure that security engineers rely on every day. This year, our engineers […]
AI-enabled code assistants (like GitHub’s Copilot, Continue.dev, and Tabby) are making software development faster and more productive. Unfortunately, these tools are often bad at Solidity. So we decided to improve them! To make it easier to write, edit, and understand Solidity with AI-enabled tools, we have: Added support for Solidity into Tabby […]