We bypassed human approval protections for system command execution in AI agents, achieving RCE in three agent platforms.
We’ve added a pickle file scanner to Fickling that uses an allowlist approach to protect AI/ML environments from malicious pickle files that could compromise models or infrastructure.
Our business operations intern at Trail of Bits built two AI-powered tools that became permanent company resources—a podcast workflow that saves 1,250 hours annually and a Slack exporter that enables efficient knowledge retrieval across the organization.
In this blog post, we’ll detail how attackers can exploit image scaling on Gemini CLI, Vertex AI Studio, Gemini’s web and API interfaces, Google Assistant, Genspark, and other production AI systems. We’ll also explain how to mitigate and defend against these attacks, and we’ll introduce Anamorpher, our open-source tool that lets you explore and generate these crafted images.
Our team won the runner-up prize of $3M at DARPA’s AI Cyber Challenge, demonstrating Buttercup’s world-class automated vulnerability discovery and patching capabilities with remarkable cost efficiency.
Now that DARPA’s AI Cyber Challenge (AIxCC) has officially ended, we can finally make Buttercup, our CRS (Cyber Reasoning System), open source!
While the AIxCC winner has not yet been announced, differences in the finalists’ approaches show that there are multiple viable paths forward to using AI for vulnerability detection.
Prompt injection pervades discussions about security for LLMs and AI agents. But there is little public information on how to write powerful, discreet, and reliable prompt injection exploits. In this post, we will design and implement a prompt injection exploit targeting GitHub’s Copilot Agent, with a focus on maximizing reliability and minimizing the odds of detection.
In my first month at Trail of Bits as an AI/ML security engineer, I found two remotely accessible memory corruption bugs in NVIDIA’s Triton Inference Server during a routine onboarding practice.
We’re releasing pajaMAS: a curated set of MAS hijacking demos that illustrate important principles of MAS security.
Today we’re announcing the beta release of mcp-context-protector, a security wrapper for LLM apps using the Model Context Protocol (MCP). It defends against the line jumping attacks documented earlier in this blog series, such as prompt injection via tool descriptions and ANSI terminal escape codes.
Datasig generates compact, unique fingerprints for AI/ML datasets that let you compare training data with high accuracy—without needing access to the raw data itself.
This critical capability helps AIBOM (AI bill of materials) tools detect data-borne vulnerabilities that traditional security tools completely miss.
This post describes how many examples of MCP software store long-term API keys for third-party services in plaintext on the local filesystem, often with insecure, world-readable permissions.
This post describes attacks using ANSI terminal code escape sequences to hide malicious instructions to the LLM, leveraging the line jumping vulnerability we discovered in MCP.
Malicious MCP servers can inject trigger phrases into tool descriptions to exfiltrate entire conversation histories and steal sensitive credentials and IP.
MCP’s ’line jumping’ vulnerability lets malicious servers inject prompts through tool descriptions to manipulate AI behavior before tools are ever invoked.
Trail of Bits’ Buttercup competes in DARPA’s AIxCC Finals with expanded resources, multiple rounds, new challenge types, and custom AI model capabilities.
While Trail of Bits is known for developing security tools like Slither, Medusa, and Fickling, our engineering efforts extend far beyond our own projects. Throughout 2024, our team has been deeply engaged with the broader security ecosystem, tackling challenges in open-source tools and infrastructure that security engineers rely on every day. This year, our engineers […]
AI-enabled code assistants (like GitHub’s Copilot, Continue.dev, and Tabby) are making software development faster and more productive. Unfortunately, these tools are often bad at Solidity. So we decided to improve them! To make it easier to write, edit, and understand Solidity with AI-enabled tools, we have: Added support for Solidity into Tabby […]
This is a joint post with the Hugging Face Gradio team; read their announcement here! You can find the full report with all of the detailed findings from our security audit of Gradio 5 here. Hugging Face hired Trail of Bits to audit Gradio 5, a popular open-source library that provides a web interface that […]
At DEF CON, Michael Brown, Principal Security Engineer at Trail of Bits, sat down with Michael Novinson from Information Security Media Group (ISMG) to discuss four critical areas where AI/ML is revolutionizing security. Here’s what they covered: AI/ML techniques surpass the limits of traditional software analysis As Moore’s law slows down after 20 years of […]
Today we’re going to provision some cloud infrastructure the Max Power way: by combining automation with unchecked AI output. Unfortunately, this method produces cloud infrastructure code that 1) works and 2) has terrible security properties. In a nutshell, AI-based tools like Claude and ChatGPT readily provide extremely bad cloud infrastructure provisioning code, […]
With DARPA’s AI Cyber Challenge (AIxCC) semifinal starting today at DEF CON 2024, we want to introduce Buttercup, our AIxCC submission. Buttercup is a Cyber Reasoning System (CRS) that combines conventional cybersecurity techniques like fuzzing and static analysis with AI and machine learning to find and fix software vulnerabilities. The system is designed to operate […]
Today, we present the second of our open-source AI security audits: a look at security issues we found in an open-source retrieval augmented generation (RAG) application that could lead to chatbot output poisoning, inaccurate document ingestion, and potential denial of service. This audit follows up on our previous work that identified 11 security vulnerabilities in […]
Earlier this week, at Apple’s WWDC, we finally witnessed Apple’s AI strategy. The videos and live demos were accompanied by two long-form releases: Apple’s Private Cloud Compute and Apple’s On-Device and Server Foundation Models. This blog post is about the latter. So, what is Apple releasing, and how does it compare to […]