← All Authors

Evan Sultanik

11 posts

Detecting code copying at scale with Vendetect

Vendetect is our new open-source tool for detecting copied and vendored code between repositories. It uses semantic fingerprinting to identify similar code even when variable names change or comments disappear. More importantly, unlike academic plagiarism detectors, it understands version control history, helping you trace vendored code back to its exact source commit.

Preventing account takeover on centralized cryptocurrency exchanges in 2025

This blog post highlights key points from our new white paper Preventing Account Takeovers on Centralized Cryptocurrency Exchanges, which documents ATO-related attack vectors and defenses tailored to CEXes. Imagine trying to log in to your centralized cryptocurrency exchange (CEX) account and your password and username just… don’t work. You […]

libmagic: The Blathering

A couple of years ago we released PolyFile: a utility to identify and map the semantic structure of files, including polyglots, chimeras, and schizophrenic files. It’s a bit like file, binwalk, and Kaitai Struct all rolled into one. PolyFile initially used the TRiD definition database for file identification. However, […]

PDF is Broken: a justCTF Challenge

Trail of Bits sponsored the recent justCTF competition, and our engineers helped craft several of the challenges, including D0cker, Go-fs, Pinata, Oracles, and 25519. In this post we’re going to cover another of our challenges, titled PDF is broken, and so is this file. It demonstrates some of the PDF file format’s idiosyncrasies in a […]

Graphtage: A New Semantic Diffing Tool

Graphtage is a command line utility and underlying library for semantically comparing and merging tree-like structures such as JSON, JSON5, XML, HTML, YAML, and TOML files. Its name is a portmanteau of “graph” and “graftage” (i.e., the horticultural practice of joining two trees together so they grow as one). Read on for what Graphtage does differently and better, why we developed it, how it works, and directions for using it as a library.

Two New Tools that Tame the Treachery of Files

Parsing is hard, even when a file format is well specified. But when the specification is ambiguous, it leads to unintended and strange parser and interpreter behaviors that make file formats susceptible to security vulnerabilities. What if we could automatically generate a “safe” subset of any file format, along with an associated, verified parser? That’s […]

Empire Hacking: Ethereum Edition 2

On December 12, over 150 attendees joined a special, half-day Empire Hacking to learn about pitfalls in smart contract security and how to avoid them. Thank you to everyone who came, to our superb speakers, and to BuzzFeed for hosting this meetup at their office. Watch the presentations again It’s hard to find such rich […]