← All Authors
Evan Sultanik
We optimized the route for visiting every NYC subway station using algorithms from combinatorial optimization, creating a 20-hour tour that beats the existing world record by 45 minutes.
Vendetect is our new open-source tool for detecting copied and vendored code between repositories. It uses semantic fingerprinting to identify similar code even when variable names change or comments disappear. More importantly, unlike academic plagiarism detectors, it understands version control history, helping you trace vendored code back to its exact source commit.
Deptective, our new open-source tool, automatically finds the packages needed to install software dependencies. It does so not based on the software’s self-reported requirements, but by observing what the software needs at runtime.
This blog post highlights key points from our new white paper Preventing Account Takeovers on Centralized Cryptocurrency Exchanges, which documents ATO-related attack vectors and defenses tailored to CEXes. Imagine trying to log in to your centralized cryptocurrency exchange (CEX) account and your password and username just… don’t work. You […]
A couple of years ago we released PolyFile: a utility to identify and map the semantic structure of files, including polyglots, chimeras, and schizophrenic files. It’s a bit like file, binwalk, and Kaitai Struct all rolled into one. PolyFile initially used the TRiD definition database for file identification. However, […]
You just cloned a fresh source code repository and want to get a quick sense of its dependencies. Our tool, it-depends, can get you there. We are proud to announce the release of it-depends, an open-source tool for automatic enumeration of dependencies. You simply point it to a source code repository, and it will build […]
Many machine learning (ML) models are Python pickle files under the hood, and it makes sense. The use of pickling conserves memory, enables start-and-stop model training, and makes trained models portable (and, thereby, shareable). Pickling is easy to implement, is built into Python without requiring additional dependencies, and supports serialization of custom […]
Trail of Bits sponsored the recent justCTF competition, and our engineers helped craft several of the challenges, including D0cker, Go-fs, Pinata, Oracles, and 25519. In this post we’re going to cover another of our challenges, titled PDF is broken, and so is this file. It demonstrates some of the PDF file format’s idiosyncrasies in a […]
Graphtage is a command line utility and underlying library for semantically comparing and merging tree-like structures such as JSON, JSON5, XML, HTML, YAML, and TOML files. Its name is a portmanteau of “graph” and “graftage” (i.e., the horticultural practice of joining two trees together so they grow as one). Read on for what Graphtage does differently and better, why we developed it, how it works, and directions for using it as a library.
Parsing is hard, even when a file format is well specified. But when the specification is ambiguous, it leads to unintended and strange parser and interpreter behaviors that make file formats susceptible to security vulnerabilities. What if we could automatically generate a “safe” subset of any file format, along with an associated, verified parser? That’s […]
On December 12, over 150 attendees joined a special, half-day Empire Hacking to learn about pitfalls in smart contract security and how to avoid them. Thank you to everyone who came, to our superb speakers, and to BuzzFeed for hosting this meetup at their office. Watch the presentations again It’s hard to find such rich […]