Detecting Bad OpenSSL Usage

by William Wang, UCLA

OpenSSL is one of the most popular cryptographic libraries out there; even if you aren’t using C/C++, chances are your programming language’s biggest libraries use OpenSSL bindings as well. It’s also notoriously easy to mess up due to the design of its low-level API. Yet many of these mistakes fall into easily identifiable patterns, which raises the possibility of automated detection.

As part of my internship this past winter and spring, I’ve been prototyping a tool called Anselm, which lets developers describe and search for patterns of bad behavior. Anselm is an LLVM pass, meaning that it operates on an intermediate representation of code between source and compilation. Anselm’s primary advantage over static analysis is that it can operate on any programming language that compiles to LLVM bitcode, or any closed-source machine code that can be converted backwards. Anselm can target any arbitrary sequence of function calls, but its original purpose was to inspect OpenSSL usage so let’s start there.

OpenSSL

The design of OpenSSL makes it difficult to understand and work with for beginners. It has a variety of inconsistent naming conventions across its library and offers several, arguably too many, options and modes for each primitive. For example, due to the evolution of the library there exist both high level (EVP) and low level methods that can be used to accomplish the same task (e.g. DSA signatures or EC signing operations). To make this worse, their documentation can be inconsistent and difficult to read.

In addition to being difficult to use, other design choices make the library dangerous to use. The API inconsistently returns error codes, pointers (with and without ownership), and demonstrates other surprising behavior. Without rigorously checking error codes or defending against null pointers, unexpected program behavior and process termination can occur.

So what types of errors can Anselm detect? It depends on what the developer specifies, but that could be anything from mismanaging the OpenSSL error queue to reusing initialization vectors. It’s also important to remember that these are heuristics, and misidentifying both good and bad behavior is always possible. Now, let’s get into how the tool works.

Function Calls

While the primary motivation of this project was to target OpenSSL, the library itself doesn’t actually matter. One can view OpenSSL usage as a sequence of API calls, such as EVP_EncryptUpdate and EVP_EncryptFinal_ex. But we could easily replace those names with anything else, and the idea remains the same. Hence bad behavior is a pattern of any function calls (not just OpenSSL’s) which we would like to detect.

My main approach was to search through all possible paths of execution in a function, looking for bad sequences of API calls. Throughout this post, I’ll be using OpenSSL’s symmetric encryption functions in my examples. Let’s consider EVP_EncryptUpdate, which encrypts a block of data, and EVP_EncryptFinal_ex, which pads the plaintext before a final encryption. Naturally, they should not be called out of order:

EVP_EncryptFinal_ex(ctx, ciphertext + len, &len);
...
EVP_EncryptUpdate(ctx, ciphertext, &len, plaintext, plaintext_len);

This should also be flagged, since the bad sequence remains a possibility:

EVP_EncryptFinal_ex(ctx, ciphertext + len, &len);
...
if (condition) {
  EVP_EncryptUpdate(ctx, ciphertext, &len, plaintext, plaintext_len);
}

I worked with LLVM BasicBlocks, which represent a list of instructions that always execute together. BasicBlocks can have multiple successors, each reflecting different paths of execution. A function, then, is a directed graph of many BasicBlocks. There is a single root node, and any leaf node represents an end of execution.

Finding all possible executions amounts to performing a depth-first search (DFS) starting from the root node. However, notice that the graph can contain cycles; this is analogous to a loop in code. If we performed a blind DFS, we could get stuck in an infinite loop. On the other hand, ignoring previously visited nodes can lead to missed behavior. I settled this by limiting the length of any path, after which Anselm stops exploring further (this can be customized).

One issue remains, which is that performing DFS over an entire codebase can be very time-consuming. Even if our exact pattern is simple, it still needs to be matched against all possible paths generated by the search. To solve this, I first prune the graph of any BasicBlock that does not contain any relevant API calls. This is done by pointing any irrelevant node’s predecessors to each of its successors, removing the middleman.

In practice, this dramatically reduces the complexity of a graph for faster path-finding: entire if statements and while loops can be eliminated without any consequences! It also makes any path limit much more reasonable.

Matching Values

Although solely examining function calls is a good start, we can do better. Consider OpenSSL contexts, which are created by EVP_CIPHER_CTX_new and must be initialized with algorithm, key, etc. before actual use. In the following situation, we want every context to be initialized by EVP_EncryptInit_ex:

EVP_CIPHER_CTX *ctx1 = EVP_CIPHER_CTX_new();
EVP_CIPHER_CTX *ctx2 = EVP_CIPHER_CTX_new();
EVP_EncryptInit_ex(ctx1, EVP_aes_256_cbc(), NULL, key, iv);

EVP_EncryptInit_ex always follows EVP_CIPHER_CTX_new, yet ctx2 is obviously not initialized properly. A more precise pattern would be, “Every context returned from EVP_CIPHER_CTX_new should later be initialized in EVP_CIPHER_CTX_new.”

I addressed this by matching arguments and return values — checking whether they pointed to the same LLVM Value object in memory. Contexts are a prime situation to match values, but we can use the same technique to detect repeated IVs:

EVP_EncryptInit_ex(ctx1, EVP_aes_256_cbc(), NULL, key1, iv);
EVP_EncryptInit_ex(ctx2, EVP_aes_256_cbc(), NULL, key2, iv);

Internally, Anselm uses regex capture groups to perform this analysis; every execution path is a string of function calls and Value pointers, while a bad behavior is defined by some regex pattern.

Pattern Format

Towards the end of my internship, I also defined a format for developers to specify bad behaviors, which Anselm translates into a (somewhat messy) regex pattern. Every line begins with a function call, followed by its return value and arguments. If you don’t care about a value, use an underscore. Otherwise, define a token which you can use elsewhere. Hence a rule banning repeat IVs would look like this:

EVP_EncryptInit_ex _ _ _ _ _ iv
EVP_EncryptInit_ex _ _ _ _ _ iv

Since the iv token is reused, Anselm constrains its search to only match functions which contain the same Value pointer at that argument position.

I also defined a syntax to perform negative lookaheads, which tells Anselm to look for the absence of specific function calls. For example, if I wanted to prevent any context from being used before initialized, I would prepend an exclamation mark like so:

EVP_CIPHER_CTX_new ctx
! EVP_EncryptInit_ex _ ctx _ _ _ _
EVP_EncryptUpdate _ ctx _ _ _ _

In English, this pattern identifies any calls to EVP_CIPHER_CTX_new and EVP_EncryptUpdate that do not have EVP_EncryptInit_ex sandwiched in between.

Final Notes

With its current set of tools, Anselm is capable of interpreting a wide range of function call patterns and searching for them in LLVM bitcode. Of course, it’s still a prototype and there are improvements to be made, but the main ideas are there and I’m proud of how the project turned out. Thanks to Trail of Bits for supporting these types of internships — it was a lot of fun!

Verifying Windows binaries, without Windows

TL;DR: We’ve open-sourced a new library, μthenticode, for verifying Authenticode signatures on Windows PE binaries without a Windows machine. We’ve also integrated it into recent builds of Winchecksec, so that you can use it today to verify signatures on your Windows executables!

svcli

As a library, μthenticode aims to be a breeze to integrate: It’s written in cross-platform, modern C++ and avoids the complexity of the CryptoAPI interfaces it replaces (namely WinVerifyTrust and CertVerifyCertificateChainPolicy). You can use it now as a replacement for many of SignTool’s features, and more are on the way.

A quick Authenticode primer

Authenticode is Microsoft’s code signing technology, comparable in spirit (but not implementation) to Apple’s Gatekeeper.

At its core, Authenticode supplies (or can supply, as optional features) a number of properties for signed programs:

  • Authenticity: A program with a valid Authenticode signature contains a chain of certificates sufficient for validating that signature. Said chain is ultimately rooted in a certificate stored in the user’s Trusted Publishers store, preventing self-signed certificates without explicit opt-in by the user.
  • Integrity: Each Authenticode signature includes a cryptographic hash of the signed binary. This hash is compared against the binary’s in-memory representation at load time, preventing malicious modifications.
    • Authenticode can also embed cryptographic hashes for each page of memory. These are used with forced integrity signing, which is necessary for Windows kernel drivers and requires a special Microsoft cross-signed “Software Publisher Certificate” instead of a self-signed or independently trusted Certificate Authority (CA).
  • Timeliness: Authenticode supports countersignatures embedding from a Timestamping Authority (TSA), allowing the signature on a binary to potentially outlive the expiration dates of its signing certificates. Such countersignatures also prevent backdating of a valid signature, making it more difficult for an attacker to re-use an expired signing certificate.

Like all code signing technologies, there are things Authenticode can’t do or guarantee about a program:

  • That it has no bugs: Anybody can write buggy software and sign for it, either with a self-signed certificate or by purchasing one from a CA that’s been cross-signed by Microsoft.
  • That it runs no code other than itself: The Windows execution contract is notoriously lax (e.g., the DLL loading rules for desktop applications), and many applications support some form of code execution as a feature (scripts, plugins, sick WinAMP skins, etc). Authenticode has no ability to validate the integrity or intent of code executed outside of the initial signed binary.

Similarly, there are some things that Authenticode, like all PKI implementations, is susceptible to:

  • Misplaced trust: CAs want to sell as many certificates as possible, and thus have limited incentives to check the legitimacy of entities that purchase from them. Anybody can create a US LLC for a few hundred bucks.
  • Stolen certificates: Code-signing and HTTPS certificates are prime targets for theft; many real-world campaigns leverage stolen certificates to fool users into trusting malicious code. Companies regularly check their secret material into source control systems, and code signing certificates are no exception.
  • Fraudulent certificates: Flame infamously leveraged a novel chosen-prefix attack on MD5 to impersonate a Microsoft certificate that had been accidentally trusted for code signing. Similar attacks on SHA-1 are now tractable at prices reasonable for nation states and organized crime.

All told, Authenticode (and all other forms of code signing) add useful authenticity and integrity checks to binaries, provided that you trust the signer and their ability to store their key material.

With that said, let’s take a look at what makes Authenticode tick.

Parsing Authenticode signatures: spicy PKCS#7

In a somewhat unusual move for 2000s-era Microsoft, most of the Authenticode format is actually documented and available for download. A few parts are conspicuously under-defined or marked as “out of scope”; we’ll cover some of them below.

At its core, Authenticode has two components:

  • The certificate table, which contains one or more entries, each of which may be a SignedData.
  • SignedData objects, which are mostly normal PKCS#7 containers (marked with a content type of SignedData per RFC 2315).

The certificate table

The certificate table is the mechanism by which Authenticode signatures are embedded into PE files.

It has a few interesting properties:

  • Accessing the certificate table involves reading the certificate table directory in the data directory table. Unlike every other entry in the data directory table, the certificate directory’s RVA field is not a virtual address—it’s a direct file offset. This is a reflection of the behavior of the Windows loader, which doesn’t actually load the certificates into the address space of the program.
  • Despite this, real-world tooling appears to be inflexible about placement and subsequent parsing of the certificate table. Microsoft’s tooling consistently places the certificate table at the end of the PE (after all sections); many third-party tools naively seek to the certificate table offset and parse until EOF, allowing an attacker to trivially append additional certificates1.

Once located, actually parsing the certificate table is straightforward: It’s an 8 byte-aligned blob of WIN_CERTIFICATE structures:


win_certificate.png

…with some fields of interest:

  • wRevision: the “revision” of the WIN_CERTIFICATE.
    • MSDN only recently fixed the documentation for this field: WIN_CERT_REVISION_2_0=0x0200 is the current version for Authenticode signatures; WIN_CERT_REVISION_1_0=0x0100 is for “legacy” signatures. I haven’t been able to find the latter in the wild.
  • wCertificateType: the kind of encapsulated certificate data.
    • MSDN documents four possible values for wCertificateType, but we’re only interested in one: WIN_CERT_TYPE_PKCS_SIGNED_DATA.
  • bCertificate: the actual certificate data. For WIN_CERT_TYPE_PKCS_SIGNED_DATA, this is the (mostly) PKCS#7 SignedData mentioned above.

As you might have surmised, the structure of the certificate table allows for multiple independent Authenticode signatures. This is useful for deploying a program across multiple versions of Windows, particularly versions that might have legacy certificates in the Trusted Publishers store or that don’t trust a particular CA for whatever reason.

Authenticode’s SignedData

Microsoft helpfully2 supplies this visualization of their SignedData structure:

authenticode-signeddata

This is almost a normal PKCS#7 SignedData, with a few key deviations:

  • Instead of one of the RFC 2315 content types, the Authenticode SignedData’s contentInfo has a type of SPC_INDIRECT_DATA_OBJID, which Microsoft defines as 1.3.6.1.4.1.311.2.1.43.
  • The structure corresponding to this object identifier (OID) is documented as SpcIndirectDataContent. Microsoft conveniently provides its ASN.1 definition:
    spc_indirect(Observe that the custom AlgorithmIdentifier is actually just X.509’s AlgorithmIdentifier—see RFC 3279 and its updates).

    ⚠ The code below does no error handling or memory management; read the μthenticode source for the full version. ⚠

    Given the ASN.1 definitions above, we can use OpenSSL’s (hellish and completely undocumented) ASN.1 macros to parse Microsoft’s custom structures:


    spc_indirect_macros

Actually checking the signature

With our structures in place, we can use OpenSSL’s (mostly) undocumented PKCS#7 API to parse our SignedData and indirect data contents:

verify1

…and then validate them:

verify2

Voilà: the basics of Authenticode. Observe that we pass PKCS7_NOVERIFY, as we don’t necessarily have access to the entire certificate chain—only Windows users with the relevant cert in their Trusted Publishers store will have that.

Calculating and checking the Authenticode hash

Now that we have authenticity (modulo the root certificate), let’s do integrity.

First, let’s grab the hash embedded in the Authenticode signature, for eventual comparison:

authenticode_digest

Next, we need to compute the binary’s actual hash. This is a little involved, thanks to a few different fields:

  • Every PE has a 32-bit CheckSum field that’s used for basic integrity purposes (i.e., accidental corruption). This field needs to be skipped when calculating the hash, as it’s calculated over the entire file and would change with the addition of certificates.
  • The certificate data directory entry itself needs to be skipped, since relocating and/or modifying the size of the certificate table should not require any changes to pre-existing signatures.
  • The certificate table (and constituent signatures) itself, naturally, cannot be part of the input to the hash.
  • To ensure a consistent hash, Authenticode stipulates that sections are hashed in ascending order by the value of each section header’s PointerToRawData, not the order of the section headers themselves. This is not particularly troublesome, but requires some additional bookkeeping.

μthenticode’s implementation of the Authenticode hashing process is a little too long to duplicate below, but in pseudocode:

  1. Start with an empty buffer.
  2. Insert all PE headers (DOS, COFF, Optional, sections) into the buffer.
  3. Erase the certificate table directory entry and CheckSum field from the buffer, in that order (to avoid rescaling the former’s offset).
  4. Use pe-parse’s IterSec API to construct a list of section buffers. IterSec yields sections in file offset order as of #129.
  5. Skip past the certificate table and add trailing data to the buffer, if any exists.
  6. Create and initialize a new OpenSSL message digest context using the NID retrieved from the signature.
  7. Toss the buffer into EVP_DigestUpdate and finish with EVP_DigestFinal.
  8. Compare the result with the Authenticode-supplied hash.

Other bits and pieces

We haven’t discussed the two remaining major Authenticode features: page hashes and timestamp countersignatures.

Page hashes

As mentioned above, page hashes are conspicuously not documented in the Authenticode specification, and are described as stored in a “[…] binary structure [that] is outside the scope of this paper.”

Online information on said structure is limited to a few resources:

  • The VirtualBox source code references OIDs for two different versions of the page hashes structure:
    • SPC_PE_IMAGE_PAGE_HASHES_V1_OBJID: 1.3.6.1.4.1.311.2.3.1
    • SPC_PE_IMAGE_PAGE_HASHES_V2_OBJID: 1.3.6.1.4.1.311.2.3.2

    These OIDs are not listed in Microsoft’s OID reference or in the OID repository4, although they do appear in Wintrust.h.

  • At least one fork of osslsigncode has support for generating and validating page hashes, and grants us further insight:
    • The V1 OID represents SHA-1 page hashes; V2 represents SHA2-256.
    • The serializedData of each SpcSerializedObject is an ASN.1 SET, each member of which is an ASN.1 SEQUENCE, to the effect of:page_hash_asn1(The definitions above are my reconstruction from the body of get_page_hash_link; osslsigncode confusingly reuses the SpcAttributeTypeAndOptionalValue type for Impl_SpcPageHash and constructs the rest of the contents of SpcSerializedObject manually.)

As far as I can tell, osslsigncode only inserts one Impl_SpcPageHash for the entire PE, which it calculates in pe_calc_page_hash. The code in that function is pretty dense, but it seems to generate a table of structures as follows:

page_hash

…where IMPL_PAGE_HASH_SIZE is determined by the hash algorithm used (i.e., by Impl_SpcPageHash.type), and the very first entry in the table is a null-padded “page hash” for just the PE headers with page_offset=0. This table is not given an ASN.1 definition—it’s inserted directly into Impl_SpcPageHash.pageHashes.

Timestamp countersignatures

Unlike page hashes, Authenticode’s timestamp countersignature format is relatively well documented, both in official and third-party sources.

Just as the Authenticode SignedData is mostly a normal PKCS#7 SignedData, Authenticode’s timestamp format is mostly a normal PKCS#9 countersignature. Some noteworthy bits include:

  • When issuing a timestamp request (TSR) to a timestamp authority (TSA), the request takes the form of an HTTP 1.1 POST containing a DER-encoded, then base64-encoded ASN.1 message:timestamp_asn1…where countersignatureType is the custom Microsoft OID 1.3.6.1.4.1.311.3.2.1 (i.e., SPC_TIME_STAMP_REQUEST_OBJID) and content is the original Authenticode PKCS#7 ContentInfo.
  • The TSA response is a PKCS#7 SignedData, from which the SignerInfo is extracted and embedded into the main Authenticode SignedData. The certificates from the TSA response are similarly embedded into the certificate list as unauthenticated attributes.

Wrapup

We’ve covered all four major components of Authenticode above: verifying the signature, checking the integrity of the file against the verified hash, calculating page hashes, and verifying any timestamp countersignatures.

μthenticode itself is still a work in progress, and currently only has support for signatures and the main Authenticode hash. You can help us out by contributing support for page hash parsing and verification, as well as timestamp signature validation!

μthenticode’s’ APIs are fully documented and hosted, and most can be used immediately with a peparse::parsed_pe *:

uthenticode_example

Check out the svcli command-line tool for an applied example, including retrieving the embedded Authenticode hash.

Prior work and references

μthenticode was written completely from scratch and uses the official Authenticode document supplied by Microsoft as its primary reference. When that was found lacking, the following resources came in handy:

The following resources were not referenced, but were discovered while researching this post:

  • jsign: A Java implementation of Authenticode

Want the scoop on our open-source projects and other security innovations? Contact us or sign up for our newsletter!


  1. Adding additional certificates without this parsing error is still relatively trivial, but requires the attacker to modify more parts of the PE rather than just append to it.↩︎
  2. For some definitions of helpful.↩︎
  3. The OID tree for Authenticode shows lots of other interesting OIDs, most of which aren’t publicly documented.↩︎
  4. I’ve submitted them, pending approval by the repository’s administrator.↩︎

Emerging Talent: Winternship 2020 Highlights

The Trail of Bits Winternship is our winter internship program where we invite 10-15 students to join us over the winter break for a short project that has a meaningful impact on information security. They work remotely with a mentor to create or improve tools that solve a single impactful problem. These paid internships give student InfoSec engineers real industry experience, plus a publication for their resumé—and sometimes even lead to a job with us (congrats, Samuel Caccavale!).

Check out the project highlights from some of our 2020 Winternship interns below.

Aaron Yoo—Anvill Decompiler

UCLA

This winter break I added a tool to the Anvill decompiler that produces “JSON specifications” for LLVM bitcode functions. These bitcode-derived specifications tell Anvill about the physical location (register or memory) of important values such as function arguments and return values. Anvill depends on this location information to intelligently lift machine code into high-quality, easy-to-analyze LLVM bitcode.

A typical specification looks something like this:

{
    "arch": "amd64",
    "functions": [
        {
            "demangled_name": "test(long, long)",
            "name": "_Z4testll",
            "parameters": [
                {
                    "name": "param1",
                    "register": "RDI",
                    "type": "l"
                },
...

Overall, I had fun learning so much about ABIs and using the LLVM compiler infrastructure. I got to see my tool operate in some sample full “machine code to C” decompilations, and overcame tricky obstacles, such as how a single high-level parameter or return value can be split across many machine-level registers. My final assessment: Decompilers are pretty cool!

Paweł Płatek—DeepState and Python

AGH University of Science and Technology

The main goal of my winternship project was to find and fix bugs in the Python part of the DeepState source code. Most of the bugs I found were making it nearly impossible to build DeepState for fuzzing or to use the fuzzer’s executors. Now, the build process and executors work correctly, and have been tested and better documented. I’ve also identified and described places where more work is needed (in GitHub issues). Here are the details of my DeepState project:

Cmake—Corrected build options; added the option to build only examples; added checks for compilers and support for Honggfuzz and Angora; improved cmake for examples (so the examples are automatically found).

Docker—Split docker build into two parts—deepstate-base and deepstate—to make it easier to update the environment; added multi-stage docker builds (so fuzzers can be updated separately, without rebuilding everything from scratch); added –jobs support and Angora.

CI—Rewrote a job that builds and pushes docker images to use GitHub action that supports caching; added fuzzer tests.

Fuzzer executors (frontends)—Unified arguments and input/output directory handling, reviewed each fuzzer documentation, so executors set appropriate runtime flags, reimplemented pre- and post-execution methods, added methods for fuzzers’ executables discovery (based on FUZZERNAME_HOME environment variables and CLI flag), reimplemented logging, fixed compilation functions, rewritten methods related to fuzzer’s runtime statistics, reimplemented run method (so that the management routine that retrieves statistics & synchronize stuff is called from time to time and that exceptions are handled properly; also added cleanup function and possibility to restart process), examined each fuzzer directory structure and seed synchronization possibilities and, based on that, implemented fuzzers resuming and fixed ensembling methods.

Fuzzer testing—Created a basic test that checks whether executors can compile correctly and whether fuzzers can find a simple bug; created test for seed synchronization.

Documentation—Split documentation into several files, and added chapters on fuzzer executor usage and harness writing.

Philip Zhengyuan Wang—Manticore

University of Maryland

During my winternship, I helped improve Manticore’s versatility and functionality. Specifically, I combined it with Ansible (an automated provisioning framework) and Protobuf (a serialization library) to allow users to run Manticore in the cloud and better understand what occurs during a Manticore run.

As it stands, Manticore is very CPU-intensive; it competes for CPU time with other user processes when running locally. Running a job on a remotely provisioned VM in which more resources could be diverted to a Manticore run would make this much less of an issue.

To address this, I created “mcorepv” (short for Manticore Provisioner), a Python wrapper for Manticore’s CLI and Ansible/DigitalOcean that allows users to select a run destination (local machine/remote droplet) and supply a target Manticore Python script or executable along with all necessary runtime flags. If the user decides to run a job locally, mcorepv executes Manticore analysis in the user’s current working directory and logs the results.

Things get more interesting if the user decides to run a job remotely—in this case, mcorepv will call Ansible and execute a playbook to provision a new DigitalOcean droplet, copy the user’s current working directory to the droplet, and execute Manticore analysis on the target script/executable. While the analysis is running, mcorepv streams logs and Manticore’s stdout back in near real time via Ansible so a user may frequently check on the analysis’ progress.

Manticore should also simultaneously stream its list of internal states and their statuses (ready, busy, killed, terminated) to the user via a protobuf protocol over a socket in order to better describe the analysis’ status and resource consumption (this is currently a work in progress). To make this possible, I developed a protobuf protocol to represent Manticore’s internal state objects and allow for serialization, along with a terminal user interface (TUI). Once started on the droplet, Manticore spins up a TCP server that provides a real-time view of the internal state lists. The client can then run the TUI locally, which will connect to the Manticore server and display the state lists. Once the job has finished, the Manticore server is terminated, and the results of the Manticore run along with all logs are copied back to the user’s local machine where they may be further inspected.

There’s still some work to be done to ensure Manticore runs bug-free in the cloud. For example:

  • Port forwarding must be set up on the droplet and local machine to ensure Manticore’s server and client TUI can communicate over SSH.
  • The TUI needs additional optimization and improvement to ensure the user gets the right amount of information they need.
  • Mcorepv and its Ansible backend need to be more rigorously tested to ensure they work properly.

I’m glad that in my short time at Trail of Bits, I was able to help move Manticore one step closer to running anywhere, anytime.

Fig 1: Proof of concept—Manticore TUI displaying a list of state objects and log messages to the user.

Samuel Caccavale—Go

Northeastern University

During my winternship, I developed AST- and SSA-based scanners to find bugs in Go code that were previously overlooked by tools like GoSec and errcheck. One unsafe code pattern targeted was usage of a type assertion value before checking whether the type assertion was ok in value, ok := foo.(). While errcheck will detect type assertions that do not bind the ok value (and therefore cause a panic if the type assertion fails), it can’t exhaustively check whether the usage of value is within a context where ok is true. The example reproduces the most trivial example where errcheck and the SSA approach diverge; the SSA approach will correctly detect the usage of safe as safe, and the usage of unsafe as unsafe:

package main

import ("fmt")

func main() {
    var i interface{} = "foo"

    safe, ok := i.(string)
    if ok {
        fmt.Println(safe)
    }

    unsafe, ok := i.(string)
    fmt.Println(ok)
    if true {
        fmt.Println(unsafe)
    }
}

Taylor Pothast—Mishegos

Vanderbilt University

During my winternship, I worked on improving the performance of mishegos, Trail of Bits’ differential fuzzer for x86_64 decoders, by switching the cohort collection and output components from their original proof-of-concept JSON format to a compact binary format.

To do this, I learned about mishegos’ internal workings, its in-memory result representations, binary parsing, and how to write Kaitai Struct definitions. My work was merged on the final day of my internship, and is now the default output format for mishegos runs.

To make my work compatible with mishegos’s preexisting analysis tools, I also added a helper utility, mish2jsonl, for converting the new binary output into a JSON form mostly compatible with the original output format. Finally, I updated the analysis tools to handle the changes I made in the JSON-ified output format, including new symbolic fields for each fuzzed decoder’s state.

Thomas Quig—Crytic and Slither

University of Illinois Urbana-Champaign

While at Trail of Bits, I integrated Slither’s smart contract upgradeability checks into Crytic, Trail of Bits’ CI service for Ethereum security. Because smart contracts are immutable upon upload, users need to be able to upgrade their contracts if they ever need to be uploaded. To resolve this issue, the user can have the old contract, a new contract, and a proxy contract. The proxy contract contains what are essentially function pointers that can be modified to point towards the new contract. Risks arise if the aforementioned process is done incorrectly (e.g., incorrect global variable alignment). With Slither, the user can check to make sure those risks are mitigated.

The integration process was much more complicated than I thought it would be, but I was still able to successfully implement the checks. Learning the codebase and the multiple languages required to work with it simultaneously was quite difficult, but manageable. Crytic now grabs the list of all smart contracts and displays them within settings so they can be chosen. The user can pick which contract is the old contract, the new contract, and the proxy contract. Upgradeability checks are then run on those contracts, and the output is displayed to a new findings page in an easily analyzable JSON format.

I enjoyed my time at Trail of Bits. My mentors helped me learn the infrastructure while giving me opportunities to work independently. I gained significant experience in a short period of time and learned about many topics I didn’t expect to like.

William Wang—OpenSSL and Anselm

UCLA

This winter I worked on Anselm. OpenSSL is one of the most popular cryptography libraries for developers, but it’s also shockingly easy to misuse. Still, many instances of improper use fall into specific patterns, which is where Anselm comes in. My main goal was to prototype a system for detecting these behaviors.

I spent most of my time writing an LLVM pass to detect OpenSSL API calls and form a simplified graph of nodes representing possible execution paths. In larger codebases, traversing through each individual IR block can be time consuming. By “compressing” the graph to its relevant nodes (API calls) first, Anselm enables more complex analyses on it.

I also explored and began implementing heuristics for bad behaviors of varying complexity. For example, mandating that a cipher context be initialized only involves searching the children of nodes where one is created. Similarly, other ordering requirements can be enforced through the call graph alone. However, a bug like repeated IVs/nonces likely requires argument/return value analysis, which I’d like to research more in the future.

While I’m happy with what I accomplished over the winternship, there’s a lot more to be done. In addition to fleshing out the ideas mentioned above, I’d like to polish up the final interface so other developers can easily write their own heuristics. Working with LLVM IR also means that Anselm can theoretically operate on other languages that use OpenSSL bindings. I’m excited to tackle these and other challenges in the future!

See you in the fall?

We’ve selected our summer 2020 interns, but if you’d like to apply (or suggest an intern) for fall, contact us!

Reinventing Vulnerability Disclosure using Zero-knowledge Proofs

We, along with our partner Matthew Green at Johns Hopkins University, are using zero-knowledge (ZK) proofs to establish a trusted landscape in which tech companies and vulnerability researchers can communicate reasonably with one another without fear of being sabotaged or scorned. Over the next four years, we will push the state of the art in ZK proofs beyond theory, and supply vulnerability researchers with software that produces ZK proofs of exploitability. This research is part of a larger effort funded by the DARPA program Securing Information for Encrypted Verification and Evaluation (SIEVE).

Much recent ZK work has focused on blockchain applications, where participants must demonstrate to the network that their transactions follow the underlying consensus rules. Our work will substantially broaden the types of statements provable in ZK. By the end of the program, users will be able to prove statements about programs that run on an x86 processor.

Why ZK proofs of exploitability?

Software makers and vulnerability researchers have a contentious relationship when it comes to finding and reporting bugs. Disclosing too much information about a vulnerability could ruin the reward for a third-party researcher while premature disclosure of a vulnerability could permanently damage the reputation of a software company. Communication between these parties commonly breaks down, and the technology industry suffers because of it.

Furthermore, in many instances companies are unwilling to engage with security teams and shrug off potential hazards to user privacy. In these situations, vulnerability researchers are put in a difficult position: stay silent despite knowing users are at risk, or publicly disclose the vulnerability in an attempt to force the company into action. In the latter scenario, researchers may themselves put users in harm’s way by informing attackers of a potential path to exploitability.

ZK proofs of exploitability will radically shift how vulnerabilities are disclosed, allowing companies to precisely define bug bounty scope and researchers to unambiguously demonstrate they possess valid exploits, all without risking public disclosure.

Designing ZK proofs

In ZK proofs, a prover convinces a verifier that they possess a piece of information without revealing the information itself. For example, Alice might want to convince Bob that she knows the SHA256 preimage, X, of some value, Y, without actually telling Bob what X is. (Maybe X is Alice’s password). Perhaps the most well-known industrial application of ZK proofs is found in privacy-preserving blockchains like Zcash, where users want to submit transactions without revealing their identity, the recipient’s identity, or the amount sent. To do this, they submit a ZK proof showing that the transaction follows the blockchain’s rules and that the sender has sufficient funds. (Check out Matt Green’s blog post for a good introduction to ZK proofs with examples.)

Now you know why ZK proofs are useful, but how are these algorithms developed, and what tradeoffs need to be evaluated? There are three metrics to consider for developing an efficient system: prover time, verifier time, and bandwidth, which is the amount of data each party must send to the other throughout the protocol. Some ZK proofs don’t require interaction between the prover and verifier, in which case the bandwidth is just the size of the proof.

And now for the hard part, and the main barrier that has prevented ZK proofs from being used in practice. Many classical ZK protocols require that the underlying statement be first represented as a Boolean or arithmetic circuit with no loops (i.e., a combinatorial circuit). For Boolean circuits, we can use AND, NOT, and XOR gates, whereas arithmetic circuits use ADD and MULT gates. As you might imagine, it can be challenging to translate a statement you wish to prove into such a circuit, and it’s particularly challenging if that problem doesn’t have a clean mathematical formulation. For example, any program with looping behavior must be unrolled prior to circuit generation, which is often impossible when the program contains data-dependent loops.

circuits

Examples of Boolean and arithmetic circuits

Furthermore, circuit size has an impact on the efficiency of ZK protocols. Prover time usually has at least a linear relationship with circuit size (often with large constant overhead per gate). Therefore, ZK proofs in vulnerability disclosure require the underlying exploit to be captured by a reasonably sized circuit.

Proving Exploitability

Since ZK proofs generally accept statements written as Boolean circuits, the challenge is to prove the existence of an exploit by representing it as a Boolean circuit that returns “true” only if the exploit succeeds.

We want a circuit that accepts if the prover has some input to a program that leads to an exploit, like a buffer overflow that leads to the attacker gaining control of program execution. Since most binary exploits are both binary- and processor-specific, our circuit will have to accurately model whatever architecture the program was compiled to. Ultimately, we need a circuit that accepts when successful runs of the program occur and when an exploit is triggered during execution.

Taking a naive approach, we would develop a circuit that represents “one step” of whatever processor we want to model. Then, we’d initialize memory to contain the program we want to execute, set the program counter to wherever we want program execution to start, and have the prover run the program on their malicious input—i.e., just repeat the processor circuit until the program finishes, and check whether an exploit condition was met at each step. This could mean checking whether the program counter was set to a known “bad” value, or a privileged memory address was written to. An important caveat about this strategy: The entire processor circuit will need to run at each step (even though only one instruction is being actually executed), because singling out one piece of functionality would leak information about the trace.

Unfortunately, this approach will produce unreasonably large circuits, since each step of program execution will require a circuit that models both the core processor logic and the entirety of RAM. Even if we restrict ourselves to machines with 50 MB of RAM, producing a ZK statement about an exploit whose trace uses 100 instructions will cause the circuit to be at least 5 GB. This approach is simply too inefficient to work for meaningful programs. The key problem is that circuit size scales linearly with both trace length and RAM size. To get around this, we follow the approach of SNARKs for C, which divides the program execution proof into two pieces, one for core logic and the other for memory correctness.

To prove logic validity, the processor circuit must be applied to every sequential pair of instructions in the program trace. The circuit can verify whether one register state can legitimately follow the other. If every pair of states is valid, then the circuit accepts. Note, however, that memory operations are assumed to be correct. If the transition function circuit were to include a memory checker, it would incur an overhead proportional to the size of the RAM being used.

Trace Proof (1)

Checking processor execution validity without memory

Memory can be validated by having the prover also input a memory-sorted execution trace. This trace places one trace entry before the other if the first accesses a lower-numbered memory address than the second (with timestamps breaking ties). This trace lets us do a linear sweep of memory-sorted instructions and ensure consistent memory access. This approach avoids creating a circuit whose size is proportional to the amount of RAM being used. Rather, this circuit is linear in the trace size and only performs basic equality checks instead of explicitly writing down the entirety of RAM at each step.

Memory Proof (1)

Checking a memory sorted trace

The final issue we need to address is that nothing stops the prover from using a memory-sorted trace that doesn’t correspond to the original and create a bogus proof. To fix this, we have to add a “permutation checker” circuit that verifies we’ve actually sorted the program trace by memory location. More discussion on this can be found in the SNARKs for C paper.

Modeling x86

Now that we know how to prove exploits in ZK, we need to model relevant processor architectures as Boolean circuits. Prior work has done this for a RISC architecture called TinyRAM, which was designed to run efficiently in a ZK context. Unfortunately, TinyRAM is not used in commercial applications and is therefore not realistic for providing ZK proofs of exploitability in real-world programs, since many exploits rely on architecture-specific behavior.

We will begin our development of ZK proofs of exploitability by modeling a relatively simple microprocessor that is in widespread use: the MSP430, a simple chip found in a variety of embedded systems. As an added bonus, the MSP430 is also the system the Microcorruption CTF runs on. Our first major goal is to produce ZK proofs for each Microcorruption challenge. With this “feasibility demonstration” complete, we will set our sights on x86.

Moving from a simple RISC architecture to an infamously complex CISC machine comes with many complications. Circuit models of RISC processors clock in between 1–10k gates per cycle. On the other hand, if our x86 processor turned out to contain somewhere on the order of 100k gates, a ZK proof for an exploit that takes 10,000 instructions to complete would produce a proof of size 48 gigabytes. Since x86 is orders of magnitude more complex than MSP430, naively implementing its functionality as a Boolean circuit would be impractical, even after separating the proof of logic and memory correctness.

Our solution is to take advantage of the somewhat obvious fact that no program uses all of x86. It may be theoretically possible for a program to use all 3,000 or so instructions supported by x86, but in reality, most only use a few hundred. We will use techniques from program analysis to determine the minimal subset of x86 needed by a given binary and dynamically generate a processor model capable of verifying its proper execution.

Of course, there are some x86 instructions that we cannot support, since some x86 instructions implement data-dependent loops. For example, repz repeats the subsequent instruction until rcx is 0. The actual behavior of such an instruction cannot be determined until runtime, so we cannot support it in our processor circuit, which must be a combinatorial circuit. To handle such instructions, we will produce a static binary translator from normal x86 to our program-specific x86 subset. This way, we can handle the most complex x86 instructions without hard-coding them into our processor circuit.

A new bug disclosure paradigm

We are very excited to start this work with Johns Hopkins University and our colleagues participating in the DARPA SIEVE Program. We want to produce tools that radically shift how vulnerabilities are disclosed, so companies can precisely specify the scope of their bug bounties and vulnerability researchers can securely submit proofs that unambiguously demonstrate they possess a valid exploit. We also anticipate that the ZK disclosure of vulnerabilities can serve as a consumer protection mechanism. Researchers can warn users about the potential dangers of a given device without making that vulnerability publicly available, which puts pressure on companies to fix issues they may not have been inclined to otherwise.

More broadly, we want to transition ZK proofs from academia to industry, making them more accessible and practical for today’s software. If you have a specific use for this technology that we haven’t mentioned here, we’d like to hear from you. We’ve built up expertise navigating the complex ecosystem of ZK proof schemes and circuit compilers, and can help!

Bug Hunting with Crytic

Crytic, our Github app for discovering smart contract flaws, is kind of a big deal: It detects security issues without human intervention, providing continuous assurance while you work and securing your codebase before deployment.

Crytic finds many bugs no other tools can detect, including some that are not widely known. Right now, Crytic has 90+ detectors, and we’re continuously adding new checks and improving existing ones. It runs these bug and optimization detectors on every commit, and evaluates custom security properties that you add yourself too!

Today, we’re sharing twelve issues in major projects that were found solely by Crytic, including some of high severity.

Here are the issues we discovered. Read on for where we found them:

  1. Unused return value can allow free token minting
  2. registerVerificationKey always returns empty bytes
  3. MerkleTree.treeHeight and MerkleTree.zero can be constant
  4. Modifiers can return the default value
  5. Dangerous strict equality allows the contract to be trapped
  6. ABI encodePacked collision
  7. Missing inheritance is error-prone
  8. Msg.value is used two times in fundWithAward
  9. Reentrancy might lead to theft of tokens
  10. [Pending acknowledgement]
  11. [Pending acknowledgement]
  12. [Pending acknowledgement]

Ernst & Young Nightfall

Crytic found three bugs in E&Y’s Nightfall project, including one critical vulnerability that could allow anyone to mint free tokens.

Issue 1: Unused return value can allow minting free tokens

Description

FTokenShield.mint does not check the return value of transferFrom. If FTokenShield is used with a token that does not revert in case of incorrect transfers, and only returns false (e.g., BAT), anyone can mint free commitments—and an attacker can mint tokens for free.

A similar issue with less impact occurs in FTokenShield.burn: here fToken.transfer is not checked.

Recommendation

Check the return value of transferFrom and transfer.

Crytic report

Issue 2: registerVerificationKey always returns empty bytes

Description

FTokenShield.registerVerificationKey and NFTokenShield.registerVerificationKey return an empty bytes32. It is unclear what the correct returned value should be. This might lead to unexpected behavior for third parties and contracts.

Recommendation

Consider either returning a value, or removing the return value from the signature.

Crytic report

Issue 3: MerkleTree.treeHeight and MerkleTree.zero can be constant

Description

treeHeight and zero can be declared constant in MerkleTree.sol to allow the compiler to optimize this code.

Recommendation

State variables that never change can be declared constant to save gas.

Crytic report

DeFiStrategies

Crytic found one unusual issue in DeFiStrategies: The lack of placeholder execution in a modifier leads the caller function to return the default value. Additionally, Crytic found an issue related to strict equality on the return of a balanceOf call.

Issue 4: Modifiers can return the default value

Description

The SuperSaverZap.onlyInEmergency() and SuperSaverZap.stopInEmergency() modifiers do not revert in case of invalid access. If a modifier does not execute or revert, the execution of the function will return the default value, which can be misleading for the caller.

Recommendation

Replace the if() condition by a require in both modifiers.

Crytic report

Issue 5: Dangerous strict equality allows the contract to be trapped

Description

ERC20toUniPoolZapV1_General.addLiquidity has strict equality on the _UniSwapExchangeContractAddress balance. This behavior might allow an attacker to trap the contract by sending tokens to _UniSwapExchangeContractAddress.

Recommendation

Change == to <= in the comparison.

Crytic report

DOSNetwork

Crytic found another issue that is not well known in DOSNetwork: If abi.encodedPacked is called with multiple dynamic arguments, it can return the same value with different arguments.

Issue 6: ABI encodePacked Collision

Description

DOSProxy uses the encodePacked Solidity function with two consecutive strings (dataSource and selector): eth-contracts/contracts/DOSProxy.sol.

This schema is vulnerable to a collision, where two calls with a different dataSource and selector can result in the same queryId (see the Solidity documentation for more information).

Recommendation

Do not use more than one dynamic type in encodePacked, and consider hashing both dataSource and selector with keccak256 first.

Crytic Report

ethereum-oasis/Baseline

Crytic found an architectural issue in the Baseline Protocol, among others: A contract implementing an interface did not inherit from it.

Issue 7: Missing inheritance is error-prone

Description

Shield is an implementation of ERC1820ImplementerInterface, but it does not inherit from the interface. This behavior is error-prone and might prevent the implementation or the interface from updating correctly.

Recommendation

Inherit Shield from ERC1820ImplementerInterface.

Crytic report

EthKids

Crytic found another unusual issue in EthKidsthis.balance includes the amount of the current transaction (msg.value), which might lead to incorrect value computation.

Issue 8: Msg.value is used two times in fundWithAward

Description

The use of this.balance in fundWithAward does not account for the value already added by msg.value. As a result, the price computation is incorrect.

fundWithAward computes the token amount by calling calculateReward with msg.value:

function fundWithAward(address payable _donor) public payable onlyWhitelisted {
   uint256 _tokenAmount = calculateReward(msg.value, _donor);

calculateReward calls calculatePurchaseReturn, where _reserveBalance is this.balance and _depositAmount is msg.value:

  return bondingCurveFormula.calculatePurchaseReturn(_tokenSupply, _tokenBalance, address(this).balance, _ethAmount);

In calculatePurchaseReturnn, baseN is then computed by adding _depositAmount (msg.value) to _reserveAmount (this.balance):

uint256 baseN = _depositAmount.add(_reserveBalance);

msg.value is already present in this.balance. For example, if this.balance is 10 ether before the transaction, and msg.value is 1 eth, this.balance will be 11 ether during the transaction). As a result, msg.value is used two times, and calculatePurchaseReturn is incorrectly computed.

Recommendation

Change the price computation so that _reserveBalance does not include the amount sent in the transaction.

Crytic report

HQ20

Finally, Crytic found a well-known issue in HQ20: reentrancy. This reentrancy is interesting as it occurs if the contract is used with a token with callback capabilities (such as ERC777). This is similar to the recent uniswap and lendf.me hacks.

Issue 9: Reentrancy might lead to theft of tokens

Description

Classifieds calls transferFrom on external tokens without following the check-effects-interaction pattern. This leads to reentrancy that can be exploited by an attacker if the destination token has a callback mechanism (e.g., an ERC777 token).

There are two methods with reentrancy issues:

Recommendation

Follow the check-after-effect pattern.

Crytic report

Start using Crytic today!

Crytic can save your codebase from critical flaws and help you design safer code. What’s not to like?

Sign up for Crytic today. Got questions? Join our Slack channel (#crytic) or follow @CryticCi on Twitter.

Announcing the 1st International Workshop on Smart Contract Analysis

At Trail of Bits we do more than just security audits: We also push the boundaries of research in vulnerability detection tools, regularly present our work in academic conferences, and review interesting papers from other researchers (see our recent Real World Crypto and Financial Crypto recaps).

In this spirit, we and Northern Arizona University are pleased to co-organize the 1st International Workshop on Smart Contract Analysis (WoSCA 2020). Co-located with ISSTA 2020, our workshop will bring together researchers from all over the world to discuss static and dynamic approaches for analyzing smart contracts, static and dynamic. It covers, but is not limited to:

  • Analysis-based vulnerability discovery (e.g. heuristics-based static analysis, fuzzing)
  • Sound analysis (e.g. model checking, temporal logic)
  • Code optimization (e.g. code-size reduction, gas-cost estimation)
  • Code understanding (e.g. decompilation, reverse engineering)
  • Code monitoring (e.g. debugging, fault detection)
  • Intermediate representation (e.g. design, specification)

WoSCA 2020 also actively promotes open and reproducible research. We especially encourage you to submit papers that show how to improve existing tools or propose new ones!

Submit your paper (up to 8 pages) by Friday, May 22, 2020 (AoE). Deadline extended to Friday, June 26 2020! (AoE).

And don’t forget: We’re still accepting submissions for the 10k Crytic Research Prize, which honors published academic papers built around or on top of our blockchain tools. Get involved!

Revisiting 2000 cuts using Binary Ninja’s new decompiler

It’s been four years since my blog post “2000 cuts with Binary Ninja.” Back then, Binary Ninja was in a private beta and the blog post response surprised its developers at Vector35. Over the past few years I’ve largely preferred to use IDA and HexRays for reversing, and then use Binary Ninja for any scripting. My main reason for sticking with HexRays? Well, it’s an interactive decompiler, and Binary Ninja didn’t have a decompiler…until today!

Today, Vector35 released their decompiler to their development branch, so I thought it’d be fun to revisit the 2000 cuts challenges using the new decompiler APIs.

Enter High Level IL

The decompiler is built on yet another IL, the High Level IL. High Level IL (HLIL) further abstracts the existing ILs: Low Level IL, Medium Level IL, and five others largely used for scripts. HLIL has aggressive dead code elimination, constant folding, switch recovery, and all the other things you’d expect from a decompiler, with one exception: Binary Ninja’s decompiler doesn’t target C. Instead, they’re focusing on readability, and I think that’s a good thing, because C is full of many idiosyncrasies and implicit operations/conversions, which makes it very difficult to understand.

Using one of our 2000 cuts challenges, let’s compare Binary Ninja’s decompiler to its IL abstractions.

First, the original disassembly:

The ASM disassembly has 43 instructions, of which three are arithmetic, 27 memory operations (loads, stores), four transfers, one comparison, and eight control flow instructions. Of these instructions, we care mainly about the control flow instructions and a small slice of memory operations.

Next, Low Level IL (LLIL):

LLIL mode doesn’t remove any instructions; it simply provides us a consistent interface to write scripts against, regardless of architecture being disassembled.

With Medium Level IL (MLIL), things start to get better:

In MLIL, stack variables are identified, dead stores are eliminated, constants are propagated, and calls have parameters. And we’re down to 17 instructions (39% of the original). Since MLIL understands the stack frame, all the arithmetic operations are removed and all the “push” and “pop” instructions are eliminated.

Now, in the decompiler, we’re down to only six instructions (or eight, if you count the variable declarations):

Also available in linear view:

hlil-linear

HLIL Linear View

Using the new decompiler API to solve 2000 cuts

For the original challenge, we had to extract hard-coded stack canaries out of hundreds of nearly identical executables. I was able to do this with very little difficulty using the LLIL API, but I had to trace some values around and it was a bit unergonomic.

Let’s try it again, but with the HLIL API1:

#!/usr/bin/env python3
import sys
import binaryninja
# 4 years ago we had to specify the loader format and sleep while analysis
# finished
target = sys.argv[1]
bv = binaryninja.BinaryViewType.get_view_of_file(target, update_analysis=True)
# start is our entry point and has two calls, the second call is main
# grabbing start isn't too different than before
start = bv.get_function_at(bv.entry_point)
start_blocks = list(start.high_level_il) # start only has one block
start_calls = [x for x in start_blocks[0] if x.operation == binaryninja.HighLevelILOperation.HLIL_CALL]
call_main = start_calls[1] # second call is main
# Main has a single call to our handler function
main = bv.get_function_at(call_main.dest.constant)
main_blocks = list(main.high_level_il) # main only has one block
main_calls = [x for x in main_blocks[0] if x.operation == binaryninja.HighLevelILOperation.HLIL_CALL]
call_handler = main_calls[0] # first call is handler
# Here's where the real improvements lie
# Handler has our cookie, which is compared with memcmp in the
# last call of the first block, but our call is folded into the if condition
handler = bv.get_function_at(call_handler.dest.constant)
handler_blocks = list(handler.high_level_il)
# grab all the HLIL_IF instructions out of the first block, which there should only be one.
if_insn = [x for x in handler_blocks[0] if x.operation == binaryninja.HighLevelILOperation.HLIL_IF]
# The call to memcmp is the left side of the condition, the right side is '0':
# if(memcmp(buf, "cookie", 4) == 0)
call_memcmp = if_insn[0].condition.left
# Now pull the cookie's data pointer out of the call to memcmp
# arg0 is our input buffer
# arg1 is the cookie data pointer
# arg2 is the size of the compare
cookie_ptr = call_memcmp.params[1].constant
# Read the first 4 bytes to get the cookie value, we could also use the
# count of the memcmp here
cookie = bv.read(cookie_ptr, 4)
print(f"Cookie: {cookie}")

view raw
2000_cuts_hlil.py
hosted with ❤ by GitHub

Running the script produces the desired result:

We quickly process 1000 cuts

Conclusion

The HLIL solution is a bit more concise than our original, and easier to write. Overall, I’m enjoying the Binary Ninja’s new decompiler and I’m pretty excited to throw it at architectures not supported by other decompilers, like MIPS, 68k, and 6502. Hopefully, this is the decade of the decompiler wars!

If you’re interested in learning more about Binary Ninja, check out our Automated Reverse Engineering with Binary Ninja training.

Footnotes
  1. For help writing Binary Ninja plugins that interact with the ILs, check out my bnil-graph plugin, which was recently updated to support HLIL.

Announcing our first virtual Empire Hacking

At Trail of Bits, we’ve all been working remotely due to COVID-19. But the next Empire Hacking event will go on, via video conference!

  • When: April 14th @ 6PM
  • How: RSVP via this Google Form or on Meetup. We’ll email you an invitation early next week.

Come talk shop with us!

Every two months, Empire Hacking is hosted by companies that share our passion for investigating the latest in security engineering. We bring together interesting speakers and people in the field, share new information, make connections, and enjoy continuing our conversations at a nearby bar afterwards. (Bring your own quarantini this time.)

Don’t miss our April lineup

Enarx-Secured, Attested Execution on Any Cloud

Lily Sturmann and Mark Bestavros, security engineers at Red Hat in Boston, will tell us the latest about their work on the open-source Enarx project, which is currently underway on AMD’s SEV and Intel’s SGX hardware. Enarx makes it simple to deploy confidential workloads to a variety of TEEs in the public cloud by handling deployment and attestation. It uses WebAssembly to offer developers a wide range of compatible language choices for workloads, eliminating the need to rewrite the application for a particular TEE’s platform or SDK.

Towards Automated Vulnerability Patching

If you know Trail of Bits, you know we find a lot of bugs, and we think you’ll enjoy this peek at what happens after bugs are found. Carson Harmon will give a brief high-level overview of the challenges associated with vulnerability patching, and our progress in creating tools to assist humans with the process. He’s been working on DARPA CHESS, one of our government-sponsored projects that focuses on automatically finding and fixing bugs in C/C++ programs.

We look forward to seeing you (virtually) on April 14th at 6PM to continue the conversation.

An Echidna for all Seasons

TL;DR: We have improved Echidna with tons of new features and enhancements since it was released—and there’s more to come.

Two years ago, we open-sourced Echidna, our property-based smart contract fuzzer. Echidna is one of the tools we use most in smart contract assessments. According to our records, Echidna was used in about 35% of our smart contract audits during the past two years. These include several high-profile audits such as MakerDAO, 0x, and Balancer. Since the first release of Echidna, we have been adding new features as well as fixing bugs. Here’s a quick look at what we’ve done.

New features

We expanded the capabilities of Echidna with a large set of exciting new features. Some of the most important ones are:

Support for several compilation frameworks using crytic-compile: Integration with crytic-compile allowed Echidna to test complex Truffle projects, and even smart contracts in other languages, such as Vyper, right out of the box. It is completely transparent for the user (if you are an Echidna user, you are already using it!) and it was one of the most important features we implemented in Echidna last year.

Assertion testing: Solidity’s assert can be used as an alternative to explicit Echidna properties, especially if the conditions you’re checking are directly related to the correct use of some complex code deep inside a function. Assertion testing also lets you check for implicit asserts inserted by the compiler, such as out-of-bounds array accesses without an explicit property. Add checkAsserts: true in your Echidna configuration file and it will take care of the rest.

An assertion failure is discovered in Vera’s MakerDAO example

Restrict the functions to call during a fuzzing campaign: Not all functions in a smart contract are created equal. Some of them are not useful during property-based testing and will only slow down the campaign. That’s why Echidna can either blacklist or whitelist functions to call during a fuzzing campaign. Here’s an Echidna configuration that avoids “f1” and “f2” methods during a fuzzing campaign:

filterBlacklist: true # or use false for whitelisting
filterFunctions: [&quot;f1&quot;, &quot;f2&quot;]

Save and load the corpus collected during a fuzzing campaign: If coverage support is enabled, Echidna can load and save the complete corpus collected in JSON. If a corpus is available at the beginning of a fuzzing campaign, Echidna will use it immediately. This means that Echidna will not start from scratch, which is particularly useful during CI tests to speed up the verification of complex properties. Add coverage: true and corpusDir: "corpus" to your Echidna configuration and create a “corpus” directory to save the inputs generated by Echidna.

Pretty-printed example of a transaction from a corpus.

Detect transactions with high-gas consumption: Excessive gas usage can be a pain for developers and users of smart contracts. There are few tools available for detecting transactions with large gas consumption, especially if detecting the transaction requires reaching unusual states of the contract via other transactions. Recently Echidna added support to detect this kind of issue. Use estimateGas: true in your Echidna configuration to report high-gas transactions to your console.

Discovery of a transaction consuming a large amount of gas

Extended testing of complex contracts: Echidna also improved the testing of complex contracts with two cool features. First, it allows initializing a fuzzing campaign with arbitrary transactions using Etheno. Second, it can test more than one contract at the same time, calling any public or external function of any tested contract. Use multi-abi: true in your Echidna configuration to test more than one contract at the same time.

Keeping up to date with the latest research

We are following the latest developments in smart contract fuzzing papers to make sure Echidna is up to date. Our researchers compare open-source fuzzers to Echidna, and integrate any new approach that proves to be effective for finding faults or generating more interesting inputs. In fact, from time to time, we test examples presented in research papers to make sure Echidna can solve them very efficiently! We also regularly attend conferences to discuss novel fuzzing techniques, and even financially support new research papers that improve our tools.

Echidna solves the example presented in Harvey’s paper

Looking forward

And we’re not taking a break! In fact, we have a pipeline of improvements and new features coming to Echidna in the near future, including enhanced coverage feedback, array generation and corpus mutations, and Slither integration. We are also excited to share that we have added Echidna support to crytic.io, our continuous assurance platform for smart contracts.

Echidna integration for automatic assertion checking in crytic.io

In summary

In two years, Echidna has evolved from an experimental tool into an essential resource for fuzzing smart contracts and identifying correctness/security issues. We continue to push the limits of what is possible by fuzzing smart contracts, and keep our open-source tools updated for community use. Learn more about testing your smart contracts with Echidna in our Building Secure Contracts training.

Do you have smart contracts to test with Echidna? Are you interested in reviewing your Echidna scripts or training on how to use it effectively? Drop us a line! Trail of Bits has years of experience in performing smart contract security assessments, addressing everything from minimalistic tokens to complex staking and voting platforms.

Announcing the Zeek Agent

(This posting is cross-posted between the Zeek blog and the Trail of Bits blog).

The Zeek Network Security Monitor provides a powerful open-source platform for network traffic analysis. However, from its network vantage point, Zeek lacks access to host-level semantics, such as the process and user accounts that are responsible for any connections observed. The new Zeek Agent fills this gap by interfacing Zeek directly with your endpoints, providing semantic context that’s highly valuable for making security decisions. The agent collects endpoint data through custom probes and, optionally, by interfacing to osquery and making most of its tables available to Zeek.

We are releasing the Zeek Agent as open-source code under the liberal BSD license. Head over to the releases page and try it now with your macOS and Linux endpoints; Windows support is coming soon! Please share your ideas, requests and other feedback with us by filing an issue on GitHub or joining the #zeek-agent channel on the new Zeek Slack.

logo

What’s new here and why it’s useful

Traditionally, network security monitors only receive network traffic passively intercepted between hosts (endpoints). While that vantage point is very powerful—the network does not lie!—it does not provide the complete picture, and it can be challenging to understand the broader context of who is doing what on the endpoints. This approach makes analyzing encrypted traffic particularly challenging: Since passive monitoring cannot assess the actual content of the communication, defenders often remain in the dark about its legitimacy.

The Zeek Agent closes this gap by adding an endpoint-level vantage point to host activity. Just like Zeek itself, the policy-neutral agent does not perform any detection. Instead, it collects a stream of host-side events (“new process,” “socket opened,” “user logged in”) and feeds those events into Zeek’s standard processing pipeline, where they become available to Zeek scripts just like traditional network-derived events. By representing both network and host activity inside the same event abstraction, this setup lets users deploy all of Zeek’s powerful machinery to cross-correlate the two vantage points. For example, Zeek could now tag network connections with the endpoint-side services that initiated them (e.g., sshd). The agent can also let Zeek create new logs that record endpoint information, as seen in this example:

Screen Shot 2020-03-20 at 12.29.04 AM

Zeek logs of network connections and listening ports, enriched with relevant process context from the endpoint (provided by the Zeek Agent)

A short history and background of the project

In 2018 at the University of Hamburg, Steffen Haas developed an initial prototype of the agent. Originally called zeek-osquery, this prototype was a powerful demonstration of the agent approach, but it had certain technical limitations that precluded production usage.

In 2019, Corelight hired Trail of Bits to update and improve Haas’ zeek-osquery software prototype. While the prototype was developed as a set of patches that implemented functionality directly within the osquery core, Trail of Bits suggested a different approach that would be more suitable for long-term development, and a better match for existing deployments. The new version began as a port of the existing code to an osquery extension which could be packaged and distributed independently. Eventually, this process evolved into a clean rewrite to produce an entirely new agent that can operate both in a standalone fashion and with osquery. The agent’s built-in data sources leverage the same underlying process-monitoring capabilities as osquery, but in a way more compatible with how Linux systems are configured for real-world use. Trail of Bits also designed the agent to easily support more custom data sources in the future.

How the Zeek Agent works

Like osquery, the agent provides system information in the form of a database, using the SQLite library. Table plugins publish the actual data, and are allocated and registered during startup. Once they export the necessary methods to report the schema and generate data, the internal framework will create the required SQLite table abstractions and attach them to the database.

Most data sources will inspect the system at query time and report their findings at that specific point in time. However, some tables may want to keep listening for system activity even when no query is being executed. This is commonly referred to as an evented table, which typically uses threads/callbacks to continuously record system events in the background. The process_events table works exactly like this, and allows Zeek scripts to look at past process executions.

Additional data sources can be imported from osquery, if it happens to be installed and running on the same system, thanks to the extensions socket. With this design, everything appears to come from a single unified database, allowing users to seamlessly join built-in and osquery tables together.

The tables can be accessed via scripts as soon as the agent connects to a Zeek server instance. Data can be requested either by running a single-shot SQL statement or as a scheduled query that runs automatically and reports data at specified intervals.

On the Zeek side, scripts fully control the data requested from the agent by defining corresponding SQL queries. Results stream into Zeek continuously, and transparently turn into standard Zeek events that handlers can hook into, just as if the events were derived from network traffic.

How to get started

The Zeek Agent documentation summarizes how to build and install the agent. On the Zeek side, there’s a new Zeek Agent framework you can install through Zeek’s package manager. See its documentation for more information on installation and usage. For an example of a Zeek script requesting information from the agent, see this script that turns process activity into a Zeek log on Linux.

What’s next for the Zeek Agent?

Right now, we welcome feedback on the Zeek Agent regarding usability, functionality, and deployment models. We’re curious to see what use-cases the Zeek community comes up with, and we encourage users to publish Zeek packages that leverage the agent’s capabilities. The best places to leave feedback are the agent’s GitHub issues, the #zeek-agent channel on the new Zeek Slack, and the Zeek mailing list.

The agent is an open-source project, so we also appreciate code contributions; just file GitHub pull requests. If you are interested in sponsoring specific work on the Zeek Agent, please contact Corelight.

We continue to extend the agent: We just completed an initial port to macOS, and we’re working on Windows support, as well. We will be extending the Zeek-side agent script framework, and we‘re also adding a multi-hop routing layer to Zeek’s underlying communication system, Broker, to facilitate deployment of the Zeek Agent across numerous endpoints.

About Trail of Bits Security Engineering Services

As your secure development partner, Trail of Bits has helped some of the world’s leading security software companies bring reliable products to market. Our engineering team’s goal is to write secure code and build tools that users can trust to protect their organizations and data. With our teams’ combined software development experience and research into system-level security, we regularly identify foundational gaps: missing capabilities, opportunities for improvement, and even potential vulnerabilities. We will review existing software architecture and provide recommendations or fixes, enhance feature sets or write new capabilities, and improve your security testing using the best available tools.

If your organization has security software plans but lacks the time or dedicated engineering resources to ensure the final product adheres to the best practices in secure coding, contact us. We would love to hear from you.

About Corelight

Corelight delivers powerful network traffic analysis (NTA) solutions that help organizations defend themselves more effectively by transforming network traffic into rich logs, extracted files, and security insights. Corelight Sensors are built on Zeek (formerly called “Bro”), the open-source network security monitoring framework that generates actionable, real-time data for thousands of security teams worldwide. Zeek has become the “gold standard’’ for incident response, threat hunting, and forensics in large enterprises and government agencies worldwide. Founded by the team behind Zeek, Corelight makes a family of virtual and physical network sensors that take the pain out of deploying open-source Zeek and expand its performance and capabilities. Corelight is based in San Francisco, California, and its global customers include Fortune 500 companies, large government agencies, and major research universities. For more information, visit https://www.corelight.com.