Trail of Bits @ ICSE 2019 – Recap

Three weeks ago, we presented our work on Slither at WETSEB, an ICSE workshop. ICSE is a top-tier academic conference, focused on software engineering. This edition of the event went very well. The organizers do their best to attract and engage industrials to the discussions. The conference had many talks in parallel. We wish we could have attended several concurrent talks. The following lists some of the talks we recommend:

Note: Some of the following papers are only accessible with a paid account. We will do our best to update the links as soon as the papers become freely available.

Static Analysis

At Trail of Bits, we spend a lot of effort building reliable static analyzers. For example, McSema allows us to lift binary directly to LLVM and Slither leverages static analysis to smart contracts. ICSE offered a rich variety of talks on this topic. We noticed two interesting trends to improve the scalability of static analysis: the combination of light and expensive analyses, and the online tuning of parameters.

SMT-Based Refutation of Spurious Bug Reports in the Clang Static Analyzer

By Mikhail R. Gadelha, Enrico Steffinlongo, Lucas C. Cordeiro, Bernd Fischer, Denis A. Nicole (pdf)

This work on clang highlighted the scalability of static analysis. The authors worked to remove false alarms emitted by the clang static analyzer through the use of an SMT solver. The solver is meant to filter out simple false alarms that were detected by the existing static analyzer. While the technique is not novel by itself, it is nice to see its concrete implementation within the compiler. The technique can be used out of the box, without significant overhead.

Resource aware program analysis via online abstract corning

By Kihong Heo, Hakjoo Oh, Hongseok Yang (pdf)

In this work, the authors tried to tune the parameters of the analysis on-the-fly according to available resources (e.g. the free RAM). The technique enables or disables the flow-sensitivity of the variables. First, they rank each variable with a score representing whether flow-sensitivity is important. Then, a controller decides how many variables should be treated flow-sensitively. If a fix-point is not reached, the controller can change the number of variables treated as flow-sensitive on the fly. This work is a nice step towards adaptive static analyzers that will be able to adjust themselves in a real-world context.

SMOKE: Scalable Path-Sensitive Memory Leak Detection for Millions of Lines of Code

By Gang Fan, Rongxin Wu, Qingkai Shi, Xiao Xiao, Jinguo Zhou, Charles Zhang (pdf)

SMOKE tries to tackle to problem of finding memory leaks in large codebases. The intuition behind this work is that most of the memory objects can be proven safe from leaking without a complex analysis, while only a small number of the objects require costly analysis. Their approach is twofold: first, they use the so-called use-flow graph representation to compute an imprecise but fast analysis to filter out most of the leak candidates. Then, they use a precise and costly analysis on the remaining objects. It is interesting to note that they first use a linear-time solver to filter out obvious results, and apply an SMT solver only on remaining cases. SMOKE is built on top of LLVM. The tool seems to scale to projects up to 8 million lines of code. The tool and the data to reproduce the experiment are available (the source code is not).

Testing

We are always interested in state-of-the-art testing techniques to improve input generation, using either fuzzing or symbolic execution. For instance, we recently added new techniques for concolic execution of cryptographic primitives and symbolic path merging to Manticore, our symbolic execution engine.

Generating Random Structurally Rich Algebraic Data Type Values (AST)

By Agustín Mista, Alejandro Russo (pdf)

The authors designed an approach for the generation of algebraic data types values in Haskell. The generated values are more “uniform” in terms of how often each value constructor is randomly generated (and having variety is important to uncover more bugs). After generation, these values can be used to test other programs (not necessarily in Haskell). The authors’ system works at compile time to generate Haskell code that generates values. Source code is available here.

DifFuzz: Differential Fuzzing for Side-Channel Analysis

By Shirin Nilizadeh, Yannic Noller, Corina S. Pasareanu (pdf)

DiffFuzz uses differential fuzzing to detect side-channels. The idea is, instead of maximizing the code coverage, to maximize the difference of resources used (i.e. the time or the consumed memory) between two inputs. The inputs are meant to be composed of the same public part and a different private part. If the difference of resources used is above a given threshold, the attacker can deduce information about the private part sent. The paper presents a fair evaluation using specific benchmarks. The code is available here.

SLF: Fuzzing without Valid Seed Inputs

By Wei You, Xuwei Liu, Shiqing Ma, David Mitchel Perry, Xiangyu Zhang, Bin Liang (pdf)

An interesting paper that shows a technique to improve fuzzing when no seeds or source-code are available. SLF uses a complex technique, divided into multiple steps. To start, SLF uses AFL to identify fields in the input data. Then, it uses a lightweight dynamic analysis to determine which fields are verified by the execution (e.g., while parsing a file) and what type of checks are used: arithmetic, index/offset, count, or if-then-else. Moreover, SLF identifies when checks depend on each other. Finally, this tool implements an array of techniques to generate and mutate input for every field, depending on its type.

The experimental evaluation is fair and shows good results in terms of finding new bugs and improving coverage in testing complex programs. However, SLF is not open source so we cannot verify the results presented in this paper.

Grey-box Concolic Testing on Binary Code

By Jaeseung Choi, Joonun Jang, Choongwoo Han, Sang Kil Cha (pdf)

This paper presents Eclipser, a fuzzer that borrows some ideas from symbolic execution, but keeps them scalable for use in larger and more complex programs. We enjoyed their ideas so much that we described their paper in detail in a recent blog post, and integrated Eclipser in DeepState, our Google-Test-like property-based testing tool for C and C++.

Blockchain

Blockchain in academia remains a hot topic. Several new conferences specialize in this area. ICSE did not escape the fervor. Several papers related to blockchain were presented. While quality varied, we found the following works promising:

Smarter smart contract development tools (WETSEB)

By Michael Coblenz, Joshua Sunshine, Jonathan Aldrich, Brad A. Myers (pdf)

This talk presented Obsidian, a new smart contract language designed to be safer than existing languages (e.g. Solidity). Obsidian has interesting properties, including a kind of user-level pre- and post-condition type system. Obsidian tries to statically prove the conditions, and adds dynamic checks when needed. The current implementation only compiles to Hyperledger Fabric. The language is still young, though promising. The authors are running a user study to improve the language design.

Gigahorse: Thorough, Declarative Decompilation of Smart Contracts

By Neville Grech, Lexi Brent, Bernhard Scholz, Yannis Smaragdakis (pdf)

Gigahorse is an EVM decompiler. The authors use Datalog, a declarative logic programming language in an unexpected way: they wrote the decompilation steps as Datalog rules and combined them with an external fixpoint loop to overcome the language limitations. A web service is available at https://contract-library.com, though source code is not provided.

Automated Repair

Automatic bug patching is an interesting, but complex, topic. We tackled this challenge during the CGC competition, and we have preliminary results for smart contracts through slither-format. We were definitely interested to review the academics trends in this area of research. Several papers showed promising work, with the caveat that they generally focus on one type of issue, and some evaluations generated incorrect patches.

SapFix: Automated End-to-End Repair at Scale

By A. Marginean, J. Bader, S. Chandra, M. Harman, Y. Jia, K. Mao, A. Mols, A. Scott (pdf)

SapFix is an automated end-to-end fault-fixing tool deployed at Facebook, designed to work at scale. The system focused on null pointers, and takes advantage of two other tools, Sapienz and Infer. The work showed an interesting combination of heuristics and templates to create the least painful experience for the user. For example, the system will combine information from a dynamic crash with the Infer static analyzer to improve the fault localization. It will abandon the patch if no developer reviews it within seven days. The paper presented promising results.

On Reliability of Patch Correctness Assessment

By Xuan-Bach D. Le, Lingfeng Bao, David Lo, Xin Xia, Shanping Li, and Corina Pasareanu (pdf)

This work assesses the validity of patch-generation tools. This type of validation is, unfortunately, not represented enough in conferences. The authors performed an evaluation of eight automatic software repair tools with 35 developers. The paper shows that the results of the tools are not as promising as claimed, though they are still useful as complements to better established tools.

Poster Session

The poster session was meant to present on-going work and allowed direct interaction with the authors. We found several promising works.

Demand-driven refinement of points-to analysis

By Chenguang Sun, Samuel Midkiff (pdf)

This work follows the trend we saw during the static analysis session to aid scalability. The goal is to improve points-to analysis by slicing pertinent program elements that are needed to answer targeted queries.

WOK: statical program slicing in production

By Bogdan-Alexandru Stoica, Swarup K. Sahoo, James Larus, Vikram Adve (pdf)

The authors are working toward a scalable dynamic slicing of programs, by taking advantage of dataflow information gathering statically, and modern hardware support (e.g., Intel Processor Trace). Their preliminary evaluation shows real potential for the technique.

Validity fuzzing and parametric generators for effective random testing

By Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, Yves Le Traon (pdf)

This ongoing work tries to improve generator-based testing (aka QuickCheck) by guiding the input generation with two features: (1) the use of a validator to discard semantically invalid inputs, and (2) the conversion from raw bits input to structured inputs. The goal is to be able to conserve both the syntax and semantics of the inputs, especially structured inputs (i.e., file format).

Optimizing seed inputs in fuzzing with machine learning

By Liang Cheng, Yang Zhang, Yi Zhang, Chen Wu, Zhangtan Li, Yu Fu, Haisheng Li (pdf)

The authors aim to improve the generation of inputs through machine learning following the same techniques from Learn&Fuzz. The insight is to let a neural network learn the correlation between the input and the execution trace coverage, with the goal of generating inputs that are more likely to explore unseen code. Their first experiment on PDF shows encouraging results when compared with previous work.

Contributing to Academic Research

In our work, we spend a fair amount of our time building reliable software, based on the latest research available (see Manticore, McSema, Deepstate, or our blockchain toolchains). We enjoy exchanging our vision for technology with the academic community. We are also happy to provide any technical support for the usage of our tools for academic research. If your work brings you to try our tools, contact us for support!

Why you should go to QueryCon this week

QueryCon takes place this week at the Convene Conference Center in Downtown Manhattan, Thursday June 20th- Friday June 21st. If you don’t have a ticket yet, get one while you can.

QueryCon is an annual conference about osquery, the open source project that’s helping many top tech companies manage their endpoints. We’ve been big fans of osquery since Facebook hired us to bring Windows support to osquery in 2016. Now we have an entire group in our Engineering practice devoted to helping clients harness the power of osquery through new features and fixes. We jumped at the chance to bring QueryCon to New York at the invitation of Kolide, the original conference organizers. We got tons of value out of QueryCon 2018. We’re super excited to bring the conference to the east coast, and to reconnect in person with the growing and vibrant osquery community.

For most of us who went to 2018 QueryCon, attending this year’s event is a no-brainer. But what about those people who missed the magic and aren’t sure? For those of you still on the fence, here are some reasons to join:

Want to know what’s going on with your endpoints? You need osquery.

If you’re an IT or Security Operations professional, and you haven’t heard of osquery yet, you’re likely in the minority. osquery is quickly becoming a standard foundational tool that top tech firms use to flexibly manage their endpoints.

With osquery, you can expose your fleet’s machine data as a high-performance relational database. Using simple, standardized SQL queries, you can explore operating system data such as running processes, loaded kernel modules, open network connections, browser plugins, hardware events or file hashes.

The solution offers great benefits in isolation or as part of a multifaceted fleet management system. It’s become so ubiquitous that leading commercial endpoint management solution providers, such as Carbon Black, are harnessing osquery to support certain use cases through solutions like Live Query.

Meet someone who can build your killer feature

Being open source, osquery is free to use, transparent, and developed according to users’ needs. Unlike other tools where you have to hope that a vendor company’s feature list overlaps with your needs, with osquery, you can take control by developing new killer features yourself. That is, if you have a team of security-focused C++ developers on staff. If you don’t, QueryCon is a great place to meet those people! Whether you’re interested in learning about better osquery development techniques, hiring a team to help you, or campaigning for a community dev path that aligns with your goals, QueryCon is the place to go to get your company’s wishlist of features into the osquery project.

The community is currently evolving – get a front-row seat to the latest developments

As many community veterans know, the osquery community has seen many recent changes. Late last year, Facebook heavily refactored the entire codebase, migrating osquery away from standard development tools like CMake and integrating it with Facebook’s internal tooling. In order to maintain functionality for the majority of enterprise users who rely on standard dependencies to run osquery, Trail of Bits developed and announced a community-oriented osquery fork, osql. At this year’s QueryCon, Teddy Reed of Facebook will be announcing and discussing plans to transfer stewardship of osquery from Facebook to an open-source foundation, using Trail of Bits’ osql code. Want to ask Teddy questions about the change? He’ll be offering a live Q&A on the first day of the conference! Want to know more about osql? Trail of Bits’ Stefano Bonicatti and Mark Mossberg will let you know what to expect from osql.

It’s single-track, with select and engaged attendees, and no sales talks

We’re keeping with some excellent decisions made by QueryCon 2018. The talks are single-track and laser focused on providing the osquery community with relevant information. You don’t have to wonder if some talks will be relevant; We did that work for you in our speaker screening. You can also expect a high-quality audience at QueryCon. Expect to meet other osquery users, developers, and community managers who can help you with your own deployment, help you build your killer feature, or help you weigh in on the direction of this open source project to match your company’s needs. Expect everyone you meet to be a person interested in building an open-source tool that makes security better for everyone.

Finally, as a means of protecting the special culture created in the 2018 QueryCon, we’re keeping with the “No sales talks” rule. Speakers are welcome to share new features or tools that they have built as part of an informative development, deployment, or community management topic, but talks submitted that primarily peddle a commercial product or service are strictly prohibited. We think this enhances the QueryCon experience, focuses the conversation on the tool’s progress, and ensures that information presented is trustworthy.

There are still a few tickets left

Want to join for this year’s QueryCon? You can still buy tickets at the eventbrite page and get more information, including the speaker schedule, on the conference website. We hope to see you there!

Leaves of Hash

Trail of Bits has released Indurative, a cryptographic library that enables authentication of a wide variety of data structures without requiring users to write much code. Indurative is useful for everything from data integrity to trustless distributed systems. For instance, developers can use Indurative to add Binary Transparency to a package manager — so users can verify the authenticity of downloaded binaries — in only eight lines of code.

Under the hood, Indurative uses Haskell’s new DerivingVia language extension to automatically map types that instantiate FoldableWithIndex to sparse Merkle tree representations, then uses those representations to create and verify inclusion (or exclusion) proofs. If you understood what that means, kudos, you can download Indurative and get started. If not, then you’re in luck! The whole rest of this blog post is written for you.

“That looks like a tree, let’s call it a tree”

In 1979, Ralph Merkle filed a patent for a hash-based signature scheme. This patent introduced several novel ideas, perhaps most notably that of an “authentication tree,” or, as it’s now known, a Merkle tree. This data structure is now almost certainly Merkle’s most famous work, even if it was almost incidental to the patent in which it was published, as it vastly improves efficiency for an incredible variety of cryptographic problems.

Hash-based signatures require a “commitment scheme” in which one party sends a commitment to a future message such that i) there is exactly one message they can send that satisfies the commitment, ii) given a message, it is easy to check if it satisfies the commitment, and iii) the commitment doesn’t give away the message’s contents. Commitment schemes are used everywhere from twitter to multi-party computation.

Typically, a commitment is just a hash (or “digest”) of the message. Anyone can hash a message and see if it’s equal to the commitment. Finding a different message with the same hash is a big deal. That didn’t quite work for Merkle’s scheme though: he wanted to commit to a whole set of different messages, then give an inclusion proof that a message was in the set without revealing the whole thing. To do that, he came up with this data structure:

An example of a binary hash tree. Hashes 0-0 and 0-1 are the hash values of data blocks L1 and L2, respectively, and hash 0 is the hash of the concatenation of hashes 0-0 and 0-1 (Image and caption from Wikimedia)

Think of a binary tree where each node has an associated hash. The leaves are each associated with the hash of a message in the set. Each branch is associated with the hash of its childrens’ hashes, concatenated. In this scheme, we can then just publish the top hash as a commitment. To prove some message is included in the set, we start at the leaf associated with its hash and walk up the tree. Every time we walk up to a branch, we keep track of the side we entered from and the hash associated with the node on the other side of that branch. We can then check proofs by redoing the concatenation and hashing at each step, and making sure the result is equal to our earlier commitment.

This is a lot easier to understand by example. In the image above, to prove L3’s inclusion, our proof consists of [(Left, Hash 1-1), (Right, Hash 0)] because we enter Hash 1 from the left, with Hash 1-1 on the other side, then Top Hash from the right, with Hash 0 on the other side. To check this proof, we evaluate hash(Hash 0 + hash(hash(L3) + Hash 1-1)). If this is equal to Top Hash, the proof checks! Forging these proofs is, at each step, as hard as finding a hash collision, and proof size is logarithmic in message set size.

This has all kinds of applications. Tahoe-LAFS, Git, and ZFS (see: Wikipedia) all use it for ensuring data integrity. It appears in decentralization applications from IPFS to Bitcoin to Ethereum (see again: Wikipedia). Lastly, it makes certificate transparency possible (more on that later). The ability to authenticate a data structure turns out to solve all kinds of hard computer science problems.

“You meet your metaphor, and it’s good”

Of course, a Merkle tree is not the only authenticated data structure possible. It’s not hard to imagine generalizing the approach above to trees of arbitrary branch width, and even trees with optional components. We can construct authenticated versions of pretty much any DAG-like data structure, or just map elements of the structure onto a Merkle tree.

In fact, as Miller et al. found in 2014, we can construct a programming language where all data types are authenticated. In Authenticated Data Structures, Generically the authors create a fork of the OCaml compiler to do exactly that, and prove it to be both sound and performant. The mechanics for doing so are fascinating, but beyond the scope of this post. I highly recommend reading the paper.

One interesting thing to note in Miller et al.’s paper is that they re-contextualize the motivation for authenticated data structures. Earlier in this post, we talked about Merkle trees as useful for commitment schemes and data integrity guarantees, but Miller et al. instead chooses to frame them as useful for delegation of data. Specifically, the paper defines an authenticated data structure as one “whose operations can be carried out by an untrusted prover, the results of which a verifier can efficiently check as authentic.”

If we take a moment to think, we can see that this is indeed true. If I have a Merkle tree with millions of elements in it, I can hand it over to a third party, retaining only the top hash, then make queries to this data expecting both a value and an inclusion proof. As long as the proof checks, I know that my data hasn’t been tampered with. In the context of trustless distributed systems, this is significant (we’ll come back to exactly why later, I promise).

In fact, I can authenticate not just reads, but writes! When I evaluate an inclusion proof, the result is a hash that I check against the digest I have saved. If I request the value at some index in the tree, save the proof, then request to write to that same index, by evaluating the old proof with the value I’m writing, I can learn what the digest will be after the write has taken place. Once again, an example may be helpful.

Recall our earlier (diagrammed) example, where to prove L3’s inclusion, our proof consists of [(Left, Hash 1-1), (Right, Hash 0)]. If we want to write a new value, we first retrieve L3 and the associated proof. Then, just as we checked our proof by calculating hash(Hash 0 + hash(hash(L3) + Hash 1-1)) and ensured it was equal to the root hash, we calculate hash(Hash 0 + hash(hash(new_L3) + Hash 1-1)) and update our saved digest to the result. If this isn’t intuitive, looking back at the diagram can be really helpful.

The combination of authenticated reads and writes allow for some very powerful new constructions. Specifically, by adding authentication “checkpoints” to a program in Miller et al.’s new language judiciously, we can cryptographically ensure that a client and server always agree on program state, even if the client doesn’t retain any of the data a program operates on! This is game-changing for systems that distribute computation to semi-trusted nodes (yes, like blockchains).

This sounds like a wild guarantee with all manner of caveats, but it’s much less exciting than that. Programs ultimately run on overcomplicated Turing machines. Program state is just what’s written to the tape. Once you’ve accepted that all reads and writes can be authenticated for whatever data structure you’d like, the rest is trivial. Much of Miller et al.’s contribution is ultimately just nicer semantics!

“We love the things we love for what they are”

So far, we’ve achieved some fairly fantastical results. We can write code as usual, and cryptographically ensure client and server states are synchronized without one of them even having the data operated upon. This is a powerful idea, and it’s hard not to read it and seek to expand on it or apply it to new domains. Consequently, there have been some extremely impressive developments in the field of authenticated data structures even since 2014.

One work I find particularly notable is Authenticated Data Structures, as a Library, for Free! by Bob Atkins, written in 2016. Atkins builds upon Miller et al.’s work so that it no longer requires a custom compiler, a huge step towards practical adoption. It does require that developers provide an explicit serialization for their data type, as well as a custom retrieval function. It now works with real production code in OCaml relatively seamlessly.

There is still, however, the problem of indexing. Up until now we’ve been describing our access in terms of Merkle tree leaves. This works pretty well for data structures like an array, but it’s much harder to figure out how to authenticate something like a hashmap. Mapping the keys to leaves is trivial, but how do you verify that there was a defined value for a given key in the first place?

Consider a simple hashmap from strings to integers. If the custodian of the authenticated hashmap claims that some key “hello” has no defined value, how do we verify that? The delegator could keep a list of all keys and authenticate that, but that’s ugly and inelegant, and effectively grows our digest size linearly with dataset size. Ideally, we’d still like to save only one hash, and synchronizing this key list between client and server is fertile breeding ground for bugs.

Fortunately, Ben Laurie and Emilia Kasper of Google developed a novel solution for this in 2016. Their work is part of Trillian, the library that enables certificate transparency in Chrome. In Revocation Transparency, they introduce the notion of a sparse Merkle tree, a Merkle tree of infeasible size (in their example, depth 256, so a node per thousand atoms in the universe) where we exploit the fact that almost all leaves in this tree have the same value to compute proofs and digests in efficient time.

I won’t go too far into the technical details, but essentially, with 2^256 leaves, each leaf can be assigned a 256-bit index. That means that given some set of key/value data, we can hash each key (yielding a 256-bit digest) and get a unique index into the tree. We associate the hash of the value with that leaf, and have a special null hash for leaves not associated with any value. There’s another diagram below I found very helpful:

“An example sparse Merkle tree of height=4 (4-bit keys) containing 3 keys. The 3 keys are shown in red and blue. Default nodes are shown in green. Non-default nodes are shown in purple.” (Image and caption from AergoBlog)

Now we know the hash of every layer-two branch that isn’t directly above one of our defined nodes as well, since it’s just hash(hash(null) + hash(null)). Extending this further, for a given computation we only need to keep track of nodes above at least one of our defined nodes, every other value can be calculated quickly on-demand. Calculating a digest, generating a proof, and checking a proof are all logarithmic in the size of our dataset. Also, we can verify that a key has no associated value by simply returning a retrieval proof valid for a null hash.

Sparse Merkle trees, while relatively young, have already seen serious interest from industry. Obviously, they are behind Revocation Transparency, but they’re also being considered for Ethereum and Loom. There are more than a few libraries (Trillian being the most notable) that just implement a sparse Merkle tree data store. Building tooling on top of them isn’t particularly hard (check out this cool example).

“Give me a land of boughs in leaf”

As exciting as all these developments are, one might still wish for a “best of all worlds” solution: authenticated semantics for data structures as easy to use as Miller et al.’s, implemented as a lightweight library like Atkins’s, and with the support for natural indexing and exclusion proofs of Laurie and Kasper’s. That’s exactly what Indurative implements.

Indurative uses a new GHC feature called DerivingVia that landed in GHC 8.6 last summer. DerivingVia is designed to allow for instantiating polymorphic functions without either bug-prone handwritten instances or hacky, unsound templating and quasiquotes. It uses Haskell’s newtype system so that library authors can write one general instance which developers can automatically specialize to their type.

DerivingVia means that Indurative can offer authenticated semantics for essentially any indexed type that can be iterated through with binary-serializable keys and values. Indurative works out-of-the-box on containers from the standard library, containers and unordered-containers. It can derive these semantics for any container meeting these constraints, with any hash function (and tree depth), and any serializable keys and values, without the user writing a line of code.

Earlier we briefly discussed the example of adding binary transparency to a package-management server in less than ten lines of code. If developers don’t have to maintain parallel states between the data structures they already work with and their Merkle tree authenticated store, we hope that they can focus on shipping features without giving up cryptographic authenticity guarantees.

Indurative is still alpha software. It’s not very fast yet (it can be made waaaay faster), it may have bugs, and it uses kind of sketchy Haskell (UndecidableInstances, but I think we do so soundly). It’s also new and untested cryptographic software, so you might not want to rely on it for production use just yet. But, we’ve worked hard on commenting all the code and writing tests because we think that even if it isn’t mature, it’s really interesting. Please try it, let us know how it works, and let us know what you want to see.

If you have hard cryptographic engineering problems, and you think something like Indurative might be the solution, drop us a line.

Announcing Manticore 0.3.0

Earlier this week, Manticore leapt forward to version 0.3.0. Advances for our symbolic execution engine now include: “fast forwarding” through concrete execution that you don’t care about, support for Linux binaries statically compiled for AArch64, and an interface for selectively solving for interesting test cases. We’ve been working really hard on these and other features over the past quarter, and we’re excited to share what we’ve built.

Executor Refactor

Felipe Manzano completed a major refactor of Manticore’s state machine. It now uses the multiprocessing module, which could make it easier one day to implement distributed symbolic execution. You can read more details about the state machine in the pull request description. Be advised that it does introduce a few small changes to the API, the most important of which are:

  • You must now explicitly call the finalize method in order to dump test cases after a run. That means that you can inspect a state before deciding whether to invest the time to solve for a test case.
  • The will_start_run callback has been renamed to will_run
  • The solver singleton must now be accessed explicitly as Z3Solver.instance()

Unicorn Preloading

Manticore models native instructions in Python, a language that is not known for speed. Instruction throughput is only a tiny fraction of what you’d expect on a concrete CPU, which can be really unfortunate when the code you care about is buried deep within a binary. You might spend several minutes waiting for Manticore to execute a complicated initialization routine before it ever reaches anything of interest.

To handle cases like this, we’ve added a Unicorn emulator plugin that allows Manticore to “fast forward” through concrete execution that you don’t care about. Unicorn is a fast native CPU emulator that leverages QEMU’s JIT engine for better performance. By replacing Manticore’s executor with Unicorn for unimportant initialization routines, we’ve encountered speed improvements of up to 50x. See an example of how to invoke the Unicorn emulator on the pull request.

AArch64 Support

Over the past four months, Nikita Karetnikov added support for Linux binaries statically compiled for AArch64. Since it’s a brand-new architecture, we’ve left in many of the debugging components in order to help us diagnose issues, a decision that may make it a bit slower than other architectures. With the growing popularity of ARMv8 CPUs for platforms ranging from embedded development boards to server farms, we look forward to receiving feedback on this new architecture.

System Call Audit

To provide an accurate symbolic execution environment, Manticore needs symbolic models of all the Linux system calls. Previously, we implemented only a subset of the most common system calls, and Manticore would throw an exception as soon as it encountered an unimplemented call. This is enough to execute many binaries, but there’s room for improvement.

With the 0.3.0 release, we’ve added a dozen new system calls, and added “stubs” to account for the ones we haven’t implemented. Now, instead of throwing an exception when it encounters an unimplemented call, Manticore will attempt to pretend that the call completed successfully. The program may still break afterwards, but we’ve found that this technique is often “good enough” to analyze a variety of problematic binaries. Just be sure to keep your eyes peeled for the “Unimplemented system call” warning message, since further analysis may be unsound if Manticore has ignored an important syscall!

Symbolic EVM Tests

One of the important guarantees that Manticore provides is that when it executes a transaction with a symbol, the result holds for all possible values of that symbol. In order for this to be trustworthy, the symbolic implementation of each instruction needs to be correct. That’s why we’ve extended our continuous integration pipeline to automatically run Manticore against the Frontier version of the Ethereum VM tests on each new commit. This will ensure that throughout further development, you’ll always be able to rely on Manticore to correctly reason about your code.

Black

We believe in clean code, which is why we’ve run Manticore through the black autoformatter. Black deterministically formats your code according to a fairly strict reading of the pycodestyle conventions so that you can focus on the content instead of the formatting. From now on, you should run black -t py36 -l 100 . on your branch before submitting a pull request.

What’s Next?

We believe that security tools are only beneficial if people actually use them, so we want to make Manticore easier for everyone to use. Over the next few months, we have big plans for Manticore’s usability, including improvements to our documentation, updating our examples repository, and conducting a formal usability study. Don’t think we’ll let the code languish, though! Our next release should include support for crytic-compile, making it even easier to analyze smart contracts in Manticore. We’ll continue working towards improved performance and eventual support for EVM Constantinople.

You can download Manticore 0.3.0 from our GitHub, via PyPI, or as a pre-built Docker container.

Using osquery for remote forensics

System administrators use osquery for endpoint telemetry and daily monitoring. Security threat hunters use it to find indicators of compromise on their systems. Now another audience is discovering osquery: forensic analysts. While osquery core is great for querying various system-level data remotely, forensics extensions will give it the ability to inspect to deeper-level data structures and metadata not even available to a user at a local system. We continued our collaboration with Crypsis, a security consulting company, to show some immediate scenarios where osquery comes in handy for forensic analysts.

Previously, we announced and briefly introduced the features of the new NTFS forensics extension that we added to our osquery-extensions repository. Today, we’ll demonstrate some familiar real-world use-cases for forensic analysts interested in leveraging osquery in their incident response efforts.

Identifying “Timestomping” Attacks

Every interaction with a filesystem leaves a trace. Attackers who want to remain undetected for as long as possible need to clean up these traces. File timestamps, if left unmodified, provide a great deal of information about the attacker’s timeline and behavior. They’re a common focus for both the attacker and the forensic analyst. “Timestomping” is the common name for the anti-forensics tactic of destroying filesystem timestamp evidence of the attacker’s file modifications.

When it comes to covering up evidence in timestamps, NTFS is a little more complicated than other filesystems. To explain, we’ll have to explore some of NTFS’s structure.

The core element of NTFS is the Master File Table (MFT), which stores an entry for every single file on the system. Every entry in the MFT contains a number of attributes that store metadata describing the file. One attribute – $STANDARD_INFORMATION ($SI) – stores a collection of timestamps. Standard files also have a $FILE_NAME ($FN) attribute that contains its own set of timestamps. The timestamps in the $SI attribute roughly correlate to interactions with the contents of the file. The timestamps in the $FN attribute roughly correlate to interactions with the location and name of the file. Finally, directory entries in the MFT have an index attribute that stores a copy of the $FN attribute (including timestamps) for all files in that directory.

Example 1: Timestamp Inconsistency

The simplest example of a timestamp attack is to change the file-creation date to a time prior to incursion. Done poorly, the $FN creation timestamp and $SI creation timestamp won’t match. The discrepancy stands out. To use osquery to find files in a directory whose timestamps don’t match, for example, I’d run the following: SELECT path,fn_btime,btime from ntfs_file_data where device=”\\.\PhysicalDrive0” and partition=3 and directory=”/Users/mmyers/Desktop/test_dir” and fn_btime != btime;

We can also look for other forms of timestamp inconsistency. Perhaps the file-creation times are left alone, and thus match, but the last modified time was set to some earlier time to avoid detection. Would you trust a file whose MFT entry’s modified time predates its creation time? Me neither: SELECT filename, path from ntfs_file_data where device=”\\.\PhysicalDrive0” and partition=2 and path=”/Users/Garret/Downloads” and fn_btime > ctime OR btime > ctime;

Example 2: Timestamp Missing Full Precision

Attackers can be lazy sometimes and timestomp a file with a built-in system utility. These utilities have a lower precision for time values than the operating system would naturally use. An analyst can spot this kind of forgery by checking the nanosecond portion of the timestamp — it’s unlikely to be all zeros, unless it has been tampered with.

We saw above that NTFS timestamps are 64-bit values. For example, consider the NTFS timestamp 131683876627452045. If you have a Windows command prompt handy, that’s Monday, April 16, 2018 9:27:43 PM — to be specific, it’s 9:27:42 PM and 0.7452045 minutes, but it was rounded up. Pretty specific! This is what a natural file timestamp looks like.

However, a file timestamp that has been set by a system utility will only have seconds-level precision, and that’s as much detail as most user-interfaces show. 131683876620000000 is also Monday, April 16, 2018 9:27:42 PM, but it sticks out like a sore thumb in integer representation. This timestamp was forged.

At first use, it might seem odd for osquery to output the NTFS timestamps in integer representation, but it serves to make this type of forgery easy to spot for an experienced forensic analyst.

Locating Evidence of Deleted Files

A user clicks a bad link or opens a bad email attachment. The malware goes to work. It downloads a couple of payloads, deploys them, collects some data on the system into a file, sends that data upstream, then deletes itself and all downloaded files from the filesystem. All neat and tidy, right?
Well, maybe not. The contents of those files might not be available any longer, but NTFS is lazy about cleaning up metadata for files, especially in the context of directory indices. A complete explanation of NTFS and directory index management is beyond the scope of this post, but we can provide a high-level overview (readers who are inclined to learn more might wish to read NTFS.com or the documentation by Russon and Fledel of the Linux-NTFS project).

Like any file on NTFS, every directory has an entry in the MFT. These entries have various attributes. The relevant attribute here is the index attribute, which in turn contains copies of the $FN attributes of the directory’s child files, arranged in a tree structure. As files are added and removed from the directory, the contents of the index attribute are updated. Entries in the index are not deleted, though—they’re simply marked as inactive, and may be overwritten later as new entries are added. Even though a file was deleted, a copy of its $FN attribute may still remain in its parent directory’s index for some time afterwards.

The NTFS forensic extension makes finding these entries relatively simple.

Example 3: A Directory’s Unused Filename Entries

Let’s delete all of the files from the last example, and empty the Recycle Bin. Then, let’s look at the unused entries in that folder’s directory index by running the following query: SELECT parent_path,filename,slack from ntfs_indx_data WHERE parent_path=”/Users/mmyers/Desktop/test_dir” and slack!=0;

There’s more information available than just filenames. Since the entire $FN attribute is stored, there are time stamps available as well. We can reconstruct a partial timeline of file activity in a directory just from the index entries. Some extra work is required, though: since directory indices are filename-based, renaming a file will in effect cause the old entry to be marked as inactive, and create a new entry in the index. Differentiating a renamed file from a deleted one will require additional analysis.
Also note that there were three files deleted, but only two files left artifacts in slack. When looking at unused data structures, we are often only seeing a partial record of what used to be there.

Getting Started

This extension offers a fast and convenient way to perform filesystem forensics on Windows endpoints as a part of an incident response. Go get it – and our other osquery extensions – from our repository. We’re committed to maintaining and extending our collection of extensions. Take a look, and see what else we have available. Visit the osquery community on Slack if you need help.

Helping incident responders with remote forensics is an area of increasing capability for osquery. Besides our NTFS forensics extension, osquery already supports file carving, system activity queries, and audit-based monitoring. There is undoubtedly still more that could be added to osquery: remote memory carving, USB device history retrieval, or filesystem forensic metadata for other filesystems.

Join us on June 20th-21st for QueryCon!

Trail of Bits is hosting the QueryCon osquery conference in New York City, June 20th and 21st, 2019. As we have demonstrated in this article with the NTFS forensics extension, there are many potential use-cases for osquery extensions, and some of the talks at QueryCon 2019 will explore some of those specifically. Victor Vrantchan will give a lesson on how to use extensions and logger plugins to integrate osquery with your existing logging infrastructure; Atul Kabra will speak about enriching osquery with ‘event-driven’ extensions.

As of the time of this writing, tickets for QueryCon are still available! Purchase yours today, and meet with the others from the osquery user and developer community. Bring your ideas for extensions, and participate in the workshop. We look forward to seeing you there!

Fuzzing Unit Tests with DeepState and Eclipser

If unit tests are important to you, there’s now another reason to use DeepState, our Google-Test-like property-based testing tool for C and C++. It’s called Eclipser, a powerful new fuzzer very recently presented in an ICSE 2019 paper. We are proud to announce that Eclipser is now fully integrated into DeepState.

Eclipser provides many of the benefits of symbolic execution in fuzzing, without the high computational and memory overhead usually associated with symbolic execution. It combines “the best of both white-box and grey-box fuzzing” using only lightweight instrumentation and, most critically, never calling an expensive SMT or SAT solver. Eclipser is the first in what we hope (perhaps with your help) to make a series of push-button front-ends to promising tools that require more work to apply than AFL or libFuzzer. Eclipser allows DeepState to quickly detect more hard-to-reach bugs.

What Makes Eclipser Special?

Traditional symbolic execution, supported by DeepState through tools such as Manticore and angr, keeps track of path constraints: conditions on a program’s input such that the program will take a particular path given an input satisfying the constraint. Unfortunately, solving such conditions is difficult and expensive, especially since many constraints are infeasible: they cannot be solved.

Many workarounds for the high cost of solving path constraints have been proposed, but most symbolic-execution based tools are still limited in scalability and prone to failure when asked to produce long paths or handle complex code. Eclipser builds on ideas developed in KLEE and MAYHEM to substitute approximate path constraints for path constraints. These conditions are (as the name suggests) less precise, but much easier to solve. Critically, they don’t require a slow solver. Eclipser still has to solve these approximate, “easy” constraints, but it can assume they are either simple and linear (in which case inexpensive techniques suffice) or at least monotonic, in which case Eclipser uses a binary search instead of a solver call. If the real constraint is neither linear nor monotonic, Eclipser will not be able to generate relevant inputs, but fuzzing may let it make progress despite this failure. In practice, symbolic execution will also often fail because of such constraints, but with a solver timeout, after wasting considerable computational effort. Eclipser will produce some input much more quickly (though not necessarily one satisfying the too-hard-to-solve conditions).

Why Should You Care?

Eclipser is interesting primarily because the authors report that it performed better in terms of code coverage on coreutils than KLEE, better in terms of bugs detected on LAVA-M benchmarks than AFLFast, LAF-intel, VUzzer, and Steelix and, most compellingly, better in terms of bugs detected on real Debian packages than AFLFast and LAF-intel. The Debian experiments produced eight new CVEs.

Given this promising performance, we decided to integrate Eclipser into DeepState, making it easy to apply the Eclipser approach to your unit testing. Out of the box, DeepState could already be used with Eclipser. The fuzzer works with any binary that takes a file as input. DeepState works with all file-based fuzzers we have tried. However, it is important to use the right arguments to DeepState with Eclipser, or else Eclipser’s QEMU-based instrumentation will not work. It also takes some manual effort to produce standalone test cases and crashing inputs for DeepState, since Eclipser stores tests in a custom format not usable by other tools. We therefore added a simple front-end to make your life (and our life) easier.

The Eclipser Paper Example

The DeepState examples directory has the code for a DeepState-ized version of the main example used in the Eclipser paper:

#include <deepstate/DeepState.hpp>
using namespace deepstate;
#include <assert.h>

int vulnfunc(int32_t intInput, char * strInput) {
   if (2 * intInput + 1 == 31337)
      if (strcmp(strInput, "Bad!") == 0)
         assert(0);
   return 0;
}

TEST(FromEclipser, CrashIt) {
   char *buf = (char*)DeepState_Malloc(9);
   buf[8] = 0;
   vulnfunc(*((int32_t*) &buf[0]), &buf[4]);
}

The easiest way to try this example out is to build the DeepState docker image (yes, DeepState now makes it easy to create a full-featured docker image):

$ git clone https://github.com/trailofbits/deepstate
$ cd deepstate
$ docker build -t deepstate . -f docker/Dockerfile
$ docker run -it deepstate bash

Building the docker image will take a while: DeepState, AFL, and libFuzzer are quick, but building Eclipser is a fairly involved process.

Once you are inside the DeepState docker image:

$ cd deepstate/build/examples
$ deepstate-eclipser ./FromEclipser --timeout 30 --output_test_dir eclipser-FromEclipser

Eclipser doesn’t need the full 30 seconds; it produces a crashing input almost immediately, saving it in eclipser-FromEclipser/crash-0. The other fuzzers we tried, AFL and libFuzzer, fail to find a crashing input even if given four hours to generate tests. They generate and execute, respectively, tens and hundreds of millions of inputs, but none that satisfy the conditions to produce a crash. Even using libFuzzer’s value profiles does not help.

Running the experiments yourself is easy:

$ mkdir foo; echo foo > foo/foo
$ afl-fuzz -i foo -o afl-FromEclipser -- ./FromEclipser_AFL --input_test_file @@ --no_fork --abort_on_fail

and

$ mkdir libFuzzer-FromEclipser
$ export LIBFUZZER_EXIT_ON_FAIL=TRUE
$ ./FromEclipser_LF libFuzzer-FromEclipser -use_value_profile=1

You’ll want to interrupt both of these runs, when you get tired of waiting.

Both angr and Manticore find this crashing input in a few seconds. The difference is that while Eclipser is able to handle this toy example as well as a binary analysis tool, the binary analysis tools fail to scale to complex problems like testing an ext3-like file system, testing Google’s leveldb, or code requiring longer tests to hit interesting behavior, like a red-black-tree implementation. Eclipser is exciting because it outperforms libFuzzer on both the file system and the red-black-tree, but can still solve “you need symbolic execution” problems like FromEclipser.cpp.

Behind the Scenes: Adding Eclipser Support to DeepState

As noted above, in principle there’s literally “nothing to” adding support for Eclipser, or most file-based fuzzers. DeepState makes it easy for a fuzzer that uses files as a way to communicate inputs to a program to generate values for parameterized unit tests. However, figuring out the right DeepState arguments to use with a given fuzzer can be difficult. At first we thought Eclipser wasn’t working because it doesn’t, if DeepState forks to run tests. Once we ran DeepState with no_fork, everything went smoothly. Part of our goal in producing front-ends like deepstate-eclipser is to make sure you never have to deal with such mysterious failures. The full code for setting up Eclipser runs, parsing command line options (translating DeepState tool argument conventions into Eclipser’s arguments), and getting Eclipser to produce standalone test files from the results takes only 57 lines of code. We’d love to see users submit more simple “front-ends” to other promising fuzzers that require a little extra setup to use with DeepState!

So, Is This the Best Fuzzer?

Will some advance in test generation technology, like Eclipser, obsolete DeepState’s goal of supporting many different back-ends? The answer is “not likely.” While Eclipser is exciting, our preliminary tests indicate that it performs slightly worse than everyone’s favorite workhorse fuzzer, AFL, on both the file system and red-black-tree. In fact, even with the small set of testing problems we’ve explored in some depth using DeepState, we see instances where Eclipser performs best, instances where libFuzzer performs best, and instances where AFL performs best. Some bugs in the red black tree required a specialized symbolic execution test harness to find (and Eclipser doesn’t help, we found out). Moreover, even when one fuzzer performs best overall for an example, it may not be best at finding some particular bug for that example.

The research literature and practical wisdom of fuzzer use repeatedly show that, even when a fuzzer is good enough to “beat” other fuzzers (and thus get a paper published at ICSE), it will always have instances where it performs worse than an “old,” “outdated” fuzzer. In fuzzing, diversity is not just helpful, it’s essential, if you really want the best chance to find every last bug. No fuzzer will be best for all programs under test, or for all bugs in a given real-world program.

The authors of the Eclipser paper recognize this, and note that their technique is complimentary to that used in the Angora fuzzer. Angora shares some of Eclipser’s goals, but relies on metaheuristics about branch distances, rather than approximate path conditions, and uses fine-grained taint analysis to penetrate some branches Eclipser cannot handle. Angora also requires source code. One big advantage of Eclipser is that unlike AFL (in non-QEMU mode) or libFuzzer, it doesn’t require you to rebuild any libraries you want to test with DeepState with additional instrumentation. At the time the Eclipser paper was written, Angora was not available to compare with, but it was recently released and is another good candidate for full integration with DeepState.

Eclipser is a great horse to add to your fuzzer stable, but it won’t win every race. As new and exciting fuzzers emerge, DeepState’s ability to support many fuzzers will only become more important. Using a diverse array of fuzzers is easy if it’s a matter of changing a variable and doing FUZZER=FOO make; deepstate-foo ./myprogram, and practically impossible if it requires rewriting your tests for every tool. In the near future, we plan to make life even easier, and support an automated ensemble mode where DeepState makes use of multiple fuzzers to test your code even more aggressively, without any effort on your part other than deciding how many cores you want to use.

Announcing Automated Reverse Engineering Trainings

Trail of Bits is excited to announce new training offerings for automated reverse engineering with Binary Ninja.

We’ve been raving about Vector35’s Binary Ninja for years. We’ve used it to:

That work, and a whole lot of correspondence, has garnered high praise from an author of Binary Ninja:

Josh is without a doubt our most knowledgeable Binary Ninja user. We pay attention very closely to any of his feedback and we couldn’t think of a better third-party instructor to teach about how to use Binary Ninja to solve reverse engineering problems.

– Jordan Wiens, Co-Founder, Vector35

If you’re doing any amount of manual reverse engineering, you really should consider learning to use Binary Ninja. Its API is much clearer than its competitors. There’s more documentation on it as well as lots of examples. You can find what you need quickly.

Binary Ninja is a much more modern design than other binary analysis tools. Vector35 built it from the ground up with the intention to continue to innovate on top of it, and avoid handcuffing themselves with past design choices. They’re constantly adding more new features and better analysis, which is exposed to allow you to write plugins on top of it and create your own tooling.

It’s much easier to automate things as well. Because of those analyses that are baked in, you don’t have to implement them yourself. Everything is lifted to an architecture-agnostic language, so that you can perform the same analysis on any language that Binary Ninja can disassemble. If you write your own architecture plugin and implement the lifter using the API, you get all of that analysis for free immediately.

If that weren’t enough to get your attention, Binary Ninja is significantly less expensive than its major competitors.

Master Binary Ninja with Help from Industry Experts

You could learn Binary Ninja by yourself. Vector35 has done a great job publishing helpful materials, managing a healthy Slack community, and giving informative presentations.

However, if you can’t bill for hours spent studying, consider our modular trainings. They can be organized to suit your company’s needs. You choose the number of skills and days to spend honing them. Here’s what you can learn and accomplish:

  • Reverse Engineering with Binary Ninja (1 day)
    By the end of this one-day module, you will be able to reverse engineer software and automate simple tasks, and you’ll be ready to dive into the primary module, Automated Reverse Engineering.
  • Automated Reverse Engineering with Binary Ninja (2 days)
    Take your reverse engineering skills to the next level. This two-day training module dives deeper into the Python API. By the end of the module, you will be able to automate common analysis tasks, as well as extend Binary Ninja’s built-in functionality with plugins.
  • Automated Malware Analysis with Binary Ninja (2 days)
    Building on the Automated Reverse Engineering module, this two-day module provides a toolbox for tackling the advanced techniques that malware uses to hide or obscure its functionality. By the end of the module, you will be able to write plugins that detect and deobfuscate strings and control flow to make sense of a binary’s functionality, as well as scripting detection routines to identify malicious behavior for batch processing.
  • Automated Vulnerability Research with Binary Ninja (2 days)
    Adding to the Automated Reverse Engineering module, this two-day module gives you the tools to automate bug-hunting tasks in binary applications, then write exploit payloads in C with Binary Ninja. Exercises are provided as a friendly Capture-the-Flag format.
  • Custom Loaders and Architectures (1 day)
    This one-day module trains you to expand Binary Ninja’s support for new file types and architectures. You will also learn how to extend existing architecture plugins. At the end of the module, you will be able to reverse engineer an instruction set, and implement disassemblers, lifters, and loader plugins.
  • Extending Binary Ninja with the C++ API (1 day)
    This one-day module demonstrates the differences between the various APIs and how to write effective Binary Ninja plugins in C++. At the end of the module, you will be able to develop standalone applications that interface with Binary Ninja’s core.

Download a PDF containing all of these modules’ descriptions.

Empower Your Analysts to do More

Reverse engineering offers tremendous potential, but if you do it manually, you’re wasting a lot of time and intelligence. Automate your reverse engineering with Binary Ninja, and accelerate your capabilities with our training modules.

Contact us to schedule a training.

Slither: The Leading Static Analyzer for Smart Contracts

We have published an academic paper on Slither, our static analysis framework for smart contracts, in the International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB), colocated with ICSE.

Our paper shows that Slither’s bug detection outperforms other static analysis tools for finding issues in smart contracts in terms of speed, robustness, and balance of detection and false positives. The paper provides more details on how the use of a sophisticated intermediate language based on Static Single Assignment (SSA) form, a key advance in the development of modern optimizing compilers, lets Slither go about its work quickly and effectively, and makes it easy to extend Slither to new tasks.

Overview and applications

First, we describe how Slither was designed and what it can do. Slither was designed to be a static analysis framework that provides fine-grained information about smart contract code and has the necessary flexibility to support many applications. The framework is currently used for the following:

  • Automated vulnerability detection. A large variety of smart contract bugs can be detected without user intervention or additional specification effort.
  • Automated optimization detection. Slither detects code optimizations that the compiler misses.
  • Code understanding. Slither summarizes and displays contracts’ information to aid your study of the codebase.
  • Assisted code review. A user can interact with Slither through its API.

Slither works as follows:

  1. It takes as initial input the Solidity Abstract Syntax Tree (AST) generated by the Solidity compiler. Slither works out of the box with the most common frameworks, including Truffle, Embark, and Dapp. You just point Slither at a contract to analyze.
  2. It then generates important information, such as the contract’s inheritance graph, the control flow graph (CFG), and the list of all expressions in the contract.
  3. Slither then translates the code of the contract into SlithIR, an internal representation language that makes precise and accurate analyses easier to write.
  4. Finally, Slither runs a set of pre-defined analyses that provide enhanced information to other modules (e.g., computing data flow, protected function calls, etc.).

Fig. 1: How Slither works

Slither vs. the World

An important part of our paper focuses on comparing Slither to other smart contract static analysis tools. We contrast Slither (release 0.5.0) with other open-source static analysis tools to detect vulnerabilities in Ethereum smart contracts: Securify (revision 37e2984), SmartCheck (revision 4d3367a) and Solhint (release 1.1.10). We decided to focus our evaluation almost exclusively on the tools’ reentrancy detectors, since reentrancy is one of the oldest, best understood, and most dangerous security issues. Figure 2 shows the classic example of a simple reentrant contract that can be exploited to drain all of its ether by calling withdrawBalance with a fallback function that calls withdrawBalance again.

Fig. 2: An exploitable reentrant contract

The reentrancy detector is one of the few that is available in all the tools we evaluated. Furthermore, we experimented with one thousand of the most used contracts (those with the largest number of transactions) for which Etherscan provides the source code, to obtain the following results:

Fig. 3: Slither outperforms the other tools in every category

Using a dataset of one thousand contracts, the tools were run on each contract with a timeout of 120 seconds, using only reentrancy detectors. We manually disabled other detection rules to avoid the introduction of bias in the measurements.

In summary, we observed the following strengths in our tool in terms of vulnerability detection:

  • Accuracy. The False positives, Flagged contracts, and Detections per contract rows summarize accuracy results. Our experiments reveal that Slither is the most accurate tool with the lowest false positive rate of 10.9%; followed by Securify with 25%. On the contrary, SmartCheck and Solhint have extremely high false positive rates: 73.6% and 91.3% (!) respectively.
    Additionally, we include the number of contracts for which at least one reentrancy is detected (flagged contracts) and the average number of findings per flagged contract. On one hand, SmartCheck flags a larger number of contracts, confirming its high false positive rate (it flags about seven times as many contracts as Slither, and has a false positive rate roughly seven times higher). On the other hand, Securify flags a very small number of contracts, which indicates that the tool fails to detect a number of true positives found by other tools; note that Securify flags far fewer contracts than Slither, but still flags more that are false positives.
  • Performance. The Average execution time and Timed-out analyses rows summarize performance results, confirming that Slither is the fastest tool, followed by Solhint, SmartCheck, and, finally, Securify. In our experiments, Slither was typically as fast as a simple linter. Other tools, such as Solhint and SmartCheck, parse Solidity source code or analyze precompiled contracts, such as Securify.
  • Robustness. The Failed analyses row summarizes robustness results, showing that Slither is the most robust tool, followed by Solhint, SmartCheck, and Securify. Slither failed only for 0.1% of the contracts; meanwhile, Solhint failed around 1.2%. SmartCheck and Securify are less robust, failing 10.22% and 11.20% of the time, respectively.

We also compared Slither to Surya, the most similar tool for smart contract code understanding. We found that Slither includes all the important information provided by Surya, but is able to integrate more advanced information due to the static analyses it performs. Code understanding tools that do not incorporate deeper analyses are limited to superficial information, while Slither is easily extensible to more sophisticated code summarization tasks.

The Talk

This paper will be presented by our security engineers, Josselin Feist and Gustavo Grieco, at WETSEB 2019 on May 27th at 11am.

Beyond the Paper

Slither is in constant evolution. We recently released the version 0.6.4 and several improvements and features were added since we wrote the paper, including automated checks for upgradeable contracts, and Visual Studio integration. We are proud to have more than 30 detectors that are open source, and Slither has about the same amount of private detectors for race conditions, weak cryptography, and other critical flaws.

Slither is the core of crytic.io, our continuous assurance system (think “Travis-CI but for Ethereum”), which unleashes all the Slither analyses to protect smart contracts.

Contact us, or join the Empire Hacking Slack, if you need help to integrate Slither to your development process, or if you want to learn more about Slither capacities.

Announcing the community-oriented osquery fork, osql

For months, Facebook has been heavily refactoring the entire osquery codebase, migrating osquery away from standard development tools like CMake and integrating it with Facebook’s internal tooling. Their intention was to improve code quality, implement additional tests, and move the project to a more modular architecture. In practice, the changes sacrificed support for a number of architectures, operating systems, and a variety of useful developer tools that integrate well only with the standard build system preferred by the open-source C++ community.

Worse still, the project’s new inward focus has greatly delayed the review of community contributions — effectively stalling development of features or fixes for the needs of the community — without a clear end in sight. Lacking a roadmap or predictable release cycle, user confidence in the project has fallen. Enterprises are postponing their planned osquery deployments and searching for alternative solutions.

Many of the most secure organizations in the world have already invested in making osquery the absolute best endpoint management solution for their needs. Being forced to look elsewhere would be a waste of their investment, and leave them relying on less effective alternatives. That is why we are announcing the community-oriented osquery fork: osql.

What are the goals of osql?

With osql, we are committed to restoring the community’s confidence in the osquery project, to making the development process more open and predictable, and to reviewing and accepting community contributions more quickly. Our goal is to restore direct community participation.

An open and transparent development process

In the immediate term, osql will be maintained as a “soft-fork.” We will closely track Facebook’s upstream updates without diverging from the codebase. Plenty of completed work is simply waiting upstream, in Pull Requests. We prepared a workflow through which the osql project can accept Pull Requests that the community deems stable enough to be shipped, but which have been ignored by the upstream maintainers. The community can pick and choose its priorities from those contributions, and incorporate them into the next release of osql.

Screen Shot 2019-04-18 at 8.56.40 AM

The osql organization on GitHub will be a hub for community projects

Continuous Integration, Continuous Delivery

We’ve also integrated a much-needed public CI using Azure Pipelines, which will build and run tests at each commit. Find the results here. The CI will help us build, test, and release faster and more frequently. We are committing to release a new osql binary (package installer) on a regular monthly cadence. We will communicate the changes that users can expect in the next release. They will know when to expect it, and that the version they download has passed all tests.

Screen Shot 2019-04-18 at 8.58.37 AM

Determine if the latest code is building for all platforms, at a glance

Restoring standard tool support for developers

We rewrote the build system from scratch to return it to CMake, the C++ community’s de-facto standard for building projects. This effort was non-trivial, but we believe it was central to preserving the project’s compatibility with open-source toolchains. The libraries and tools that represent the foundation of modern C++ development, such as Boost or the LLVM/Clang compiler toolchain, all support CMake natively. The most-used third party libraries use CMake as well, making it quite easy to include them in a CMake-based project.

Developers benefit from built-in CMake support in their IDEs. Visual Studio, VS Code, CLion and QtCreator can all easily open a project from its CMakeLists file, enabling a precise view of the project’s structure and the outputs of its build process. They’ll also regain the convenience of CMake-supporting static analyzer frameworks, like Clang’s scan-build, which helps discover critical bugs across an entire project.

By re-centering everything around a CMake build process, we made osql a more developer-friendly project than upstream osquery. If you would like to see for yourself and begin contributing to osql, check out the build guide.

VSCode

Work conveniently in the Visual Studio Code IDE, with CMake integration

What’s next for osql

Our work is just beginning! We plan to continue improving the automation of osql releases. Initially, osql releases will be unsigned binaries/packages. The next priority for the project is to implement a secure code-signing step into the CI procedure, so that every release is a binary signed by the “osql” organization.

The osquery project’s build process used to allow you to choose whether to download or to build third-party dependencies, thanks to easily modifiable Homebrew formulas. Not only that, you could also choose from where these dependencies were downloaded. That is no longer true for osquery currently, but we will restore that ability in osql (a task made easier thanks to CMake).

We also plan to extend the public CI for osql to enable it to test PRs opened against upstream osquery. This will help the community review those PRs, and provide a kind of quality assurance for their inclusion in a future release of osql.

In the longer term, thanks to CMake’s support for building on various platforms, it will be possible for osql to be built for whatever new systems that the community demands.

Want More? Let’s Talk

When we originally ported osquery to Windows, we didn’t imagine it would become so big, or that it would outgrow what Facebook alone could maintain. A whole community of organizations now deploy and depend on osquery. That’s why we’ve launched osql, the community-oriented osquery fork. If you are part of this community and are interested in porting to other platforms, need special features from the project, or want some customization done to the core, join our osquery/osql support group or contact us!

Announcing QueryCon 2019

Exciting news: We’re hosting the second annual QueryCon on June 20th-21st in New York City, co-sponsored by Kolide and Carbon Black!

Register here

QueryCon has become the foremost event for the osquery and osql open-source community. QueryCon brings together core maintainers, developers, and end-users to teach, discuss, and collaborate on Facebook’s award-winning open-source endpoint detection tool.

Last year’s inaugural conference (hosted by Kolide in San Francisco) boasted 120 attendees, 16 speakers, and talk topics ranging from ‘super features’ to ‘the extensions skunkworks’ to ‘catching everything with osquery events.’ This year, we’re switching coasts and growing the event in honor of the growing community. Join us for what is sure to be a great event!

Event details

Conference room at the venue

  • When: June 20th – 21st
  • Where: Convene at 32 Old Slip in downtown Manhattan, just steps from Wall Street and the New York Stock Exchange.
  • What to expect:
    • Two days of talks by osquery and osql engineers, users, and fans — no sales talks
    • Structured time to discuss and collaborate on fixing issues and improving the project
    • Networking with users and experts
    • Sponsored afterparty in downtown Manhattan

Learn more and register

Make sure to buy your tickets ASAP — last year’s event sold out!

Call for Papers

Would you like to be a featured speaker at this year’s QueryCon? You’re in luck: Speaker slots are still open.

Apply here!

About Trail of Bits

It’s no secret that we are huge fans of osquery. From when we ported osquery to Windows in 2016 to our launch of our osquery extension repo last year, we’ve been one of the leading contributors to the tool’s development.

Trail of Bits helps secure the world’s most targeted organizations and products. We combine high-end security research with a real-world attacker mentality to reduce risk and fortify code.

We’re a security research and engineering firm headquartered in New York City. Our engineering services team works closely with business customers in tech, defense, and finance on quick-response feature development, bug fixes, and integration of the tools they depend on for endpoint detection and response, event log aggregation, secure software updates, and security testing.

We leverage the best of open-source software for our work, and regularly contribute enhancements to these projects as a result. In this way, we plan to bring projects like osquery, Santa, Omaha and StreamAlert to parity with the leading proprietary alternatives.