Slither – a Solidity static analysis framework

Slither is the first open-source static analysis framework for Solidity. Slither is fast and precise; it can find real vulnerabilities in a few seconds without user intervention. It is highly customizable and provides a set of APIs to inspect and analyze Solidity code easily. We use it in all of our security reviews. Now you can integrate it into your code-review process.

We are open sourcing the core analysis engine of Slither. This core provides advanced static-analysis features, including an intermediate representation (SlithIR) with taint tracking capabilities on top of which complex analyses (“detectors”) can be built. We have built many detectors, including ones that detect reentrancy and suicidal contracts. We are open sourcing some as examples.

If you are a smart-contract developer, a security expert, or an academic researcher, then you will find Slither invaluable.

Built for continuous integration

Slither has a simple command line interface. To run all of its detectors on a Solidity file, this is all you need: $ slither contract.sol

You can integrate Slither into your development process without any configuration. Run it on each commit to check that you are not adding new bugs.

Helps automate security reviews

Slither provides an API to inspect Solidity code via custom scripts. We use this API to rapidly answer unique questions about the code we’re reviewing. We have used Slither to:

  • Identify code that can modify a variable’s value.
  • Isolate the conditional logic statements that are influenced by a particular variable’s value.
  • Find other functions that are transitively reachable as a result of a call to a particular function.

For example, the following script will show which function(s) in myContract write to the state variable myVar:

# function_writing.py
import sys
from slither.slither import Slither

if len(sys.argv) != 2:
print('python.py function_writing.py file.sol')
exit(-1)

# Init slither
slither = Slither(sys.argv[1])

# Get the contract
contract = slither.get_contract_from_name('myContract')

# Get the variable
myVar = contract.get_state_variable_from_name('myVar')

# Get the functions writing the variable
funcs_writing_myVar = contract.get_functions_writing_to_variable(myVar)

# Print the result
print('Functions that write to "myVar": {}'.format([f.name for f in funcs_writing_myVar]))

Figure 1: Slither API Example

Read the API documentation and the examples to start harnessing Slither.

Aids in understanding contracts

Slither comes with a set of predefined “printers” which show high-level information about the contract. We included four that work out-of-the-box to print essential security information: a contract summary, a function summary, a graph of inheritance, and an authorization overview.

1. Contract summary printer

Gives a quick summary of the contract, showing the functions and their visibility:

Figure 2: Contract Summary Printer

2. Function summary printer

Shows useful information for each function, such as the state variables read and written, or the functions called:

Figure 3: Function Summary Printer

3. Inheritance printer

Outputs a graph highlighting the inheritance dependencies of all the contracts:

Figure 3: Function Summary Printer

4. Authorization printer

Shows what a user with privileges can do on the contract:

Figure 4: Authorization Printer

See Slither’s documentation for information about adding your own printers.

A foundation for research

Slither uses its own intermediate representation, SlithIR, to build innovative vulnerability analyses on Solidity. It provides access to the CFG of the functions, the inheritance of the contracts, and lets you inspect Solidity expressions.

Many academic tools, such as Oyente or MAIAN, advanced the start of the art when they were released. However, each academic team had to invent their own framework, built for only the limited scope of their particular area of interest. Maintenance became a challenge quickly. In contrast, Slither is a generic framework. Because it’s capable of the widest possible range of security analyses, it is regularly maintained and used by our open source community.

If you are an academic researcher, don’t spend time and effort parsing and recovering information from smart contracts. Prototype your new innovations on top of Slither, complete your research sooner, and ensure it maintains its utility over time.

It’s easy to extend Slither’s capabilities with new detector plugins. Read the detector documentation to start writing your own.

Next steps

Slither can find real vulnerabilities in a few seconds with minimal or no user interaction. We use it on all of our Solidity security reviews. You should too!

Many of our ongoing projects will improve Slither, including:

  • API enhancements: Now that we have open sourced the core, we intend to provide the most effective static analysis framework possible.
  • More precise built-in analyses: We plan to make several new layers of information, such as value tracking, accessible to the API.
  • Toolchain integration: We plan to combine Slither with Manticore, Echidna, and Truffle to automate the triage of issues.

Questions about Slither’s API and its core framework? Join the Empire Hacking Slack. Need help integrating Slither into your development process? Want access to our full set of detectors? Contact us.

Introduction to Verifiable Delay Functions (VDFs)

Finding randomness on the blockchain is hard. A classic mistake developers make when trying to acquire a random value on-chain is to use quantities like future block hashes, block difficulty, or timestamps. The problem with these schemes is that they are vulnerable to manipulation by miners. For example, suppose we are trying to run an on-chain lottery where users guess whether the hash of the next block will be even or odd. A miner then could bet that the outcome is even, and if the next block they mine is odd, discard it. Here, tossing out the odd block slightly increases the miner’s probability of winning the lottery. There are many real-world examples of “randomness” being generated via block variables, but they all suffer from the unavoidable problem that it is computationally easy for observers to determine how choices they make will affect the randomness generated on-chain.

Another related problem is electing leaders and validators in proof of stake protocols. In this case it turns out that being able to influence or predict randomness allows a miner to affect when they will be chosen to mine a block. There are a wide variety of techniques for overcoming this issue, such as Ouroboros’s verifiable secret-sharing scheme. However, they all suffer from the same pitfall: a non-colluding honest majority must be present.

In both of the above scenarios it is easy for attackers to see how different inputs affect the result of a pseudorandom number generator. This led Boneh, et al. to define verifiable delay functions (VDF’s). VDF’s are functions that require a moderate amount of sequential computation to evaluate, but once a solution is found, it is easy for anyone to verify that it is correct. Think of VDF’s as a time delay imposed on the output of some pseudorandom generator. This delay prevents malicious actors from influencing the output of the pseudorandom generator, since all inputs will be finalized before anyone can finish computing the VDF.

When used for leader selection, VDF’s offer a substantial improvement over verifiable random functions. Instead of requiring a non-colluding honest majority, VDF-based leader selection only requires the presence of any honest participant. This added robustness is due to the fact that no amount of parallelism will speed up the VDF, and any non-malicious actor can easily verify anyone else’s claimed VDF output is accurate.

VDF Definitions

Given a delay time t, a verifiable delay function f must be both

  1. Sequential: anyone can compute f(x) in t sequential steps, but no adversary with a large number of processors can distinguish the output of f(x) from random in significantly fewer steps
  2. Efficiently verifiable: Given the output y, any observer can verify that y = f(x) in a short amount of time (specifically log(t)).

In other words, a VDF is a function which takes exponentially more time to compute (even on a highly parallel processor) than it does to verify on a single processor. Also, the probability of a verifier accepting a false VDF output must be extremely small (chosen by a security parameter λ during initialization). The condition that no one can distinguish the output of f(x) from random until the final result is reached is essential. Suppose we are running a lottery where users submit 16-bit integers and the winning number is determined by giving a seed to a VDF that takes 20 min to compute. If an adversary can learn 4 bits of the VDF output after only 1 min of VDF computation, then they might be able to alter their submission and boost their chance of success by a factor of 16!

Before jumping into VDF constructions, let’s examine why an “obvious” but incorrect approach to this problem fails. One such approach would be repeated hashing. If the computation of some hash function h takes t steps to compute, then using f = h(h(...h(x))) as a VDF would certainly satisfy the sequential requirement above. Indeed, it would be impossible to speed this computation up with parallelism since each application of the hash depends entirely on the output of the previous one. However, this does not satisfy the efficiently verifiable requirement of a VDF. Anyone trying to verify that f(x) = y would have to recompute the entire hash chain. We need the evaluation of our VDF to take exponentially more time to compute than to verify.

VDF Candidates

There are currently three candidate constructions that satisfy the VDF requirements. Each one has its own potential downsides. The first was outlined in the original VDF paper by Boneh, et al. and uses injective rational maps. However, evaluating this VDF requires a somewhat large amount of parallel processing, leading the authors to refer to it as a “weak VDF.” Later, Pietrzak and Wesolowski independently arrived at extremely similar constructions based on repeated squaring in groups of unknown order. At a high level, here’s how the Pietrzak scheme works.

  1. To set up the VDF, choose a time parameter T, a finite abelian group G of unknown order, and a hash function H from bytes to elements of G.
  2. Given an input x, let g = H(x) evaluate the VDF by computing y = g2T

The repeated squaring computation is not parallelizable and reveals nothing about the end result until the last squaring. These properties are both due to the fact that we do not know the order of G. That knowledge would allow attackers to use group theory based attacks to speed up the computation.

Now, suppose someone asserts that the output of the VDF is some number z (which may or may not be equal to y). This is equivalent to showing that z = v2(T/2) and v = g2(T/2). Since both of the previous equations have the same exponent, they can be verified simultaneously by checking a random linear combination, e.g., vr z = (gr v)2(T/2), for a random r in {1, … , 2λ}(where λ is the security parameter). More formally, the prover and verifier perform the following interactive proof scheme:

  1. The prover computes v = g2(T/2) and sends v to the verifier
  2. The verifier sends a random r in {1, … , 2l} to the prover
  3. Both the prover and verifier compute g1 = gr v and z1 = vr z
  4. The prover and verifier recursively prove that z1 = g12(T/2)

The above scheme can be made non-interactive using a technique called the Fiat-Shamir heuristic. Here, the prover generates a challenge r at each level of the recursion by hashing (G,g,z,T,v) and appending v to the proof. In this scenario the proof contains log2 T elements and requires approximately (1 + 2/√T) T.

Security Analysis of Pietrzak Scheme

The security of Pietrzak’s scheme relies on the the security of the low order assumption: it is computationally infeasible for an adversary to find an element of low order in the group being used by the VDF. To see why finding an element of low order breaks the scheme, first assume that a malicious prover Eve found some element m of small order d. Then Eve sends zm to the verifier (where z is the valid output). The invalid output will be accepted with probability 1/d since

  1. When computing the second step of the recursion, we will have the base element g1 = gr v, where v = g2T/2 m, and need to show that g1T/2 = vr(zm)
  2. The m term on the left hand side is mT/2
  3. The m term on the right hand side is mr+1
  4. Since m has order d, these two will be equal when r+1 = T/2 mod d, which happens with probability 1/d

To see a full proof of why the low order assumption is both necessary and sufficient to show Pietrzak’s scheme is sound, see Boneh’s survey of recent VDF constructions.

The security analysis assumes that one can easily generate a group of unknown order that satisfies the low order assumption. We will see below that there are not groups currently known to satisfy these constraints that are amenable to a trustless setup, i.e., a setup where there is no party who can subvert the VDF protocol.

For example, let’s try to use everyone’s favorite family of groups: the integers modulo the product of two large primes (RSA groups). These groups have unknown order, since finding the order requires factoring a large integer. However, they do not satisfy the low order assumption. Indeed, the element -1 is always of order 2. This situation can be remedied by taking the quotient of an RSA group G by the subgroup {1,-1}. In fact, if the modulus of G is a product of strong primes (primes such that p-1/ 2 is also prime), then after taking the aforementioned quotient there are no elements of low order other than 1.

This analysis implies that RSA groups are secure for Pietrzak’s VDF, but there’s a problem. To generate an RSA group, someone has to know the factorization of the modulus N. Devising a trustless RSA group selection protocol–-where no one knows the factorization of the modulus N–-is therefore an interesting and important open problem in this area.

Another avenue of work towards instantiating Pietrzak’s scheme involves using the class group of an imaginary quadratic number field. This family of groups does not suffer from the above issue where selection requires a trusted third party. Simply choosing a large negative prime (with several caveats) will generate a group whose order is computationally infeasible to determine even for the person who chose the prime. However, unlike RSA groups, the difficulty of finding low-order elements in class groups of quadratic number fields is not well studied and would require more investigation before any such scheme could be used.

State of VDFs and Open Problems

As mentioned in the previous section, both the Pietrzak and Wesolowski schemes rely on generating a group of unknown order. Doing so without a trusted party is difficult in the case of RSA groups, but class groups seem to be a somewhat promising avenue of work. Furthermore, the Wesolowski scheme assumes the existence of groups that satisfy something called the adaptive root assumption, which is not well studied in the mathematical literature. There are many other open problems in this area, including constructing quantum resistant VDFs, and the potential for ASICs to ruin the security guarantees of VDF constructions in practice.

As for industry adoption of VDF’s, several companies in the blockchain space are trying to use VDF’s for consensus algorithms. Chia, for example, uses the repeated squaring technique outlined above, and is currently running a competition for the fastest implementation of this scheme. The Ethereum Foundation also appears to be developing a pseudorandom number generator that combines RANDAO with VDF’s. While both are very exciting projects that will be hugely beneficial to the blockchain community, this remains a very young area of research. Take any claim of security with a grain of salt.

How to Spot Good Fuzzing Research

Of the nearly 200 papers on software fuzzing that have been published in the last three years, most of them—even some from high-impact conferences—are academic clamor. Fuzzing research suffers from inconsistent and subjective benchmarks, which keeps this potent field in a state of arrested development. We’d like to help explain why this has happened and offer some guidance for how to consume fuzzing publications.

Researchers play a high-stakes game in their pursuit of building the next generation of fuzzing tools. A major breakthrough can render obsolete what was once the state of the art. Nobody is eager to use the world’s second-greatest fuzzer. As a result, researchers must somehow demonstrate how their work surpasses the state of the art in finding bugs.

The problem is trying to objectively test the efficacy of fuzzers. There isn’t a set of universally accepted benchmarks that is statistically rigorous, reliable, and reproducible. Inconsistent fuzzing measurements persist throughout the literature and prevent meaningful meta-analysis. That was the motivation behind the paper “Evaluating Fuzz Testing,” to be presented by Andrew Ruef at the 2018 SIGSAC Conference on Computer and Communications Security in Toronto.

“Evaluating Fuzz Testing” offers a comprehensive set of best practices for constructing a dependable frame of reference for comparing fuzzing tools. Whether you’re submitting your fuzzing research for publication, peer-reviewing others’ submissions, or trying to decide which tool to use in practice, the recommendations from Ruef and his colleagues establish an objective lens for evaluating fuzzers. In case you don’t have time to read the whole paper, we’re summarizing the criteria we recommend you use when evaluating the performance claims in fuzzing research.

Quick Checklist for Benchmarking Fuzzers

  1. Compare new research against popular baseline tools like american fuzzy lop (AFL), Basic Fuzzing Framework (BFF), libfuzzer, Radamsa, and Zzuf. In lieu of a common benchmark, reviewing research about these well-accepted tools will prepare you to ascertain the quality of other fuzzing research. The authors note that “there is a real need for a solid, independently defined benchmark suite, e.g., a DaCapo or SPEC10 for fuzz testing.” We agree.
  2. Outputs should be easy to read and compare. Ultimately, this is about finding the fuzzer that delivers the best results. “Best” is subjective (at least until that common benchmark comes along), but evaluators’ work will be easier if they can interpret fuzzers’ results easily. As Ruef and his colleagues put it: “Clear knowledge of ground truth avoids overcounting inputs that correspond to the same bug, and allows for assessing a tool’s false positives and false negatives.”
  3. Account for differences in heuristics. Heuristics influence how fuzzers start and pursue their searches through code paths. If two fuzzers’ heuristics lead them to different targets, then the fuzzers will produce different results. Evaluators have to account for that influence in order to compare one fuzzer’s results against another’s.
  4. Targets representative datasets with distinguishable bugs like the Cyber Grand Challenge binaries, LAVA-M, and Google’s fuzzer test suite and native programs like nm, objdump, cxxfilt, gif2png, and FFmpeg. Again for lack of a common benchmark suite, fuzzer evaluators should look for research that used one of the above datasets (or, better yet, one native and one synthetic). Doing so can encourage researchers to ‘fuzz to the test,’ which doesn’t do anyone any good. Nevertheless, these datasets provide some basis for comparison. Related: As we put more effort into fuzzers, we should invest in refreshing datasets for their evaluation, too.
  5. Fuzzers are configured to begin in similar and comparable initial states. If two fuzzers’ configuration parameters reflect different priorities, then the fuzzers will yield different results. It isn’t realistic to expect that all researchers will use the same configuration parameters, but it’s quite reasonable to expect that those parameters are specified in their research.
  6. Timeout values are at least 24 hours. Of the 32 papers the authors reviewed, 11 capped timeouts at “less than 5 or 6 hours.” The results of their own tests of AFL and AFLFast varied by the length of the timeout: “When using a non-empty seed set on nm, AFL outperformed AFLFast at 6 hours, with statistical significance, but after 24 hours the trend reversed.” If all fuzzing researchers allotted the same time period—24 hours—for their fuzz runs, then evaluators would have one less variable to account for.
  7. Consistent definitions of distinct crashes throughout the experiment. Since there’s some disagreement in the profession about how to categorize unique crashes and bugs (by the input or by the bug triggered), evaluators need to seek the researchers’ definition in order to make a comparison. That said, beware the authors’ conclusion: “experiments we carried out showed that the heuristics [for de-duplicating or triaging crashes] can dramatically over-count the number of bugs, and indeed may suppress bugs by wrongly grouping crashing inputs.”
  8. Consistent input seed files. The authors found that fuzzers’ “performance can vary substantially depending on what seeds are used. In particular, two different non-empty inputs need not produce similar performance, and the empty seed can work better than one might expect.” Somewhat surprisingly, many of the 32 papers evaluated did not carefully consider the impact of seed choices on algorithmic improvements.
  9. At least 30 runs per configuration with variance measured. With that many runs, anomalies can be ignored. Don’t compare the results of single runs (“as nearly ⅔ of the examined papers seem to,” the authors report!). Instead, look for research that not only performed multiple runs, but also used statistical tests to account for variances in those tests’ performance.
  10. Prefer bugs discovered over code coverage metrics. We at Trail of Bits believe that our work should have a practical impact. Though code coverage is an important criterion in choosing a fuzzer, this is about finding and fixing bugs. Evaluators of fuzz research should measure performance in terms of known bugs, first and foremost.

Despite how obvious or simple these recommendations may seem, the authors reviewed 32 high-quality publications on fuzzing and did not find a single paper that aligned with all 10. Then they demonstrated how conclusive the results from rigorous and objective experiments can be by using AFLFast and AFL as a case study. They determined that, “Ultimately, while AFLFast found many more ‘unique’ crashing inputs than AFL, it only had a slightly higher likelihood of finding more unique bugs in a given run.”

The authors’ results and conclusions showed decisively that in order to advance the science of software fuzzing, researchers must strive for disciplined statistical measurements and better empirical measurements. We believe this paper will begin a new chapter in fuzzing research by providing computer scientists with an excellent set of standards for designing, evaluating, and reporting software fuzzing experiments in the future.

In the meantime, if you’re evaluating a fuzzer for your work, approach with caution and this checklist.

Ethereum security guidance for all

We came away from ETH Berlin with two overarching impressions: first, many developers were hungry for any guidance on security, and second; too few security firms were accessible.

When we began taking on blockchain security engagements in 2016, there were no tools engineered for the work. Useful documentation was hard to find and hidden among many bad recommendations.

We’re working to change that by: offering standing office hours, sharing our aggregation of the best Ethereum security references on the internet, and maintaining a list of contact information for bug reporting.

We want to support the community to produce more secure smart contracts and decentralized apps.

Ethereum security office hours

Once every other week, our engineers will host a one-hour video chat where we’ll take all comers and answer Ethereum security questions at no cost. We’ll help guide participants through using our suite of Ethereum security tools and reference the essential knowledge and resources that people need to know.

Office hours will be noon Eastern Standard Time (GMT-5) on the first and third Tuesdays of the month. We’ll post a sign up form on our Twitter and the Empire Hacking Slack one day ahead of time (EDIT: the signup link is now live).

Crowdsourced blockchain security contacts

It’s a little ironic, but most security researchers have struggled to report vulnerabilities. Sometimes, the reporting process itself puts unnecessary burden on the reporter. The interface may not support the reporter’s language. Or, as Project Zero’s Natalie Silvanovich recently shared, it may come down to legalities:

“When software vendors start [bug bounties], they often remove existing mechanisms for reporting vulnerabilities…” and “…without providing an alternative for vulnerability reporters who don’t agree or don’t want to participate in [a rewards] program for whatever reason.”

We routinely identify previously unknown flaws in smart contracts, decentralized applications, and blockchain software clients. In many cases, it has been difficult or impossible to track down contact information for anyone responsible. When that happens, we have to leave the vulnerability unreported and simply hope that no one malicious discovers it.

This is not ideal, so we decided to do something about it. We are crowdsourcing a directory of security contacts for blockchain companies. This directory, Blockchain Security Contacts, identifies the best way to contact an organization’s security team so that you can report vulnerabilities directly to those who can resolve them.

If you work on a security team at a blockchain company, please add yourself to the directory!

Security contact guidance

The directory is just the first step. Even with the best of intentions, many companies rush into bug bounties without fully thinking through the legal and operational ramifications. They need guidance for engaging with security researchers most effectively.

At a minimum, we recommend:

Ethereum security references

Over the course of our work in Blockchain security, we’ve curated the best community-maintained and open-source Ethereum security references on the internet. These are the references we rely on the most. They’re the most common resources that every team developing a decentralized application needs to know about, including:

  • Resources for secure development, CTFs & wargames, and even specific podcast episodes
  • Security tools for visualization, linting, bug finding, verification, and reversing
  • Pointers to related communities

This is a community resource we want to grow as the community does. We’re committed to keeping it up to date.

With that all said, please contact us if you’d like help securing your blockchain software.

Effortless security feature detection with Winchecksec

We’re proud to announce the release of Winchecksec, a new open-source tool that detects security features in Windows binaries. Developed to satisfy our analysis and research needs, Winchecksec aims to surpass current open-source security feature detection tools in depth, accuracy, and performance without sacrificing simplicity.

Feature detection, made simple

Winchecksec takes a Windows PE binary as input, and outputs a report of the security features baked into it at build time. Common features include:

  • Address-space layout randomization (ASLR) and 64-bit-aware high-entropy ASLR (HEASLR)
  • Authenticity/integrity protections (Authenticode, Forced Integrity)
  • Data Execution Prevention (DEP), better known as W^X or No eXecute (NX)
  • Manifest isolation
  • Structured Exception Handling (SEH) and SafeSEH
  • Control Flow Guard (CFG) and Return Flow Guard (RFG)
  • Guard Stack (GS), better known as stack cookies or canaries

Winchecksec’s two output modes are controlled by one flag (-j): the default plain-text tabular mode for humans, and a JSON mode for machine consumption. In action:

unnamed

Did you notice that Winchecksec distinguishes between “Dynamic Base” and ASLR above? This is because setting /DYNAMICBASE at build-time does not guarantee address-space randomization. Windows cannot perform ASLR without a relocation table, so binaries that explicitly request ASLR but lack relocation entries (indicated by IMAGE_FILE_RELOCS_STRIPPED in the image header’s flags) are silently loaded without randomized address spaces. This edge case was directly responsible for turning an otherwise moderate use-after-free in VLC 2.2.8 into a gaping hole (CVE-2017-17670). The underlying toolchain error in mingw-w64 remains unfixed.

Similarly, applications that run under the CLR are guaranteed to use ASLR and DEP, regardless of the state of the Dynamic Base/NX compatibility flags or the presence of a relocation table. As such, Winchecksec will report ASLR and DEP as enabled on any binary that indicates that it runs under the CLR. The CLR also provides safe exception handling but not via SafeSEH, so SafeSEH is not indicated unless enabled.

How do other tools compare?

Not well:

  • Microsoft released BinScope in 2014, only to let it wither on the vine. BinScope performs several security feature checks and provides XML and HTML outputs, but relies on .pdb files for its analysis on binaries. As such, it’s impractical for any use case outside of the Microsoft Secure Development Lifecycle. BinSkim appears to be the spiritual successor to BinScope and is actively maintained, but uses an obtuse overengineered format for machine consumption. Like BinScope, it also appears to depend on the availability of debugging information.
  • The Visual Studio toolchain provides dumpbin.exe, which can be used to dump some of the security attributes present in the given binary. But dumpbin.exe doesn’t provide a machine-consumable output, so developers are forced to write ad-hoc parsers. To make matters worse, dumpbin.exe provides a dump, not an analysis, of the given file. It won’t, for example, explain that a program with stripped relocation entries and Dynamic Base enabled isn’t ASLR-compatible. It’s up to the user to put two and two together.
  • NetSPI maintains PESecurity, a PowerShell script for testing many common PE security features. While it provides a CSV output option for programmatic consumption, it lags in performance compared to dumpbin.exe (and other compiled tools listed below), much less Winchecksec.
  • There are a few small feature detectors floating around the world of plugins and gists, like this one, this one, and this one (for x64dbg!). These are generally incomplete (in terms of checks), difficult to interact with programmatically, sporadically maintained, and/or perform ad-hoc PE parsing. Winchecksec aims for completeness in the domain of static checks, is maintained, and uses official Windows APIs for PE parsing.

Try it!

Winchecksec was developed as part of Sienna Locomotive, our integrated fuzzing and triaging system. As one of several triaging components, Winchecksec informs our exploitability scoring system (reducing the exploitability of a buffer overflow, for example, if both DEP and ASLR are enabled) and allows us to give users immediate advice on improving the baseline security of their applications. We expect that others will develop additional use cases, such as:

  • CI/CD integration to make a base set of security features mandatory for all builds.
  • Auditing entire production servers for deployed applications that lack key security features.
  • Evaluating the efficacy of security features in applications (e.g., whether stack cookies are effective in a C++ application with a large number of buffers in objects that contain vtables).

Get Winchecksec on GitHub now. If you’re interested in helping us develop it, try out this crop of first issues.

Protecting Software Against Exploitation with DARPA’s CFAR

Today, we’re going to talk about a hard problem that we are working on as part of DARPA’s Cyber Fault-Tolerant Attack Recovery (CFAR) program: automatically protecting software from 0-day exploits, memory corruption, and many currently undiscovered bugs. You might be thinking: “Why bother? Can’t I just compile my code with exploit mitigations like stack guard, CFG, or CFI?” These mitigations are wonderful, but require source code and modifications to the build process. In many situations it is impossible or impractical to change the build process or alter program source code. That’s why our solution for CFAR protects binary installations for which source isn’t available or editable.

CFAR is very intuitive and deceptively simple. The system runs multiple versions, or ‘variants,’ of the software in parallel, and uses comparisons between these variants to identify when one or more have diverged from the others in behavior. The idea is akin to an intrusion detection system that compares program behavior against variants of itself running on identical input, instead of against a model of past behavior. When the system detects behavioral divergence, it can infer that something unusual, and possibly malicious, has happened.

Like all DARPA programs, CFAR is a large and difficult research problem. We are only working on a small piece of it. We have coordinated this blog post with our teammates – Galois, Immunant, and UCI – each of whom has more details about their respective contributions to the CFAR project.

We are excited to talk about CFAR not just because it’s a hard and relevant problem, but because one of our tools, McSema, is a part of our team’s versatile LLVM-based solution. As a part of this post, we get to show examples of lesser-known McSema features, and explain why they were developed. Perhaps most exciting of all, we’re going to show how to use McSema and the UCI multicompiler to harden off-the-shelf binaries against exploitation.

Our CFAR Team

The overall goal of CFAR is to detect and recover from faults in existing software without impacting core functionality. Our team’s responsibility was to produce an optimal set of variants to mitigate and detect fault-inducing inputs. The other teams were responsible for the specialized execution environment, for red-teaming, and so on. Galois’s blog post on CFAR describes the program in greater detail.

The variants must behave identically to each other and to the original application, and present compelling proof that behavior will remain identical for all valid inputs. Our teammates have developed transformations and provided equivalence guarantees for programs with available source code. The team has devised a multicompiler-based solution for variant generation using the Clang/LLVM toolchain.

McSema’s Role

We have been working on generating program variants of binary-only software, because source code may be unavailable for proprietary or older applications. Our team’s source code based toolchain works at the LLVM intermediate representation (IR) level. Transforming and hardening programs at the IR level allows us to manipulate program structure without altering the program’s source code. Using McSema, we could translate binary-only programs to LLVM IR, and re-use the same components for both source-level and binary-only variant generation.

Accurately translating programs for CFAR required us to bridge the gap between machine-level semantics and program-level semantics. Machine-level semantics are the changes to processor and memory state caused by individual instructions. Program-level semantics (e.g., functions, variables, exceptions, and try/catch blocks) are more abstract concepts that represent program behavior. McSema was designed to be a translator for machine level semantics (the name “McSema” derives from “machine code semantics”). However, to accurately transform the variants required for CFAR, McSema would have to recover program semantics as well.

We are actively working to recover more and more program semantics, and many common use-cases are already supported. In the following section we’ll discuss how we handle two particularly important semantics: stack variables and global variables.

Stack Variables

The compiler can place the data backing function variables in one of several locations. The most common location for program variables is the stack, a region of memory specifically made for storing temporary information and easily accessible to the calling function. Variables that the compiler stores on the stack are called… stack variables!

int sum_of_squares(int a, int b) {
  int a2 = a * a;
  int b2 = b * b;
  return a2+b2;
}
Binary view of the sum_of_squares function
Figure 1: Stack variables for a simple function shown both at the source code level, and at the binary level. At the binary level, there is no concept of individual variables, just bytes in a large block of memory.

When attackers turn bugs into exploits, they often rely on stack variables being in a specific order. The multicompiler can mitigate this class of exploits by generating program variants, where no two variants have stack variables in the same order. We wanted to enable this stack variable shuffling for binaries, but there was a problem: there is no concept of stack variables at the machine code level (Figure 1). Instead, the stack is just a large contiguous block of memory. McSema faithfully models this behavior and treats the program stack as an indivisible blob. This, of course, makes it impossible to shuffle stack variables.

Stack Variable Recovery

The process of converting a block of memory that represents the stack into individual variables is called stack variable recovery. McSema implements stack variable recovery as a three-step process.

First, McSema identifies stack variable bounds during disassembly, via the disassembler’s (e.g., IDA Pro’s) heuristics and, where present, DWARF-based debugging information. There is prior research on identifying stack variable bounds without such hints, which we plan to utilize in the future. Second, McSema attempts to identify which instructions in the program reference which stack variable. Every reference must be accurately identified, or the resulting program will not function. Finally, McSema creates an LLVM-level variable for each recovered stack variable and rewrites instructions to reference these LLVM-level variables instead of the prior monolithic stack block.

Stack variable recovery works for many functions, but it isn’t perfect. McSema will default to the classic behavior of treating the stack as a monolithic block when it encounters functions with the following characteristics:

  • Varargs functions. Functions that use a variable number of arguments (like the common printf family of functions) have a variable sized stack frame. This variance makes it difficult to determine which instruction references which stack variable.
  • Indirect stack references. Compilers also rely on a predetermined layout of stack variables, and will generate code that accesses a variable via the address of an unrelated variable.
  • No stack-frame pointer. As an optimization, the stack-frame pointer can serve as a general purpose register. This optimization makes it difficult for us to detect possible indirect stack references.

Stack variable recovery is a part of the CFG recovery process, and is currently implemented in the IDAPython CFG recovery code (in collect_variable.py). It can be invoked via the --recover-stack-vars argument to mcsema-disass. For an example, see the code accompanying this blog post, which is described more in the Lifting and Diversifying a Binary section.

Global Variables

Global variables can be accessed by all functions in a program. Since these variables are not tied to a specific function, they are typically placed in a special section of the program binary (Figure 2). As with stack variables, the specific ordering of global variables can be exploited by attackers.

bool is_admin = false;
int set_admin(int uid) {
  is_admin = 0 == uid;
}
The set_admin function
The is_admin global variable
Figure 2: Global variables as seen at source code level and at the machine code level. Global variables are typically placed into a special section in the program (in this case, into .bss).

Like the stack, McSema treats each data section as a large block of memory. One major difference between stack and global variables is that McSema knows where global variables start, because they are referenced directly from multiple locations. Unfortunately that is not enough information to shuffle around the global variable layout. McSema also needs to know where every variable ends, which is harder. Currently we rely on DWARF debug information to identify global variable sizes, but look forward to implementing approaches that would work on binaries without DWARF information.

Currently, global variable recovery is implemented separately from normal CFG recovery (in var_recovery.py). That script creates an “empty” CFG, filled with only global variable definitions. The normal CFG recovery process will further populate the file with the real control flow graph, referencing the pre-populated global variables. We will show an example of using global variable recovery later.

Lifting and Diversifying A Binary

In the remainder of this blog post, we’ll refer to the process of generating new program variants via the multicompiler as ‘diversification.’ For this specific example, we will lift and diversify a simple C++ application that uses exception handling (including a catch-all clause) and global variables. While this is just a simple example, program semantics recovery is meant to work on large, real applications: our standard test program is the Apache2 web server.

First, let’s familiarize ourselves with the standard McSema workflow (i.e. without any diversification), which is to lift the example binary to LLVM IR, then compile that IR back down into a runnable program. To get started, please build and install McSema. We provide detailed instructions in the official McSema README.

Next, build and lift the program using the provided script (lift.sh). The script will need to be edited to match your McSema installation.

After running lift.sh, you should have two programs: example and example-lift, along with some intermediate files.

The example program squares two numbers and passes the result to the set_admin function. If both the numbers are 5, then the program throws the std::runtime_error exception. If the numbers are 0, then the global variable is_admin is set to true. Finally, if two numbers are not supplied to the program, then it throws std::out_of_range.

The four different cases can be demonstrated via the following program invocations:

$ ./example
Starting example program
Index out of range: Supply two arguments, please
$ ./example 0 0
Starting example program
You are now admin.
$ ./example 1 2
Starting example program
You are not admin.
$ ./example 5 5
Starting example program
Runtime error: Lucky number 5

We can see that example-lifted, the same program as lifted and re-created by McSema, behaves identically:

$ ./example-lifted
Starting example program
Index out of range: Supply two arguments, please
$ ./example-lifted 0 0
Starting example program
You are now admin.
$ ./example-lifted 1 2
Starting example program
You are not admin.
$ ./example-lifted 5 5
Starting example program
Runtime error: Lucky number 5

Now, lets diversify the lifted example program. To start, install the multicompiler. Next, edit the lift.sh script to specify a path to your multicompiler installation.

It’s time to build the diversified version. Run the script with the diversify argument (./lift.sh diversify) to generate a diversified binary. The diversified example looks different at the binary level than the original (Figure 3), but has the same functionality:

$ ./example-diverse
Starting example program
Index out of range: Supply two arguments, please
$ ./example-diverse 0 0
Starting example program
You are now admin.
$ ./example-diverse 1 2
Starting example program
You are not admin.
$ ./example-diverse 5 5
Starting example program
Runtime error: Lucky number 5
Differences between the normal and diversified binaries
Figure 3: The normal lifted binary (left) and its diversified equivalent (right). Both binaries are functionally identical, but look different at the binary level. Binary diversification protects software by preventing certain classes of bugs from turning into exploits.

Open example-lifted and example-diversified in your favorite disassembler. Your binaries may not be identical to the ones in the screenshot, but they should be different from each other.

Let’s review what we did. It’s really quite amazing. We started by building a simple C++ program that used exceptions and global variables. Then we translated the program into LLVM bitcode, identified stack and global variables, and preserved exception-based control flow. We then transformed it using the multicompiler, and created a new, diversified binary with the same functionality as the original program.

While this was just a small example, this approach scales to much larger applications, and provides a means to rapidly create diversified programs, whether starting with source code or with a previous program binary.

Conclusion

We would first like to thank DARPA, without whom this work would not be possible, for providing ongoing funding for CFAR and other great research programs. We would also like to thank our teammates — Galois, Immunant and UCI — for their hard work creating the multicompiler, transformations, providing equivalence guarantees for variants, and for making everything work together.

We are actively working to improve stack and global variable recovery in McSema. Not only will these higher-level semantics create more diversification and transformation opportunities, but they will also allow for smaller, leaner bitcode, faster re-compiled binaries, and more thorough analyses.

We believe there is a bright future for CFAR and similar technologies: the number of available cores per machine continues to increase, as does the need for secure computing. Many software packages can’t utilize these cores for performance, so it is only natural to use the spare cores for security. McSema, the multicompiler, and other CFAR technologies show how we can put these extra cores in service to stronger security guarantees.

If you think some of these technologies can be applied to your software, please contact us. We’d love to hear from you. To learn more about CFAR, the multicompiler, and other technologies developed under this program, please read our teammates’ blog posts at the Galois blog and the Immunant blog.

Disclaimer

The views, opinions and/or findings expressed are those of the author and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Rattle – an Ethereum EVM binary analysis framework

Most smart contracts have no verified source code, but people still trust them to protect their cryptocurrency. What’s more, several large custodial smart contracts have had security incidents. The security of contracts that exist on the blockchain should be independently ascertainable.

Ethereum VM (EVM) Bytecode

Ethereum contracts are compiled to EVM – the Ethereum Virtual Machine. As blocks are mined, EVM is executed and its resulting state is encoded into the blockchain forever. Everyone has access to this compiled EVM code for every smart contract on the blockchain — but reviewing EVM directly isn’t easy.

EVM is a RISC Harvard-architecture stack machine, which is fairly distinct in the world of computer architectures. EVM has around 200 instructions which push and pop values from a stack, occasionally performing specific actions on them (e.g. ADD takes two arguments of the stack, adds them together, and pushes the result back to the stack). If you’re familiar with reverse polish notation (RPN) calculators, then stack machines will appear similar. Stack machines are easy to implement but difficult to reverse-engineer. As a reverse-engineer, I have no registers, local variables, or arguments that I can label and track when looking at a stack machine.

For these reasons, I created Rattle, a framework which turns the stack machine into an infinite-register SSA form.

Rattle

Rattle is an EVM binary static analysis framework designed to work on deployed smart contracts. Rattle takes EVM byte strings, uses a flow-sensitive analysis to recover the original control flow graph, lifts the control flow graph into an SSA/infinite register form, and optimizes the SSA – removing DUPs, SWAPs, PUSHs, and POPs. Converting the stack machine to SSA form removes 60%+ of EVM instructions and presents a much friendlier interface to those who wish to read the smart contracts they’re interacting with.

Demo

As an example, we will analyze the infamous King of Ether contract.

First in Ethersplay, our Binary Ninja plug-in for analyzing Ethereum Smart Contracts:

Figure 1: The King of Ether contract as disassembled by Ethersplay

In Ethersplay, we can see there are 43 instructions and 5 basic blocks. The majority of the instructions are pure stack manipulation instructions (e.g. PUSH, DUP, SWAP, POP). Interspersed in the blocks are the interesting instructions (e.g. CALLVALUE, SLOAD, etc.).

Now, analyze the contract with Rattle and observe the output for the same function. We run Rattle with optimizations, so constants are folded and unneeded blocks are removed.

$ python3 rattle-cli.py --input inputs/kingofether/KingOfTheEtherThrone.bin -O

The Rattle CLI interface generates graphviz files for each function that it can identify and extract.

Figure 2: The King of Ether contract as recovered by Rattle

As you can see, Rattle optimized the numberOfMonarchs() function to only 12 instructions. Rattle eliminated 72% of the instructions, assigned registers you can track visually, and removed an entire basic block. What’s more, Rattle recovered the used storage location and the ABI of the function.

Rattle will help organizations and individuals study the contracts they’re interacting with and establish an informed degree of trust to the contracts’ security. If your contracts’ source code isn’t available or can’t be verified, then you should run Rattle.

Get Rattle on our GitHub and try it out for yourself.

Contract upgrade anti-patterns

A popular trend in smart contract design is to promote the development of upgradable contracts. At Trail of Bits, we have reviewed many upgradable contracts and believe that this trend is going in the wrong direction. Existing techniques to upgrade contracts have flaws, increase the complexity of the contract significantly, and ultimately introduce bugs. To highlight this point, we are releasing a previously unknown flaw in the Zeppelin contract upgrade strategy, one of the most common upgrade approaches.

In this article, we are going to detail our analysis of existing smart contract upgrade strategies, describe the weaknesses we have observed in practice, and provide recommendations for contracts that require upgrades. In a follow-up blog post, we will detail a method, contract migration, that achieves the same benefits with few of the downsides.

An overview of upgradable contracts

Two ‘families’ of patterns have emerged for upgradable smart contracts:

  • Data separation, where logic and data are kept in separate contracts. The logic contract owns and calls the data contract.
  • Delegatecall-based proxies, where logic and data are kept in separate contracts, also, but the data contract (the proxy) calls the logic contract through delegatecall.

The data separation pattern has the advantage of simplicity. It does not require the same low-level expertise as the delegatecall pattern. The delegatecall pattern has received lots of attention recently. Developers may be inclined to choose this solution because documentation and examples are easier to find.

Using either of these patterns comes at considerable risk, an aspect of this trend that has gone unacknowledged thus far.

Data separation pattern

The data separation pattern keeps logic and data in separate contracts. The logic contract, which owns the data contract, can be upgraded if required. The data contract is not meant to be upgraded. Only the owner can alter its content.

Figure 1: High-level overview of the data separation upgrade pattern

When considering this pattern, pay special attention to these two aspects: how to store data, and how to perform the upgrade.

Data storage strategy

If the variables needed across an upgrade will remain the same, you can use a simple design where the data contract holds these variables, with their getters and setters. Only the contract owner should be able to call the setters:

contract DataContract is Owner {
  uint public myVar;

  function setMyVar(uint new_value) onlyOwner public {
    myVar = new_value;
  }
}

Figure 2: Data storage example (using onlyOwner modifier)

You have to clearly identify the state variables required. This approach is suitable for ERC20 token-based contracts since they only require the storage of their balances.

If a future upgrade requires new persistent variables, they could be stored in a second data contract. You can split the data across separate contracts, but at the cost of additional logic contract calls and authorization. If you don’t intend to upgrade the contract frequently, the additional cost may be acceptable.

Nothing prevents the addition of state variables to the logic contract. These variables will not be kept during an upgrade, but can be useful for implementing the logic. If you want to keep them, you can migrate them to the new logic contract, too.

Key-value pair

A key-value pair system is an alternative to the simple data storage solution described above. It is more amenable to evolution but also more complex. For example, you can declare a mapping from a bytes32 key value to each base variable type:

contract DataContract is Owner {
  mapping(bytes32 => uint) uIntStorage;

  function getUint(bytes32 key) view public returns(uint) {
    return uintStorage[key];
}

  function setUint(bytes32 key, uint new_val) onlyOwner public {
    uintStorage[key] = new_val;
  }
}

Figure 3: Key-Value Storage Example (using onlyOwner modifier)

This solution is often called the Eternal Storage pattern.

How to perform the upgrade

This pattern offers several different strategies, depending on how the data are stored.

One of the simplest approaches is to transfer the ownership of the data contract to a new logic contract and then disable the original logic contract. To disable the previous logic contract, implement a pausable mechanism or set its pointer to 0x0 in the data contract.

Figure 4: Upgrade by deploying a new logic contract and disabling the old one

Another solution involves forwarding the calls from the original logic contract to the new version:

Figure 5: Upgrade by deploying a new logic contract and forwarding calls to it from the old one

This solution is useful if you want to allow users to call the first contract. However, it adds complexity; you have to maintain more contracts.

Finally, a more complex approach uses a third contract as an entry point, with a changeable pointer to the logic contract:

Figure 6: Upgrade by deploying a proxy contract that calls a new logic contract

A proxy contract provides the user with a constant entry point and a distinction of responsibilities that is clearer than the forwarding solution. However, it comes with additional gas costs.

Cardstack and Rocket-pool have detailed implementations of the data separation pattern.

Risks of the data separation pattern

The simplicity of the data separation pattern is more perceived than real. This pattern adds complexity to your code, and necessitates a more complex authorization schema. We have repeatedly seen clients deploy this pattern incorrectly. For example, one client’s implementation achieved the opposite effect, where a feature was impossible to upgrade because some its logic was located in the data contract.

In our experience, developers also find the EternalStorage pattern challenging to apply consistently. We have seen developers storing their values as bytes32, then applying type conversion to retrieve the original values. This increased the complexity of the data model, and the likelihood of subtle flaws. Developers unfamiliar with complex data structures will make mistakes with this pattern.

Delegatecall-based proxy pattern

Like the data separation method, the proxy pattern splits a contract in two: one contract holding the logic and a proxy contract holding the data. What’s different? In this pattern, the proxy contract calls the logic contract with delegatecall; the reverse order.

Figure 7: Visual representation of the proxy pattern

In this pattern, the user interacts with the proxy. The contract holding the logic can be updated. This solution requires mastering delegatecall to allow one contract to use code from another.

Let’s review how delegatecall works.

Background on delegatecall

delegatecall allows one contract to execute code from another contract while keeping the context of the caller, including its storage. A typical use-case of the delegatecall opcode is to implement libraries. For example:

pragma solidity ^0.4.24;

library Lib {

  struct Data { uint val; }

  function set(Data storage self, uint new_val) public {
    self.val = new_val;
  }
}

contract C {
  Lib.Data public myVal;

  function set(uint new_val) public {
    Lib.set(myVal, new_val);
  }
}

Figure 8: Library example based on delegatecall opcode

Here, two contracts will be deployed: Lib and C. A call to Lib in C will be done through delegatecall:

Figure 9: EVM opcodes of a call to Lib.set (Ethersplay output)

As a result, when Lib.set changes self.val, it changes the value stored in C’s myVal variable.

Solidity looks like Java or JavaScript, which are object-oriented languages. It’s familiar, but comes with the baggage of misconceptions and assumptions. In the following example, a programmer might assume that as long as two contract variables share the same name, then they will share the same storage, but this is not the case with Solidity.

pragma solidity ^0.4.24;

contract LogicContract {
  uint public a;

  function set(uint val) public {
    a = val;
  }
}

contract ProxyContract {
  address public contract_pointer;
  uint public a;

  constructor() public {
    contract_pointer = address(new LogicContract());
  }

  function set(uint val) public {
    // Note: the return value of delegatecall should be checked
    contract_pointer.delegatecall(bytes4(keccak256("set(uint256)")), val);
  }
}

Figure 10: Dangerous delegatecall usage

Figure 11 represents the code and the storage variables of both of the contracts at deployment:

Figure 11: Memory illustration of Figure 10

What happens when the delegatecall is executed? LogicContract.set will write in ProxyContract.contract_pointer instead of ProxyContract.a. This memory corruption happens because:

  • LogicContract.set is executed within the context of ProxyContract.
  • LogicContract knows only one state variable: a. Any store to this variable will be done on the first element in memory (see the Layout of State Variables in Storage documentation).
  • The first element for ProxyContract is contract_pointer. As a result, LogicContract.set will write theProxyContract.contract_pointer variable instead of ProxyContract.a (see Figure 12).
  • At this point, the memory in ProxyContract has been corrupted.

If a was the first variable declared in ProxyContract, delegatecall would have not corrupted the memory.

Figure 12: LogicContract.set will write the first element in storage: ProxyContract.contract_pointer

Use delegatecall with caution, especially if the called contract has state variables declared.

Let’s review the different data-storage strategies based on delegatecall.

Data storage strategies

There are three approaches to separating data and logic when using the proxy pattern:

  • Inherited storage, which uses Solidity inheritance to ensure that the caller and the callee have the same memory layout.
  • Eternal storage, which is the key-value storage version of the logic separation that we saw above.
  • Unstructured storage, which is the only strategy that does not suffer from potential memory corruption due to an incorrect memory layout. It relies on inline assembly code and custom memory management on storage variables.

See ZeppelinOS for a more thorough review of these approaches.

How to perform an upgrade

To upgrade the code, the proxy contract needs to point to a new logic contract. The previous logic contract is then discarded.

Risks of delegatecall

In our experience with clients, we have found that it is difficult to apply the delegatecall-based proxy pattern correctly. The proxy pattern requires that memory layouts stay consistent between contract and compiler upgrades. A developer unfamiliar with EVM internals can easily introduce critical bugs during an upgrade.

Only one approach, unstructured storage, overcomes the memory layout requirement but it requires low-level memory handling, which is difficult to implement and review. Due to its high complexity, unstructured storage is only meant to store state variables that are critical for the upgradability of the contract, such as the pointer to the logic contract. Further, this approach hinders static analysis of Solidity (for example, by Slither), costing the contract the guarantees provided by these tools.

Preventing memory layout corruption with automated tools is an ongoing area of research. No existing tool can verify that an upgrade is safe against a compromise. Upgrades with delegatecall will lack automated safety guarantees.

Breaking the proxy pattern

To wit, we have discovered and are now disclosing a previously unknown security issue in the Zeppelin proxy pattern, rooted in the complex semantics of delegatecall. It affects all the Zeppelin implementations that we have investigated. This issue highlights the complexity of using a low-level Solidity mechanism and illustrates the likelihood that an implementation of this pattern will have flaws.

What is the bug?

The Zeppelin Proxy contract does not check for the contract’s existence prior to returning. As a result, the proxy contract may return success to a failed call and result in incorrect behavior should the result of the call be required for application logic.

Low-level calls, including assembly, lack the protections offered by high-level Solidity calls. In particular, low-level calls will not check that the called account has code. The Solidity documentation warns:

The low-level call, delegatecall and callcode will return success if the called account is non-existent, as part of the design of EVM. Existence must be checked prior to calling if desired.

If the destination of delegatecall has no code, then the call will successfully return. If the proxy is set incorrectly, or if the destination was destroyed, any call to the proxy will succeed but will not send back data.

A contract calling the proxy may change its own state under the assumption that its interactions are successful, even though they are not.

If the caller does not check the size of the data returned, which is the case of any contract compiled with Solidity 0.4.22 or earlier, then any call will succeed. The situation is slightly better for recently compiled contracts (Solidity 0.4.24 and up) thanks to the check on returndatasize. However, that check won’t protect the calls that do not expect data in return.

ERC20 tokens are at considerable risk

Many ERC20 tokens have a known flaw that prevents the transfer functions from returning data. As a result, these contracts support a call to transfer which may return no data. In such a case, the lack of an existence check, as detailed above, may lead a third party to believe that a token transfer was successful when it was not, and may lead to the theft of money.

Exploit scenario

Bob’s ERC20 smart contract is a proxy contract based on delegatecall. The proxy is incorrectly set due to human error, a flaw in the code, or a malicious actor. Any call to the token will act as a successful call with no data returned.

Alice’s exchange handles ERC20 tokens that do not return data on transfer. Eve has no tokens. Eve calls the deposit function of Alice’s exchange for 10,000 tokens, which calls transferFrom of Bob’s token. The call is a success. Alice’s exchange credits Eve with 10,000 tokens. Eve sells the tokens and receives ethers for free.

How to avoid this flaw

During an upgrade, check that the new logic contract has code. One solution is to use the extcodesize opcode. Alternatively, you can check for the existence of the target each time delegatecall is used.

There are tools that can help. For instance, Manticore is capable of reviewing your smart contract code to check a contract’s existence before any calls are made to it. This check was designed to help mitigate risky proxy contract upgrades.

Recommendations

If you must design a smart contract upgrade solution, use the simplest solution possible for your situation.

In all cases, avoid the use of inline assembly and low-level calls. The proper use of this functionality requires extreme familiarity with the semantics of delegatecall, and the internals of Solidity and EVM. Few teams whose code we’ve reviewed get this right.

Data separation recommendations

If you need to store data, opt for the simple data storage strategy over key-pairs (aka Eternal Storage). This method requires writing less code and depends on fewer moving parts. There is simply less that can go wrong.

Use the contract-discard solution to perform upgrades. Avoid the forwarding solution, since it requires building forwarding logic that may be too complex to implement correctly. Only use the proxy solution if you need a fixed address.

Proxy pattern recommendations

Check for the destination contract’s existence prior to calling delegatecall. Solidity will not perform this check on your behalf. Neglecting the check may lead to unintended behavior and security issues. You are responsible for these checks if relying upon low-level functionality.

If you are using the proxy pattern, you must:

  • Have a detailed understanding of Ethereum internals, including the precise mechanics of delegatecall and detailed knowledge of Solidity and EVM internals.
  • Carefully consider the order of inheritance, as it impacts the memory layout.
  • Carefully consider the order in which variables are declared. For example, variable shadowing, or even type changes (as noted below) can impact the programmer’s intent when interacting with delegatecall.
  • Be aware that the compiler may use padding and/or pack variables together. For example, if two consecutive uint256 are changed to two uint8, the compiler can store the two variables in one slot instead of two.
  • Confirm that the variables’ memory layout is respected if a different version of solc is used or if different optimizations are enabled. Different versions of solc compute storage offsets in different ways. The storage order of variables may impact gas costs, memory layout, and thus the result of delegatecall.
  • Carefully consider the contract’s initialization. According to the proxy variant, state variables may not be initializable during construction. As a result, there is a potential race condition during initialization that needs to be mitigated.
  • Carefully consider names of functions in the proxy to avoid function-name collision. Proxy functions with the same Keccak hash as the intended function will be called instead, which could lead to unpredictable or malicious behavior.

Concluding remarks

We strongly advise against the use of these patterns for upgradable smart contracts. Both strategies have the potential for flaws, significantly increase complexity, and introduce bugs, and ultimately decrease trust in your smart contract. Strive for simple, immutable, and secure contracts rather than importing a significant amount of code to postpone feature and security issues.

Further, security engineers that review smart contracts should not recommend complex, poorly understood, and potentially insecure upgrade mechanisms. Ethereum security community, consider the risk prior to endorsing these techniques.

In a follow-up blog post, we will describe contract migration, our recommended approach to achieve the benefits of upgradable smart contracts without their downsides. A contract migration strategy is essential in case of private key compromise, and helpful in avoiding the need for other upgrades.

In the meantime, you should contact us if you’re concerned that your upgrade strategy may be insecure.

Introducing windows-acl: working with ACLs in Rust

Access Control Lists (ACLs) are an integral part of the Microsoft Windows security model. In addition to controlling access to secured resources, they are also used in sandboxing, event auditing, and specifying mandatory integrity levels. They are also exceedingly painful to programmatically manipulate, especially in Rust. Today, help has arrived — we released windows-acl, a Rust crate that simplifies the manipulation of access control lists on Windows.

A short background on Windows ACLs

Windows has two types of ACLs: discretionary (DACL) and system (SACL). Every securable object in Windows (such as files, registry keys, events, etc.) has an associated security descriptor with a DACL and SACL.

Security Descriptor Diagram (1)

The difference between DACL and SACL lies in the type of held entries. DACLs are used to control an entity’s access to a resource. For instance, to deny a user from reading a registry key, the registry key’s DACL needs to contain an access denied entry for the user. SACLs are used for managing the types of actions required to generate an audit event and for setting mandatory integrity labels on resources. As an example, to audit access failures by a user group on a specific file, the specified file’s SACL must contain an audit entry for the user group.

Working with ACLs today

Adding and removing ACL entries from an existing ACL requires the creation of a new ACL. The removal process is fairly simple — copy all of the existing ACL entries over except the target to be deleted. The insertion process is a bit more difficult for discretionary ACLs. The access rights that a DACL allows a user depend on the ordering of the ACEs in the DACL. Correctly inserting a new access control entry (ACE) at the preferred location is the responsibility of the developer. Furthermore, the new DACL must ensure that the to-be-inserted ACE must not have a conflict with existing ACE entries. For instance, if the existing DACL has an access allowed entry for a user with read/write privileges and the to-be added ACE is an access allowed entry for a user with only read privileges, the existing entry must not be copied into the new DACL. No one wants to deal with this complexity, especially in Rust.

How windows-acl simplifies the task

Let’s take a look at appjaillauncher-rs for a demonstration of windows-acl’s simplicity. Last year, I ported the original C++ AppJailLauncher to Rust. appjaillauncher-rs sandboxes Windows applications using AppContainers. For sandboxing to work correctly, I needed to modify ACLs on specific resources — it was an arduous task without the help of Active Template Library classes (see CDacl and CSacl), the .NET framework, or PowerShell. I ended up with a solution that was both imperfect and complicated. After implementing windows-acl, I went back and updated appjaillauncher-rs to use windows-acl. With windows-acl, I have a modular library that handles the difficulties of Windows ACL programming. It provides an interface that simplifies the act of adding and removing DACL and SACL entries.

For example, adding a DACL access allow entry requires the following code:

match ACL::from_file_path(string_path, false) {
    Ok(mut acl) => {
        let sid = string_to_sid(string_sid).unwrap_or(Vec::new());
        if sid.capacity() == 0 {
            return false;
        }
        acl.remove(
            sid.as_ptr() as PSID, 
            Some(AceType::AccessAllow), None
        ).unwrap_or(0);
        if !acl.allow(sid.as_ptr() as PSID, true, mask).unwrap_or_else(|code| {
               false
           }) {
            return false;
        }
    },
    ...
}

Similarly for removing a DACL access allow entry:

match ACL::from_file_path(string_path, false) {
    Ok(mut acl) => {
        let sid = string_to_sid(string_sid).unwrap_or(Vec::new());
        if sid.capacity() == 0 {
            return false;
        }
        let result = acl.remove(
                         sid.as_ptr() as PSID, 
                         Some(AceType::AccessAllow), 
                         None);
        if result.is_err() {
            return false;
        }
    },
    ...
}

Potential Applications and Future Work with windows-acl

The windows-acl crate opens up new possibilities for writing Windows security tools in Rust. The ability to manipulate SACLs allows us to harness the full power of the Windows Event auditing engine. The Windows Event auditing engine is instrumental in providing information to detect endpoint compromise. We hope this work contributes to the Windows Rust developers community and inspires the creation of more Rust-based security tools!

Get an open-source security multiplier

An increasing number of organizations and companies (including the federal government) rely on open-source projects in their security operations architecture, secure development tools, and beyond.

Open-source solutions offer numerous advantages to development-savvy teams ready to take ownership of their security challenges. Teams can implement them to provide foundational capabilities, like “process logs” or “access machine state,” swiftly; no need to wait for purchasing approval. They can build custom components on top of open-source code to fit their company’s needs perfectly. Furthermore, open-source solutions are transparent, ‘return’ great value for dollars spent (since investment makes the tool better rather than paying for a license), and receive maintenance from a community of fellow users.

What’s the catch?

Open source can potentially mean faster, easier, and better solutions, but there’s one thing missing: Expert engineering support.

How do you pick the right tool for your needs? How do you know this open-source solution works? That it’s safe? What do you do when no one fixes reported bugs? What do you do when you need a new killer feature but no one in your company can build it? What happens when breaking changes are introduced and you need to upgrade?

How we can help

We’re security researchers who specialize in evaluating the security of code and building security solutions. If you want to leverage open-source security technology, we can help you pick the right tool, clean up technical debt, build new necessary features, and maintain it for you. Our developers fix bugs, update out-of-date components or practices, harden vulnerable code, and, when necessary, can re-engineer things completely. We provide the way forward through all your show-stopping issues.

How can I ensure this open-source project is right for my needs? We can find open-source solutions that best suit the needs of your organization. We can polish up whatever needs fixing, build new essential features, and then maintain your final solution.

What if I need a new feature? Whether it’s porting software to a new OS, integrating the tool with others in your stack, or leveraging an existing tool for a new use case, our team are experts at building security solutions. Once built, we work with the open source project teams to merge features into public repos so the features are continuously maintained through updates.

How can I fix known bugs? We can fix them and work through technical debt to prevent more bugs in the future.

What if breaking changes are introduced in open-source dependencies? We can re-engineer solutions to maintain functionality.

Is the open-source software safe? We can review it for security best practices, ensure there’s no malicious code, and harden the code to minimize risk to your company.

How do I know this open-source solution works? We can review its code, understand the mechanics of how it works, and test edge cases to confirm consistent intended behavior. If we find anything broken or poorly-engineered, we can fix it. If the project is truly beyond repair, we can completely engineer the system to work for your organization.

How we do it

We’ve begun hosting support groups for companies leveraging open-source technology in three areas:

  • Security Operations: this team supports technology that keep company fleets and users safe from network-level attacks.
  • Secure Development: this team automates software testing, hardens software, and builds security into modern development practices.
  • Core Infrastructure: this team improves essential core infrastructure that requires maintenance or re-engineering to mitigate newly discovered attacks.

For each group, we help clients pick the best open-source project to suit their needs, update and fix needed functionality, build custom features to perfectly fit client requirements, and maintain the technology so that it’s effective and safe.

Take a look at some examples of our Security Operations group’s success stories:

Facebook’s osquery

Endpoint monitoring tool that transforms your fleets’ system data into a queryable database.

Successes so far:

AirBnB’s StreamAlert

A serverless, real-time data analysis framework which empowers users to ingest, analyze, and set up alerts on data from any environment, using customized data sources and alerting logic.

Successes so far:

Google Santa

A binary whitelisting/blacklisting system for macOS that helps administrators track naughty or nice binaries.

Successes so far:

Google Omaha + CrystalNix Omaha Server

The open-source version of Google Update. Developers can use it to install requested software and keep it up to date. (This set of enhancements is being publicly released soon)

Successes so far:

  • Created scripts to simplify the build process.
  • Simplified the process of rebranding Omaha to work with a custom Omaha server.
  • Added support in the CrystalNix Omaha server to support the latest Google Omaha client with SHA256.

Where we’d like to help next

There’s still so much room for improvement both in the security posture of companies that leverage open-source tooling and the technologies’ capabilities. In security operations, we can leverage promising open-source technology for memory forensics, user account security, binary analysis, and secret management. We can enhance software testing capabilities in fuzzing, symbolic execution, and binary lifting. We can wipe out industry-wide risks by re-engineering, improving, and maintaining widely-adopted tools for forensicspackage management, and document parsing.

How can we help you?

How can we help make open-source security solutions work better for you? Let us know!