Empire Hacking: Ethereum Edition 2

On December 12, over 150 attendees joined a special, half-day Empire Hacking to learn about pitfalls in smart contract security and how to avoid them. Thank you to everyone who came, to our superb speakers, and to BuzzFeed for hosting this meetup at their office.

Watch the presentations again

It’s hard to find such rich collections of practical knowledge about securing Ethereum on the internet. For many of you, it was even harder to attend the event. That’s why we’re posting these recordings. We hope you find them useful.

Anatomy of an unsafe smart contract programming language

Our own Evan Sultanik gave an introduction to blockchains and smart contracts, followed by a dissection of Solidity: the most popular smart contract programming language (slides).

Takeaways

  • Solidity harbors many unsafe features that allow even experienced, competent programmers to easily shoot themselves in the foot.
  • Solidity is changing quickly, which is both bad and good. Developers must keep pace with new compiler releases and beware the implications of contract upgradability.
  • There is an effort to introduce an intermediate representation to the compiler. Early indications suggest that it suffers from many of the same design decisions that have plagued Solidity.

Evaluating digital asset security fundamentals

Shamiq Islam of Coinbase discussed problems in the security of digital assets and their ecosystems. These problems pose unique and interesting challenges to cryptocurrency exchanges like Coinbase.

Takeaways

  • The supply chain is untrustworthy. How do you validate that an asset issuer’s node, smart contract, or wallet code is authentic?
  • Security communications channels are immature. If you find a bug, how do you know where to report it? Conversely, how can exchanges like Coinbase become aware if a bug is found? Monitoring a Telegram chat for each asset does not scale.
    [For the time being, use our directory of Blockchain Security Contacts.]
  • How do we know when a smart contract owner is malicious? The code itself may be secure, but the owner’s key is a single point of failure. If compromised, it can arbitrarily modify the accounting of an asset in most cases.

Contract upgrade risks and recommendations

Our own Josselin Feist compared several different strategies for upgrading smart contracts. This talk covers everything you need to know to decide how and if to implement upgradability in your contracts (slides).

Takeaways

  • Upgradability is useful for developers as it allows for features to be added and bugs to be fixed after the fact. However, it also adds complexity and increases the likelihood of deployment mistakes.
  • Use the simplest upgrade system that suits your needs. Compared to data separation, the delegatecall proxy pattern is very fragile and adds even more complexity.
  • Instead of these upgradability patterns, consider contract migration. Migration is more involved, but it allows for recovery from many more scenarios.

How to buidl an enterprise-grade mainnet Ethereum client

S. Matthew English of PegaSys highlighted the trials and tribulations of implementing a new Ethereum client. Many of the insights in the talk apply just as well to any large-scale software engineering projects.

Takeaways

  • Building an Ethereum client is hard. The protocol itself is poorly documented, uses non-standard computer science concepts, and has continued to evolve at the same time as existing clients evolve.
  • Team communication, architectural design, and incremental progress validation were important factors in the successful development of PegaSys.
  • Pantheon is now syncing with Ethereum mainnet, has been open-sourced, and is available for download.

Failures in on-chain privacy

Ian Miers of Cornell Tech and The Zcash Foundation provided an overview of privacy issues in cryptocurrencies. Cryptocurrencies might not be as private as you thought.

Takeaways

  • Privacy in cryptocurrency has been misunderstood since the very beginning. It’s important that we figure it out now before it’s too late to fix.
  • Decoy-based approaches may seem successful in isolation, but the privacy claims they make break down in real-world scenarios.
  • Stronger approaches are deployed and efficient, but they still need important work to improve usability for end users.

Secure micropayment protocols

Yondon Fu of Livepeer highlighted some security requirements unique to micropayment methods. He shared how Livepeer is making micropayments securely scale.

Takeaways

  • Micropayments are useful for a variety of applications, in particular those with the potential for ongoing or a high volume of transactions. However, high deployment and transaction costs have stymied widespread adoption.
  • Security considerations for clients common to most micropayment methods include security of the hot signing key and timely transaction confirmation of additional necessary transactions, even when gas prices fluctuate.
  • Important considerations for probabilistic micropayments include secure random number generation and protection from replay attacks and double spends.

Designing the Gemini dollar: a regulated, upgradeable, transparent stablecoin

Brandon Arvanaghi of Gemini Trust explained the design decisions that went into the regulated, upgradable, and transparent Gemini dollar, comparing and contrasting it with other implementations.

Takeaways

  • Upgradability in smart contracts provides a means for response to illicit activities and bugs, but can reduce transparency and expand the attack surface.
  • Contract modularity, ownership distribution, and “time-locked” upgrades help mitigate these issues.
  • Take every opportunity to provide multi-level mitigations. Gemini ensures that even if an attacker were to compromise a contract with all of its underlying logic (Impl), its custodian/owner-contract would need to be compromised too, as it is the sole entity to confirm printed tokens.

Property testing with Echidna and Manticore for secure smart contracts

Our own JP Smith introduced the concept of property-based testing and its application to smart contracts. This includes strategies for picking good properties and testing them thoroughly (slides).

Takeaways

  • Unit Testing is not always sufficient: it tests one individual case at a time, and typically focuses on known cases and failure modes. Property Testing aims to cover unknown cases by specifying generic code invariants.
  • Echidna is a tool for property testing smart contracts, which is extremely fast and can discover new transaction sequences that violate code properties.
  • When property-based testing with such tools, you’re sure to hit some conditions that a user might have typically missed in their individual unit tests.

Simple is hard: Making your awesome security thing usable

Patrick Nielsen and Amber Baldet of Clovyr went down infosec memory lane of great-ideas-that-didn’t-quite-catch-on to help attendees think about how to get people to use what they build (slides).

Takeaways

  • A great idea or tool can often be derailed by its (lack of) usability, undermining its potential to deliver immense real-world value. Sweat the “boring stuff” if you want your worthwhile work to be worth it.
  • Most end users don’t change settings or look for anything beyond the default, most devs don’t want to mess with complex configs. Practice simplicity at every opportunity and do as much as you can in the background for both.
  • Regular people care about simplicity, stability and cost. Power users care about implementation details. Developers care about approachability, utility, and ops overhead. Businesses care about technical risk and the bottom line. Put yourself in the shoes of each; don’t expect them to change priorities just because you made something innovative or pure.
  • Practice what you preach; if we (in the security and “crypto” communities) use tools that are fundamentally insecure or data hungry, how can we expect others to act differently?

Attend the next Empire Hacking on February 12. Join the meetup to RSVP.

How to write a rootkit without really trying

We open-sourced a fault injection tool, KRF, that uses kernel-space syscall interception. You can use it today to find faulty assumptions (and resultant bugs) in your programs. Check it out!


This post covers intercepting system calls from within the Linux kernel, via a plain old kernel module.

We’ll go through a quick refresher on syscalls and why we might want to intercept them and then demonstrate a bare-bones module that intercepts the read(2) syscall.

But first, you might be wondering:

What makes this any different from $other_fault_injection_strategy?

Other fault injection tools rely on a few different techniques:

  • There’s the well-known LD_PRELOAD trick, which really intercepts the syscall wrapper exposed by libc (or your language runtime of choice). This often works (and can be extremely useful for e.g. spoofing the system time within a program or using SOCKS proxies transparently), but comes with some major downsides:
    • LD_PRELOAD only works when libc (or the target library of choice) has been dynamically linked, but newer languages (read: Go) and deployment trends (read: fully static builds and non-glibc Linux containers) have made dynamic linkage less popular.
    • Syscall wrappers frequently deviate significantly from their underlying syscalls: depending on your versions of Linux and glibc open() may call openat(2), fork() may call clone(2), and other calls may modify their flags or default behavior for POSIX compliance. As a result, it can be difficult to reliably predict whether a given syscall wrapper invokes its syscall namesake.
  • Dynamic instrumentation frameworks like DynamoRIO or Intel PIN can be used to identify system calls at either the function or machine-code level and instrument their calls and/or returns. While this grants us fine-grained access to individual calls, it usually comes with substantial runtime overhead.

Injecting faults within kernelspace sidesteps the downsides of both of these approaches: it rewrites the actual syscalls directly instead of relying on the dynamic loader, and it adds virtually no runtime overhead (beyond checking to see whether a given syscall is one we’d like to fault).

What makes this any different from $other_blog_post_on_syscall_interception?

Other blog posts address the interception of syscalls, but many:

  • Grab the syscall table by parsing their kernel’s System.map, which can be unreliable (and is slower than the approach we give below).
  • Assume that the kernel exports sys_call_table and that extern void *sys_call_table will work (not true on Linux 2.6+).
  • Involve prodding large ranges of kernel memory, which is slow and probably dangerous.

Basically, we couldn’t find a recent (>2015) blog post that described a syscall interception process that we liked. So we developed our own.

Why not just use eBPF or kprobes?

eBPF can’t intercept syscalls. It can only record their parameters and return types.

The kprobes API might be able to perform interception from within a kernel module, although I haven’t come across a really good source of information about it online. In any case, the point here is to do it ourselves!

Will this work on $architecture?

For the most part, yes. You’ll need to make some adjustments to the write-unlocking macro for non-x86 platforms.

What’s a syscall?

A syscall, or system call, is a function1 that exposes some kernel-managed resource (I/O, process control, networking, peripherals) to user-space processes. Any program that takes user input, communicates with other programs, changes files on disk, uses the system time, or contacts another device over a network (usually) does so via syscalls.2

The core UNIX-y syscalls are fairly primitive: open(2), close(2), read(2), and write(2) for the vast majority of I/O; fork(2), kill(2), signal(2), exit(2), and wait(2) for process management; and so forth.

The socket management syscalls are mostly bolted on to the UNIX model: send(2) and recv(2) behave much like read(2) and write(2), but with additional transmission flags. ioctl(2) is the kernel’s garbage dump, overloaded to perform every conceivable operation on a file descriptor where no simpler means exists. Despite these additional complexities in usage, the underlying principle behind their usage (and interception) remains the same. If you’d like to dive all the way in, Filippo Valsorda maintains an excellent Linux syscall reference for x86 and x86_64.

Unlike regular function calls in user-space, syscalls are extraordinarily expensive: on x86 architectures, int 80h (or the more modern sysenter/syscall instructions) causes both the CPU and the kernel to execute slow interrupt-handling code paths as well as perform a privilege-context switch.3

Why intercept syscalls?

For a few different reasons:

  • We’re interested in gathering statistics about a given syscall’s usage, beyond
    what eBPF or another instrumentation API could (easily) provide.
  • We’re interested in fault injection that can’t be avoided by static linking or manual syscall(3) invocations (our use case).
  • We’re feeling malicious, and we want to write a rootkit that’s hard to remove from user-space (and possibly even kernel-space, with a few tricks).4

Why do I need fault injection?

Fault injection finds bugs in places that fuzzing and conventional unit testing often won’t:

  • NULL dereferences caused by assuming that particular functions never fail (are you sure you always check whether getcwd(2) succeeds?) Are you sure that you’re doing better than systemd?
  • Memory corruption caused by unexpectedly small buffers, or disclosure caused by unexpectedly large buffers
  • Integer over/underflow caused by invalid or unexpected values (are you sure you’re not making incorrect assumptions about stat(2)‘s atime/mtime/ctime fields?)

Getting started: Finding the syscall table

Internally, the Linux kernel stores syscalls within the syscall table, an array
of __NR_syscalls pointers. This table is defined as sys_call_table, but has not been directly exposed as a symbol (to kernel modules) since Linux 2.5.

First thing, we need to get the syscall table’s address, ideally without using the System.map file or scanning kernel memory for well-known addresses. Luckily for us, Linux provides a superior interface than either of these: kallsyms_lookup_name.

This makes retrieving the syscall table as easy as:

static unsigned long *sys_call_table;

int init_module(void) {
  sys_call_table = (void *)kallsyms_lookup_name("sys_call_table");

  if (sys_call_table == NULL) {
    printk(KERN_ERR "Couldn't look up sys_call_table\n");
    return -1;
  }

  return 0;
}

Of course, this only works if your Linux kernel was compiled with CONFIG_KALLSYMS=1. Debian and Ubuntu provide this, but you may need to test in other distros. If your distro doesn’t enable kallsyms by default, consider using a VM for one that does (you weren’t going to test this code on your host, were you?).

Injecting our replacement syscalls

Now that we have the kernel’s syscall table, injecting our replacement should be as easy as:

static unsigned long *sys_call_table;
static typeof(sys_read) *orig_read;

/* asmlinkage is important here -- the kernel expects syscall parameters to be
 * on the stack at this point, not inside registers.
 */
asmlinkage long phony_read(int fd, char __user *buf, size_t count) {
  printk(KERN_INFO "Intercepted read of fd=%d, %lu bytes\n", fd, count);

  return orig_read(fd, buf, count);
}

int init_module(void) {
  sys_call_table = (void *)kallsyms_lookup_name("sys_call_table");

  if (sys_call_table == NULL) {
    printk(KERN_ERR "Couldn't look up sys_call_table\n");
    return -1;
  }

  orig_read = (typeof(sys_read) *)sys_call_table[__NR_read];
  sys_call_table[__NR_read] = (void *)&phony_read;

  return 0;
}

void cleanup_module(void) {
  /* Don't forget to fix the syscall table on module unload, or you'll be in
   * for a nasty surprise!
   */
  sys_call_table[__NR_read] = (void *)orig_read;
}

…but it isn’t that easy, at least not on x86: sys_call_table is write-protected by the CPU itself. Attempting to modify it will cause a page fault (#PF) exception.5 To get around this, we twiddle the 16th bit of the cr0 register, which controls the write-protect state:

#define CR0_WRITE_UNLOCK(x) \
  do { \
    write_cr0(read_cr0() & (~X86_CR0_WP)); \
    x; \
    write_cr0(read_cr0() | X86_CR0_WP); \
  } while (0)

Then, our insertions become a matter of:

CR0_WRITE_UNLOCK({
  sys_call_table[__NR_read] = (void *)&phony_read;
});

and:

CR0_WRITE_UNLOCK({
  sys_call_table[__NR_read] = (void *)orig_read;
});

and everything works as expected…almost.

We’ve assumed a single processor; there’s an SMP-related race condition bug in the way we twiddle cr0. If our kernel task were preempted immediately after disabling write-protect and placed onto another core with WP still enabled, we’d get a page fault instead of a successful memory write. The chances of this happening are pretty slim, but it doesn’t hurt to be careful by implementing a guard around the critical section:

#define CR0_WRITE_UNLOCK(x) \
  do { \
    unsigned long __cr0; \
    preempt_disable(); \
    __cr0 = read_cr0() & (~X86_CR0_WP); \
    BUG_ON(unlikely((__cr0 & X86_CR0_WP))); \
    write_cr0(__cr0); \
    x; \
    __cr0 = read_cr0() | X86_CR0_WP; \
    BUG_ON(unlikely(!(__cr0 & X86_CR0_WP))); \
    write_cr0(__cr0); \
    preempt_enable(); \
  } while (0)

(The astute will notice that this is almost identical to the “rare write” mechanism from PaX/grsecurity. This is not a coincidence: it’s based on it!)

What’s next?

The phony_read above just wraps the real sys_read and adds a printk, but we could just as easily have it inject a fault:

asmlinkage long phony_read(int fd, char __user *buf, size_t count) {
  return -ENOSYS;
}

…or a fault for a particular user:

asmlinkage long phony_read(int fd, char __user *buf, size_t count) {
  if (current_uid().val == 1005) {
    return -ENOSYS;
  } else {
    return orig_read(fd, buf, count);
  }
}

…or return bogus data:

asmlinkage long phony_read(int fd, char __user *buf, size_t count) {
  unsigned char kbuf[1024];

  memset(kbuf, 'A', sizeof(kbuf));
  copy_to_user(buf, kbuf, sizeof(kbuf));

  return sizeof(kbuf);
}

Syscalls happen under task context within the kernel, meaning that the
current task_struct is valid. Opportunities for poking through kernel structures abound!

Wrap up

This post covers the very basics of kernel-space syscall interception. To do anything really interesting (like precise fault injection or statistics beyond those provided by official introspection APIs), you’ll need to read a good kernel module programming guide6 and do the legwork yourself.

Our new tool, KRF, does everything mentioned above and more: it can intercept and fault syscalls with per-executable precision, operate on an entire syscall “profile” (e.g., all syscalls that touch the filesystem or perform process scheduling), and can fault in real-time without breaking a sweat. Oh, and static linkage doesn’t bother it one bit: if your program makes any syscalls, KRF will happily fault them.

Other work

Outside of kprobes for kernel-space interception and LD_PRELOAD for user-space interception of wrappers, there are a few other clever tricks out there:

  • syscall_intercept is loaded through LD_PRELOAD like a normal wrapper interceptor, but actually uses capstone internally to disassemble (g)libc and instrument the syscalls that it makes. This only works on syscalls made by the libc wrappers, but it’s still pretty cool.
  • ptrace(2) can be used to instrument syscalls made by a child process, all within user-space. It comes with two considerable downsides, though: it can’t be used in conjunction with a debugger, and it returns (PTRACE_GETREGS) architecture-specific state on each syscall entry and exit. It’s also slow. Chris Wellons’s awesome blog post covers ptrace(2)‘s many abilities.

  1. More of a “service request” than a “function” in the ABI sense, but thinking about syscalls as a special class of functions is a serviceable-enough fabrication.
  2. The number of exceptions to this continues to grow, including user-space networking stacks and the Linux kernel’s vDSO for many frequently called syscalls, like time(2).
  3. No process context switch is necessary. Linux executes syscalls within the same underlying kernel task that the process belongs to. But a processor context switch does occur.
  4. I won’t detail this because it’s outsite of this post’s scope, but consider that init_module(2) and delete_module(2) are just normal syscalls.
  5. Sidenote: this is actually how CoW works on Linux. fork(2) write-protects the pre-duplicated process space, and the kernel waits for the corresponding page fault to tell it to copy a page to the child.
  6. This one’s over a decade old, but it covers the basics well. If you run into missing symbols or changed signatures, you should find the current equivalents with a quick search.

On Bounties and Boffins

Trying to make a living as a programmer participating in bug bounties is the same as convincing yourself that you’re good enough at Texas Hold ‘Em to quit your job. There’s data to back this up in Fixing a Hole: The Labor Market for Bugs, a chapter in New Solutions for Cybersecurity, by Ryan Ellis, Keman Huang, Michael Siegel, Katie Moussouris, and James Houghton.

Bug bounties follow a Pareto distribution, exhibiting the same characteristics as the distribution of wealth and other sociological phenomena. A select few, the boffins, report the largest number and highest quality bug reports and earn the lion’s share of bounties. The rest of the population fights over the boffins’ table scraps.

Fixing a Hole does not offer much to encourage software companies looking to improve their security through bug bounty programs. HackerOne, a company that hosts bug bounty programs, boasts that over 300,000 people “have signed up” to help organizations improve their security. It’s nice to think that you have 300,000 sets of eyes scrutinizing your code, but this number includes zombie accounts and people who never find a single bug. In reality, only an elite few are getting the work done and cashing in.

So why not just hire the boffins as consultants instead of gamifying your company’s security? The authors of Fixing a Hole argue that bug bounties should be designed to incentivize the elite. They say that making bounties invite-only lowers the operational cost of managing a tsunami of trivial, non-issue, and duplicate bugs. (Only 4-5% of bugs from Google, Facebook, and GitHub’s public-facing bounty programs were eligible for payment.) According to the authors, a small number of bounty hunters are indispensable and hold significant power to shape the market for bug bounty programs. Based on this, hiring security consultants under terms and conditions that can be controlled seems more practical.

The data undermining bug bounties

In Fixing a Hole, independent researchers funded by Facebook studied data from 61 HackerOne bounty programs over 23 months and one Facebook program over 45 months. The HackerOne data set includes bounty programs from Twitter, Square, Slack, Coinbase, Flash, and others. Usernames could be tracked across different programs in the HackerOne data set, but not to the Facebook data set.

bugbountydata3

Top: Participants, sales, and payments for Facebook (45 months) and HackerOne (23 months). Bottom: Population according to bug sales per person.

The prolific group at the top doesn’t limit itself to one program. It sweeps across multiple programs selling bugs across different technologies. What’s more, they also report the most critical bugs that are worth the most money. On average, those in the top 1% submitted bugs to nearly five different programs.

The authors included averages for sales per seller, earnings per seller, and price per transaction, but these values skew the analysis of uneven distributions, so they are disregarded in this review. (e.g., If 90 people end up earning a $10/hour wage, and 10 people earn $1,000/hour, the average wage is $109/hour, which isn’t characteristic of either population.)

Surprisingly, the variance of the data was not reported in Fixing a Hole. When populations are stratified, as the authors discover, the variance of the individual groups reveals some surprising insights. Consequently, a lot of information about the most interesting population, the top-performing 5%, is omitted.

We reproduced and overlaid some of the plots here to show the main theme in the report: there is a small group of prolific participants in bug bounty programs. The larger the data set, the more pronounced this trend is. For the entirety of the HackerOne and Facebook data sets, the 7% of participants with 10 or more bugs were paid for 1,622 bounties, while the other 93% of the population earned 2,523.

bugbountryplot

The most interesting group was arbitrarily lumped into the 10-or-more-bugs category. (See how the line trends upward at the end of the plot? You don’t want to do that. Plot all your data.) The top 1% (6 participants) in the HackerOne data landed 161 bounties and the top 1% (7 participants) in the Facebook data accounted for 274 bugs. That’s an average of 27 and 39 bugs per person, respectively! There may be stratification even among the top earners, but without knowledge of this distribution at the top (i.e., the variance), it remains a mystery.

As productive as the top 1% are, their earnings are equally depressing. The top seven participants in the Facebook data set averaged 0.87 bugs per month, earning an average yearly salary of $34,255; slightly less than what a pest control worker makes in Mississippi.

It gets worse for the top six earners from the HackerOne data set. Averaging 1.17 bugs per month, they earn a yearly average of $16,544. (Two outlying data points appear in the notes of Fixing a Hole mentioning that Google’s Chromium Rewards Program paid $60,000 for a single submission and one Facebook participant earned $183,000 in 21 months or a $104,000/year average.)

If your heart is breaking and you’re wondering whether this top 1% is wondering where their next meal is coming from, it’s more likely that these are security professionals working a side hustle. You can get really good at finding a few classes of critical bugs, then set scanners and alerts for when relevant bounty programs come online. You find your bugs, submit your proof, grab your dollars, then move on.

What’s better than bug bounties

Who are these top bug bounty performers, and what’s their background? What separates them from the rest? There’s no way to tell from the data, but the authors suggest three possibilities: improved skills over time, differences in raw talent, and professionals vs. hobbyists. (I believe that some top performers may work as teams under one account or are individuals who specialize in a few types of critical bugs and keep a watchful eye for low-hanging fruit when new programs launch.) Whoever they are, they’re indispensable and need to be incentivized to join bug bounty programs. For this, the authors offer three solutions:

  1. Keep the talent pool exclusive through invite-only programs that are closed to the public. This ensures that the most talented will not lose any bounties to lesser talent—even the low-hanging fruit.
  2. Escalate prices with successive valid submissions to deter people from straying to other programs.
  3. Offer grants to talented researchers, and pay them even if no bugs are found.

There’s not much difference between this advice and simply reaching out to a consulting firm for a code audit. Plus, an exclusive bug bounty program faces the chicken-or-egg paradox for the participants: How do you get an invite when you aren’t given the opportunity to establish a reputation? Also, there’s a lot less control and a lot more risk to holding bug bounties than most people are aware of.

The economics of bug bounty programs are turbulent, because there’s a competing offensive market in play. Exploitable zero-days can fetch up to millions of dollars from the right buyer. Anyone who discovers a critical bug can choose not to disclose it to the vendor and try to sell it elsewhere for much more. Fixing a Hole recommends that work should be directed toward incentivizing disclosure to vendors, but offers no practical details beyond that. There’s no evidence to suggest that researchers compare defensive and offensive bounty programs searching for the highest sale. Our opinion is that the decision to disclose or not disclose to vendors is mostly a moral one.

So who is currently incentivized to participate in bug bounty programs? Two groups: Citizens of economically disadvantaged countries, who can take advantage of US dollar exchange rates; and students who want to improve their security skills and learn the tools of the trade. After reading Fixing a Hole, I wasn’t convinced that the elite boffins are incentivized to participate in bug bounty programs. Perhaps they should wield some of their indispensable power to demand more from the market.

What do La Croix, octonions, and Second Life have in common?

This year for CSAW CTF, Trail of Bits contributed two cryptography problems. In the first problem, you could combine two bugs to break DSA much like the Playstation 3 firmware hackers. The other challenge–-weirder and mathier–-was split into two parts: one for the qualifiers, one in finals. This challenge, “Holywater,” was some of the most fun I’ve ever had making a CTF problem.

The qualifier challenge was a pretty textbook CTF cryptography challenge. Contestants began with a script and a text file of past outputs (preserved on Github), and had to recover a secret passphrase. Spoilers follow below the (extremely relevant) image, if you’d like to try it yourself.

Before diving into my own solution, I first want to commend Galhacktic Trendsetters for their excellent writeup (if any of you Trendsetters are reading this, get in touch, I’d love to mail you some swag). They covered the mathematical foundations of the attack with eloquence, a topic which I won’t get into in quite as much depth here. It’s also an excellent walkthrough of the thought process that lets a team start with nothing but a python file and a few hex strings and develop a working attack in less than 48 hours.

The challenge’s python file didn’t make that easy. It was called “lattice.py,” which might immediately suggest it has something to do with lattice cryptography. The method names included, among other things “isogeny,” “gaussian,” and “wobble.” Even the above writeup acknowledges some confusion about the terms’ meanings.

In reality, more or less every name in that file is a red herring. It implements HK17 key exchange, a proposed new post-quantum key exchange mechanism that was proven totally unworkable by Li, Liu, Pan, and Xie. The mathematical construction underlying HK17 is not lattices or isogenies, but octonions! Octonions are eight-dimensional hypercomplex numbers used in theoretical physics with a number of counterintuitive properties.

Perhaps the easiest way to understand octonions is by constructing them from scratch. Most readers will already be familiar with complex numbers, a two-dimensional superset of real numbers that is algebraically closed, a property that makes many kinds of math much easier. We construct the complex numbers using the Cayley-Dickson construction. Effectively, we double the number of dimensions and define multiplication much as we would in a direct sum (though not in exactly the same way).

We can repeat this process on complex numbers to yield a four-dimensional set of numbers known as the quaternions. Readers with graphics programming experience may be familiar, as quaternions allow for efficient computation of rotations in three-dimensional space, and are thus used by many graphics libraries. One more application of the Cayley-Dickson process takes us to eight dimensions; the octonions we use for our cryptosystem.

However, the Cayley-Dickson process cannot preserve every property of a number system we might want. Complex numbers, unlike their real counterparts, are not orderable (they can’t just be laid out end to end on a line). Quaternions are also unorderable, but unlike reals or complex numbers, have noncommutative multiplication! If a and b are quaternions, a * b and b * a can yield different numbers. This gradual loss of invariants continues with octonions, which aren’t even associative; if d, e, and f are octonions, (d * e) * f may well not equal d * (e * f).

“The real numbers are the dependable breadwinner of the family, the complete ordered field we all rely on. The complex numbers are a slightly flashier but still respectable younger brother: not ordered, but algebraically complete. The quaternions, being noncommutative, are the eccentric cousin who is shunned at important family gatherings. But the octonions are the crazy old uncle nobody lets out of the attic: they are nonassociative.” – John Baez

This is fairly gnarly, by the standards of numbers we choose to use, and explains to a degree why octonions aren’t used frequently (keep the attic door shut!). However, it also appears to allow for exactly the kind of hard problem we want when building a key exchange system! By working with polynomials over the octonions, the HK17 authors create a Diffie-Hellman style key exchange system they claim is quantum-hard.

However, in real life this system can be reliably broken by college students over the course of a weekend (nine teams solved it). Octonions’ odd multiplication rules end up making factoring far easier! With a few octonion identities and a high schooler’s knowledge of linear algebra, the cryptosystem reduces to four variables in four linear equations, and can be solved in O(1) by a python script that runs almost instantaneously.

An astute reader may pause here, with complete knowledge of the problem, and wonder “why was this challenge called Holywater?” The answer has nothing to do with octonion key exchange, and everything to do with my plans for the second half of the problem. The HK17 draft defined systems not just on octonions, but on unit quaternions (quaternions of magnitude one) as well! And, since quaternions are used by so many more programmers (as mentioned above, for graphics) that opens some interesting doors.

Specifically, it means we can now define our system in Linden Scripting Language, the official scripting language of Second Life. I’ve always been a bit of a programming language snob. For a while, I thought PHP was the absolute bottom of the barrel. Nothing could possibly be worse than that fractal of bad design, created largely by accident. Later in life I began working on blockchain security, and learned about the language Solidity. Suffice to say, my mind has since changed. Neither language, however, compares to the absolute tire fire that is Linden Scripting Language. Seriously, just read how you parse JSON.

LSL has a built-in quaternion type, and, while the “Differences Between Math’s Quaternions and LSL’s [Quaternions]” might seem foreboding, they are completely workable for our purposes. And, writing the whole challenge in LSL meant the competitors could have even more fun reverse engineering. However, I needed help to develop the Second Life scripts, design objects for them to be attached to, lease space in Second Life, and generally do the non-mathy parts of the whole project.

This is where the name comes in. The final part was called “Holywater 2: La Croix” specifically to entice Dan “Pamplemousse” Hlavenka, a friend of mine who loves both LSL and La Croix more than any other person I know of. He was willing to help with every part of the Second Life portion, but only if we made the challenge La Croix themed in every way we could, to spread the gospel to the next generation.
Competitors were greeted by the below acrostic, which, when the blanks are filled in, describes both a location in Second Life and half-dozen La Croix flavors.

yeah i SMOKE WEED
                                P    P
                                E  M A
                               PA  AOM
                               UC BNRP
maps.Secondlife.com/secondlife/_______/23/233/1
     M                         E PRONE
     O                          PRR GM
     K                          EIY EO
     E                          AC   U
                                RO   S
     W                           T   S
     E                               E
     E
     D

Once teams arrive, they find themselves inside a giant can of La Croix, underwater (and with particle effects for carbonation). The location in Second Life was renamed “Mud City” after LaCrosse Wisconsin, home of the beverage. They are then presented with two glowing orbs, reading “Click here for everything you need” and “Click here to die instantly.”

These labels are accurate. That did not stop many people from repeatedly clicking the “die instantly” orb however, perhaps in an attempt at some sort of reincarnation-based cryptanalysis. The “everything you need” orb in contrast, gives the player an IBM Selectric typeball. Since unit quaternions describe rotations, we elected to encode the message by physically rotating one such typeball (as in normal Selectric operation), agreeing on rotations via HK17 key exchange in Second Life’s chat box. Users could see a script attached to the type ball that outlined the whole process, though again, some attempted other strategies (see below).

Nonetheless, the math was much the same, if harder to apply. This time only two teams (MIT and CMU) found the final flag (another clever La Croix reference), with the first blood winning a case of La Croix for each team member as a bonus on top of the (unusually high) 600 points (typically, challenges are 100 points if extremely easy, 500 points if extremely hard). By reversing the script and scraping the chat, the same process that worked for quals can work here. All that’s left is rotating your typeball and watching which letter is on the bottom.

Dan’s lease on the land in Second Life is now up, so the challenge is unfortunately closed to the public. Dan’s La Croix contributions ended up far more popular than I expected though, so perhaps this challenge won’t be the last to feature the beverage. This challenge is perhaps less applicable than the qualifier, but its lesson remains valid: if you’re securing a remote-control typewriter sending La Croix secrets in Second Life, don’t use HK17.

P.S.: While the last minute removal of csaw.tv meant this never saw the light of competition, you can enjoy this La Croix themed playlist Dan and I made for a special csaw.tv only accessible from Second Life.

Fuzzing Like It’s 1989

With 2019 a day away, let’s reflect on the past to see how we can improve. Yes, let’s take a long look back 30 years and reflect on the original fuzzing paper, An Empirical Study of the Reliability of UNIX Utilities, and its 1995 follow-up, Fuzz Revisited, by Barton P. Miller.

In this blog post, we are going to find bugs in modern versions of Ubuntu Linux using the exact same tools as described in the original fuzzing papers. You should read the original papers not only for context, but for their insight. They proved to be very prescient about the vulnerabilities and exploits that would plague code over the decade following their publication. Astute readers may notice the publication date for the original paper is 1990. Even more perceptive readers will observe the copyright date of the source code comments: 1989.

A Quick Review

For those of you who didn’t read the papers (you really should), this section provides a quick summary and some choice quotes.

The fuzz program works by generating random character streams, with the option to generate only printable, control, or non-printable characters. The program uses a seed to generate reproducible results, which is a useful feature modern fuzzers often lack. A set of scripts execute target programs and check for core dumps. Program hangs are detected manually. Adapters provide random input to interactive programs (1990 paper), network services (1995 paper), and graphical X programs (1995 paper).

The 1990 paper tests four different processor architectures (i386, CVAX, Sparc, 68020) and five operating systems (4.3BSD, SunOS, AIX, Xenix, Dynix). The 1995 paper has similar platform diversity. In the first paper, 25-33% of utilities fail, depending on the platform. In the 1995 follow-on, the numbers range from 9%-33%, with GNU (on SunOS) and Linux being by far the least likely to crash.

The 1990 paper concludes that (1) programmers do not check array bounds or error codes, (2) macros make code hard to read and debug, and (3) C is very unsafe. The extremely unsafe gets function and C’s type system receive special mention. During testing, the authors discover format string vulnerabilities years before their widespread exploitation (see page 15). The paper concludes with a user survey asking about how often users fix or report bugs. Turns out reporting bugs was hard and there was little interest in fixing them.

The 1995 paper mentions open source software and includes a discussion of why it may have fewer bugs. It also contains this choice quote:

When we examined the bugs that caused the failures, a distressing phenomenon emerged: many of the bugs discovered (approximately 40%) and reported in 1990 are still present in their exact form in 1995. …

The techniques used in this study are simple and mostly automatic. It is difficult to understand why a vendor would not partake of a free and easy source of reliability improvements.

It would take another 15-20 years for fuzz testing to become standard practice at large software development shops.

I also found this statement, written in 1990 to be prescient of things to come:

Often the terseness of the C programming style is carried to extremes; form is emphasized over correct function. The ability to overflow an input buffer is also a potential security hole, as shown by the recent Internet worm.

Testing Methodology

Thankfully, after 30 years, Dr. Barton still provides full source code, scripts, and data to reproduce his results, which is a commendable goal that more researchers should emulate. The scripts and fuzzing code have aged surprisingly well. The scripts work as is, and the fuzz tool required only minor changes to compile and run.

For these tests, we used the scripts and data found in the fuzz-1995-basic repository, because it includes the most modern list of applications to test. As per the top-level README, these are the same random inputs used for the original fuzzing tests. The results presented below for modern Linux used the exact same code and data as the original papers. The only thing changed is the master command list to reflect modern Linux utilities.

Updates for 30 Years of New Software

Obviously there have been some changes in Linux software packages in the past 30 years, although quite a few tested utilities still trace their lineage back several decades. Modern versions of the same software audited in the 1995 paper were tested, where possible. Some software was no longer available and had to be replaced. The justification for each replacement is as follows:

  • cfecc1: This is a C preprocessor and equivalent to the one used in the 1995 paper.
  • dbxgdb: This is a debugger, an equivalence to that used in the 1995 paper.
  • ditroffgroff: ditroff is no longer available.
  • dtblgtbl: A GNU Troff equivalent of the old dtbl utility.
  • lispclisp: A common lisp implementation.
  • moreless: Less is more!
  • prologswipl: There were two choices for prolog: SWI Prolog and GNU Prolog. SWI Prolog won out because it is an older and a more comprehensive implementation.
  • awkgawk: The GNU version of awk.
  • ccgcc: The default C compiler.
  • compressgzip: GZip is the spiritual successor of old Unix compress.
  • lintsplint: A GPL-licensed rewrite of lint.
  • /bin/mail/usr/bin/mail: This should be an equivalent utility at a different path.
  • f77fort77: There were two possible choices for a Fortan77 compiler: GNU Fortran and Fort77. GNU Fortran is recommended for Fortran 90, while Fort77 is recommended for Fortran77 support. The f2c program is actively maintained and the changelog records entries date back to 1989.

Results

The fuzzing methods of 1989 still find bugs in 2018. There has, however, been progress.

Measuring progress requires a baseline, and fortunately, there is a baseline for Linux utilities. While the original fuzzing paper from 1990 predates Linux, the 1995 re-test uses the same code to fuzz Linux utilities on the 1995 Slackware 2.1.0 distribution. The relevant results appear on Table 3 of the 1995 paper (pages 7-9). GNU/Linux held up very well against commercial competitors:

The failure rate of the utilities on the freely-distributed Linux version of UNIX was second-lowest at 9%.

Let’s examine how the Linux utilities of 2018 compare to the Linux utilities of 1995 using the fuzzing tools of 1989:

Ubuntu 18.10 (2018) Ubuntu 18.04 (2018) Ubuntu 16.04 (2016) Ubuntu 14.04 (2014) Slackware 2.1.0 (1995)
Crashes 1 (f77) 1 (f77) 2 (f77, ul) 2 (swipl, f77) 4 (ul, flex, indent, gdb)
Hangs 1 (spell) 1 (spell) 1 (spell) 2 (spell, units) 1 (ctags)
Total Tested 81 81 81 81 55
Crash/Hang % 2% 2% 4% 5% 9%

Amazingly, the Linux crash and hang count is still not zero, even for the latest Ubuntu release. The f2c program called by f77 triggers a segmentation fault, and the spell program hangs on two of the test inputs.

What Are The Bugs?

There are few enough bugs that I could manually investigate the root cause of some issues. Some results, like a bug in glibc, were surprising while others, like an sprintf into a fixed-sized buffer, were predictable.

The ul crash

The bug in ul is actually a bug in glibc. Specifically, it is an issue reported here and here (another person triggered it in ul) in 2016. According to the bug tracker it is still unfixed. Since the issue cannot be triggered on Ubuntu 18.04 and newer, the bug has been fixed at the distribution level. From the bug tracker comments, the core issue could be very serious.

f77 crash

The f77 program is provided by the fort77 package, which itself is a wrapper script around f2c, a Fortran77-to-C source translator. Debugging f2c reveals the crash is in the errstr function when printing an overly long error message. The f2c source reveals that it uses sprintf to write a variable length string into a fixed sized buffer:

errstr(const char *s, const char *t)
#endif
{
  char buff[100];
  sprintf(buff, s, t);
  err(buff);
}

This issue looks like it’s been a part of f2c since inception. The f2c program has existed since at least 1989, per the changelog. A Fortran77 compiler was not tested on Linux in the 1995 fuzzing re-test, but had it been, this issue would have been found earlier.

The spell Hang

This is a great example of a classical deadlock. The spell program delegates spell checking to the ispell program via a pipe. The spell program reads text line by line and issues a blocking write of line size to ispell. The ispell program, however, will read at most BUFSIZ/2 bytes at a time (4096 bytes on my system) and issue a blocking write to ensure the client received spelling data processed thus far. Two different test inputs cause spell to write a line of more than 4096 characters to ispell, causing a deadlock: spell waits for ispell to read the whole line, while ispell waits for spell to acknowledge that it read the initial corrections.

The units Hang

Upon initial examination this appears to be an infinite loop condition. The hang looks to be in libreadline and not units, although newer versions of units do not suffer from the bug. The changelog indicates some input filtering was added, which may have inadvertently fixed this issue. While a thorough investigation of the cause and correction was out of scope for this blog post, there may still be a way to supply hanging input to libreadline.

The swipl Crash

For completeness I wanted to include the swipl crash. However, I did not investigate it thoroughly, as the crash has been long-fixed and looks fairly benign. The crash is actually an assertion (i.e. a thing that should never occur has happened) triggered during character conversion:

[Thread 1] pl-fli.c:2495: codeToAtom: Assertion failed: chrcode >= 0
C-stack trace labeled "crash":
  [0] __assert_fail+0x41
  [1] PL_put_term+0x18e
  [2] PL_unify_text+0x1c4
…

It is never good when an application crashes, but at least in this case the program can tell something is amiss, and it fails early and loudly.

Conclusion

Fuzzing has been a simple and reliable way to find bugs in programs for the last 30 years. While fuzzing research is advancing rapidly, even the simplest attempts that reuse 30-year-old code are successful at identifying bugs in modern Linux utilities.

The original fuzzing papers do a great job at foretelling the dangers of C and the security issues it would cause for decades. They argue convincingly that C makes it too easy to write unsafe code and should be avoided if possible. More directly, the papers show that even naive fuzz testing still exposes bugs, and such testing should be incorporated as a standard software development practice. Sadly, this advice was not followed for decades.

I hope you have enjoyed this 30-year retrospective. Be on the lookout for the next installment of this series: Fuzzing In The Year 2000, which will investigate how Windows 10 applications compare against their Windows NT/2000 equivalents when faced with a Windows message fuzzer. I think that you can already guess the answer.

$10,000 research fellowships for underrepresented talent

The Trail of Bits SummerCon Fellowship program is now accepting applications from emerging security researchers with excellent project ideas. Fellows will explore their research topics with our guidance and then present their findings at SummerCon 2019. We will be reserving at least 50% of our funding for marginalized, female-identifying, transgender, and non-binary candidates. If you’re interested in applying, read on!

Why we’re doing this

Inclusion is a serious and persistent issue for the infosec industry. According to the 2017 (ISC)2 report on Women in Cybersecurity, only 11% of the cybersecurity workforce identify as women–-a deficient proportion that hasn’t changed since 2013. Based on a 2018 (ISC)2 study, the issue is worse for women of color, who report facing pervasive discrimination, unexplained denial or delay in career advancement, exaggerated highlights of mistakes and errors, and tokenism.

Not only is this ethically objectionable, it makes no business sense. In 2012, Mckinsey & Company found–-with ‘startling consistency’—that “for companies ranking in the top quartile of executive-board diversity, Returns on Equity (ROE) were 53 percent higher, on average, than they were for those in the bottom quartile. At the same time, Earnings Before Tax and Interest (EBTI) margins at the most diverse companies were 14 percent higher, on average, than those of the least diverse companies.”

The problem is particularly conspicuous at infosec conferences: a dearth of non-white non-male speakers, few female attendees, and pervasive reports of sexual discrimination. That’s why Trail of Bits and one of the longest-running hacker conferences, SummerCon, decided to collaborate to combat the issue. Through this fellowship, we’re sponsoring and mentoring emerging talent that might not otherwise get enough funding, mentorship, and exposure, and then shining a spotlight on their research.

Funding and mentorship to elevate your security research

The Trail of Bits SummerCon Fellowship provides awarded fellows with:

  • $10,000 grant to fund a six-month security research project
  • Dedicated research mentorship from a security engineer at Trail of Bits
  • An invitation to present findings at SummerCon 2019

50% of the program spots are reserved for marginalized, people of color, female-identifying, transgender, and non-binary candidates. Applicants of all genders, races, ethnicities, sexual orientations, ages, and abilities are encouraged to apply.

The research topics we’ll support

Applicants should bring a low-level programming or security research project that they’ve been wanting to tackle but have lacked the time or resources to pursue. They’ll have strong skills in low-level or systems programming, reverse engineering, program analysis (including dynamic binary instrumentation, symbolic execution, and abstract interpretation), or vulnerability analysis.

We’re especially interested in research ideas that align with our areas of expertise. That way, we can better support applicants. Think along the lines of:

How do I apply?

Apply here!

We’re accepting applications until January 15th. We’ll announce fellowship recipients in February.

Interested in applying? Go for it!

Submissions will be judged by a panel of experts from the SummerCon foundation, including Trail of Bits. Good luck!

CSAW CTF Crypto Challenge: Breaking DSA

The Trail of Bits cryptographic services team contributed two cryptography CTF challenges to the recent CSAW CTF. Today we’re going to cover the easier one, titled “Disastrous Security Apparatus – Good luck, ‘k?”

This problem involves the Digital Signature Algorithm (DSA) and the way an apparently secure algorithm can be made entirely insecure through surprising implementation quirks. The challenge relies on two bugs, one of which was the source of the Playstation 3 firmware hack, while the other is a common source of security vulnerabilities across countless software products. Despite both of these issues having been known for many years a large number of software developers (and even security engineers) are unfamiliar with them.

If you’re interested in solving the challenge yourself get the code here and host it locally. Otherwise, read on so you can learn to spot these sorts of problems in code you write or review.

Flags need capturing

Participants were given the source code (main.py) and an active HTTP server they could contact. This server was designed to look roughly like an online signing server. It had an endpoint that signed payloads sent to it and a partially implemented login system with password reset functionality.

The enumerated set of routes:

  • /public_key, which returned a DSA public key’s elements (p, q, g, y) as integers encoded in a JSON structure.
  • /sign/, which performed a SHA1 hash of the data passed, then signed the resulting hash with the DSA private key and returned two integers (r, s) in a JSON structure.
  • /forgotpass, which generated a URL for resetting a user’s password using random.getrandbits.
  • /resetpass, an unimplemented endpoint that returned a 500 if called.
  • /challenge, returned a valid Fernet token.
  • /capture, which, when presented with a valid DSA signature for a valid Fernet token, yielded the flag.

To capture the flag we’ll need to recover the DSA private key and use that to sign an encrypted payload from the /challenge endpoint. We then submit both the challenge value and the signature to /capture. This allows the server to verify you’ve recovered the private key. Let’s go!

DSA signing, the Disastrous Security Apparatus in action

A complete DSA key is made up of 5 values: p, q, g, x, and y.

p, q, g, and y are all public values. The /public_key endpoint on the server gives these values and can be used to verify that a given signature is valid. The private value, x, is what we need. A DSA signature is normally computed as follows

  1. First pick a k where 0 < k < q
  2. Compute the value r. Conceptually this is gk mod p mod q.  However, as g and k are both large numbers it is very slow to compute this value directly. Fortunately modular exponentiation completes the calculation very quickly. In Python you can calculate this via the built-in pow method: pow(g, k, p) % q.
  3. Calculate the modular multiplicative inverse of k modulo q. That is, kinv such that (k * kinv) % q = 1
  4. Compute the hash of the message you want to sign. This particular code uses SHA1 and then converts the byte string into a big endian integer. To do this in Python: int.from_bytes(hashlib.sha1(data).digest(), 'big') (Python 3 required!)
  5. Finally, calculate s using kinv * (h + r * x) % q

The signer implementation in main.py conveniently possesses this exact code

def sign(ctf_key: DSAPrivateKeyWithSerialization, data: bytes) -> tuple(int, int):
  data = data.encode("ascii")
  pn = ctf_key.private_numbers()
  g = pn.public_numbers.parameter_numbers.g
  q = pn.public_numbers.parameter_numbers.q
  p = pn.public_numbers.parameter_numbers.p
  x = pn.x
  k = random.randrange(2, q)
  kinv = _modinv(k, q)
  r = pow(g, k, p) % q
  h = hashlib.sha1(data).digest()
  h = int.from_bytes(h, "big")
  s = kinv * (h + r * x) % q
  return (r, s)

To confirm that r and s are correct you can also perform a DSA verification.

  1. Compute w, the modular inverse of s modulo q
  2. Calculate u1 = (h * w) % q
  3. Calculate u2 = (r * w) % q
  4. Calculate v, defined as ((g ** u1) * (y ** u2)) % p % q. This will need to be done via modular exponentiation!

At this point v should be equal to r.

Tricksy math, ruining our security

We’ve seen the math involved in generating and verifying a DSA signature, but we really want to use the set of values we know to recover a value we do not (x, the private scalar). Recall this equation?

s = (kinv * (h + r * x)) % q

A DSA signature is composed of two values: r and s. We also know h is the value that is being signed and with a signing oracle we pick that value. Finally, we know q as that is part of the public key that is used to verify a DSA signature. This leaves us with two unknowns: kinv and x. Let’s solve for x:

  1. s = (kinv * (h + r * x)) % q
  2. s * k = (h + r * x) % q
  3. (s * k) % q = (h + r * x) % q Note: (s * k) will always be less than q, so adding % q is just for clarity.
  4. ((s * k) - h) % q = (r * x) % q
  5. (rinv * ((s * k) - h)) % q = x

rinv is calculated just like kinv (the modular multiplicative inverse).

As you can see from the final equation, if we can determine the k used for any given signature tuple (r, s) then we can recover the private scalar. But k is generated via random.randrange so it’s not predictable.

RNGs and global state oh my!

Random number generation is hard. Python’s random module uses a global singleton instance of Mersenne Twister (MT) to provide a fast and statistically random RNG. However, MT is emphatically not a cryptographically secure pseudo-random number generator (CSPRNG). Both Python’s documentation and MT’s document this property, but documenting dangerous APIs turns out to be insufficient to prevent misuse. In the case of MT, observing 624 32-bit outputs is sufficient to reconstruct the internal state of the RNG and predict all future outputs. This is despite the fact that MT has a period of 219937 − 1. If a user were able to view the output of the MT RNG via another endpoint then they could use those outputs to predict the output of random.randrange. Enter /forgotpass, the Chekhov’s gun of this challenge.

/forgotpass is implemented as follows:

@app.route("/forgotpass")
def returnrand() -> str:
  # Generate a random value for the reset URL so it isn't guessable
  random_value = binascii.hexlify(struct.pack(">Q",
  random.getrandbits(64)))
  return "https://innitech.local/resetpass/{}".format(
    random_value.decode("ascii")
  )

So every call to that endpoint will get a random 64-bit integer packed in big endian form. But how do we turn this into a working MT instance?

Playing twister

We now know how to get chunks of data from the MT instance, but how do we process that data and use it to predict future output? First we need our own MT implementation:

class ClonedMersenneTwister:
    length = 624

    def __init__(self, state):
        self.state = state[:]
        self.index = 0

    def next(self):
        if self.index == 0:
            self.generate_numbers()

        y = self.state[self.index]
        y = y ^ (y >> 11)
        y = y ^ (y << 7) & 2636928640
        y = y ^ (y  18)

        self.index = (self.index + 1) % self.length
        return y

    def generate_numbers(self):
        for i in range(self.length):
            y = ((self.state[i] & 0x80000000) +
                 ((self.state[(i + 1) % self.length]) & 0x7fffffff))
            self.state[i] = self.state[(i + 397) % self.length] ^ (y >> 1)
            if y % 2:
                self.state[i] ^= 2567483615

You can see from the code in next that the internal state has a series of bit shifts, AND, and OR operations applied to it that the MT algorithm refers to as “tempering.” To recover the original state we’ll need to invert those operations.

Are you the Keymaster?

We have all the pieces. Let’s put them together.

First, we need to make calls to /forgotpass to obtain the internal state of the RNG and build a local clone. We’ll need to split the reset code at the end of the URL and turn it into two values of the internal state since it is 64-bits of data and we’re cloning a 32-bit instance of MT.

Once that’s complete we’ll make a call  to /sign with some data we want to sign and get back r, s. Any data will do. We can then use r, s, p, q, g, and the value we get from our cloned RNG (which is the k we predict the server will use) to solve for x.

To confirm the x we’ve calculated is correct, we can compute pow(g, x, p), the result of which will be equal to y.

Finally, we’ll make a call to /challenge to obtain a Fernet token, sign it with the private key (using SHA256 as the hash), and submit the token and signature to the /capture endpoint to capture the flag!

Wrapping it up

During the 36 hour CSAW finals 28 out of the 44 teams were able to capture this flag. That’s a pretty good success rate for a challenge that leveraged an unexpected relationship between the password reset token generation and nonce generation for a DSA signature. Coupled with the brittleness of an algorithm like DSA, this apparently mild issue in reality causes a catastrophic and unrecoverable breach of security and the majority of participating teams were able to solve it.

In the real world where you may be building or auditing systems that deal with sensitive data like this, remember that the use of non-CSPRNG sources for randomness should be carefully investigated. If high performance or reproducibility of sequences is not a hard requirement then a CSPRNG is a better choice. If you do not have legacy constraints, then your systems should avoid signature algorithms with failure modes like this. Deterministic nonce generation (RFC 6979) can significantly mitigate risk, but, where feasible, more robust signing algorithms like ed25519 (RFC 8032) are a better choice.

10 Rules for the Secure Use of Cryptocurrency Hardware Wallets

Earlier this year, the Web3 Foundation (W3F) commissioned Trail of Bits for a security review and assessment of the risks in storing cryptocurrency. Everyone who owns cryptocurrency — from large institutions to individual enthusiasts — shares the W3F’s concerns. In service to the broader community, the W3F encouraged us to publish our recommendations for the secure use of hardware wallets: the small tamper-resistant peripherals that enable users to securely create and protect cryptocurrency accounts.

Whether your cryptocurrency holdings amount to a few Satoshis or a small fortune, you will find this post useful.

44917824762_dc1fa21afd_k

Today’s hardware wallets require diligent security procedures (Image credit: Gareth Halfacree)

The Advent of Hardware Wallets

In the early days of cryptocurrency, users simply generated their payment addresses (in cryptographic terms, their public/private key pairs) using the client software on a standard PC. Unfortunately, once cryptocurrency became a hot commodity, securely storing an account private key on a general-purpose computer — using a “software wallet” — became a liability. Software wallet files could be lost or deleted, and they were targeted for theft. Most users were unprepared for the hefty responsibility of securely and reliably storing their private keys. Partially, this drove the adoption of custodial storage services (such as at cryptocurrency exchanges).

Years of massive, unpunished thefts from these services convinced many users that they could never trust a third party with their cryptocurrency holdings the same way that they might trust a regulated bank holding fiat currency. So, in the past couple of years, hardware wallets have gained popularity as useful tools for protecting cryptocurrency accounts without relying on a custodial service.

A Foolproof Solution?

Hardware wallets are a kind of consumer-grade Hardware Security Module (HSM), with a similar purpose: a device that embodies a tamper-resistant vault, inside of which the user’s cryptographic identity (in this case, a cryptocurrency account) can be created and used without the private key ever leaving the device. Fundamentally, a hardware wallet only needs to take a transaction created on a host computer, sign it to make it valid, and output the signed transaction for the host computer to publish to the blockchain.

In practice, it’s not so simple. Users must properly initialize their wallets. Sometimes the devices have firmware updates. Then there’s the matter of recovery codes (also known as the BIP39 recovery phrase or seed words). Hardware wallets are a huge improvement over storing private keys on a sheet of paper in a fire safe, or in a directory on a laptop, but hardware wallets still carry risks. Users need to take some safety precautions. In the words of Bruce Schneier, “Security is a process, not a product.

10 Rules for the Secure Use of Cryptocurrency Hardware Wallets

1: Purchase the device from a trusted source, preferably direct from the vendor, new and unopened

Avoid any unnecessary supply-chain risk. Buying a device directly from the manufacturer (e.g., Ledger or Trezor) rather than from a reseller minimizes the risk of acquiring a counterfeit or a device tampered by a middleman. At least one malicious eBay reseller has reportedly devised clever schemes to defraud buyers even while selling them genuine and unopened products (see rule #3).

2: Never use a pre-initialized hardware wallet

If a user accepts a pre-initialized hardware wallet, they are putting their cryptocurrency into a wallet that is potentially just a copy of a wallet controlled by an attacker. Ensure that you (and only you) properly initialize your hardware wallet before use. Follow the initialization instructions from your hardware wallet vendor’s website (for example, the instructions for a Ledger brand wallet; instructions for a Trezor wallet).

IMG_4723

The kind of prompt you want to see, new out of the box (Ledger Nano S pictured)

3: Never use a pre-selected set of recovery words, only ones generated on-device

Never accept pre-selected recovery words. Always initialize a hardware wallet from a clean slate with on-device generation of new random recovery words. Anyone that knows the recovery words has complete control over the wallet, the ability to watch it for activity, and the ability to steal all of its coins. Effectively, the words are the secret key.

In December 2017, a hardware wallet reseller reportedly packaged a counterfeit scratch-off card in the box with each device delivered to their customers. The scratch-off card revealed a list of recovery words, and the card instructed the buyer to set up their device using a recovery step, rather than initializing it to securely generate a new set of words. This was a clever scam to trick users into using a pre-configured wallet (see rule #2).

fake-ledger-scam-768x1151 (source Reddit)

Beware reseller frauds like this one: official-looking pre-selected recovery words (Image credit: Ledger, Reddit user ‘moodyrocket’)

4: Prefer a device that is able to provide an attestation of its integrity

While resetting or initializing a device ought to be sufficient, there is hypothetically still a risk of buying a counterfeit or tampered hardware wallet. Before you buy one, confirm that you’ll be able to verify the provenance, authenticity, or integrity of the new hardware wallet. Look for software provided by the device maker that can interrogate a Secure Element on the device and provide an attestation of the device’s integrity. Follow the verification instructions from the vendor of your wallet (for example, Ledger’s instructions to use secure element attestation to check device integrity). There are, however, still gaps in the attestation capabilities of today’s wallets. Users ought to continue to demand better and more complete attestation.

5: Test your recovery words

Data protection 101 is “always test your backup”: in this case, your backup is the set of recovery words. Using a spare hardware wallet device, use the recorded recovery words to initialize the test wallet. Eliminate any doubt that the recorded words can successfully recover the original wallet’s state. After testing the correctness of the recovery words, reset/wipe this test device. Do not use a general-purpose computer or software wallet to verify the recovery words. Follow the instructions from your vendor for performing a recovery dry-run to test your seed words (the steps for Trezor wallet users and the steps for Ledger users).

6: Protect your recovery words separately and equally to the hardware wallet. Do not take a picture of them. Do not type them into anything.

Write the recovery words by hand — do not type them into a computer or photograph them to be printed — and then laminate the paper (preferably archival-quality acid-free paper for long-term storage). Store it in an opaque tamper-evident sealed envelope (example) for assurance that it has not been viewed without authorization. Remember that the device’s PIN code is no protection against an attacker with physical access if the recovery words are stored alongside the device. Do not store them together.

IMG_4726

Write the words down, but don’t take a photo like this one!

7: Verify the software you use to communicate with the hardware wallet; understand that a backdoored desktop UI is part of your threat model

Hardware wallets rely on desktop software for initiating transactions, updating the hardware wallet’s firmware, and other sensitive operations. Users of cryptocurrency software should demand reproducible builds and code-signed executables to prevent tampering by an attacker post-installation. The advantage of code-signing, relative to manual verification with a tool like GPG, is that code signatures are automatically verified by the operating system on every launch of the application, whereas manual verification is typically only performed once, if at all. Even verifiable software, though, can still be subverted at runtime. Recognize that general-purpose computing devices are exposed to potentially risky data from untrusted sources on a routine basis.

8: Consider using a high assurance workstation, even with a hardware wallet

By dedicating a workstation to the single task of operating the hardware wallet, it can be locked down to a greater degree because it is not used for day-to-day tasks, nor exposed to as many potential sources of compromise. Consider operating your hardware wallet only from an immutable host PC configuration. This workstation would be offline only, and dedicated to the task of transaction creation and signing using the hardware wallet. First, lock down the system’s firmware configuration (e.g., restrict boot devices, disable network boot, etc.) to ensure the integrity of the boot process. Then, the boot media can be protected either by Secure Boot using a TPM-backed encrypted SSD / hard drive, or — for true immutability — by burning and verifying a trusted OS image onto a write-once DVD-R media and storing the DVD-R in a tamper-evident bag alongside the hardware wallet.

9: Consider a M-of-N multi-signature wallet with independently stored devices

Multi-signature” refers to requiring more than one key to authorize a transaction. This is a fantastic protection against a single point-of-failure. Consider creating a multi-signature wallet with keys generated and kept in hardware wallets stored in physically separate locations. Note that if the devices will be in the custody of different individuals, carefully consider how to coordinate and make decisions to spend from the wallet. For added paranoia, the hardware wallets could be of different device brands. Then, even in the unlikely case that an employee at one of the hardware wallet manufacturers were to have successfully backdoored their devices, they would still only control one of the keys in your multi-signature wallet.

10: Consider manually verifying the generation of a new multi-signature address

Related to rules #7 and #8, note that multi-signature wallets are created by “joining” several private key-holders into a single address defined by a script. In the case of Bitcoin, this is called a P2SH address (“pay-to-script hash”). This part of the address creation is done in a the desktop software UI using public keys, and not on the hardware wallet. If a compromised workstation provides the script basis during the generation of a new P2SH address, then the attacker may be able to join or control the multi-sig wallet. For example, attacker-controlled or -subverted desktop software could secretly turn a 2-of-3 wallet into a 2-of-5 wallet with two additional public keys inserted by the attacker. Remember, a hardware wallet does not entirely preclude the need to secure the host that interfaces with it.

More Secure, More Usable Solutions Still Needed

This discussion of risks and recommendations in regards to cryptocurrency hardware wallets illustrates the challenges for the broader security industry in attempting to design other kinds of fixed-function devices for private key protection. For instance, U2F tokens and Secure Enclaves.

For well over a decade, security researchers have promoted the goal of “usable security.” Usable security is simply the idea that secure computing should be easy to do right, and hard to do wrong. Compare the usability of a modern secure messaging client, for example, with the cumbersome and error-prone key management required to use GPG. Getting usability right is the difference between protecting a few thousand technologists and protecting tens of millions of regular users.

Avoid complacency. Demand safer, better designed devices that aren’t prone to traps and mistakes. The best hardware wallet should be a little bit boring! We hope that in the future, safe and usable hardware wallets will be a commodity device that we can take for granted.

Until then, we will continue doing our part to build security awareness independently and in collaboration with organizations like the W3F. If you work for a company that creates hardware wallets, we welcome you to contact us for help protecting your users.

Return of the Blockchain Security Empire Hacking

Remember last December’s Empire Hacking? The one where we dedicated the event to sharing the best information about blockchain and smart contract security? Let’s do that again, and let’s make it a tradition; a half-day mini conference focused exclusively on a single topic every December. On December 12, please join us at Buzzfeed’s NYC offices to hear 10 excellent speakers share their knowledge of blockchain security in an event that will assuredly expand your abilities.

Dinner will be served. We will congregate at The Headless Horseman afterwards to continue the conversation and for some holiday cheer.

Due the the nature of this event, we’ll be charging attendees $2.00 for entry. Only registered guests will be permitted to attend.

Reserve a spot while you can.

Talks will include:

Anatomy of an Unsafe Smart Contract Programming Language

This talk dissects Solidity: the most popular smart contract programming language. Various examples of its unsafe behavior are discussed, demonstrating that even an experienced, competent programmer can easily shoot themselves in the foot. These serve as a cautionary tale of how not to create a programming language and toolchain, particularly one that shall be trusted with hundreds of millions of dollars in cryptocurrency. The talk is concluded with a retrospective of how some of these issues could have been avoided, and what we can do to make smart contract development more secure moving forward.

Evan Sultanik is a security engineer from Trail of Bits.

Asset Insecurities: Evaluating Digital Asset Security Fundamentals

Spend a couple minutes learning about digital asset security ecosystem problems as faced at Coinbase scale. This will be a jaunt through insecure supply chain, the difference between a protocol white paper and the actual implementation, and a couple other things that’ll bite you if you’re not paying attention.

Shamiq herds cryptokitties, security engineers and developers at Coinbase as Head of Application Security. In his spare time, he loves to eat cheese and chocolate.

Designing the Gemini dollar: a regulated, upgradeable, transparent stablecoin

A regulated stablecoin requires important design decisions. How can you make your contracts upgradeable when many rely on them? How can you manage keys that protect the underlying assets? And how can you do this all completely transparently? In this talk, we explain the design decisions that went into the Gemini dollar, and compare and contrast with other possible implementations.

Brandon Arvanaghi is a security engineer at Gemini Trust.

Property testing with Echidna and Manticore for secure smart contracts

Property-based testing is an incredibly simple and powerful tool for bug discovery, but despite its efficacy, it’s almost unheard of in the smart contract development community. This talk will introduce the concept of property-based testing, discuss strategies for picking good properties and testing them thoroughly, then go into how to apply these ideas to smart contracts specifically. We’ll discuss the use of both Manticore and Echidna for testing, and look at real bugs these tools can find in production code.

JP Smith is a security engineer from Trail of Bits.

Contract upgrade risks and remediations

A popular trend in smart contract design is to promote the development of upgradable contracts. Existing techniques to upgrade contracts have flaws, increase the complexity of the contract significantly, and ultimately introduce bugs. We will detail our analysis of existing smart contract upgrade strategies, describe the weaknesses we have observed in practice, and provide recommendations for contracts that require upgrades.

Josselin Feist is a security engineer at Trail of Bits.

Failures in On-Chain Privacy

Many, including Satoshi, believed cryptocurrencies provided privacy for payments. In reality, cryptocurrency is Twitter for your bank account. Worse, the current set of decoy transaction–based approaches commonly believed to provide privacy—including coinjoin and cryptonote/Monero—provide fundamentally flawed privacy protections. Where did we go wrong? This talk covers how to critically evaluate the privacy provided by any proposed protocol for payment privacy. Through a series of thought experiments, it outlines three plausible attacks on existing decoy-based schemes. These issues show the unintuitive nature of privacy protections, as well as the need to both evaluate protocols in the context of real world threats, and use approaches with formal and peer reviewed privacy guarantees such as Zcash.

Ian Miers is a post-doctoral associate at Cornell Tech.

Secure Micropayment Protocols

Sending cryptocurrency micropayment transactions that must be confirmed on a blockchain is impractical today due to transaction fees that can exceed the value being sent. Instead, we can use micropayment protocols that only rely on the blockchain for settlement and disputes to minimize on-chain fees. In this talk, we will describe and compare different approaches to constructing secure micropayment protocols on top of Ethereum including probabilistic micropayments and payment channels. Furthermore, we will highlight the difficulties and considerations in implementing these types of protocols given the increased reliance on correct and timely client behavior to prevent the loss of funds.

Yondon Fu is a software engineer and researcher at Livepeer.

How To Buidl an Enterprise-Grade Mainnet Ethereum Client

The byzantine environment of the Ethereum mainnet is fraught with challenges for aspiring hackers seeking to publish a compatible client. This talk highlights the trials and tribulations of implementing a client capable of handily dispatching the adversarial events and actors of the sprawling P2P ecosystem that comprises the Ethereum blockchain’s world-wide compute network. The uniquely modular nature of the Pantheon codebase and it’s suitability for enterprise application will be treated in detail. The session will conclude with a brief sketch of the road ahead for Pantheon with an eye towards the Ethereum Enterprise Alliance and the forthcoming updates that comprise the broad strokes of the Ethereum 2.0 specification.

S. Matthew English is a PegaSys protocol engineer and Pantheon core dev.

Simple is hard: Making your awesome security thing usable

If the security assumptions of blockchain systems fail even a little, they provide very little value. They also have a high barrier to entry and are hard to use. But wait, people already don’t use security tools — how isn’t this the worst of all possible worlds? We’ll talk about some precedents from infosec history and how we might be able to avoid “Your elections are fine as long as you use The New PGP on The Blockchain” in favor of helping people build cool things that really do solve longstanding problems in novel ways.

Patrick Nielsen and Amber Baldet are founders of Clovyr.

Like it or not, blockchain voting is here to stay

I’m going to talk about how blockchain voting apps received serious pushback from academics who study voting security, but that West Virginia used the Voatz app for some counties during primaries, used it in almost half the state in the midterm election, and is pleased with how it went. Voatz is already in talks with other states and is hoping for up to 20 states to use it by 2020. And several other countries are testing different blockchain voting apps.

Kevin Collier is the cybersecurity correspondent at BuzzFeed News, where he covers cyberwar, hackers, election security, disinformation efforts, tech companies, and hacking laws. Prior to BuzzFeed, Kevin covered cybersecurity at Vocativ and the Daily Dot, and has written for Politico, Gizmodo, The Daily Beast, and NY Mag. A native of West Virginia, he lives in Brooklyn.

We’re look forward to seeing you there!

Workshop: Smart-Contract Security Analysis (December 11)

On December 11th, the day prior to Empire Hacking, we’ll be hosting a security training for Ethereum smart contract developers.

In this day-long training, JP Smith will share how we conduct our security reviews; not just our tools or tricks, but the whole approach. In addition to that knowledge, we’ll share our school of thought regarding assessments. Far too often, we encounter the belief that audits deliver a list of bugs and, consequently, the ability to say “Our code has been audited!” (and therefore “Our code is safe!”). That’s just part of the picture. Audits should also deliver an assessment of total project risk, guidance on architectural and development lifecycle, and someone to talk to. That’s the framework attendees will come away with.

Register for the day-long training.

Trail of Bits @ Devcon IV Recap

We wanted to make up for missing the first three Devcons, so we participated in this year’s event through a number of talks, a panel, and two trainings. For those of you who couldn’t join us, we’ve summarized our contributions below. We hope to see you there next year.

Using Manticore and Symbolic Execution to Find Smart Contract Bugs

In this workshop, Josselin Feist showed how to use Manticore, our open-source symbolic execution engine. Manticore enables developers not only to discover bugs in their code immediately, but also to prove that their code works correctly. Josselin led 120 attendees through a variety of exercises with Manticore. Everyone left with hands-on formal methods that will help them ensure that their smart contracts follow their specifications.

Get the workshop’s slides and exercises

Blockchain Autopsies

In this lightning talk, Jay Little recovered and analyzed 30,000 self-destructed contracts, and identified possible attacks hidden among them. 2 million contracts have been created on Ethereum’s mainnet yet few holding any value have been destroyed. These high-signal transactions are difficult to find; many are not available to a fully synchronized Ethereum node. In order to achieve this feat, Jay created new tools that re-process blockchain ledger data, recreate contracts with state, and analyze suspect transactions using traces and heuristics.

Filtering deployment mistakes, DoS attacks, and spam to identify suspect self-destructs

Get Jay’s slides

Current State of Security

In this panel, Kevin Seagraves facilitated a discussion about Ethereum’s current security posture. What was the biggest change in Ethereum security in the last year? How is securing smart contracts different from traditional systems? How should we think about the utility of bug bounties? Hear what this panel of experts had to say:

Security Training

In this day-long training, JP shared how we conduct our security reviews; not just our tools or tricks, but the whole approach. In addition to that knowledge, we tried to impart our school of thought regarding assessments. Far too often, we encounter the belief that audits deliver a list of bugs and, consequently, the ability to say “Our code has been audited!” (and therefore “Our code is safe!”). That’s just part of the picture. Audits should also deliver an assessment of total project risk, guidance on architectural and development lifecycle, and someone to talk to.

We’re running the training again on December 11th in New York. Reserve yourself a seat.

Devcon Surprise

Instead of going to Devcon, Evan Sultanik stayed home and wrote an Ethereum client fuzzer. Etheno automatically seeks divergences among the world’s Ethereum clients, like the one that surfaced on Ropsten in October. Etheno automatically identified that same bug in two minutes.

We’re glad that we attended Devcon4, and look forward to participating more in future events.