Why I didn’t catch any Pokemon today

tl;dr While the internet went crazy today, we went fact finding. Here are our notes on Pokemon Go’s permissions to your Google account.

Here’s what Jay and I set out to do at around 6pm today:

  • Find what permissions Pokemon Go is actually requesting
  • Investigate what the permissions actually do
  • Replicate the permissions in a test app

Our first instinct was to go straight to the code, so we began by loading up the iOS app in a jailbroken phone. The Pokemon Go app uses jailbreak detection to prevent users with modified devices from accessing the game. As we have commonly found with such protections, they were trivial to bypass and, as a result, afforded no real protection. We recommend that firms contact us about MAST if they need more formidable application protection.

Niantic issues an OAuth request to Google with their scope set to the following (note: “scope” determines the level of access that Niantic has to your account and each requested item is a different class of data):

The OAuthLogin scope stands out in this list. It is mainly used by applications from Google, such as Chrome and the iOS Account Manager, though we were able to find a few Github projects that used it too.

It’s not possible to use this OAuth scope from Google’s own OAuth Playground. It only gives various “not authorized” error messages. This means that the OAuth Playground, Google’s own service for testing access to their APIs, is unable to exactly replicate the permissions requested by Pokemon Go.

It might be part of the OAuth 1.0 API, which was deprecated by Google in 2012 and shut down in 2015. If so, we’re not sure why Pokemon Go was able to use it. We checked, and accounts that migrate up to the OAuth 2.0 API are no longer able to access the older 1.0 API.

We found changelogs in the source code for Google Chrome that refer to this OAuth scope as the “Uber” token where it is passed with the “IssueUberAuth” GET parameter.

It does not appear possible to create our own app that uses this OAuth scope through normal or documented means. In order to properly test the level of access provided by this OAuth token, we would probably need to hook an app with access to one (e.g., via a Cydia hook).

The Pokemon Go login flow does not describe what permissions are being requested and silently re-enables them after they’ve been revoked. Further, the available documentation fails to adequately describe what token permissions mean to anyone trying to investigate them.

It’s clear that this access is not needed to identify user accounts in Pokemon Go. While we were writing this we expected Niantic to ultimately respond by reducing the privileges they request. By the time we hit publish, they released a statement confirming they will.

For once, we agree with a lot of comments on Hacker News.

This seems like a massive security fail on Google’s part. There’s no reason the OAuth flow should be able to request admin privileges silently. As a user, I really must get a prompt asking me (and warning me!). — ceejayoz

We were able to query for specific token scopes through Google Apps for Work but we have not found an equivalent for personal accounts. Given that these tokens are nearly equivalent to passwords, it seems prudent to enable greater investigation and transparency about their use on all Google accounts for the next inevitable time that this occurs.

tokens

Google Apps for Work lets you query individual token scopes

By the time we got this far, Niantic released a statement that confirmed they had far more access than needed:

We recently discovered that the Pokémon GO account creation process on iOS erroneously requests full access permission for the user’s Google account. However, Pokémon GO only accesses basic Google profile information (specifically, your User ID and email address) and no other Google account information is or has been accessed or collected. Once we became aware of this error, we began working on a client-side fix to request permission for only basic Google profile information, in line with the data that we actually access. Google has verified that no other information has been received or accessed by Pokémon GO or Niantic. Google will soon reduce Pokémon GO’s permission to only the basic profile data that Pokémon GO needs, and users do not need to take any actions themselves.

After Google and Niantic follow through with the actions described in their statement, this will completely resolve the issue. As best we can tell, Google plans to find the already issued tokens and “demote” them, in tandem with Niantic no longer requesting these permissions for new users.

Thanks for reading and let us know if you have any further details! Please take a second to review what apps you have authorized via the Google Security Checkup, and enable 2FA.

Update 7/12/2016: It looks like we were on the right track with the “UberAuth” tokens. This OAuth scope initially gains access to very little but can be exchanged for new tokens that allow access to all data in your Google account, including Gmail, through a series of undocumented methods. More details: https://gist.github.com/arirubinstein/fd5453537436a8757266f908c3e41538

Update 7/13/2016: The Pokemon Go app has been updated to request only basic permissions now. Niantic’s statement indicated they were going to de-privilege all the erroneously issued tokens themselves, but if you want to jump ahead of them go to your App Permissions, revoke the Pokemon Go access, signout of the Pokemon Go app, and then sign back in.

Screen Shot 2016-07-13 at 2.30.19 PM

Start using the Secure Enclave Crypto API

tl;dr – Tidas is now open source. Let us know if your company wants help trying it out.

When Apple quietly released the Secure Enclave Crypto API in iOS 9 (kSecAttrTokenIDSecureEnclave), it allowed developers to liberate their users from the annoyance of strong passwords or OAuth.

That is, if the developers could make do without documentation.

The required attribute was entirely undocumented. The key format was incompatible with OpenSSL. Apple didn’t even say what cipher suite was used (it’s secp256r1). It was totally unusable in its original state. The app-developer community was at a loss.

We filled the gap

We approached this as a reverse-engineering challenge. Ryan Stortz applied his considerable skill and our collective knowledge of the iOS platform to figure out how to use this new API.

Once Ryan finished a working set of tools to harness the Secure Enclave, we took the next step. We released a service based on this feature: Tidas.

When your app is installed on a new device, the Tidas SDK generates a unique encryption key identifying the user and registers it with the Tidas web service. This key is stored on the client device in the Secure Enclave and is protected by Touch ID, requiring the user to use their fingerprint to unlock it. Client sign-in generates a digitally-signed session token that your backend can pass to the Tidas web service to verify the user’s identity. The entire authentication process is handled by our easy-to-use SDK and avoids transmitting users’ sensitive data. They retain their privacy. You minimize your liability.

tidas-login

David Schuetz, at NCC Group, assessed Tidas’s protocol in this tidy write-up. David’s graphic on the right accurately describes the Tidas wire protocol.

Tidas’s authentication protocol, combined with secure key storage in the Secure Enclave, provides strong security assurances and prevents attacks like phishing and replays. It significantly lowers the bar to adopting token-only authentication in a mobile-first development environment.

We saw enormous potential for security by enabling applications to use private keys that are safely stored outside of iOS and away from any potential malware, like easily unlocking your computer with a press of TouchID, stronger password managers, and more trustworthy mobile payments.

We thought the benefits were clear, so we put together a website and released this product to the internet.

Today, Tidas becomes open source.

Since its February release, Tidas has raised a lot of eyebrows. The WSJ wrote an article about it. We spoke with a dozen different banks that wanted Tidas for its device-binding properties and potential reduction to fraud. Meanwhile, we courted mobile app developers directly for trial runs.

Months later, none of this potential has resulted in clients.

Authentication routines are the gateway to your application. The developers we spoke with were unwilling to modify them in the slightest if it risked locking out honest paying customers.

Banks liked the technology, but none would consider purchasing a point solution for a single device (iOS).

So, Tidas becomes open source today. All its code is available at https://github.com/tidas. If you want to try using the Secure Enclave on your own, check out our DIY toolkit: https://github.com/trailofbits/SecureEnclaveCrypto. It resolves all the Apple problems we mentioned above by providing an easy-to-use wrapper around the Secure Enclave API. Integration with your app could not be easier.

If your company is interested in trying it out and wants help, contact us.

It’s time to take ownership of our image

Gloves
Goggles
Checkered body suits

The representation of hackers in stock media spans a narrow band of reality between the laughable and the absurd.

It overshadows the fact that lots of hackers are security professionals. They may dress differently, but they serve a critical function in the economy.

It’s easy to satirize the way the media and Hollywood portray hackers. Dorkly and Daniel J. Solove have excellently skewered many of them.

What’s harder -and more productive- would be a repository of stock assets of real-life hackers wearing -yes- hoodies, but also more formal attire. Some scenes may show dark rooms at night. Others will be in daytime offices.

If the media used the repository maybe it’d change the public’s perception. Maybe it would show aspiring hackers -boys and girls- that we’re just like them, and that if they work hard they could join our ranks.

We’re kicking off this “Hacker Anthology” by contributing stock video footage of our own employees and a hacker typer script that we made last year for fun.

In a few weeks, I’ll be in Las Vegas for Blackhat and Defcon with many of you. If there’s enough interest, I’ll hire a photographer for a few hours to build up our portfolio of stock photos. It should be a fun time. Get in touch with me if you’d be interested in contributing.

—–

I poured through dozens of truly awful and hilarious photos while writing this blog post. Here are some of my favorites that I stumbled upon from around the net:

I have met DAOAttacker and can confirm this is what they look like:

Play a hacker on TV, become a hacker in real life:

One of my favorite novelty Twitter accounts:

In some cases, bad stock photography can be physically harmful:
stock-image-fail-soldering-iron-bob-byron-1

I, too, look intently at screens that are turned off:
stock-photo-88593521-scientist-uses-modern-technology-for-its-research

If I had a nickel for every time I saw this photo used:
depositphotos_11605816-Security-concept-lock-on-digital

Alex Sotirov schooling the kids on cyberpunk style before the Hackers 15th anniversary party:
shot0737

What are you favorite hacker stock photos? Leave a comment below.

2000 cuts with Binary Ninja

Using Vector35’s Binary Ninja, a promising new interactive static analysis and reverse engineering platform, I wrote a script that generated “exploits” for 2,000 unique binaries in this year’s DEFCON CTF qualifying round.

If you’re wondering how to remain competitive in a post-DARPA DEFCON CTF, I highly recommend you take a look at Binary Ninja.

Before I share how I slashed through the three challenges — 334 cuts, 666 cuts, and 1,000 cuts — I have to acknowledge the tool that made my work possible.

Compared to my experience with IDA, which is held together with duct tape and prayers, Binary Ninja’s workflow is a pleasure. It does analysis on its own intermediate language (IL), which is exposed through Python and C++ APIs. It’s comparatively simple to query blocks of code, functions, trace execution flow, query register states, and many other tasks that seem herculean within IDA.

This brought a welcome distraction from the slew of stack-based buffer overflows and unhardened heap exploitation that have come to characterize DEFCON’s CTF.

Since the original point of CTF competitions was to help people improve, I limited my options to what most participants could use. Without Binary Ninja, I would have had to:

  1. Use IDA and IDAPython; a more expensive and unpleasant proposition.
  2. Develop a Cyber Reasoning System; an unrealistic option for most participants.
  3. Reverse the binaries by hand; effectively impossible given the number of binaries.

None of these are nearly as attractive as Binary Ninja.

How Binary Ninja accelerates CTF work

This year’s qualifying challenges were heavily focused on preparing competitors for the Cyber Grand Challenge (CGC). A full third of the challenges were DECREE-based. Several required CGC-style “Proof of Vulnerability” exploits. This year the finals will be based on DECREE so the winning CGC robot can ‘play’ against the human competitors. For the first time in its history, DEFCON CTF is abandoning the attack-defense model.

Challenge #1 : 334 cuts

334 cuts
http://download.quals.shallweplayaga.me/22ffeb97cf4f6ddb1802bf64c03e2aab/334_cuts.tar.bz2
334_cuts_22ffeb97cf4f6ddb1802bf64c03e2aab.quals.shallweplayaga.me:10334

The first challenge, 334 cuts, didn’t offer much in terms of direction. I started by connecting to the challenge service:

$ nc 334_cuts_22ffeb97cf4f6ddb1802bf64c03e2aab.quals.shallweplayaga.me 10334
send your crash string as base64, followed by a newline
easy-prasky-with-buffalo-on-bing

Okay, so it wants us to crash the service, no problem; I already had a crashing input string for that service already from a previous challenge.

$ nc 334_cuts_22ffeb97cf4f6ddb1802bf64c03e2aab.quals.shallweplayaga.me 10334
send your crash string as base64, followed by a newline
easy-prasky-with-buffalo-on-bing
YWFhYWFhYWFhYWFhYWFhYWFhYWFsZGR3YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYQo=
easy-biroldo-with-mayonnaise-on-multigrain

I wasn’t expecting a second challenge name after the first. I’m guessing I’m going to need to crash a few services now. Next I extracted the tarball.

$ tar jxf 334_cuts.tar.bz2
$ ls 334_cuts
334_cuts/easy-alheira-with-bagoong-on-ngome*
334_cuts/easy-cumberland-with-gribiche-on-focaccia*
334_cuts/easy-kielbasa-with-skhug-on-scone*
334_cuts/easy-mustamakkara-with-pickapeppa-on-soda*
334_cuts/easy-alheira-with-garum-on-pikelet*
334_cuts/easy-cumberland-with-khrenovina-on-white*
334_cuts/easy-krakowska-with-franks-on-pikelet*
334_cuts/easy-mustamakkara-with-shottsuru-on-naan*
...
$ ls 334_cuts | wc -l
334

Hmm, there are 334 DECREE challenge binaries, all with food-related names. Well, time to throw them into Binja. Starting with easy-biroldo-with-mayonnaise-on-multigrain. DECREE challenge binaries are secretly ELF binaries (as used on Linux and FreeBSD), so they load just fine with Binja’s ELF loader.

binarynina-overview

Binary Ninja has a simple and smooth interface

This challenge binary is fairly simple and nearly identical to easy-prasky-with-buffalo-on-bing. Each challenge binary is stripped of symbols, has a static stack buffer, a canary, and a stack-based buffer overflow. The canary is copied to the stack and checked against a hard coded value. If the canary is overwritten, the challenge terminates and does not crash. Any overflow will have to make sure the canary value is overwritten with the expected value. It turns out all 334 challenges only differ in four ways:

  1. The size of the buffer you overflow
  2. The canary string and its length
  3. The size of the stack buffer in the recvmsg function
  4. The amount of data the writemsg function proceses for each iteration of its write loop

Our crashing string has to exactly overflow both the stack buffer and pass the canary check in each of the 334 binaries. It’s best to automate collecting this information. Thankfully Binja can be used as a headless analysis engine from Python!

We start by importing binja into our python script and creating a binary view. The binary view is our main interface to Binja’s analysis.

I was initially trying to create a generic solution without looking at the majority of the challenge binaries, so I found the main function programmatically. I did that by starting at the entry point and knowing that it made two calls.

From the entry point, I knew there were two calls with the second being the one I wanted. Similarly, I knew the next function had one call and the call was the one I wanted to follow to main. All my analysis used Binja’s LowLevelIL.

Once we have our reference to main, the real fun begins.

binaryninja-ilview

Binary Ninja in LowLevelIL mode

The first thing we needed to figure out was the canary string. The approach I took was to collect references to all the call instructions:

Then I knew that the first call was to a memcpy, the second was to recvmsg, and the third was to the canary memcmp. Small hiccup here, sometimes the compiler would inline the memcpy. This happened when the canary string string was less than 16 bytes long.

binaryninja-inlinememcpy

This Challenge Binary has an inline memcpy.😦

This was a simple fix, as I now counted the number of calls in the function and adjusted my offsets accordingly:

To extract the canary and size of the canary buffer, I used the newly introduced get_parameter_at() function. This function is fantastic: at any caller site, it allows you to query the function parameters with respect to calling convention and system architecture. I used it to query all the parameters for the call to memcmp.

Next I need to know how big the buffer to overflow is. To do this, I once again used get_parameter_at() to query the first argument for the read_buf call. This points to the stack buffer we’ll overflow. We can calculate its size by subtracting the offset of the canary’s stack buffer.

It turns out the other two variables were inconsequential. These two bits of information were all we needed to craft our crashing string.

I glued all this logic together and threw it at the 334 challenge. It prompted me for 10 crashing strings before giving me the flag: baby's first crs cirvyudta.

Challenge #2: 666 cuts

666 cuts
http://download.quals.shallweplayaga.me/e38431570c1b4b397fa1026bb71a0576/666_cuts.tar.bz2
666_cuts_e38431570c1b4b397fa1026bb71a0576.quals.shallweplayaga.me:10666

To start, I once again connected with netcat:

$ nc 666_cuts_e38431570c1b4b397fa1026bb71a0576.quals.shallweplayaga.me 10666
send your crash string as base64, followed by a newline
medium-chorizo-with-chutney-on-bammy

I’m expecting 666 challenge binaries.

$ tar jxf 666_cuts.tar.bz2
$ ls 666_cuts
666_cuts/medium-alheira-with-khrenovina-on-marraqueta*
666_cuts/medium-cumberland-with-hollandaise-on-bannock*
666_cuts/medium-krakowska-with-doubanjiang-on-pita*
666_cuts/medium-newmarket-with-pickapeppa-on-cholermus*
…
$ ls 666_cuts | wc -l
666

Same game as before, I throw a random binary into binja and it’s nearly identical to the set from 334. At this point I wonder if the same script will work for this challenge. I modify it to connect to the new service and run it. The new service provides 10 challenge binary names to crash and my script provides 10 crashing strings, before printing the flag: you think this is the real quaid DeifCokIj.

Challenge #3: 1000 cuts

1000 cuts
http://download.quals.shallweplayaga.me/1bf4f5b0948106ad8102b7cb141182a2/1000_cuts.tar.bz2
1000_cuts_1bf4f5b0948106ad8102b7cb141182a2.quals.shallweplayaga.me:11000

You get the idea, 1000 challenges, same script, flag is: do you want a thousand bandages gruanfir3.

Room For Improvement

Binary Ninja shows a lot of promise, but it still has a ways to go. In future versions I hope to see the addition of SSA and a flexible type system. Once SSA is added to Binary Ninja, it will be easier to identify data flows through the application, tell when types change, and determine when stack slots are reused. It’s also a foundational feature that helps build a decompiler.

Conclusion

From its silky smooth graph view to its intermediate language to its smart integration with Python, Binary Ninja provides a fantastic interface for static binary analysis. With minimal effort, it allowed me to extract data from 2000 binaries quickly and easily.

That’s the bigger story here: It’s possible to enhance our capabilities and combine mechanical efficiency with human intuition. In fact, I’d say it’s preferable. We’re not going to become more secure if we rely on machines entirely. Instead, we should focus on building tools that make us more effective; tools like Binary Ninja.

If you agree, give Binary Ninja a chance. In less than a year of development, it’s already punching above its weight class. Expect more fanboyism from myself and the rest of Trail of Bits — especially as Binary Ninja continues to improve.

My (slightly updated) script is available here. For the sake of history, the original is available here.

Binary Ninja is currently in a private beta and has a public Slack.

Empire Hacking Turns One

In the year since we started this bi-monthly meetup, we’ve been thrilled by the community that it has attracted. We’ve had some excellent presentations on pragmatic security research, shared our aspirations and annoyances with our work, and made some new friends. It’s a wonderful foundation for an even better year two!

To mark the group’s ‘birthday,’ we took a moment to reflect on all that has happened.

By the numbers:

  • 312 – Number of members on meetup.com
  • 75 – Largest turnout for a single event
  • 46 – Times Jay said “there’s a Python module for that”
  • 785 – Beers served
  • 14 – Superb presentations given
  • 154 – Members on Empire Slacking, a Slack organization for our members

Presentations

June 2015

Offense at Scale

  • Chris Rohlf from Yahoo discussed the effects of scale on vulnerability research, fuzzing and real attack campaigns.

Automatically proving program termination (and more!)

  • Dr. Byron Cook, Professor of Computer Science at University College London, shared research advances that have led to practical tools for automatically proving program termination and related properties.

Cellular Baseband Exploitation

  • Nick DePetrillo, one of our security engineers, explored the challenges of reliable, large-scale cellular baseband exploitation.

August 2015

Exploiting the Nintendo 3DS

  • Luke Arntson, a hobbyist security researcher, reverse engineer, and hardware hacker, highlighted the origins of the Nintendo DS Profile exploit, the obfuscated Gateway browser exploit, and the payloads used by both.

Trail of Bits Cyber Grand Challenge (CGC) Demo

  • Ryan Stortz, one of our security engineers, described the high-level architecture of the system we built to fight and destroy insecure software as part of a DARPA competition, how well it worked, and difficulties we overcame during the development process.

OS X Malware

  • Jay Little, another of our security engineers, gave a code review of Hacking Team’s OS X kernel rootkit in just 10 minutes.

October 2015

The PointsTo Use-After-Free Detector

  • Peter Goodman, our very own dynamic binary translator, presented the design of PointsTo, an LLVM-based static analysis system that automatically finds use-after-free vulnerabilities in large codebases.

Protecting Virtual Function Calls in COTS C++ Binaries

  • Aravind Prakash, an assistant professor in the Dept. of Computer Science at Binghamton University, showed how vfGuard protects virtual function calls in C++ from control subversion attacks.

December 2015

Exploiting Out-of-Order Execution for Covert Cross-VM Communication

  • Sophia D’Antoine, one of our security engineers, demonstrated a novel side channel that exploits out-of-order execution to enable cross-VM communication.

Experiments building and visualizing hypergraphs of security data

  • Richard Lethin, President of Reservoir Labs, discussed data structures and algorithms that enable the representation and analysis of big data (such as security logs) as hypergraphs.

February 2016

Reversing Engineering the Tytera MD380 2-way Radio

  • Travis Goodspeed, a neighbor, explained how the handheld digital radio was jailbroken to allow for patching and firmware extraction, as well as the tricks used to patch the firmware for new features, such as promiscuous mode and a secondary application.

The Mobile Application Security Toolkit (MAST)

  • Sophia D’Antoine addressed the design of the Mobile Application Security Toolkit (MAST) which ties together jailbreak detection, anti-debugging, and anti-reversing in LLVM to address these risks.

April 2016

Putting the Hype in Hypervisor

  • Brandon Falk, a software security researcher, operating system developer, and fuzzing enthusiast, presented various ways of gathering code coverage information without binary modification and how to use code coverage to direct fuzzing.

Crypto Challenges and Fails

  • Ben Agre, a computer security consultant, distinguished successful crypto challenges from failures through the lens of challenges offered by RSA, Telegram, and several smaller examples.

Join us on Empire Slacking

Last September, we created a Slack organization for our members. That’s where we discuss meetups, the latest security news, and our open-source projects. Everyone is welcome. Join through our auto-inviter, and feel free to share the link: https://empireslacking.herokuapp.com/­

Big thanks to our event partners

WeWork hosted all but one of our meetups. The April 2016 meetup took place at Digital Ocean. We are very grateful for their hosting.

We would also like to thank the New York C++ Developers Group for co-hosting our October 2015 meetup.

With all that momentum, we’re excited for the year ahead.

Speaking of the future…

Next Meetup: June 7 at 6pm

Marcin Wielgoszewski will be speaking about Doorman, an osquery fleet manager. Doorman makes it easy for network administrators to monitor the security of thousands of devices with osquery. Doorman is open-source and under active development.

Following Marcin, Nick Esposito of Trail of Bits will discuss the design of Tidas, a solution for password-free authentication for iOS software developers. Tidas takes advantage of our unique capability to generate and store ECC keys inside the Secure Enclave. Hear all about how we built Tidas at the next Empire Hacking.

Our June event is hosted at Spotify. Beverages and light food will be provided. Space is limited, so please RSVP on the meetup page.

Don’t miss it!

Next:

ProtoFuzz: A Protobuf Fuzzer

At Trail of Bits, we occasionally perform source code audits. A recent one targeted an application that used Protocol Buffers extensively.

Google’s Protocol Buffers (protobuf) is a common method of serializing data, typically found in distributed applications. Protobufs simplify the generally error-prone task of parsing binary data by letting a developer define the type of data, and letting a protobuf compiler (protoc) generate all the serialization and deserialization code automatically.

Fuzzing a service expecting protobuf-encoded structures directly is not likely to achieve satisfactory code coverage. First, protobuf deserialization code is fairly mature and has seen scrutiny. Second, we are not typically interested in flaws in the protobuf implementation itself. Our main goal is to target the code behind protobuf decoding. Our aim becomes to create valid protobuf-encoded structures that are composed of malicious values.

This library is in sufficiently widespread use that we found it worthwhile to create a generic Protobuf message generator to help with assessments. The message generator is a Python3 library with a simple interface: provided a protobuf definition, it creates Python generators for various permutations of all defined messages. We call it ProtoFuzz.

For data itself, we use the fuzzdb database as the source of values that are generated, but it’s relatively straightforward to define your own collection of values.

Installation

When installing in Ubuntu:

pip install py3-protobuffers
sudo add-apt-repository -y ppa:5-james-t/protobuf-ppa
sudo apt-get -qq update
sudo apt-get -y install protobuf-compiler
git clone --recursive git@github.com:trailofbits/protofuzz.git
cd protofuzz/
python3 setup.py install

Usage

Message generation is handled by ProtobufGenerator instances. Each instance backs a Protobuf-produced class. This class has two functions: create fuzzing strategies and create field dependencies.

A fuzzing strategy defines how fields are permuted. So far just two are defined: linear and permutation. A linear strategy creates a stream of protobuf objects that are the equivalent of Python’s zip() across all values that can be generated. A permutation produces a stream that is a cartesian product of all the values that can be generated. A linear() permutation can be used to get a sense of the kinds of values that will be generated without creating a multitude of values.

Field dependencies force the values of some fields to be created from the values of others via any callable object. This is used for fields that probably shouldn’t be fuzzed, like lengths, CRC checksums, magic values, etc.

The entry point into the library is the `protofuzz.protofuzz` module. It defines three functions:

protofuzz.from_description_string()

Create a dict of ProtobufGenerator objects from a string Protobuf definition.

from protofuzz import protofuzz
message_fuzzers = protofuzz.from_description_string("""
    message Address {
     required int32 house = 1;
     required string street = 2;
    }
""")
for obj in message_fuzzers['Address'].permute():
    print("Generated object: {}".format(obj))
Generated object: house: -1
street: "!"

Generated object: house: 0
street: "!"

Generated object: house: 256
street: "!"

protofuzz.from_file()

Create a dict of ProtobufGenerator objects from a path to a .proto file.

from protofuzz import protofuzz
message_fuzzers = protofuzz.from_file('test.proto')
for obj in message_fuzzers['Person'].permute():
    print("Generated object: {}".format(obj))
Generated object: name: "!"
id: -1
email: "!"
phone {
  number: "!"
  type: MOBILE
}

Generated object: name: "!\'"
id: -1
email: "!"
phone {
  number: "!"
  type: MOBILE
}
...

protofuzz.from_protobuf_class()

Create a ProtobufGenerator from an already-loaded Protobuf class.

Creating Linked Fields

Some fields shouldn’t be fuzzed. For example, fields like magic values, checksums, and lengths should not be mutated. To this end, protofuzz supports resolving selected field values from other fields. To create a linked field, use ProtobufGenerator’s add_dependency method. Dependencies can also be created between nested objects. For example,

fuzzer = protofuzz.from_description_string('''
message Contents {
  required string header = 1;
  required string body = 2;
}
message Payload {
  required int32 length = 1;
  required Contents contents = 2;
}
''')

fuzzer['Payload'].add_dependency('length', 'contents.body', len)
for idx, obj in zip(range(3), fuzzer['Payload'].permute()):
  print("Generated object: {}".format(obj))
Generated object: length: 1
contents {
  header: "!"
  body: "!"
}

Generated object: length: 2
contents {
  header: "!"
  body: "!\'"
}

Generated object: length: 29
contents {
  header: "!"
  body: "!@#$%%^#$%#$@#$%$$@#$%^^**(()"
}
...

Miscellaneous

Although not related to fuzzing directly, Protofuzz also includes a simple logging class that’s implemented as a ring buffer to aid in fuzzing campaigns. See protobuf.log.

Conclusion

We created Protofuzz to assist with security assessments. It gave us the ability to quickly test message-handling code with minimal ramp up.

The library itself is implemented with minimal dependencies, making it appropriate for integration with continuous integration (CI) and testing tools.

If you have any questions, please feel free to reach out at yan@trailofbits.com or file an issue.

The DBIR’s ‘Forest’ of Exploit Signatures

If you follow the recommendations in the 2016 Verizon Data Breach Investigations Report (DBIR), you will expose your organization to more risk, not less. The report’s most glaring flaw is the assertion that the TLS FREAK vulnerability is among the ‘Top 10’ most exploited on the Internet. No experienced security practitioner believes that FREAK is widely exploited. Where else did Verizon get it wrong?

This question undermines the rest of the report. The DBIR is a collaborative effort involving 60+ organizations’ proprietary data. It’s the single best source of information for enterprise defenders, which is why it’s a travesty that its section on vulnerabilities used in data breaches contains misleading data, analysis, and recommendations.

Verizon must ‘be better.’ They have to set a higher standard for the data they accept from collaborators. I recommend they base their analysis on documented data breaches, partner with agent-based security vendors, and include a red team in the review process. I’ll elaborate on these points later.

Digging into the vulnerability data

For the rest of this post, I’ll focus on the DBIR’s Vulnerability section (pages 13-16). There, Verizon uses bad data to discuss trends in software exploits used in data breaches. This section was contributed by Kenna Security (formerly Risk I/O), a vulnerability management startup with $10 million in venture funding. Unlike the rest of the report, nothing in this section is based on data breaches.

image01.png

The Kenna Security website claims they authored the Vulnerabilities section in the 2016 DBIR

It’s easy to criticize the analysis in the Vulnerabilities section. It repeats common tropes long attacked by the security community, like simple counting of known vulnerabilities (Figures 11, 12, and 13). Counting vulnerabilities fails to consider the number of assets, their importance to the business, or their impact. There’s something wrong with the underlying data, too.

Verizon notes in the section’s header that portions of the data come from vulnerability scanners. In footnote 8, they share some of the underlying data, a list of the top 10 exploited vulnerabilities as detected by Kenna. According to the report, these vulnerabilities represent 85% of successful exploit traffic on the Internet.

image03.png

Footnote 8 lists the vulnerabilities most commonly used in data breaches

Jericho at OSVDB was the first to pick apart this list of CVEs. He noted that the DBIR never explains how successful exploitation is detected (their subsequent clarification doesn’t hold water), nor what successful exploitation means in the context of a vulnerability scanner. Worse, he points out that among the ‘top 10’ are obscure local privilege escalations, denial of service flaws for Windows 95, and seemingly arbitrary CVEs from Oracle CPUs.

Rory McCune at NCC was the second to note discrepancies in the top ten list. Rory zeroed in on the fact that one of Kenna’s top 10 was the FREAK TLS flaw which requires network man-in-the-middle, a vulnerable server, a vulnerable client to exploit, and substantial computational power to pull it off at scale. Additionally, successful exploitation produces no easily identifiable network signature. In the face of all this evidence against the widespread exploitation of FREAK, Kenna’s extraordinary claims require extraordinary evidence.

When questioned about similar errors in the 2015 DBIR, Kenna’s Chief Data Scientist Michael Rohytman explained, “the dataset is based on the correlation of ids exploit signatures with open vulns.” Rohytman later noted that disagreements about the data likely stem from differing opinions about the meaning of “successful exploitation.”

These statements show that the vulnerability data is unlike all other data used in the DBIR. Rather than the result of a confirmed data breach, the “successful exploit traffic” of these “mega-vulns” was synthesized by correlating vulnerability scanner output with intrusion detection system (IDS) alerts. The result of this correlation does not describe the frequency nor tactics of real exploits used in the wild.

Obfuscating with fake science

Faced with a growing chorus of criticism, Verizon and Kenna published a blog post that ignores critics, attempts to obfuscate their analysis with appeals to authority, substitutes jargon for a counterargument, and reiterates dangerous enterprise security policies from the report.

image05.png

Kenna’s blog post begins with appeals to authority and ad hominem attacks on critics

The first half of the Kenna blog post moves the goalposts. They present a new top ten list that, in many ways, is even more disconnected from data breaches than the original. Four of the ten are now Denial of Service (DoS) flaws which do not permit unauthorized access to data. Two more are FREAK which, if successfully exploited, only permit access to HTTPS traffic. Three are 15-year-old UPnP exploits that only affect Windows XP SP0 and lower. The final exploit is Heartbleed which, despite potentially devastating impact, can be traced to few confirmed data breaches since its discovery.

Kenna’s post does answer critics’ calls for the methodology used to define a ‘successful exploitation’: an “event” where 1) a scanner detects an open vulnerability, 2) an IDS triggers on that vulnerability, and 3) one or more post-exploitation indicators of compromise (IOCs) are triggered, presumably all on the same host. This approach fails to account for the biggest challenge with security products: false positives.

image02.png

Kenna is using a synthetic benchmark for successful exploitation based on IDS signatures

Flaws in the data

As mentioned earlier, the TLS FREAK vulnerability is the most prominent error in the DBIR’s Vulnerabilities section. FREAK requires special access as a network Man-in-the-Middle (MITM). Successful exploitation only downgrades the protections from TLS. An attacker would then have to factor a 512-bit RSA modulus to decrypt the session data; an attack that cost US$75 for each session around the time the report was in production. After decrypting the result, they’d just have a chat log; no access to either the client nor server devices. Given all this effort, the low pay-off, and the comparative ease and promise of other exploits, it’s impossible that the TLS FREAK flaw would have been one of the ten most exploited vulnerabilities in 2015.

The rest of the section’s data is based on correlations between intrusion detection systems and vulnerability scanners. This approach yields questionable results.

All available evidence (threat intel reports, the Microsoft SIR, etc.) show that real attacks occur on the client side: Office, PDF, Flash, Browsers, etc. These vulnerabilities, which figure so prominently in Microsoft data and DFIR reports about APTs, don’t appear in the DBIR. How come exploit kits and APTs are using Flash as a vector, yet Kenna’s top 10 fails to list a single Flash vulnerability? Because, by and large, these sorts of attacks are not visible to IDS nor vulnerability scanners. Kenna’s data comes from sources that cannot see the actual attacks.

Intrusion detection systems are designed to inspect traffic and apply a database of known signatures to the specific protocol fields. If a match appears, most products will emit an alert and move on to the next packet. This “first exit” mode helps with performance, but it can lead to attack shadowing, where the first signature to match the traffic generates the only alert. This problem gets worse when the first signature to match is a false positive.

The SNMP vulnerabilities reported by Kenna (CVE-2002-0012, CVE-2002-0013) highlight the problem of relying on IDS data. The IDS signatures for these vulnerabilities are often triggered by benign security scans and network discovery tools. It is highly unlikely that a 14-year old DoS attack would be one of the most exploited vulnerabilities across corporate networks.

Vulnerability scanners are notorious for false positives. These products often depend on credentials to gather system information, but fall back to less-reliable testing methods as a last resort. The UPnP issues reported by Kenna (CVE-2001-0877, CVE-2001-0876) are false positives from vulnerability scanning data. Similar to the SNMP issues, these vulnerabilities are often flagged on systems that are not Windows 98, ME, or XP, and are considered line noise by those familiar with vulnerability scanner output.

It’s unclear how the final step of Kenna’s three-step algorithm, detection of post-exploitation IOCs, supports correlation. In the republished top ten list, four of the vulnerabilities are DoS flaws and two enable HTTPS downgrades. What is a post-exploitation IOC for a DoS? In all of the cases listed, the target host would crash, stop receiving further traffic, and likely reboot. It’s more accurate to interpret post-exploitation IOCs to mean, “more than one IDS signature was triggered.”

The simplest explanation for Kenna’s results? A serious error in the correlation methodology.

Issues with the methodology

Kenna claims to have 200+ million successful exploit events in their dataset. In nearly all the cases we know about, attackers use very few exploits. Duqu duped Kaspersky with just two exploits. Phineas Phisher hacked Hacking Team with just one exploit. Stuxnet stuck with four exploits. The list goes on. There are not 50+ million breaches in a year. This is a sign of poor data quality. Working back from the three-step algorithm described earlier, I conclude that Kenna counted IDS signatures fired, not successful exploit events.

There are some significant limitations to relying on data collected from scanners and IDS. Of the thousands of companies that employ these devices -and who share the resulting data with Kenna- a marginal number go through the effort of configuring their systems properly. Without this configuration, the resulting data is a useless cacophony of false positives. Aggregating thousands of customers’ noisy datasets is no way to tune into a meaningful signal. But that’s precisely what Kenna asks the DBIR’s readers to accept as the basis for the Vulnerabilities section.

Let’s remember the hundreds of companies, public initiatives, and bots scanning the Internet. Take the University of Michigan’s Scans.io as one example. They scan the entire Internet dozens of times per day. Many of these scans would trigger Kenna’s three-part test to detect a successful exploit. Weighting the results by the number of times an IDS event triggers yields a disproportionate number of events. If the results aren’t normalized for another factor, the large numbers will skew results and insights.

Screen Shot 2016-05-05 at 11.57.35 PM

Kenna weighted their results by the number of IDS events

Finally, there’s the issue of enterprises running honeypots. A honeypot responds positively to any attempt to hack into it. This would also “correlate” with Kenna’s three-part algorithm. There’s no indication that such systems were removed from the DBIR’s dataset.

In the course of performing research, scientists frequently build models of how they think the real world operates, then back-test them with empirical data. High-quality sources of empirical exploit incidence data are available from US-CERT, which coordinates security incidents for all US government agencies, and Microsoft, which has unique data sources like Windows Defender and crash reports from millions of PCs. From their reports, only the Heartbleed vulnerability appears in Kenna’s list. The rest of the data and recommendations from US-CERT and Microsoft match. Neither of them agree with Kenna.

Ignore the DBIR’s vulnerability recommendations

“This is absolutely indispensable when we defenders are working together against a sentient attacker.” — Kenna Security

Even if you take the DBIR’s vulnerability analysis at face value, there’s no basis for assuming human attackers behave like bots. Scan and IDS data does not correlate to what real attackers would do. The only way to determine what attackers truly do is to study real attacks.

image07.png

image04.png

Kenna Security advocates a dangerous patch strategy based on faulty assumptions

Empirical data disagrees with this approach. Whenever new exploits and vulnerabilities come out, attacks spike. This misguided recommendation has the potential to cause real, preventable harm. In fact, the Vulnerabilities section of the DBIR both advocates this position and then refutes it only one page later.

image06.png

The DBIR presents faulty information on page 13…

image00.png

… then directly contradicts itself only one page later

Recommendations from this section fall victim to many of the same criticisms of pure vulnerability counting: they fail to consider the number of assets, the criticality of them, the impact of vulnerabilities, and how they are used by real attackers. Without acknowledging the source of the data, Verizon and Kenna walk the reader down a dangerous path.

Improvements for the 2017 DBIR

“It would be a shame if we lost the forest for the exploit signatures.”
— Michael Rohytman, Chief Data Scientist, Kenna

This closing remark from Kenna’s rebuttal encapsulates the issue: exploit signatures were used in lieu of data from real attacks. They skipped important steps while collecting data over the past year, jumped to assumptions based on scanners and IDS devices, and appeared to hope that their conclusions would align with what security professionals see on the ground. Above all, this incident demonstrates the folly of applying data science without sufficient input from practitioners. The resulting analysis and recommendations should not be taken seriously.

Kenna’s 2015 contribution to the DBIR received similar criticism, but they didn’t change for 2016. Instead, Verizon expanded the Vulnerability section and used it for the basis of recommendations. It’s alarming that Verizon and Kenna aren’t applying critical thinking to their own performance. They need to be more ambitious with how they collect and analyze their data.

Here’s how the Verizon 2017 DBIR could improve on its vulnerability reporting:

  1. Collect exploit data from confirmed data breaches. This is the basis for the rest of the DBIR’s data. Their analysis of exploits should be just as rigorous. Contrary to what I was told on Twitter, there is enough data to achieve statistical relevance. With the 2017 report a year away, there’s enough time to correct the processes of collecting and analyzing exploit data. Information about vulnerability scans and IDS signatures don’t serve the information security community, nor their customers.
  2. That said, if Verizon wants to take more time to refine the quality of the data they receive from their partners, why not partner with agent-based security vendors in the meantime? Host-based collection is far closer to exploits than network data. CrowdStrike, FireEye, Bit9, Novetta, Symantec and more all have agents on hosts that can detect successful exploitation based on process execution and memory inspection; more reliable factors.
  3. Finally, include a red team in the review process of future reports. It wasn’t until the 2014 DBIR that attackers’ patterns were separated into nine categories; a practice that practitioners had developed years earlier. That technique would have been readily available if the team behind the DBIR had spoken to practitioners who understand how to break and defend systems. Involving a red team in the review process would strengthen the report’s conclusions and recommendations.

Be better

For the 2016 DBIR, Verizon accepted a huge amount of low-quality data from a vendor. They reprinted the analysis verbatim. Clearly, no one who understands vulnerabilities was involved in the review process. The DBIR team tossed in some data-science vocab for credibility, and a few distracting jokes, and asked for readers’ trust.

Worse, Verizon stands behind the report, rather than acknowledge and correct the errors.

Professionals and businesses around the world depend on this report to make important security decisions. It’s up to Verizon to remain the dependable source for our industry.

I’d like to thank HD Moore, Thomas Ptacek, Grugq, Dan Rosenberg, Mike Russell, Kelly Shortridge, Rafael Turner, the entire team at Trail of Bits, and many others that cannot be cited for their contributions and comments on this blog post.

UPDATE 1:

Rory McCune has posted a followup where he notes a huge spike in Kenna’s observed exploitation of FREAK occurs at exactly the same time that the University of Michigan was scanning the entire internet for it. This supports the theory that benign internet-wide scans made it into Kenna’s data set where they were scaled by their frequency of occurrence.

Kenna's data on FREAK overlaps precisely with internet-wide scans from the University of Michigan

Kenna’s data on FREAK overlaps precisely with internet-wide scans from the University of Michigan

Further, an author of FREAK has publicly disclaimed any notion that it was widely exploited.

UPDATE 2:

Rob Graham has pointed out that typical IDS signatures for FREAK do not detect attacks but rather only detect TLS clients that offer weak cipher suites. This supports the theory that the underlying data was not inspected nor were practitioners consulted prior to using this data in the DBIR.

UPDATE 3:

Justin Kennedy has shared exploit data from five years of penetration tests conducted against his clients and noted that FREAK and Denial of Service attacks never once assisted compromising a target. This supports the theory that exploitation data in the DBIR distorts the view on the ground.

UPDATE 4:

Threatbutt has immortalized this episode with the release of their Danger Zone Incident Retort (DZIR).

UPDATE 5:

Karim Toubba, Kenna Security CEO, has posted a mea culpa on their blog. He notes that they did not confirm any of their analysis with their own product before delivering it to Verizon for inclusion in the DBIR.

What is the point of Kenna's contribution if it was not backed by their insights?

Kenna’s contribution to the DBIR was not validated by their own product

Further, Karim notes that their platform ranks FREAK as a “25 out of 100”, however, even this ranking is orders of magnitude too high based on the available evidence. This introduces the question of whether the problems exposed in Kenna’s analysis from the DBIR extend into their product as well.

Screen Shot 2016-05-12 at 1.40.36 PM

Kenna’s product prioritizes FREAK an order of magnitude higher than it likely should be

Finally, I consider the criticisms in this blog post applicable to their entire contribution to the DBIR and not only their “top ten successfully exploited vulnerabilities” list. Attempts to pigeonhole this criticism to the top ten miss the fact that other sections of the report are based on the same or similar data from Kenna.

UPDATE 6:

Verizon published their own mea culpa.

 

Hacker Handle Bounty

It’s time to close this chapter of our industry’s past. To distance ourselves from the World Wrestling Federation and comic book superheroes.

Hulk Hogan or Terry Bollea?

We’re talking about hacker handles: Dildog, Thomas Dullien, Matt Blaze etc.

When the Internet was young and fancy-free, hacker handles had their place. They afforded anonymity and supported the curious to explore the limits of this new frontier. They felt cool. Mysterious.

No more. When you’re at a security conference how does it feel when you refer to a hacker by her handle? Maybe a little awkward?

What’s more, Google’s Project Zero has shown that handles are dangerous when leaked.

“I retired my hacker handle in 2006. It wasn’t easy. I worried I’d feel exposed at conferences. Instead I felt a lightness almost immediately after going through with it. I was free! From the constraints of an identity that didn’t really fit me any longer. Free from a box that I’d built around myself without realizing it. If I’d known how good it would feel, I would’ve done it much earlier.”
– Alexander “Solar Eclipse” Sotirov, Co-Founder & CTO

Come out of the Shadows

Today, we’re launching a bounty on hacker handles. To participate, you reject your handle in the comments section of this post.

The bounty on offer: an exclusive invitation to an Italian dinner preceding the next Empire Hacking event, to be catered by yours truly. Expect tasty goodness.

Rewards Program

Once you retire your handle, you can earn points in two ways. First, you can post old tweets of yours that turned out to be wrong. The more erroneous, the more points you’ll earn. Second, you can refer your friends. Public outing is encouraged. It’s for the common good.

If, after three months, no one has seen you using your handle, and you’ve earned enough points, you’ll receive a black hat challenge coin.

Please note, if you retire your handle and change to another one later, you’ll owe us money. The fine will correspond to the number of points you’ve accrued so far, and the severity of the offending handle.

We’re calling for the retirement of these handles to help us launch the program:

  • WeldPond
  • Dildog
  • drag0rn
  • Mudge
  • Thomas Dullien
  • Gynvael Coldwind
  • Matt Blaze
  • Redpantz
  • Ian Beer
  • j00ru
  • lcamtuf
  • Simple Nomad
  • Invisigoth
  • Jolly
  • Rattle
  • Decius
  • Space Rogue
  • Solar Designer
  • HDM
  • Dark Tangent
  • Taylor Swift
  • JDuck
  • Travis Normandy

Join our bounty program

Nominate yourself, hacker friends and peers who still use handles. None will be turned away.

The Problem with Dynamic Program Analysis

Developers have access to tools like AddressSanitizer and Valgrind that will tell them when the code that they’re running accesses uninitialized memory, leaks memory, or uses memory after it’s been freed. Despite the availability of these excellent tools, memory bugs still persist, still get shipped to users, and still get exploited in the wild.

Most of today’s bug-finding finding tools are dynamic: they identify bugs in programs while those programs are running. This is great because all programs have massive test suites that exercise every line of code… right? Wrong. Large test suites are the exception, not the rule. Test suites definitely help find and reduce bugs, but bugs still get through.

Perhaps the solution is to pay to have your code audited by professionals. More eyes on your code is a good thing™, but the underlying issue remains. Analyses run inside the heads of experts are still “dynamic”: thinking through every code path is just not tractable.

So dynamic analyses can miss bugs because they can’t check every possible program path. What can check every possible program path?

Finding use-after-frees in millions of lines of code

We use static analysis to analyze millions of lines of code, without ever running the code. The analysis technique, called data-flow tracking, enables us to analyze and summarize properties about every possible program path. This solves the aforementioned problem of missing bugs that occur when certain program paths are not exercised.

How does an analysis that sees everything actually work? Below we describe the 1-2-3 of an actual whole-program static analysis tool that we developed and regularly use. The tool, PointsTo, finds and reports on potential use-after-free bugs in large codebases.

Step 1: Convert to LLVM bitcode

PointsTo operates on the LLVM bitcode representation of a program. We chose LLVM bitcode because it is a convenient intermediate representation for performing program analyses. Unsurprisingly, the first stage of our analysis pipeline converts a program’s source code into an LLVM bitcode database. We use an internal tool named CompInfo to produce these databases. An alternative, open-source tool for doing something similar is whole-program-llvm.

image04

Step 2: Create the data-flow graph

The key idea behind PointsTo is to analyze how pointers to allocated objects flow through the program. What we care about are assignments to and copies of pointers, pointer dereferences, and frees of pointers. These operations on pointers are represented using a data-flow graph.

Four steps to creating a data-flow graph

The most interesting step in the process is the why and how of transforming allocations and frees into special assignments. The “why” is that this transformation lets us repurpose an existing program analysis to find paths from FREE definitions to pointer dereferences. The how is more subtle: how does PointsTo know that it should change “new A” into an ALLOC and “delete a” into a FREE?

Imagine a hypothetical embedded system where programs are starved for memory and so the natural choice is to use a custom memory allocator called ration_memory. We created a Python modelling language to feed PointsTo information about higher-level function behaviors. Our modelling scripts tell PointsTo that “new A” returns a new object, and so we can use it to say the same thing about ration_memory.

Segue: Hidden data flows

The transformation from source code into a data flow graph looked pretty simple, but that was because the source code we started with was simple. It had no function calls, and more importantly, it had no function pointers or method calls! What happens if callback below is a function pointer? What happens if callback frees x?

int *x = malloc(4);
callback(x);
*x += 1;

This is the secret sauce and namesake of PointsTo: we perform a context- and path-sensitive pointer analysis that tells us which function pointers point to which functions and when. Altogether, we can produce an error report that follows x through callback and back again.

Step 3: Dénouement

It’s time to report potential errors for expert analysis. PointsTo searches through the data-flow graph, looking for flows from assignments to FREE down to dereferences. These flows are converted into a program slice of the source code lines, showing the path that execution needs to follow in order to produce a use-after-free. Here’s an example program slice of a real bug:

LightHTTPD Use-After-Free

When describing this system to compiler folks, the usual first question is: but what about false-positives? What if we get a report about a use-after-free and it isn’t one? Here is where the priorities of program analysis for compilers and for vulnerabilities diverge.

False-positives in a compiler analysis can introduce bugs, and so compilers are usually conservative. That is, they trade false-positives for false-negatives. They might miss some optimization opportunities because they can’t prove something, but at least the program will be compiled correctly *cough*.

For vulnerability analysis, this is a bad trade. False-positives in a vulnerability analysis are inconvenient, but they’re a drop in the ocean when millions of lines of code need to be looked at. False-negatives, however, are unacceptable. A false-negative is a bug that is missed and might make it to production. A tool that always finds the bug and sometimes warns you about sketchy but correct code is an investment that saves time and money during code audits.

In summary, we conclude

Analyzing programs for bugs is hard. Industry best-practices like having extensive test suites should be followed. Developers should regularly run their programs through dynamic analysis tools to pick the low-hanging fruit. But more importantly, developers should understand that test suites and dynamic analyses are not a panacea. Bugs have a nasty habit of hiding behind rarely executed code paths. That’s why all paths need to be looked at. That’s why we made PointsTo.

PointsTo was a topic of discussion at a recent Empire Hacking, a bi-monthly meetup in NYC. The talk I gave there includes more information about the design and implementation of PointsTo and, for curious readers, the slides and video are reproduced below. We hope to release more videos from Empire Hacking in the future.

PointsTo was originally produced for Cyber Fast Track and we would like to thank DARPA for funding our work. Consultants at Trail of Bits use PointsTo and other internal tools for application security reviews. Contact us if you’re interested in a detailed audit of your code supported by tools like PointsTo and our CRS.

 

Apple can comply with the FBI court order

Earlier today, a federal judge ordered Apple to comply with the FBI’s request for technical assistance in the recovery of the San Bernadino gunmen’s iPhone 5C. Since then, many have argued whether these requests from the FBI are technically feasible given the support for strong encryption on iOS devices. Based on my initial reading of the request and my knowledge of the iOS platform, I believe all of the FBI’s requests are technically feasible.

The FBI’s Request

In a search after the shooting, the FBI discovered an iPhone belonging to one of the attackers. The iPhone is the property of the San Bernardino County Department of Public Health where the attacker worked and the FBI has permission to search it. However, the FBI has been unable, so far, to guess the passcode to unlock it. In iOS devices, nearly all important files are encrypted with a combination of the phone passcode and a hardware key embedded in the device at manufacture time. If the FBI cannot guess the phone passcode, then they cannot recover any of the messages or photos from the phone.

There are a number of obstacles that stand in the way of guessing the passcode to an iPhone:

  • iOS may completely wipe the user’s data after too many incorrect PINs entries
  • PINs must be entered by hand on the physical device, one at a time
  • iOS introduces a delay after every incorrect PIN entry

As a result, the FBI has made a request for technical assistance through a court order to Apple. As one might guess, their requests target each one of the above pain points. In their request, they have asked for the following:

  1. [Apple] will bypass or disable the auto-erase function whether or not it has been enabled;
  2. [Apple] will enable the FBI to submit passcodes to the SUBJECT DEVICE for testing electronically via the physical device port, Bluetooth, Wi-Fi, or other protocol available on the SUBJECT DEVICE; and
  3. [Apple] will ensure that when the FBI submits passcodes to the SUBJECT DEVICE, software running on the device will not purposefully introduce any additional delay between passcode attempts beyond what is incurred by Apple hardware.

In plain English, the FBI wants to ensure that it can make an unlimited number of PIN guesses, that it can make them as fast as the hardware will allow, and that they won’t have to pay an intern to hunch over the phone and type PIN codes one at a time for the next 20 years — they want to guess passcodes from an external device like a laptop or other peripheral.

As a remedy, the FBI has asked for Apple to perform the following actions on their behalf:

[Provide] the FBI with a signed iPhone Software file, recovery bundle, or other Software Image File (“SIF”) that can be loaded onto the SUBJECT DEVICE. The SIF will load and run from Random Access Memory (“RAM”) and will not modify the iOS on the actual phone, the user data partition or system partition on the device’s flash memory. The SIF will be coded by Apple with a unique identifier of the phone so that the SIF would only load and execute on the SUBJECT DEVICE. The SIF will be loaded via Device Firmware Upgrade (“DFU”) mode, recovery mode, or other applicable mode available to the FBI. Once active on the SUBJECT DEVICE, the SIF will accomplish the three functions specified in paragraph 2. The SIF will be loaded on the SUBJECT DEVICE at either a government facility, or alternatively, at an Apple facility; if the latter, Apple shall provide the government with remote access to the SUBJECT DEVICE through a computer allowed the government to conduct passcode recovery analysis.

Again in plain English, the FBI wants Apple to create a special version of iOS that only works on the one iPhone they have recovered. This customized version of iOS (*ahem* FBiOS) will ignore passcode entry delays, will not erase the device after any number of incorrect attempts, and will allow the FBI to hook up an external device to facilitate guessing the passcode. The FBI will send Apple the recovered iPhone so that this customized version of iOS never physically leaves the Apple campus.

As many jailbreakers are familiar, firmware can be loaded via Device Firmware Upgrade (DFU) Mode. Once an iPhone enters DFU mode, it will accept a new firmware image over a USB cable. Before any firmware image is loaded by an iPhone, the device first checks whether the firmware has a valid signature from Apple. This signature check is why the FBI cannot load new software onto an iPhone on their own — the FBI does not have the secret keys that Apple uses to sign firmware.

Enter the Secure Enclave

Even with a customized version of iOS, the FBI has another obstacle in their path: the Secure Enclave (SE). The Secure Enclave is a separate computer inside the iPhone that brokers access to encryption keys for services like the Data Protection API (aka file encryption), Apple Pay, Keychain Services, and our Tidas authentication product. All devices with TouchID (or any devices with A7 or later A-series processors) have a Secure Enclave.

When you enter a passcode on your iOS device, this passcode is “tangled” with a key embedded in the SE to unlock the phone. Think of this like the 2-key system used to launch a nuclear weapon: the passcode alone gets you nowhere. Therefore, you must cooperate with the SE to break the encryption. The SE keeps its own counter of incorrect passcode attempts and gets slower and slower at responding with each failed attempt, all the way up to 1 hour between requests. There is nothing that iOS can do about the SE: it is a separate computer outside of the iOS operating system that shares the same hardware enclosure as your phone.

The Hardware Key is stored in the Secure Enclave in A7 and newer devices

The Hardware Key is stored in the Secure Enclave in A7 and newer devices

As a result, even a customized version of iOS cannot influence the behavior of the Secure Enclave. It will delay passcode attempts whether or not that feature is turned on in iOS. Private keys cannot be read out of the Secure Enclave, ever, so the only choice you have is to play by its rules.

Passcode delays are enforced by the Secure Enclave in A7 and newer devices

Passcode delays are enforced by the Secure Enclave in A7 and newer devices

Apple has gone to great lengths to ensure the Secure Enclave remains safe. Many consumers became familiar with these efforts after “Error 53” messages appeared due to 3rd party replacement or tampering with the TouchID sensor. iPhones are restricted to only work with a single TouchID sensor via device-level pairing. This security measure ensures that attackers cannot build a fraudulent TouchID sensor that brute-forces fingerprint authentication to gain access to the Secure Enclave.

For more information about the Secure Enclave and Passcodes, see pages 7 and 12 of the iOS Security Guide.

The Devil is in the Details

“Why not simply update the firmware of the Secure Enclave too?” I initially speculated that the private data stored within the SE was erased on updates, but I now believe this is not true. Apple can update the SE firmware, it does not require the phone passcode, and it does not wipe user data on update. Apple can disable the passcode delay and disable auto erase with a firmware update to the SE. After all, Apple has updated the SE with increased delays between passcode attempts and no phones were wiped.

If the device lacks a Secure Enclave, then a single firmware update to iOS will be sufficient to disable passcode delays and auto erase. If the device does contain a Secure Enclave, then two firmware updates, one to iOS and one to the Secure Enclave, are required to disable these security features. The end result in either case is the same. After modification, the device is able to guess passcodes at the fastest speed the hardware supports.

The recovered iPhone is a model 5C. The iPhone 5C lacks TouchID and, therefore, lacks a Secure Enclave. The Secure Enclave is not a concern. Nearly all of the passcode protections are implemented in software by the iOS operating system and are replaceable by a single firmware update.

The End Result

There are still caveats in these older devices and a customized version of iOS will not immediately yield access to the phone passcode. Devices with A6 processors, such as the iPhone 5C, also contain a hardware key that cannot ever be read. This key is also “tangled” with the phone passcode to create the encryption key. However, there is nothing that stops iOS from querying this hardware key as fast as it can. Without the Secure Enclave to play gatekeeper, this means iOS can guess one passcode every 80ms.

Passcodes can only be guessed once every 80ms

Passcodes can only be guessed once every 80ms with or without the Secure Enclave

Even though this 80ms limit is not ideal, it is a massive improvement from guessing only one passcode per hour with unmodified software. After the elimination of passcode delays, it will take a half hour to recover a 4-digit PIN, hours to recover a 6-digit PIN, or years to recover a 6-character alphanumeric password. It has not been reported whether the recovered iPhone uses a 4-digit PIN or a longer, more complicated alphanumeric passcode.

Festina Lente

Apple has allegedly cooperated with law enforcement in the past by using a custom firmware image that bypassed the passcode lock screen. This simple UI hack was sufficient in earlier versions of iOS since most files were unencrypted. However, since iOS 8, it has become the default for nearly all applications to encrypt their data with a combination of the phone passcode and the hardware key. This change necessitates guessing the passcode and has led directly to this request for technical assistance from the FBI.

I believe it is technically feasible for Apple to comply with all of the FBI’s requests in this case. On the iPhone 5C, the passcode delay and device erasure are implemented in software and Apple can add support for peripheral devices that facilitate PIN code entry. In order to limit the risk of abuse, Apple can lock the customized version of iOS to only work on the specific recovered iPhone and perform all recovery on their own, without sharing the firmware image with the FBI.


For more information, please listen to my interview with the Risky Business podcast.

  • Update 1: Apple has issued a public response to the court order.
  • Update 2: Software updates to the Secure Enclave are unlikely to erase user data. Please see the Secure Enclave section for further details.
  • Update 3: Reframed “The Devil is in the Details” section and noted that Apple can equally subvert the security measures of the iPhone 5C and later devices that include the Secure Enclave via software updates.
Follow

Get every new post delivered to your Inbox.

Join 5,754 other followers