Windows network security now easier with osquery

Today, Facebook announced the successful completion of our work: osquery for Windows.

“Today, we’re excited to announce the availability of an osquery developer kit for Windows so security teams can build customized solutions for their Windows networks… This port of osquery to Windows gives you the ability to unify endpoint defense and participate in an active open source community ready to share experiences and stories.”
Introducing osquery for Windows

osquery for Windows works with Doorman

The Windows version of osquery can talk to existing osquery fleet management tools, such as doorman. osquery for Windows has full support for TLS remote endpoints and certificate validation, just like the Unix version. In this screenshot, we are using an existing doorman instance to find all running processes on a Windows machine.

How we ported osquery to Windows

This port presented several technical challenges, which we always enjoy. Some of the problems were general POSIX to Windows porting issues, while others were unique to osquery.

Let’s start with the obvious POSIX to Windows differences:

  • Paths are different — no more ‘/’ as the path separator.
  • There are no signals.
  • Unix domain sockets are now named pipes.
  • There’s no glob() — we had to approximate the functionality.
  • Windows doesn’t fork() — the process model is fundamentally different. osquery forks worker processes. We worked around this by abstracting the worker process functionality.
  • There’s no more simple integer uid or gid values — instead you have SIDs, ACLs and DACLs.
  • And you can forget about the octal file permissions model — or use the approximation we created.

Then, the less-obvious problems: osquery is a daemon. In Windows, daemons are services, which expect a special interface and are launched by the service control manager. We added service functionality to osquery, and provided a script to register and remove the service. The parent-child process relationship is different — there is no getppid() equivalent, but osquery worker processes needed to know if their parent stopped working, or if a shutdown event was triggered in the parent process.

Deeper still, we found some unexpected challenges:

  • Some code that builds on clang/gcc just won’t build on Visual Studio.
  • Certain function attributes like __constructor__() have no supported Visual Studio equivalent. The functionality had to be re-created.
  • Certain standard library functions have implementation defined behavior — for instance, fopen will open a directory for reading on Unix-based systems, but will fail on Windows.

Along the way, we also had to ensure that every library that osquery depends on worked on Windows, too. This required fixing some bugs and making substitutions, like using linenoise-ng instead of GNU readline. There were still additional complexities: the build system had to accommodate a new OS, use Windows libraries, paths, compiler options, appropriate C runtime, etc.

This was just the effort to get the osquery core running. The osquery tables – the code that retrieves information from the local machine – present their own unique challenges. For instance, the processes table needed to be re-implemented on Windows. This table retrieves information about processes currently running on the system. It is a requirement for the osquery daemon to function. To implement this table, we created a generic abstraction to the Windows Management Instrumentation (WMI), and used existing WMI functionality to retrieve the list of running processes. We hope that this approach will support the creation of many more tables to tap into the vast wealth of system instrumentation data that WMI offers.

osqueryi works on Windows too!

osqueryi, the interactive osquery shell, also works on Windows. In this screenshot we are using osquery to query the list of running processes and the cryptographic hash of a file.

The port was worth the effort

Facebook sparked a lot of excitement when it released osquery in 2014.

The open source endpoint security tool allows an organization to treat its infrastructure as a database, turning operating system information into a format that can be queried using SQL-like statements. This functionality is invaluable for performing incident response, diagnosing systems operations problems, ensuring baseline security settings, and more.

It fundamentally changed security for environments running Linux distributions such as Ubuntu or CentOS, or for deployments of Mac OS X machines.

But if you were running a Windows environment, you were out of luck.

To gather similar information, you’d have to cobble together a manual solution, or pay for a commercial product, which would be expensive, force vendor reliance, and lock your organization into using a proprietary -and potentially buggy– agent. Since most of these services are cloud-based, you’d also risk exposing potentially sensitive data.

Today, that’s no longer the case.

Disruption for the endpoint security market?

Because osquery runs on all three major desktop/server platforms, the open-source community can supplant proprietary, closed, commercial security and monitoring systems with free, community-supported alternatives. (Just one more example of how Facebook’s security team accounts for broader business challenges.)

We’re excited about the potential:

  • Since osquery is cross platform, network administrators will be able to monitor complex operating system states across their entire infrastructure. For those already running an osquery deployment, they’ll be able to seamlessly integrate their Windows machines, allowing for far greater efficiency in their work.
  • We envision startups launching without the need to develop agents that collect this rich set of data first, as Kolide.co has already done. We’re excited to see what’s built from here.
  • More vulnerable organizations -groups that can’t afford the ‘Apple premium,’ or don’t use Linux- will be able to secure their systems to a degree that wasn’t possible before.

Get started with osquery

osquery for Windows is only distributed via source code. You must build your own osquery. To do that, please see the official Building osquery for Windows guide.

Currently osquery will only build on Windows 10, the sole prerequisite. All other dependencies and build tools will be automatically installed as a part of the provisioning and building process.

There is an open issue to create an osquery chocolatey package, to allow for simple package management-style installation of osquery.

If you want our help modifying osquery’s code base for your organization, contact us.

Learn more about porting applications to Windows

We will be writing about the techniques we applied to port osquery to Windows soon. Follow us on Twitter and subscribe to our blog with your favorite RSS reader for more updates.

Plug into New York’s Infosec Community

Between the city’s size and the wide spectrum of the security industry, it’s easy to feel lost. Where are ‘your people?’ How can you find talks that interest you? You want to spend your time meeting and networking, not researching your options.

So, we put together a directory of all of the infosec gatherings, companies, and university programs in NYC that we know of at nyc-infosec.com.

Why’d we create this site?

We’re better than this. Today, when investors think ‘east coast infosec,’ they think ‘Boston.’ We believe that NYC’s infosec community deserves more recognition on the national and international stages. That will come as we engage with one another to generate more interesting work.

We need breaks from routine. It’s easy to stay uptown or downtown, or only go to forensics or software security meetings. If you don’t know what’s out there, you don’t know what you’re missing out on.

We all benefit from new ideas. That’s why we started Empire Hacking. We want to help more people learn about topics that excite and inspire action.

We want to coax academics off campus. A lot of exciting research takes place in this city. We want researchers to find the groups that will be the most interested in their work. Conversely, industry professionals have much to learn from emerging academic innovations and we hope to bring them together.

Check out a new group this month

Find infosec events, companies, and universities in the city on nyc-infosec.com. If you’re not sure where to start, we recommend:

Empire Hacking (new website!)
Information security professionals gather at this semi-monthly meetup to discuss pragmatic security research and new discoveries in attack and defense over drinks and light food.

New York Enterprise Information Security Group
Don’t be fooled by the word ‘enterprise.’ This is a great place for innovative start-ups to get their ideas in front of prospective early clients. David Raviv has created a great space to connect directly with technical people working at smart, young companies.

SummerCon
Ah, SummerCon. High-quality, entertaining talks. Inexpensive tickets. Bountiful booze. Somehow, they manage to pull together an excellent line-up of speakers each year. This attracts a great crowd, ranging from “hackers to feds to convicted felons and concerned parents.”

O’Reilly Security
Until now, New York didn’t really have one technical, pragmatic, technology-focused security conference. This newcomer has the potential to fill that gap. It looks like O’Reilly has put a lot of resources behind it. If it turns out well for them (fingers crossed), we hope that they’ll plan more events just like it.

What’d we miss?

If you know of an event that should be on the list, please let us know on the Empire Hacking Slack.

Work For Us: Fall and Winter Internship Opportunities

If you’re studying in a degree program, and you thrive at the intersection of software development and cyber security, you should apply to our fall or winter internship programs. It’s a great way to add paid experience -and a publication- to your resume, and get a taste of what it’s like to work in a commercial infosec setting.

You’d work remotely through the fall semester or over winter break on a meaningful problem to produce or improve tools that we -Trail of Bits and the InfoSec community- need to make security better. Your work won’t culminate in a flash-in-the-pan report for an isolated problem. It will contribute to a measurable impact on modern security problems.

Two Ex-Interns Share Their Experiences

  • Sophia D’Antoine -now one of our security engineers- spent her internship working on part of what would later become MAST.
  • Evan Jensen accepted a job at MIT Lincoln Labs as a security researcher before interning for us, and still credits the experience as formative.

Why did you take this internship over others?

SD: I wasn’t determined to take a winter internship until I heard about the type of work that I could do at Trail of Bits. I’d get my own project, not just a slice of someone else’s. The chance to take responsibility for something that could have a measurable impact was very appealing. It didn’t hurt that ToB’s reputation would add some weight to my resumé.

EJ: I saw this as a chance to extend the class I took from Dan: “Penetration Testing and Vulnerability Analysis.” Coincidentally, I lined up a summer internship in the office while Dan was there. As soon as he suggested I tell my interviewer what I was working on in class, the interview ended with an offer for the position.

What did you work on during your internship?

SD: MAST’s obfuscating passes that transform the code. This wasn’t anywhere near the focus of my degree; I was studying electrical engineering. But I was playing CTFs for fun, and [ToB] liked that I was willing to teach myself. I didn’t want a project that could just be researched with Google.

EJ: I actually did two winternships at ToB. During my first, I analyzed malware that the “APT1” group was used in their intrusion campaigns. During my second, I worked on generating training material for a CTF-related DARPA grant that eventually became the material in the CTF Field Guide.

What was your experience like?

SD: It was great. I spent my entire break working on my project, and loved it. I like to have an end-goal and parameters, and the independence to research and execute. The only documentation I could find for my project was the LLVM compiler’s source code. There was no tutorial online to build an obfuscator like MAST. Beyond the technical stuff, I learned about myself, the conditions where I work best, and the types of projects that interest me most.

EJ: Working at ToB was definitely enlightening. It was the first time I actually got to use a licensed copy of IDA Pro. It was great working with other established hackers. They answered every question I could think of. I learned a lot about how to describe the challenges reverse engineers face and I picked up a few analysis tricks, too.

Why would you recommend this internship to students?

SD: So many reasons. It wasn’t a lot of little tasks. You own one big project. You start it. You finish it. You create something valuable. It’s cool paid work. It’s intellectually rewarding. You learn a lot. ToB is one of the best companies to have on your resume; it’s great networking.

EJ: People will never stop asking you about Trail of Bits.

Here’s What You Might Work On

We always have a variety of projects going on, and tools that could be honed. Your project will be an offshoot of our work, such as:

  • Our Cyber Reasoning System (CRS) -which we developed for the Cyber Grand Challenge and currently used for paid engagements- has potential to do a lot more. This is a really complicated distributed system with a multitude of open source components at play, including symbolic executors, x86 lifting, dynamic binary translation, and more.
  • PointsTo, an LLVM-based static analysis that discovers object life-cycle (e.g. use-after-free) vulnerabilities in large software projects such as web browsers and network servers. Learn more.
  • McSema, an open-source framework that performs static translation of x86 and x86-64 binaries to the LLVM intermediate representation. McSema enables existing LLVM-based program analysis tools to operate on binary code. See the code.
  • MAST, a collection of several whole-program transformations using the LLVM compiler infrastructure as a platform for iOS software obfuscation and protection.

In general, you’ll make a meaningful contribution to the development of reverse engineering and software analysis tools. Not many places promise that kind of work to interns, nor pay for it.

Interested?

You must have experience with software development. We want you to help us refine tools that find and fix problems for good, not just for a one-time report. Show us code that you’ve written, examples on GitHub, or CTF write-ups you’ve published.

You must be motivated. You’ll start with a clear project and an identified goal. How you get there is up to you. Apart from in-person kick-off and debrief meetings in our Manhattan offices, you will work remotely.

But you won’t be alone. We take advantage of all the latest technology to get work done, including platforms like Slack, Github, Trello and Hangouts. You’ll find considerable expertise available to you. We’ll do our best to organize everything we can up front so that you’re positioned for success. We will make good use of your time and effort.

If you’re headed into the public sector -maybe you took a Scholarship For Service- you may be wondering what it’s like to work in a commercial firm. If you want some industry experience before getting absorbed into a government agency, intern with us.

A fuzzer and a symbolic executor walk into a cloud

Finding bugs in programs is hard. Automating the process is even harder. We tackled the harder problem and produced two production-quality bug-finding systems: GRR, a high-throughput fuzzer, and PySymEmu (PSE), a binary symbolic executor with support for concrete inputs.

From afar, fuzzing is a dumb, brute-force method that works surprisingly well, and symbolic execution is a sophisticated approach, involving theorem provers that decide whether or not a program is “correct.” Through this lens, GRR is the brawn while PSE is the brains. There isn’t a dichotomy though — these tools are complementary, and we use PSE to seed GRR and vice versa.

Let’s dive in and see the challenges we faced when designing and building GRR and PSE.

GRR, the fastest fuzzer around

GRR is a high speed, full-system emulator that we use to fuzz program binaries. A fuzzing “campaign” involves executing a program thousands or millions of times, each time with a different input. The hope is that spamming a program with an overwhelming number of inputs will result in triggering a bug that crashes the program.

Note: GRR is pronounced with two fists held in the air

During DARPA’s Cyber Grand Challenge, we went web-scale and performed tens of billions of input mutations and program executions — in only 24 hours! Below are the challenges we faced when making this fuzzer, and how we solved those problems.

  1. Throughput. Typically, program fuzzing is split into discrete steps. A sample input is given to an input “mutator” which produces input variants. In turn, each variant is separately tested against the program in the hopes that the program will crash or execute new code. GRR internalizes these steps, and while doing so, completely eliminates disk I/O and program analysis ramp-up times, which represent a significant portion of where time is spent during a fuzzing campaign with other common tools.
  2. Transparency. Transparency requires that the program being fuzzed cannot observe or interfere with GRR. GRR achieves transparency via perfect isolation. GRR can “host” multiple 32-bit x86 processes in memory within its 64-bit address space. The instructions of each hosted process are dynamically rewritten as they execute, guaranteeing safety while maintaining operational and behavioral transparency.
  3. Reproducibility. GRR emulates both the CPU architecture and the operating system, thereby eliminating sources of non-determinism. GRR records program executions, enabling any execution to be faithfully replayed. GRR’s strong determinism and isolation guarantees let us combine the strengths of GRR with the sophistication of PSE. GRR can snapshot a running program, enabling PSE to jump-start symbolic execution from deep within a given program execution.

PySymEmu, the PhD of binary symbolic execution

Symbolic execution as a subject is hard to penetrate. Symbolic executors “reason about” every path through a program, there’s a theorem prover in there somewhere, and something something… bugs fall out the other end.

At a high level, PySymEmu (PSE) is a special kind of CPU emulator: it has a software implementation for almost every hardware instruction. When PSE symbolically executes a binary, what it really does is perform all the ins-and-outs that the hardware would do if the CPU itself was executing the code.

frankenpse

PSE explores the relationship between the life and death of programs in an unorthodox scientific experiment

CPU instructions operate on registers and memory. Registers are names for super-fast but small data storage units. Typically, registers hold four to eight bytes of data. Memory on the other hand can be huge; for a 32-bit program, up to 4 GiB of memory can be addressed. PSE’s instruction simulators operate on registers and memory too, but they can do more than just store “raw” bytes — they can store expressions.

A program that consumes some input will generally do the same thing every time it executes. This happens because that “concrete” input will trigger the same conditions in the code, and cause the same loops to merry-go-round. PSE operates on symbolic input bytes: free variables that can initially take on any value. A fully symbolic input can be any input and therefore represents all inputs. As PSE emulates the CPU, if-then-else conditions impose constraints on the originally unconstrained input symbols. An if-then-else condition that asks “is input byte B less than 10” will constrain the symbol for B to be in the range [0, 10) along the true path, and to be in the range [10, 256) along the false path.

If-then-elses are like forks in the road when executing a program. At each such fork, PSE will ask its theorem prover: “if I follow the path down one of the prongs of the fork, then are there still inputs that satisfy the additional constraints imposed by that path?” PSE will follow each yay path separately, and ignore the nays.

So, what challenges did we face when creating and extending PSE?

  1. Comprehensiveness. Arbitrary program binaries can exercise any one of thousands of the instructions available to x86 CPUs. PSE implements simulation functions for hundreds of x86 instructions. PSE falls back onto a custom, single-instruction “micro-executor” in those cases where an instruction emulation is not or cannot be provided. In practice, this setup enables PSE to comprehensively emulate the entire CPU.
  2. Scale. Symbolic executors try to follow all feasible paths through a program by forking at every if-then-else condition, and constraining the symbols one way or another along each path. In practice, there are an exponential number of possible paths through a program. PSE handles the scalability problem by selecting the best path to execute for the given execution goal, and by distributing the program state space exploration process across multiple machines.
  3. Memory. Symbolic execution produces expressions representing simple operations like adding two symbolic numbers together, or constraining the possible values of a symbol down one path of an if-then-else code block. PSE gracefully handles the case where addresses pointing into memory are symbolic. Memory accessed via a symbolic address can potentially point anywhere — even point to “good” and “bad” (i.e. unmapped) memory.
  4. Extensibility. PSE is written using the Python programming language, which makes it easy to hack on. However, modifying a symbolic executor can be challenging — it can be hard to know where to make a change, and how to get the right visibility into the data that will make the change a success. PSE includes smart extension points that we’ve successfully used for supporting concolic execution and exploit generation.

Measuring excellence

So how do GRR and PSE compare to the best publicly available tools?

GRR

GRR is both a dynamic binary translator and fuzzer, and so it’s apt to compare it to AFLPIN, a hybrid of the AFL fuzzer and Intel’s PIN dynamic binary translator. During the Cyber Grand Challenge, DARPA helpfully provided a tutorial on how to use PIN with DECREE binaries. At the time, we benchmarked PIN and found that, before we even started optimizing GRR, it was already twice as fast as PIN!

The more important comparison metric is in terms of bug-finding. AFL’s mutation engine is smart and effective, especially in terms of how it chooses the next input to mutate. GRR internalizes Radamsa, another too-smart mutation engine, as one of its many input mutators. Eventually we may also integrate AFL’s mutators. During the qualifying event, GRR went face-to-face with AFL, which was integrated into the Driller bug-finding system. Our combination of GRR+PSE found more bugs. Beyond this one data point, a head-to-head comparison would be challenging and time-consuming.

PySymEmu

PSE can be most readily compared with KLEE, a symbolic executor of LLVM bitcode, or the angr binary analysis platform. LLVM bitcode is a far cry from x86 instructions, so it’s an apples-to-oranges comparison. Luckily we have McSema, our open-source and actively maintained x86-to-LLVM bitcode translator. Our experiences with KLEE have been mostly negative; it’s hard to use, hard to hack on, and it only works well on bitcode produced by the Clang compiler.

Angr uses a customized version of the Valgrind VEX intermediate representation. Using VEX enables angr to work on many different platforms and architectures. Many of the angr examples involve reverse engineering CTF challenges instead of exploitation challenges. These RE problems often require manual intervention or state knowledge to proceed. PSE is designed to try to crash the program at every possible emulated instruction. For example PSE will use its knowledge of symbolic memory to access any possible invalid array-like memory accesses instead of just trying to solve for reaching unconstrained paths. During the qualifying event, angr went face-to-face with GRR+PSE and we found more bugs. Since then, we have improved PSE to support user interaction, concrete and concolic execution, and taint tracking.

I’ll be back!

Automating the discovery of bugs in real programs is hard. We tackled this challenge by developing two production-quality bug-finding tools: GRR and PySymEmu.

GRR and PySymEmu have been a topic of discussion in recent presentations about our CRS, and we suspect that these tools may be seen again in the near future.

Your tool works better than mine? Prove it.

No doubt, DARPA’s Cyber Grand Challenge (CGC) will go down in history for advancing the state of the art in a variety of fields: symbolic execution, binary translation, and dynamic instrumentation, to name a few. But there is one contribution that we believe has been overlooked so far, and that may prove to be the most useful of them all: the dataset of challenge binaries.

Until now, if you wanted to ‘play along at home,’ you would have had to install DECREE, a custom Linux-derived operating system that has no signals, no shared memory, no threads, and only seven system calls. Sound like a hassle? We thought so.

One metric for all tools

Competitors in the Cyber Grand Challenge identify vulnerabilities in challenge binaries (CBs) written for DECREE on the 32-bit Intel x86 architecture. Since 2014, DARPA has released the source code for over 100 of these vulnerable programs. These programs were specifically designed with vulnerabilities that represent a wide variety of software flaws. They are more than simple test cases, they approximate real software with enough complexity to stress both manual and automated vulnerability discovery.

If the CBs become widely adopted as benchmarks, they could change the way we solve security problems. This mirrors the rapid evolution of the SAT and ML communities once standardized benchmarks and regular competitions were established. The challenge binaries, valid test inputs, and sample vulnerabilities create an industry standard benchmark suite for evaluating:

  • Bug-finding tools
  • Program-analysis tools (e.g. automated test coverage generation, value range analysis)
  • Patching strategies
  • Exploit mitigations

The CBs are a more robust set of tests than previous approaches to measuring the quality of software analysis tools (e.g. SAMATE tests, NSA Juliet tests, or the STONESOUP test cases). First, the CBs are complex programs like games, content management systems, image processors, and so on, instead of just snippets of vulnerable code. After all, to be effective, analysis tools must process real software with a fairly low bug density, not direct snippets of vulnerable code. Second, unlike open source projects with added bugs, we have very high confidence all the bugs in the CBs have been found, so analysis tools can be compared to an objective standard. Finally, the CBs also come with extensive functionality tests, triggers for introduced bugs, patches, and performance monitoring tools, enabling benchmarking of patching tools and bug mitigation strategies.

Creating an industry standard benchmarking set will solve several problems that hamper development of future program analysis tools:

First, the absence of standardized benchmarks prevents an objective determination of which tools are “best.” Real applications don’t come with triggers for complex bugs, nor an exhaustive list of those bugs. The CBs provide metrics for comparison, such as:

  • Number of bugs found
  • Number of bugs found per unit of time or memory
  • Categories of bugs found and missed
  • Variances in performance from configuration options

Next, which mitigations are most effective? CBs come with inputs that stress original program functionality, inputs that check for the presence of known bugs, and performance measuring tools. These allow us to explore questions like:

  • What is the potential effectiveness and performance impact of various bug mitigation strategies (e.g. Control Flow Integrity, Code Pointer Integrity, Stack Cookies, etc)?
  • How much slower does the resulting program run?
  • How good is a mitigation compared to a real patch?

Play Along At Home

The teams competing in the CGC have had years to hone and adapt their bug-finding tools to the peculiarities of DECREE. But the real world doesn’t run on DECREE; it runs on Windows, Mac OS X, and Linux. We believe that research should be guided by real-world challenges and parameters. So, we decided to port* the challenge binaries to run in those environments.

It took us several attempts to find the best porting approach to minimize the amount of code changes, while preserving as much original code as possible between platforms. The eventual solution was fairly straightforward: build each compilation unit without standard include files (as all CBs are statically linked), implement CGC system calls using their native equivalents, and perform various minor fixes to make the code compatible with more compilers and standard libraries.

We’re excited about the potential of multi-platform CBs on several fronts:

  • Since there’s no need to set up a virtual machine just for DECREE, you can run the CBs on the machine you already have.
  • With that hurdle out of the way, we all now have an industry benchmark to evaluate program analysis tools. We can make comparisons such as:
    • How good are the CGC tools vs. existing program analysis and bug finding tools
    • When a new tool is released, how does it stack up against the current best?
    • Do static analysis tools that work with source code find more bugs than dynamic analysis tools that work with binaries?
    • Are tools written for Mac OS X better than tools written for Linux, and are they better than tools written for Windows?
  • When researchers open source their code, we can evaluate how well their findings work for a particular OS or compiler.

Before you watch the competitors’ CRSs duke it out, explore the challenges that the robots will attempt to solve in an environment you’re familiar with.

Get the CGC’s Challenge Binaries in the most common operating systems.

* Big thanks to our interns, Kareem El-Faramawi and Loren Maggiore, for doing the porting, and to Artem, Peter, and Ryan for their support.

Why I didn’t catch any Pokemon today

tl;dr While the internet went crazy today, we went fact finding. Here are our notes on Pokemon Go’s permissions to your Google account.

Here’s what Jay and I set out to do at around 6pm today:

  • Find what permissions Pokemon Go is actually requesting
  • Investigate what the permissions actually do
  • Replicate the permissions in a test app

Our first instinct was to go straight to the code, so we began by loading up the iOS app in a jailbroken phone. The Pokemon Go app uses jailbreak detection to prevent users with modified devices from accessing the game. As we have commonly found with such protections, they were trivial to bypass and, as a result, afforded no real protection. We recommend that firms contact us about MAST if they need more formidable application protection.

Niantic issues an OAuth request to Google with their scope set to the following (note: “scope” determines the level of access that Niantic has to your account and each requested item is a different class of data):

The OAuthLogin scope stands out in this list. It is mainly used by applications from Google, such as Chrome and the iOS Account Manager, though we were able to find a few Github projects that used it too.

It’s not possible to use this OAuth scope from Google’s own OAuth Playground. It only gives various “not authorized” error messages. This means that the OAuth Playground, Google’s own service for testing access to their APIs, is unable to exactly replicate the permissions requested by Pokemon Go.

It might be part of the OAuth 1.0 API, which was deprecated by Google in 2012 and shut down in 2015. If so, we’re not sure why Pokemon Go was able to use it. We checked, and accounts that migrate up to the OAuth 2.0 API are no longer able to access the older 1.0 API.

We found changelogs in the source code for Google Chrome that refer to this OAuth scope as the “Uber” token where it is passed with the “IssueUberAuth” GET parameter.

It does not appear possible to create our own app that uses this OAuth scope through normal or documented means. In order to properly test the level of access provided by this OAuth token, we would probably need to hook an app with access to one (e.g., via a Cydia hook).

The Pokemon Go login flow does not describe what permissions are being requested and silently re-enables them after they’ve been revoked. Further, the available documentation fails to adequately describe what token permissions mean to anyone trying to investigate them.

It’s clear that this access is not needed to identify user accounts in Pokemon Go. While we were writing this we expected Niantic to ultimately respond by reducing the privileges they request. By the time we hit publish, they released a statement confirming they will.

For once, we agree with a lot of comments on Hacker News.

This seems like a massive security fail on Google’s part. There’s no reason the OAuth flow should be able to request admin privileges silently. As a user, I really must get a prompt asking me (and warning me!). — ceejayoz

We were able to query for specific token scopes through Google Apps for Work but we have not found an equivalent for personal accounts. Given that these tokens are nearly equivalent to passwords, it seems prudent to enable greater investigation and transparency about their use on all Google accounts for the next inevitable time that this occurs.

tokens

Google Apps for Work lets you query individual token scopes

By the time we got this far, Niantic released a statement that confirmed they had far more access than needed:

We recently discovered that the Pokémon GO account creation process on iOS erroneously requests full access permission for the user’s Google account. However, Pokémon GO only accesses basic Google profile information (specifically, your User ID and email address) and no other Google account information is or has been accessed or collected. Once we became aware of this error, we began working on a client-side fix to request permission for only basic Google profile information, in line with the data that we actually access. Google has verified that no other information has been received or accessed by Pokémon GO or Niantic. Google will soon reduce Pokémon GO’s permission to only the basic profile data that Pokémon GO needs, and users do not need to take any actions themselves.

After Google and Niantic follow through with the actions described in their statement, this will completely resolve the issue. As best we can tell, Google plans to find the already issued tokens and “demote” them, in tandem with Niantic no longer requesting these permissions for new users.

Thanks for reading and let us know if you have any further details! Please take a second to review what apps you have authorized via the Google Security Checkup, and enable 2FA.

Update 7/12/2016: It looks like we were on the right track with the “UberAuth” tokens. This OAuth scope initially gains access to very little but can be exchanged for new tokens that allow access to all data in your Google account, including Gmail, through a series of undocumented methods. More details: https://gist.github.com/arirubinstein/fd5453537436a8757266f908c3e41538

Update 7/13/2016: The Pokemon Go app has been updated to request only basic permissions now. Niantic’s statement indicated they were going to de-privilege all the erroneously issued tokens themselves, but if you want to jump ahead of them go to your App Permissions, revoke the Pokemon Go access, signout of the Pokemon Go app, and then sign back in.

Screen Shot 2016-07-13 at 2.30.19 PM

Start using the Secure Enclave Crypto API

tl;dr – Tidas is now open source. Let us know if your company wants help trying it out.

When Apple quietly released the Secure Enclave Crypto API in iOS 9 (kSecAttrTokenIDSecureEnclave), it allowed developers to liberate their users from the annoyance of strong passwords or OAuth.

That is, if the developers could make do without documentation.

The required attribute was entirely undocumented. The key format was incompatible with OpenSSL. Apple didn’t even say what cipher suite was used (it’s secp256r1). It was totally unusable in its original state. The app-developer community was at a loss.

We filled the gap

We approached this as a reverse-engineering challenge. Ryan Stortz applied his considerable skill and our collective knowledge of the iOS platform to figure out how to use this new API.

Once Ryan finished a working set of tools to harness the Secure Enclave, we took the next step. We released a service based on this feature: Tidas.

When your app is installed on a new device, the Tidas SDK generates a unique encryption key identifying the user and registers it with the Tidas web service. This key is stored on the client device in the Secure Enclave and is protected by Touch ID, requiring the user to use their fingerprint to unlock it. Client sign-in generates a digitally-signed session token that your backend can pass to the Tidas web service to verify the user’s identity. The entire authentication process is handled by our easy-to-use SDK and avoids transmitting users’ sensitive data. They retain their privacy. You minimize your liability.

tidas-login

David Schuetz, at NCC Group, assessed Tidas’s protocol in this tidy write-up. David’s graphic on the right accurately describes the Tidas wire protocol.

Tidas’s authentication protocol, combined with secure key storage in the Secure Enclave, provides strong security assurances and prevents attacks like phishing and replays. It significantly lowers the bar to adopting token-only authentication in a mobile-first development environment.

We saw enormous potential for security by enabling applications to use private keys that are safely stored outside of iOS and away from any potential malware, like easily unlocking your computer with a press of TouchID, stronger password managers, and more trustworthy mobile payments.

We thought the benefits were clear, so we put together a website and released this product to the internet.

Today, Tidas becomes open source.

Since its February release, Tidas has raised a lot of eyebrows. The WSJ wrote an article about it. We spoke with a dozen different banks that wanted Tidas for its device-binding properties and potential reduction to fraud. Meanwhile, we courted mobile app developers directly for trial runs.

Months later, none of this potential has resulted in clients.

Authentication routines are the gateway to your application. The developers we spoke with were unwilling to modify them in the slightest if it risked locking out honest paying customers.

Banks liked the technology, but none would consider purchasing a point solution for a single device (iOS).

So, Tidas becomes open source today. All its code is available at https://github.com/tidas. If you want to try using the Secure Enclave on your own, check out our DIY toolkit: https://github.com/trailofbits/SecureEnclaveCrypto. It resolves all the Apple problems we mentioned above by providing an easy-to-use wrapper around the Secure Enclave API. Integration with your app could not be easier.

If your company is interested in trying it out and wants help, contact us.