Why we give so much to CSAW

In just a couple of weeks, tens of thousands of students and professionals from all over the world will tune in to cheer on their favorite teams in six competitions. If you’ve been following our blog for some time, you’ll know just what we’re referring to: Cyber Security Awareness Week (CSAW), the nation’s largest student-run cyber security event. Regardless of how busy we get, we always make time to contribute to the event’s success.

CSAW holds a special place in our hearts.

We are proud of our roots in academia and research, and we believe it’s important to promote cyber security education for all students. We’ve been involved in CSAW since its inception. Dan and Yan competed as students, and went on to play a central role in the early years. Since then, our employees have contributed to events, particularly CTF challenges; our favorite flavor of CSAW. (Special kudos to Ryan and Sophia for all the time and effort they’ve contributed). In fact, several of our staff competed as students before joining our team. Here’s looking at you, Sophia and Sam. Finally, we feel fortunate to have met our most recent intern, Loren, through the affiliated CSAW Summer Program for Women.

Part of what makes the CTF so great is that it incorporates diverse contributions by an array of collaborators. The resulting depth of expertise is hard to match.

This year, we contributed five CTF challenges for the qualifying round


Participants start with an obfuscated Linux binary asking for input when run (aka a crackme). Heavy obfuscation, using varying degrees of false predicate insertion, code diffusion, and basic block splitting (all possible through LLVM) would make this a leviathan of a static-reversing challenge. Instead, participants had to pursue a dynamic approach, and program analysis tools to brute force the flag. In the process, they learn how to leak which path the program takes by monitoring changes in instruction counts, and how to use tools such as PIN, Angr, or AFL.

bricks of gold

Solution to Bricks of GoldThis challenge began with a note of international mystery: “We’ve captured an encrypted file being smuggled into the country. All we know is that they rolled their own custom CBC mode algorithm – it’s probably terrible.” Participants must successfully decrypt the file’s custom XOR-CBC encryption. That lead them to seek the algorithm, the key and the IV. Doing so required knowledge of file headers, cryptography, and brute force. Participants also learn how to examine an encrypted file for low entropy, unencrypted strings, and CBC mode block patterns.


Participants receive an archive of a broken git repository. They need to fix the corruption and read the files. In fact, there are three corruptions: each one is a single bit off and are all contained in individual source code files. (This actually happened to Trail of Bits.) Once repaired, the source code files compile into a binary with the answer embedded inside. Participants learn how Git blobs contain versions of repository files that have been prepended with a header and zlib compressed. Git’s versioning provides enough information to rebuild the broken commits. They must dig into the lower-level details of how Git is implemented to write a recovery program.


The story opens with three binary blobs taken from IBM System/360 punch cards, and their encrypted data. These cards were encrypted with technology and techniques from 1965, requiring participants to research how security worked in that era. They also encounter ciphers like KW-26, which generated long streams of bits and XOR’d them against the plaintext, and IBM’s use of ebcdic -not ascii- for encoding. The same stream of bits was used to encrypt each blob, and this cryptographic key reuse has a known attack. Participants attack the cipher with “cribs” in a process known as “crib dragging.”

“math aside, we’re all blackhats now”

Participants must identify the security industry consultant working for the TV show ‘Silicon Valley.’ During its first two seasons, discerning viewers noticed all kinds of props, name dropping, and references to the CTF community, with notable accuracy in its security-related plot elements. There is no way the show’s producers could have learned all these references on their own. Someone had be to feeding them inside information. Who could it be?

1,367 teams scored at least one point, which already makes the event a resounding success in our books. We’re looking forward to watching the CTF finalists duke it out in New York. If you missed the deadlines, you can always find our old CTF challenges on Github.

T-shirt bounty for writeups

For a few bribable teams willing to sTrail of Bits T-Shirthare their thought processes, we’re passing out these snazzy t-shirts for posting helpful writeups. We think it’s pretty cool to send these shirts all over the world, including England, Canada, Australia, and Singapore!

Thanks and kudos to:

bricks of gold



Shaped the Policy competition

Wassenaar shone a spotlight on an array of issues we’ve been tackling for years now. We’re big supporters of the Coalition for Responsible Cybersecurity’s mission to ensure that U.S. export control regulations don’t negatively impact U.S. cybersecurity effectiveness.

So, it seemed only natural that we’d assist CSAW with its policy competition. We love the idea of the US Government hosting a bug bounty. We, as a country, could buy a lot of bugs for the billions wasted on junk security. Our topic challenged students to explore this idea and present a workable solution. We were delighted to see an exploration of this topic in the Army’s Cyber Defense Review recently.

Submissions were judged by a panel of experts in the field representing all sides of this contentious question. The top five teams will present their proposals in-person at CSAW. The top three teams will receive cash prizes and some serious attention from industry experts.


After three years of running THREADS, we’ve decided to refocus our contribution to CSAW on the competitions. We hope you’ll join us in helping motivate and educate students of every academic level. (If you’re out of your school years and in New York, you might be interested in coming to our Empire Hacking meetup.)

May the best teams win.

Hardware Side Channels in the Cloud

At REcon 2015, I demonstrated a new hardware side channel which targets co-located virtual machines in the cloud. This attack exploits the CPU’s pipeline as opposed to cache tiers which are often used in side channel attacks. When designing or looking for hardware based side channels – specifically in the cloud – I analyzed a few universal properties which define the ‘right’ kind of vulnerable system as well as unique ones tailored to the hardware medium.

Slides and full research paper found here.

The relevance of side channel attacks will only increase. Especially attacks which target the vulnerabilities inherent to systems which share hardware resources – such as in cloud platforms.

Virtualization of Physical Resources

Figure 1: virtualization of physical resources


Any meaningful information that you can leak from the environment running the target application or, in this case, the victim virtual machine counts as a side channel. However, some information is better than others. In this case a process (the attacker) must be able to repeatedly record an environment ‘artifact’ from inside one virtual machine.

In the cloud, these environment artifacts are the shared physical resources used by the virtual machines. The hypervisor dynamically partitions each resource and this is then seen by a single virtual machine as its private resource. The side channel model (Figure 2) illustrates this.

Knowing this, the attacker can affect that resource partition in a recordable way, such as by flushing a line in the cache tier, waiting until the victim process uses it for an operation, then requesting that address again – recording what values are now there.

Figure 2: Side Channel Model

Figure 2: side channel model


Great! So we can record things from our victim’s environment – but now what? Depending on what the victim’s process is doing we can actually employ several different types of attacks.

1. crypto key theft

Crypto keys are great, private crypto keys are even better. Using this hardware side channel, it’s possible to leak the bytes of the private key used by a co-located process. In one scenario, two virtual machines are allocated the same space on the L3 cache at different times. The attacker flushes a certain cache address, waits for the victim to use that address, then queries it again – recording the new values that are there [1].

2. process monitoring ( what applications is the victim running? )

This is possible when you record enough of the target’s behavior, i.e. CPU or pipeline usage or values stored in memory. A mapping between the recording to a specific running process can be constructed with a varied degree of certainty. Warning, this does rely on at least a rudimentary knowledge of machine learning.

3. environment  keying ( great for proving co-location! )

Using the environment recordings taken off of a specific hardware resource, you can also uniquely identify one server from another in the cloud. This is useful to prove that two virtual machines you control are co-resident on the same physical server. Alternatively, if you know the behavior signature of a server your target is on, you can repeatedly create virtual machines, recording the behavior on each system until you find a match [2].

4. broadcast signal ( receive messages without the internet :0 )

If a colluding process is purposefully generating behavior on a pre-arranged hardware resource, such as purposefully filling a cache line with 0’s and 1’s, the attacker (your process) can record this behavior in the same way it would record a victim’s behavior. You then can translate the recorded values into pre-agreed messages. Recording from different hardware mediums results in a channel with different bandwidths [3].

The Cache is Easy, the Pipeline is Harder

Now all of the above examples used the cache to record the environment shared by both victim and attacker processes. Cache is the most widely used in both literature and practice to construct side channels as well as being the easiest to record artifacts from. Basically everyone loves cache.

The cache isn’t the only shared resource: co-located virtual machines also share the CPU execution pipeline. In order to use the CPU pipeline, we must be able to record a value from it. However, there is no easy way for any process to query the state of the pipeline over time – it is like a virtual black-box. The only thing a process can know is the instruction set order it gives to be executed on the pipeline and the result the pipeline returns.

out-of-order execution

( the pipeline’s artifact )

We can exploit this pipeline optimization as a means to record the state of the pipeline. The known input instruction order will result in two different return values – one is the expected result(s), the other is the result if the pipeline executions them out-of-order.

Figure 3: Foreign Processes Can Share the Same Pipeline

Figure 3: foreign processes can share the same pipeline

strong memory ordering

Our target, cloud processors, can be assumed to be x86/64 architecture – implying a usually strongly-ordered memory model [4]. This is important because the pipeline will optimize the execution of instructions but attempt to maintain the right order of stores to memory and loads from memory

…HOWEVER, the stores and loads from different threads may be reordered by out-of-order-execution. Now this reordering is observable if we’re clever.

recording instruction reorder ( or how to be clever )

In order for the attacker to record the “reordering” artifact from the pipeline, we must record two things for each of our two threads:

  • input instruction order
  • return value

Additionally, the instructions in each thread must contain a STORE to memory and a LOAD from memory. The LOAD from memory must reference the location stored to by the opposite thread. This setup ensures the possibility for the four cases illustrated below. The last is the artifact we record – doing so several thousand times gives us averages over time.

Figure 4: the attacker can record when its instructions are reordered

Figure 4: the attacker can record when its instructions are reordered

sending a message

To make our attacks more interesting, we want to be able force the amount of recorded out-of-order-executions. This ability is useful for other attacks, such as constructing covert communication channels.

In order to do this, we need to alter how the pipeline’s optimization works – either by increasing the probability that it will or will not reorder our two threads. The easiest is to enforce a strong memory order and guarantee that the attacker will receive less out-of-order-executions.

memory barriers

In the x86 instruction set, there are specific barrier instructions that will stop the processor from reordering the four possible combinations of STORE’s and LOAD’s. What we’re interested in is forcing a strong order when the processor encounters an instruction set with a STORE followed by a LOAD.

The instruction mfence does exactly this.

By have the colluding process inject these memory barriers in the pipeline, the attacker’s instructions will not be reordered, forcing a noticeable decrease in the recorded averages. Doing this in distinct time frames allows us to send a binary message.

Figure 5: mfence ensures the strong memory order on pipeline

Figure 5: mfence ensures the strong memory order on pipeline


The takeaway is that even with virtualization separating your virtual machine from the hundreds of other alien virtual machines, the pipeline can’t distinguish your process’s instructions from all the other ones and we can use that to our advantage. :0

If you would like to learn more about this side channel technique, please read the full paper here.

  1. https://eprint.iacr.org/2013/448.pdf
  2. http://www.ieee-security.org/TC/SP2011/PAPERS/2011/paper020.pdf
  3. https://www.cs.unc.edu/~reiter/papers/2014/CCS1.pdf
  4. http://preshing.com/20120930/weak-vs-strong-memory-models/

Empire Hacking, a New Meetup in NYC

Today we are launching Empire Hacking, a bi-monthly meetup that focuses on pragmatic security research and new discoveries in attack and defense.


It’s basically a security poetry jam

Empire Hacking is technical. We aim to bridge the gap between weekend projects and funded research. There won’t be any product pitches here. Come prepared with your best ideas.

Empire Hacking is exclusive. Talks are by invitation-only and are under Chatham House Rule. We will discuss ongoing research and internal projects you won’t hear about anywhere else.

Empire Hacking is engaging. Talk about subjects you find interesting, face to face, with a community of experts from across the industry.

Each meetup will consist of short talks from three expert speakers and run from 6-9pm at Trail of Bits HQ. Tentative schedule: Even months, on Patch Tuesday (the 2nd Tuesday). Beverages and light food will be provided. Space is limited. Please apply on our Meetup page.

Our inaugural meetup will feature talks from Chris Rohlf, Dr. Byron Cook, and Nick DePetrillo on Tuesday, June 9th.

Offense at Scale

Chris will discuss the effects of scale on vulnerability research, fuzzing and real attack campaigns.

Chris Rohlf runs the penetration testing team at Yahoo in NYC. Before Yahoo he was the founder of Leaf Security Research, a highly-specialized security consultancy with expertise in vulnerability discovery, reversing and exploit development.

Automatically proving program termination (and more!)

Byron will discuss research advances that have led to practical tools for automatically proving program termination and related properties.

Dr. Byron Cook is professor of computer science at University College London.

Cellular Baseband Exploitation

Baseband exploitation has been a topic of interest for many, however, few have described the effort required to make such attacks practical. In this talk, we explore the challenges towards reliable, large-scale cellular baseband exploitation.

Nick DePetrillo is a principal security engineer at Trail of Bits with expertise in cellular hardware and infrastructure security.

Keep up with Empire Hacking by following us on Twitter. See you at a meetup!

Frequently Asked Questions

Why is Empire Hacking a membership-based group?

To cultivate a tight-knit community. This should be a place where members feel free to discuss private or exclusive research and data, knowing that it will remain within the group. Furthermore, we believe that a membership process increases motivation to make a high-quality contribution.

To protect against abuse. Everyone is expected to treat his or her fellow members with respect and decency. Violators lose membership and all access to the group, including membership lists, meeting locations, and our discussion board.

To follow the crowd. Not really. But seriously, we are hardly the first private meetup or group in security. Consider that NCC Open Forum “is by invite only and is limited to engineers and technical managers”, NY Information Security Meetup charges $5 to attend, and Ops-T “does not accept applications for membership.”

Why does Empire Hacking use Chatham House Rules?

We welcome everyone to apply to Empire Hacking, even journalists. But we don’t want participants to worry that their personal thoughts will be relayed to outsiders, or used against them or their employers. We enforce Chatham House Rules to preserve the balance between candor and discretion.

How can I attend a meetup?

Please apply on our meetup.com page. If you have any trouble, feel free to reach out to any of the Trail of Bits staff.

The Foundation of 2015: 2014 in Review

We need to do more to protect ourselves. 2014 overflowed with front-page proof: Apple, Target, JPMorgan Chase. Et cetera. Et cetera.

The current, vulnerable status quo begs for radical change, an influx of talented people, and substantially better tools. As we look ahead to driving that change in 2015, we’re proud to highlight a selection of our 2014 accomplishments that will underpin that work.

1. Open-source framework to transform binaries to LLVM bitcode

Our framework for analyzing and transforming machine-code programs to LLVM bitcode became a new tool in the program analysis and reverse engineering communities. McSema connects the world of LLVM program analysis and manipulation tools to binary executables. Currently it supports the translation of semantics for x86 programs and supports subsets of integer arithmetic, floating point, and vector operations.

2. Shaped smarter public policy

The spate of national-scale computer security incidents spurred anxious conversation and action. To pre-empt poorly conceived laws from poorly informed lawmakers, we worked extensively with influential think tanks to help educate our policy makers on the finer points of computer security. The Center for a New American Security’s report “Surviving on a Diet of Poisoned Fruit” was just one result of this effort.

3. More opportunities for women

As part of our ongoing collaboration with NYU-Poly, Trail of Bits put its support behind the CSAW Program for High School Women and Career Discovery in Cyber Security Symposium. These events are intended to help guide talented and interested women into careers in computer security. We want to create an environment where women have the resources to contribute and excel in this industry.

4. Empirical data on secure development practices

In contrast with traditional security contests, Build-it, Break-it, Fix-it rewards secure software development under the same pressures that lead to bugs: tight deadlines, performance requirements, competition, and the allure of money. We were invited to share insights from the event at Microsoft’s Bluehat v14.

5. Three separate Cyber Fast Track projects

Under DARPA’s Program Manager Peiter ‘Mudge’ Zatko, we completed three distinct projects in the revolutionary Cyber Fast Track program: CodeReason, MAST, and PointsTo. Five of our employees went to the Pentagon to demonstrate our creations to select members of the Department of Defense. We’re happy to have participated and been recognized for our work. We’re now planning on giving back; CodeReason will be making an open-source release in 2015!

6. Taught machines to find Heartbleed

Heartbleed, the infamous OpenSSL vulnerability, went undetected for so long because it’s hard for static analyzers to detect. So, Andrew Ruef took on the challenge and wrote a checker for clang-analyzer that can find Heartbleed and other bugs like it automatically. We released the code for others to learn from.

7. A resource for students of computer security

One of the most fun and effective ways to learn computer security is by competing in Capture the Flag events. But many fledgling students don’t know where to get started. So we wrote the Capture the Flag Field Guide to help them get involved and encourage them to take the first steps down this career path.

8. The iCloud Hack spurs our two-factor authentication guide

Adding two-factor authentication is always a good idea. Just ask anyone whose account has been compromised. If you store any sensitive information with Google, Apple ID or Dropbox, you’ll want to know about our guide to adding an extra layer of protection to your accounts.

9. Accepted into DARPA’s Cyber Grand Challenge

The prize: $2 million. The challenge: Build a robot that can repair insecure software without human input. If successful, this program will have a profound impact on the way companies secure their data in the future. We were selected as one of seven funded teams to compete.

10. THREADS 2014: How to automate security

Our CEO Dan Guido chaired THREADS, a research and development conference that takes place at NYU-Poly’s Cyber Security Awareness Week (CSAW). This year’s theme focused on scaling security — ensuring that security is an integral and automated part of software development and deployment models. We believe that the success of automated security is essential to our ever more internetworked society and devices. See talks and slides from the event.

Looking ahead.

This year, we’re excited to develop and share more code, including: improvements to McSema (i.e. support for LLVM 3.5, lots more SSE and FPU instruction support, and a new control flow recovery module based on JakStab), a private videochat service, and an open-source release of CodeReason. We’re also excited about Ghost in the Shellcode (GitS) — a capture the flag competition at ShmooCon in Washington DC in January that three of our employees are involved in running. And don’t forget about DARPA’s Cyber Grand Challenge qualifying event in June.

For now, we hope you’ll connect with us on Twitter or subscribe to our newsletter.

Speaker Lineup for THREADS ’14: Scaling Security

For every security engineer you train, there are 20 or more developers writing code with potential vulnerabilities. There’s no human way to keep up. We need to be more effective with less resources. It’s time to make security a fully integrated part of modern software development and operations.

It’s time to automate.

This year’s THREADS will focus exclusively on automating security. In this single forum, a selection of the industry’s best experts will present previously unseen in-house innovations deployed at major technology firms, and share leading research advances available in the future.

Buy tickets for THREADS now to get the early-bird special (expires 10/13).

DARPA Returns – Exclusive

If you attended THREADS’13, you know that our showcase of DARPA’s Cyber Fast Track was not-to-be-missed. Good news, folks. DARPA’s coming back with a brief of another exciting project, the Integrated Cyber Analysis System (ICAS). ICAS enables streamlined detection of targeted attacks on large and diverse corporate networks. (Think Target, Home Depot, and JPMorgan Chase.)

We’ll hear from the three players DARPA invited to tackle the problem: Invincea Labs, Raytheon BBN, and Digital Operatives. Each group attempted to meet the project goals in a unique way, and will share their experiences and insights.

Learn about it at THREADS’14 first.

World-Class Speakers at THREADS’14


Robert Joyce, Chief, Tailored Access Operations (TAO), NSA

As the Chief of TAO, Rob leads an organization that provides unique, highly valued capabilities to the Intelligence Community and the Nation’s leadership.  His organization is the NSA mission element charged with providing tools and expertise in computer network exploitation to deliver foreign intelligence. Prior to becoming the Chief of TAO, Rob served as the Deputy Director of the Information Assurance Directorate (IAD) at NSA, where he led efforts to harden, protect and defend the Nation’s most critical National Security systems and improve cybersecurity for the nation.

Michael Tiffany, CEO, White Ops

Michael Tiffany is the co-founder and CEO of White Ops, a security company founded in 2013 to break the profit models of cybercriminals. By making botnet schemes like ad fraud unprofitable, White Ops disrupts the criminal incentive to break into millions of computers. Previously, Tiffany was the co-founder of Mission Assurance Corporation, a pioneer in space-based computing that is now a part of Recursion Ventures. He is a Technical Fellow of Critical Assets Labs, a DARPA-funded cyber-security research lab. He is a Subject Matter Advisor for the Signal Media Project, a nonprofit promoting the accurate portrayal of science, technology and history in popular media. He is also a Ninja.


Smten and the Art of Satisfiability-based Search
Nirav Dave, SRI

Reverse All the Things with PANDA
Brendan Dolan-Gavitt, Columbia University

Code-Pointer Integrity
Laszlo Szekeres, Stony Brook University

Static Translation of X86 Instruction Semantics to LLVM with McSema
Artem Dinaburg & Andrew Ruef, Trail of Bits

Transparent ROP Detection using CPU Performance Counters
Xiaoning Li, Intel & Michael Crouse, Harvard University

Improving Scalable, Automated Baremetal Malware Analysis
Adam Allred & Paul Royal, Georgia Tech Information Security Center (GTISC)

Integrated Cyber Attribution System (ICAS) Program Brief
Richard Guidorizzi, DARPA

TAPIO: Targeted Attack Premonition using Integrated Operational Data Sources
Invincea Labs

Gestalt: Integrated Cyber Analysis System
Raytheon BBN

Federated Understanding of Security Information Over Networks (FUSION)
Digital Operatives


Building Your Own DFIR Sidekick
Scott J Roberts, Github

Operating system analytics and host intrusion detection at scale
Mike Arpaia, Facebook

Reasoning about Optimal Solutions to Automation Problems
Jared Carlson & Andrew Reiter, Veracode

Augmenting Binary Analysis with Python and Pin
Omar Ahmed, Etsy & Tyler Bohan, NYU-Poly

Are attackers using automation more efficiently than defenders?
Marc-Etienne M.Léveillé, ESET

Making Sense of Content Security Policy (CSP) Reports @ Scale
Ivan Leichtling, Yelp

Automatic Application Security @twitter
Neil Matatall, Twitter

Cleaning Up the Internet with Scumblr and Sketchy
Andy Hoernecke, Netflix

CRITs: Collaborative Research Into Threats
Michael Goffin, Wesley Shields, MITRE

GitHub AppSec: Keeping up with 111 prolific engineers
Ben Toews, GitHub

Don’t miss out. Buy tickets for THREADS now to get the early-bird special (expires 10/13). You won’t find a more comprehensive treatment of scaling security anywhere else.


We’re Sponsoring the NYU-Poly Women’s Cybersecurity Symposium

NYU-Poly Women's Cybersecurity Symposium

Cyber security is an increasingly complex and vibrant field that requires brilliant and driven people to work on diverse teams. Unfortunately, women are severely underrepresented and we want to change that. Career Discovery in Cyber Security is an NYU-Poly event, created in a collaboration with influential men and women in the industry. This annual symposium helps guide talented and interested women into careers in cyber security. We know that there are challenges for female professionals in male-dominated fields, which is why we want to create an environment where women have the resources they need to excel.

The goal of this symposium is to showcase the variety of industries and career paths in which cyber security professionals can make their mark. Keynote talks, interactive learning sessions, and technical workshops will prepare participants to identify security challenges and acquire the skills to meet them. A mentoring roundtable, female executive panel Q&A session, and networking opportunities allow participants to interact with accomplished women in the field in meaningful ways. These activities will give an extensive, well-rounded look into possible career paths.

Trail of Bits is a strong advocate for women in the cyber security world at all stages of their careers. In the past, we were participants in the CSAW Summer Program for Women, which introduced high school women to the world of cyber security. We are proud of our involvement in this women’s symposium from its earliest planning stages, continue to offer financial support via named scholarships for attendees, and will take part in the post-event mentoring program.

This year’s symposium is Friday and Saturday, October 17-18 in Brooklyn, New York. For more details and registration, visit the website. Follow the symposium on Twitter or Facebook for news and updates.

McSema is Officially Open Source!

We are proud to announce that McSema is now open source! McSema is a framework for analyzing and transforming machine-code programs to LLVM bitcode. It supports translation of x86 machine code, including integer, floating point, and SSE instructions. We previously covered some features of McSema in an earlier blog post and in our talk at ReCON 2014.

Our talk at ReCON where we first described McSema

Build instructions and demos are available in the repository and we encourage you to try them on your own. We have created a mailing list, mcsema-dev@googlegroups.com, dedicated to McSema development and usage. Questions about licensing or integrating McSema into your commercial project may be directed to opensource@trailofbits.com.

McSema is permissively licensed under a three-clause BSD license. Some code and utilities we incorporate (e.g. Intel PIN for semantics testing) have their own licenses and need to be downloaded separately.

Finally, we would like to thank DARPA for their sponsorship of McSema development and their continued support. This project would not have been possible without them.

A Preview of McSema

On June 28th Artem Dinaburg and Andrew Ruef will be speaking at REcon 2014 about a project named McSema. McSema is a framework for translating x86 binaries into LLVM bitcode. This translation is the opposite of what happens inside a compiler. A compiler translates LLVM bitcode to x86 machine code. McSema translates x86 machine code into LLVM bitcode.

Why would we do such a crazy thing?

Because we wanted to analyze existing binary applications, and reasoning about LLVM bitcode is much easier than reasoning about x86 instructions. Not only is it easier to reason about LLVM bitcode, but it is easier to manipulate and re-target bitcode to a different architecture. There are many program analysis tools (e.g. KLEE, PAGAI, LLBMC) written to work on LLVM bitcode that can now be used on existing applications. Additionally it becomes much simpler to transform applications in complex ways while maintaining original application functionality.

McSema brings the world of LLVM program analysis and manipulation tools to binary executables. There are other x86 to LLVM bitcode translators, but McSema has several advantages:

  • McSema separates control flow recovery from translation, permitting the use of custom control flow recovery front-ends.
  • McSema supports FPU instructions.
  • McSema is open source and licensed under a permissive license.
  • McSema is documented, works, and will be available soon after our REcon talk.

This blog post will be a preview of McSema and will examine the challenges of translating a simple function that uses floating point arithmetic from x86 instructions to LLVM bitcode. The function we will translate is called timespi. It it takes one argument, k and returns the value of k * PI. Source code for timespi is below.

long double timespi(long double k) {
    long double pi = 3.14159265358979323846;
    return k*pi;

When compiled with Microsoft Visual Studio 2010, the assembly looks like the IDA Pro screenshot below.


This is what the original timespi function looks like in IDA.

After translating to LLVM bitcode with McSema and then re-emitting the bitcode as an x86 binary, the assembly looks much different.


How timespi looks after translation to LLVM and re-emission back as an x86 binary. The new code is considerably larger. Below, we explain why.

You may be saying to yourself: “Wow, that much code bloat for such a small function? What are these guys doing?”

We specifically wanted to use this example because it shows floating point support — functionality that is unique to McSema, and because it showcases difficulties inherent in x86 to LLVM bitcode translation.

Translation Background

McSema models x86 instructions as operations on a register context. That is, there is a register context structure that contains all registers and flags and an instruction semantics are expressed as modifications of structure members. This concept is easiest to understand with a simplified pseudocode example. An operation such as ADD EAX, EBX would be translated to context[EAX] += context[EBX].

Translation Difficulties

Now let’s examine why a small function like timespi presents serious translation challenges.

The value of PI is read from the data section.

Control flow recovery must detect that the first FLD instruction references data and correctly identify the data size. McSema separates control flow recovery from translation, and hence can leverage IDA’s excellent CFG recovery via an IDAPython script.

The translation needs to support x86 FPU registers, FPU flags, and control bits.

The FPU registers aren’t like integer registers. Integer registers (EAX, ECX, EBX, etc.) are named and independent. Instructions referencing EAX will always refer to the same place in a register context.

FPU registers are a stack of 8 data registers (ST(0) through ST(7)), indexed by the TOP flag. Instructions referencing ST(i) actually refer to st_registers[(TOP + i) % 8] in a register context.


This is Figure 8-2 from the Intel IA-32 Software Development Manual. It very nicely depicts the FPU data registers and how they are implicitly referenced via the TOP flag.

Integer registers are defined solely by register contents. FPU registers are partially defined by register contents and partially by the FPU tag word. The FPU tag word is a bitmap that defines whether the contents of a floating point register are:

  • Valid (that is, a normal floating point value)
  • The value zero
  • A special value such as NaN or Infinity
  • Empty (the register is unused)

To determine the value of an FPU register, one must consult both the FPU tag word and the register contents.

The translation needs to support at least the FLDFSTP, and FMUL instructions.

The actual instruction operation such as loads, stores, and multiplication is fairly straightforward to support. The difficult part is implementing FPU execution semantics.

For instance, the FPU stores state about FPU instructions, like:

  • Last Instruction Pointer: the location of the last executed FPU instruction
  • Last Data Pointer: the address of the latest memory operand to an FPU instruction
  • Opcode: The opcode of the last executed FPU instruction

Some of these concepts are easier to translate to LLVM bitcode than others. Storing the address of the last memory operand translates very well: if the translated instruction references memory, store the memory address in the last data pointer field of the register context. Other concepts simply don’t translate. As an example, what does the “last instruction pointer” mean when a single FPU instruction is translated into multiple LLVM operations?

Self-referencing state isn’t the end of translation difficulties. FPU flags like the precision control and rounding control flags affect instruction operation. The precision control flag affects arithmetic operation, not the precision of stored registers. So one can load a double extended precision values in ST(0) and ST(1) via FLD, but FMUL may store a single precision result in ST(0).

Translation Steps

Now that we’ve explored the difficulties of translation, let’s look at the steps needed to translate just the core of timespi, the FMUL instruction. The IA-32 Software Development Manual manual defines this instance of FMUL as “Multiply ST(0) by m64fp and store result in ST(0).” Below are just some of the steps required to translate FMUL to LLVM bitcode.

  • Check the FPU tag word for ST(0), make sure its not empty.
  • Read the TOP flag.
  • Read the value from st_registers[TOP]. Unless the FPU tag word said the value is zero, in which case just read a zero.
  • Load the value pointed to by m64fp.
  • Do the multiplication.
  • Check the precision control flag. Adjust the result precision of the result as needed.
  • Write the adjusted result into st_registers[TOP].
  • Update the FPU tag word for ST(0) to match the result. Maybe we multiplied by zero?
  • Update FPU status flags in the register context. For FMUL, this is just the C1 flag.
  • Update the last FPU opcode field
  • Did our instruction reference data? Sure did! Update the last FPU data field to m64fp.
  • Skip updating the last FPU instruction field since it doesn’t really map to LLVM bitcode… for now

Thats a lot of work for a single instruction, and the list isn’t even complete. In addition to the work of translating raw instructions, there are additional steps that must be taken on function entry and exit points, for external calls and for functions that have their address taken. Those additional details will be covered during the REcon talk.


Translating floating point operations is a tricky, difficult business. Seemingly simple floating point instructions hide numerous operations and translate to a large amount of LLVM bitcode. The translated code is large because McSema exposes the hidden complexity of floating point operations. Considering that there have been no attempts to optimize instruction translation, we think the current output is pretty good.

For a more detailed look at McSema, attend Artem and Andrew’s talk at REcon and keep following the Trail of Bits blog for more announcements.

EDIT: McSema is now open-source. See our announcement for more information.

iOS 4 Security Evaluation

This year’s BlackHat USA was the 12th year in a row that I’ve attended and the 6th year in a row that I’ve participated in as a presenter, trainer, and/or co-organizer/host of the Pwnie Awards. And I made this year my busiest yet by delivering four days of training, a presentation, the Pwnie Awards, and participating on a panel. Not only does that mean that I slip into a coma after BlackHat, it also means that I win at conference bingo.

Reading my excuses for the delay in posting my slides and whitepaper, however, is not why you are reading this blog post. It is to find the link to download said slides and whitepaper:

Attacker Math 101

At SOURCE Boston this year, I gave my first conference keynote presentation. I really appreciate the opportunity that Stacy Thayer and the rest of the SOURCE crew gave me. The presentation was filmed by AT&T and you can watch it on the AT&T Tech Channel. Another thanks goes out to Ryan Naraine for inviting me to give an encore presentation of it for Kaspersky’s SAS conference in Malaga, Spain.

If slides are more your style, you can check out the more recent version from Kaspersky’s SAS 2011: Attacker Math 101.


Get every new post delivered to your Inbox.

Join 4,555 other followers