Education Initiative Spotlight: Build it Break It

Build it Break it is a first-of-its-kind security challenge run by UMD

Build it Break it is a first-of-its-kind security challenge run by UMD

We’re proud to be a sponsor of the first Build it Break it programming contest, run by the University of Maryland (UMD) and supported by one of our own employees and PhD student at the university, Andrew Ruef. Build it Break it is a “flipped CTF” where contestants both implement secure software and identify vulnerabilities in the creations of others. Points are awarded for the secure construction of software and for identifying security flaws.

The build-it, break-it, fix-it contest was conceived as a way to acquire useful scientific evidence, while at the same time engaging the student population and the wider community in a mentality of building security in rather than adding it after the fact. – Michael Hicks

At Trail of Bits, we think Build It Break it is a necessary addition to the suite of available competitions in the security community. There are a wealth of opportunities for students to learn to break software (many of which we support), however, there are relatively few that challenge them to build it right. In this unique contest, there is something for both builders and breakers since it rewards both activities.

It also presents an opportunity for language evangelists to demonstrate the merits of their approach – if their language is “more secure” than others, it should come out on top in the contest and more implementations built with it will remain standing. Contestants can use any programming language or framework to write their software, so by observing the contest, the community gathers empirical evidence about the security or insecurity of available tools.

Any undergraduate or graduate student at a US-based university is eligible to compete for cash prizes in Build it Break it. Though, be warned that Trail of Bits engineers will be on hand to assist as a “break it” team. For more information about the motivations behind this contest, see this blog post and slide deck from Michael Hicks.

Education Initiative Spotlight: CSAW Summer Program for Women

At Trail of Bits we are proud of our roots in academia and research, and we believe it is important to promote cyber security education for students of every academic level. We recently sponsored a High School Capture the Flag (CTF) event, we released a CTF Field Guide, and we are a regular part of Cyber Security Awareness Week (CSAW) at NYU-Poly.

CSAW Summer Program for High School Women

10487588_588177764633699_1497332133553342074_n

Recent graduates of the CSAW Summer Program for High School Women

As part our ongoing collaboration with NYU-Poly, Trail of Bits is proud to be a part of the CSAW Program for High School Women. This program is a unique opportunity for teenage women from the New York City area to learn the fundamentals of computer science and cyber security. Qualifying young women from local high schools participate in two weeks of lecture and hands-on training breaking codes, hacking databases, and unraveling a digital forensics mystery.

“The Introduction to Computer Science and Cyber Security Summer Program aims to introduce young women to computer science in a safe and encouraging learning environment. Exposing young women to female role models and mentors in computer science allows the women to view the field as a viable career option.” – Summer of STEM

The CSAW Program for High School Women is led by Linda Sellie, an Industry Professor of Computer Science and Engineering at NYU-Poly. The complete program covers a variety of topics in computer science, including cryptography, databases, basics of computation, networking, and more. Trail of Bits’ Dan Guido, along with NYU-Poly Computer Science and Engineering Department Head, Nasir Memon, taught an overview of computers and career opportunities in the field of cyber security. Following this course module, students engaged the first level of Matasano’s MicroCorruption challenge to further explore low-level software security issues. Graduates of the program are prepared to compete in other challenges, such as CSAW’s High School Forensics and Capture the Flag competitions this fall.

The first summer program ran from July 7-18 and a second session begins today and will run through August 8th. For more information about the CSAW Summer Program for High School Women, see the press release from the NYU Engineering department.

Trail of Bits Adds Mobile Security Researcher Nicholas DePetrillo to Growing Team

New York, NY (July 15th, 2014)—Veteran computer security researcher Nicholas DePetrillo has joined Trail of Bits, the New York-based security company, as Principal Security Researcher. Trail of Bits Co-founder and CEO Dan Guido announced the hire today. DePetrillo brings the headcount of the firm, which was founded by a team of three in 2012, to 13 employees.

DePetrillo brings more than ten years of security research experience to Trail of Bits, most notably in the area of mobile security. DePetrillo is widely regarded as one of the industry’s foremost experts in the field, attracting considerable attention for his discovery of significant security flaws that impact the privacy of millions of smartphone users and wireless network customers. He has worked on research throughout the entire mobile phone technology stack including cell phone network infrastructure, baseband radio security research and at the application and operating system level.

“Having Nick on our team opens up tremendous new opportunities for us and for the companies we work with,” said Guido. “Mobile security is one of the most important issues our industry is facing, and Nick’s work provides great insight into how mobile attackers think and work, which is key to developing solutions to protect these devices.”

Among DePetrillo’s first projects with Trail of Bits is the development of secure mobile technologies for use in next-generation smartphones.

Prior to joining Trail of Bits, DePetrillo was an independent consultant specializing in mobile security research services. He was also a Senior Security Researcher at Harris Corporation, focusing on mobile and wireless platforms. DePetrillo is a frequent lecturer, and has presented his work at BlackHat and other security conferences around the world.

About Trail of Bits
Founded in 2012, Trail of Bits enables enterprises to make better strategic security decisions with its world-class experience in security research, red teaming and incident response. The Trail of Bits management team is comprised of some of the most recognized researchers in the security industry, renowned for their expertise in reverse engineering, novel exploit techniques and mobile security. Trail of Bits has collaborated extensively with DARPA on the agency’s acclaimed Cyber Fast Track, Cyber Grand Challenge and Cyber Stakes programs. In 2014, the company launched its first enterprise product, Javelin, which simulates attacks to help companies measure and refine their security posture.

Learn more at www.trailofbits.com.

A Preview of McSema

On June 28th Artem Dinaburg and Andrew Ruef will be speaking at REcon 2014 about a project named McSema.

McSema is a framework for translating x86 binaries into LLVM bitcode. This translation is the opposite of what happens inside a compiler. A compiler translates LLVM bitcode to x86 machine code. McSema translates x86 machine code into LLVM bitcode.

Why would we do such a crazy thing?

Because we wanted to analyze existing binary applications, and reasoning about LLVM bitcode is much easier than reasoning about x86 instructions.

Not only is it easier to reason about LLVM bitcode, but it is easier to manipulate and re-target bitcode to a different architecture. There are many program analysis tools (e.g. KLEE, PAGAI, LLBMC) written to work on LLVM bitcode that can now be used on existing applications. Additionally it becomes much simpler to transform applications in complex ways while maintaining original application functionality.

McSema brings the world of LLVM program analysis and manipulation tools to binary executables. There are other x86 to LLVM bitcode translators, but McSema has several advantages:

  • McSema separates control flow recovery from translation, permitting the use of custom control flow recovery front-ends.
  • McSema supports FPU instructions.
  • McSema is open source and licensed under a permissive license.
  • McSema is documented, works, and will be available soon after our REcon talk.

This blog post will be a preview of McSema and will examine the challenges of translating a simple function that uses floating point arithmetic from x86 instructions to LLVM bitcode. The function we will translate is called timespi. It it takes one argument, k and returns the value of k * PI. Source code for timespi is below.

long double timespi(long double k) {
    long double pi = 3.14159265358979323846;
    return k*pi;
}

When compiled with Microsoft Visual Studio 2010, the assembly looks like the IDA Pro screenshot below.

IDA_TimesPi_Original

This is what the original timespi function looks like in IDA.

After translating to LLVM bitcode with McSema and then re-emitting the bitcode as an x86 binary, the assembly looks much different.

IDA_TimesPi_Lifted

How timespi looks after translation to LLVM and re-emission back as an x86 binary. The new code is considerably larger. Below, we explain why.

You may be saying to yourself: “Wow, that much code bloat for such a small function? What are these guys doing?”

We specifically wanted to use this example because it shows floating point support — functionality that is unique to McSema, and because it showcases difficulties inherent in x86 to LLVM bitcode translation.

Translation Background

McSema models x86 instructions as operations on a register context. That is, there is a register context structure that contains all registers and flags and an instruction semantics are expressed as modifications of structure members. This concept is easiest to understand with a simplified pseudocode example. An operation such as ADD EAX, EBX would be translated to context[EAX] += context[EBX].

Translation Difficulties

Now let’s examine why a small function like timespi presents serious translation challenges.

The value of PI is read from the data section.

Control flow recovery must detect that the first FLD instruction references data and correctly identify the data size. McSema separates control flow recovery from translation, and hence can leverage IDA’s excellent CFG recovery via an IDAPython script.

The translation needs to support x86 FPU registers, FPU flags, and control bits.

The FPU registers aren’t like integer registers. Integer registers (EAX, ECX, EBX, etc.) are named and independent. Instructions referencing EAX will always refer to the same place in a register context.

FPU registers are a stack of 8 data registers (ST(0) through ST(7)), indexed by the TOP flag. Instructions referencing ST(i) actually refer to st_registers[(TOP + i) % 8] in a register context.

figure_8_2

This is Figure 8-2 from the Intel IA-32 Software Development Manual. It very nicely depicts the FPU data registers and how they are implicitly referenced via the TOP flag.

Integer registers are defined solely by register contents. FPU registers are partially defined by register contents and partially by the FPU tag word. The FPU tag word is a bitmap that defines whether the contents of a floating point register are:

  • Valid (that is, a normal floating point value)
  • The value zero
  • A special value such as NaN or Infinity
  • Empty (the register is unused)

To determine the value of an FPU register, one must consult both the FPU tag word and the register contents.

The translation needs to support at least the FLDFSTP, and FMUL instructions.

The actual instruction operation such as loads, stores, and multiplication is fairly straightforward to support. The difficult part is implementing FPU execution semantics.

For instance, the FPU stores state about FPU instructions, like:

  • Last Instruction Pointer: the location of the last executed FPU instruction
  • Last Data Pointer: the address of the latest memory operand to an FPU instruction
  • Opcode: The opcode of the last executed FPU instruction

Some of these concepts are easier to translate to LLVM bitcode than others. Storing the address of the last memory operand translates very well: if the translated instruction references memory, store the memory address in the last data pointer field of the register context. Other concepts simply don’t translate. As an example, what does the “last instruction pointer” mean when a single FPU instruction is translated into multiple LLVM operations?

Self-referencing state isn’t the end of translation difficulties. FPU flags like the precision control and rounding control flags affect instruction operation. The precision control flag affects arithmetic operation, not the precision of stored registers. So one can load a double extended precision values in ST(0) and ST(1) via FLD, but FMUL may store a single precision result in ST(0).

Translation Steps

Now that we’ve explored the difficulties of translation, let’s look at the steps needed to translate just the core of timespi, the FMUL instruction. The IA-32 Software Development Manual manual defines this instance of FMUL as “Multiply ST(0) by m64fp and store result in ST(0).” Below are just some of the steps required to translate FMUL to LLVM bitcode.

  • Check the FPU tag word for ST(0), make sure its not empty.
  • Read the TOP flag.
  • Read the value from st_registers[TOP]. Unless the FPU tag word said the value is zero, in which case just read a zero.
  • Load the value pointed to by m64fp.
  • Do the multiplication.
  • Check the precision control flag. Adjust the result precision of the result as needed.
  • Write the adjusted result into st_registers[TOP].
  • Update the FPU tag word for ST(0) to match the result. Maybe we multiplied by zero?
  • Update FPU status flags in the register context. For FMUL, this is just the C1 flag.
  • Update the last FPU opcode field
  • Did our instruction reference data? Sure did! Update the last FPU data field to m64fp.
  • Skip updating the last FPU instruction field since it doesn’t really map to LLVM bitcode… for now

Thats a lot of work for a single instruction, and the list isn’t even complete. In addition to the work of translating raw instructions, there are additional steps that must be taken on function entry and exit points, for external calls and for functions that have their address taken. Those additional details will be covered during the REcon talk.

Conclusion

Translating floating point operations is a tricky, difficult business. Seemingly simple floating point instructions hide numerous operations and translate to a large amount of LLVM bitcode. The translated code is large because McSema exposes the hidden complexity of floating point operations. Considering that there have been no attempts to optimize instruction translation, we think the current output is pretty good.

For a more detailed look at McSema, attend Artem and Andrew’s talk at REcon and keep following the Trail of Bits blog for more announcements.

We’ve Moved!

Trail of Bits headquarters has moved! Located in the heart of the financial district, our new office features a unique design, cool modern decor, and an open layout that makes us feel right at home.

With fast internet, well-appointed conference rooms, and comfortable work stations, we feel that this is a great place to grow our business.

We are also loving our new commute options. We have easy access to several main subway lines, and for those of us who bike, there is indoor bicycle storage and a Citibike located right outside our building. Oh yeah, there’s also this view:

We’re hiring and we encourage you to apply if you’re interested in joining us!

Dear DARPA: Challenge Accepted.

CGC_Stacked_ColoronBlack

We are proud to have one of the only seven accepted funded-track proposals to DARPA’s Cyber Grand Challenge.

Computer security experts from academia, industry and the larger security community have organized themselves into more than 30 teams to compete in DARPA’s Cyber Grand Challenge —- a first-of-its-kind tournament designed to speed the development of automated security systems able to defend against cyberattacks as fast as they are launched. DARPA also announced today that it has reached an agreement to hold the 2016 Cyber Grand Challenge final competition in conjunction with DEF CON, one of the largest computer security conferences in the world.

More info from DARPA:

Press coverage:

Our participation in this program aligns with our development of Javelin, an automated system for simulating attacks against enterprise networks. We have assembled a world-class team of experts in software security, capture the flag, and program analysis to compete in this challenge. As much as we wish the other teams luck in this competition, Trail of Bits is playing to win. Game on!

Trail of Bits Releases Capture the Flag Field Guide

Free Online Coursework Allows Students, Professionals to Build Essential Offensive Security Skills

New York, NY (May 20, 2014)–Security researchers at Trail of Bits today introduced the CTF Field Guide (Capture the Flag), a freely available, self-guided online course designed to help university and high school students hone the skills needed to succeed in the fast-paced, offensive competitions known as Capture the Flag.

Capture the Flag events consist of many small challenges that require participants to exercise skills across the spectrum of computer security, from exploit creation and vulnerability discovery to forensics. Participation in such games is widely viewed as a critical step in building computer security expertise, especially for high school and college students considering a career in the field.

Despite the value of CTF events, few high schools and colleges have the resources to mentor students interested in computer security, and often the expertise needed to create and train CTF teams is lacking. The CTF Field Guide will help students build the skills to compete and succeed in these competitions, supplementing their existing coursework in computer security and providing motivated students with the structure and guidance to form their own CTF teams.

The CTF Field Guide is based on course content created by Dan Guido, co-founder and CEO of Trail of Bits and Hacker in Residence at NYU Polytechnic School of Engineering, one of the first universities to offer a cybersecurity program.

Guido is among the few instructors in the country to teach offensive security tactics, and his Penetration Testing and Vulnerability Analysis course is a mainstay of the cybersecurity programs at NYU Engineering. The CTF Field Guide combines elements of Guido’s classes, along with material Trail of Bits developed in collaboration with the Defense Advanced Research Projects Agency (DARPA) to train military academy students and reference material from leading security researchers around the world.

“Capture the Flag events can test and improve almost every skill that computer security professionals rely on, but one of the most valuable is mastering offensive maneuvers—-learning to think like attackers,” said Guido. “We created the CTF Field Guide to allow anyone interested in boosting their skills, from high school students to working professionals, to benefit from some of the best teaching in the world, free of charge and at their own pace.”

The CTF Field Guide is housed on GitHub, which allows users to contribute to and improve the course material over time. It is also available as a downloadable GitBook that can be viewed as a pdf or ebook. While courses on similar topics have been previously offered online, the CTF Field Guide is the first to be freely available and to allow ongoing collaboration and updates based on real-world attack trends.

Participation in CTF competitions has skyrocketed in recent years. Some of the largest events—DEF CON’s CTF and the NYU Engineering CSAW CTF among them—attract tens of thousands of entrants, and many events now include challenges specifically tailored for young student teams.

Trail of Bits is a sponsor of the High School CTF (HSCTF), the first CTF event designed for high school students by their peers which included more than 1000 competitors. Guido believes there’s no better time to launch the CTF Field Guide. “Students who competed in these recent games—or who plan to do so in the future—can start the course right now and there’s no question they’ll be better prepared to succeed next year.”

About Trail of Bits

Founded in 2012, Trail of Bits enables enterprises to make better strategic security decisions with its world-class experience in security research, red teaming and incident response. The Trail of Bits management team is comprised of some of the most recognized researchers in the security industry, renowned for their expertise in reverse engineering, novel exploit techniques and mobile security. Trail of Bits has collaborated extensively with DARPA on the agency’s acclaimed Cyber Fast Track, Cyber Grand Challenge and Cyber Stakes programs. In 2014, the company launched its first enterprise product, Javelin, which simulates attacks to help companies measure and refine their security posture.

Learn more at www.trailofbits.com

Using Static Analysis and Clang To Find Heartbleed

Background

Friday night I sat down with a glass of Macallan 15 and decided to write a static checker that would find the Heartbleed bug. I decided that I would write it as an out-of-tree clang analyzer plugin and evaluate it on a few very small functions that had the spirit of the Heartbleed bug in them, and then finally on the vulnerable OpenSSL code-base itself.

The Clang project ships an analysis infrastructure with their compiler, it’s invoked via scan-build. It hooks whatever existing make system you have to interpose the clang analyzer into the build process and the analyzer is invoked with the same arguments as the compiler. This way, the analyzer can ‘visit’ every compilation unit in the program that compiles under clang. There are some limitations to clang analyzer that I’ll touch on in the discussion section.

This exercise added to my list of things that I can only do while drinking: I have the best success with first-order logic while drinking beer, and I have the best success with clang analyzer while drinking scotch.

Strategy

One approach to identify Heartbleed statically was proposed by Coverity recently, which is to taint the return values of calls to ntohl and ntohs as input data. One problem with doing static analysis on a big state machine like OpenSSL is that your analysis either has to know the state machine to be able to track what values are attacker influenced across the whole program, or, they have to have some kind of annotation in the program that tells the analysis where there is a use of input data.

I like this observation because it is pretty actionable. You mark ntohl calls as producing tainted data, which is a heuristic, but a pretty good one because programmers probably won’t htonl their own data.

What our clang analyzer plugin should do is identify locations in the program where variables are written using ntohl, taint them, and then alert when those tainted values are used as the size parameter to memcpy. Except, that isn’t quite right, it could be the use is safe. We’ll also check the constraints of the tainted values at the location of the call: if the tainted value hasn’t been constrained in some way by the program logic, and it’s used as an argument to memcpy, alert on a bug. This could also miss some bugs, but I’m writing this over a 24h period with some Scotch, so increasing precision can come later.

Clang analyzer details

The clang analyzer implements a type of symbolic execution to analyze C/C++ programs. Plugging in to this framework as an analyzer requires bending your mind around the clang analyzer view of program state. This is where I consumed the most scotch.

The analyzer, under the hood, performs a symbolic/abstract exploration of program state. This exploration is flow and path sensitive, so it is different from traditional compiler data flow analysis. The analysis maintains a “state” object for each path through the program, and in this state object are constraints and facts about the program’s execution on that path. This state object can be queried by your analyzer, and, your analyzer can change the state to include information produced by your analysis.

This was one of my biggest hurdles when writing the analyzer – once I have a “symbolic variable” in a particular state, how do I query the range of that symbolic variable? Say there is a program fragment that looks like this:

int data = ntohl(pkt_data);
if(data >= 0 && data < sizeof(global_arr)) {
 // CASE A
...
} else {
 // CASE B
 ...
}

When looking at this program from the analyzers point of view, the state “splits” at the if into two different states A and B. In state A, there is a constraint that data is between certain bounds, and in case B there is a constraint that data is NOT within certain bounds. How do you access this information from your checker?

If your checker calls the “dump” method on its given “state” object, data like the following will be printed out:

Ranges of symbol values:
 conj_$2{int} : { [-2147483648, -2], [0, 2147483647] }
 conj_$9{uint32_t} : { [0, 6] }

In this example, conj_$9{uint32_t} is our ‘data’ value above and the state is in the A state. We have a range on ‘data’ that places it between 0 and 6. How can we, as the checker, observe that there’s a difference between this range and an unconstrained range of say [-2147483648, 2147483648]?

The answer is, we create a formula that tests the symbolic value of ‘data’ against some conditions that we enforce, and then we ask the state what program states exist when this formula is true and when it is false. If a new formula contradicts an existing formula, the state is infeasible and no state is generated. So we create a formula that says, roughly, “data > 500″ to ask if data could ever be greater than 500. When we ask the state for new states where this is true and where it is false, it will only give us a state where it is false.

This is the kind of idiom used inside of clang analyzer to answer questions about constraints on state. The arrays bounds checkers use this trick to identify states where the sizes of an array are not used as constraints on indexes into the array.

Implementation

Your analyzer is implemented as a C++ class. You define different “check” functions that you want to be notified of when the analyzer is exploring program state. For example, if your analyzer wants to consider the arguments to a function call before the function is called, you create a member method with a signature that looks like this:

void checkPreCall(const CallEvent &Call, CheckerContext &C) const;

Your analyzer can then match on the function about to be (symbolically) invoked. So our implementation works in three stages:

  1. Identify calls to ntohl/ntoh
  2. Taint the return value of those calls
  3. Identify unconstrained uses of tainted data

We accomplish the first and second with a checkPostCall visitor that roughly does this:

void NetworkTaintChecker::checkPostCall(const CallEvent &Call,
CheckerContext &C) const {
  const IdentifierInfo *ID = Call.getCalleeIdentifier();

  if(ID == NULL) {
    return;
  }

  if(ID->getName() == "ntohl" || ID->getName() == "ntohs") {
    ProgramStateRef State = C.getState();
    SymbolRef 	    Sym = Call.getReturnValue().getAsSymbol();

    if(Sym) {
      ProgramStateRef newState = State->addTaint(Sym);
      C.addTransition(newState);
    }
  }

Pretty straightforward, we just get the return value, if present, taint it, and add the state with the tainted return value as an output of our visit via ‘addTransition’.

For the third goal, we have a checkPreCall visitor that considers a function call parameters like so:

void NetworkTaintChecker::checkPreCall(const CallEvent &Call,
CheckerContext &C) const {
  ProgramStateRef State = C.getState();
  const IdentifierInfo *ID = Call.getCalleeIdentifier();

  if(ID == NULL) {
    return;
  }
  if(ID->getName() == "memcpy") {
    SVal            SizeArg = Call.getArgSVal(2);
    ProgramStateRef state =C.getState();

    if(state->isTainted(SizeArg)) {
      SValBuilder       &svalBuilder = C.getSValBuilder();
      Optional<NonLoc>  SizeArgNL = SizeArg.getAs<NonLoc>();

      if(this->isArgUnConstrained(SizeArgNL, svalBuilder, state) == true) {
        ExplodedNode  *loc = C.generateSink();
        if(loc) {
          BugReport *bug = new BugReport(*this->BT, "Tainted,
unconstrained value used in memcpy size", loc);
          C.emitReport(bug);
        }
      }
    }
  }

Also relatively straightforward, our logic to check if a value is unconstrained is hidden in ‘isArgUnConstrained’, so if a tainted, symbolic value has insufficient constraints on it in our current path, we report a bug.

Some implementation pitfalls

It turns out that OpenSSL doesn’t use ntohs/ntohl, they have n2s / n2l macros that re-implement the byte-swapping logic. If this was in LLVM IR, it would be tractable to write a “byte-swapping recognizer” that uses an amount of logic to prove when a piece of code approximates the semantics of a byte-swap.

There is also some behavior that I have not figured out in clang’s creation of the AST for openssl where calls to ntohs are replaced with __builtin_pre(__x), which has no IdentifierInfo and thus no name. To work around this, I replaced the n2s macro with a function call to xyzzy, resulting in linking failures, and adapted my function check from above to check for a function named xyzzy. This worked well enough to identify the Heartbleed bug.

Solution output with demo programs and OpenSSL

First let’s look at some little toy programs. Here is one toy example with output:

$ cat demo2.c

...

int data_array[] = { 0, 18, 21, 95, 43, 32, 51};

int main(int argc, char *argv[]) {
  int   fd;
  char  buf[512] = {0};

  fd = open("dtin", O_RDONLY);

  if(fd != -1) {
    int size;
    int res;

    res = read(fd, &size, sizeof(int));

    if(res == sizeof(int)) {
      size = ntohl(size);

      if(size < sizeof(data_array)) {
        memcpy(buf, data_array, size);
      }

      memcpy(buf, data_array, size);
    }

    close(fd);
  }

  return 0;
}

$ ../docheck.sh
scan-build: Using '/usr/bin/clang' for static analysis
/usr/bin/ccc-analyzer -o demo2 demo2.c
demo2.c:30:7: warning: Tainted, unconstrained value used in memcpy size
      memcpy(buf, data_array, size);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.
scan-build: 1 bugs found.
scan-build: Run 'scan-view /tmp/scan-build-2014-04-26-223755-8651-1' to
examine bug reports.

And finally, to see it catching Heartbleed in both locations it was present in OpenSSL, see the following:

Image

Image

Discussion

The approach needs some improvement, we reason about if a tainted value is “appropriately” constrained or not in a very coarse-grained way. Sometimes that’s the best you can do though – if your analysis doesn’t know how large a particular buffer is, perhaps it’s enough to show to an analyst “hey, this value could be larger than 5000 and it is used as a parameter to memcpy, is that okay?”

I really don’t like the limitation in clang analyzer of operating on ASTs. I spent a lot of time fighting with the clang AST representation of ntohs and I still don’t understand what the source of the problem was. I kind of just want to consider a programs semantics in a virtual machine with very simple semantics, so LLVM IR seems ideal to me. This might just be my PL roots showing though.

I really do like the clang analyzers interface to path constraints. I think that interface is pretty powerful and once you get your head around how to apply your problem to asking states if new states satisfying your constraints are feasible, it’s pretty straightforward to write new analyses.

Edit: Code Post

I’ve posted the code for the checker to Github, here.

Introducing Javelin

Javelin

Javelin shows you how modern attackers would approach and exploit your enterprise. By simulating real-time, real-world attack techniques, Javelin identifies which employees are most likely to be targets of spearphishing campaigns, uncovers security infrastructure weaknesses, and compares overall vulnerability against industry competitors. Javelin benchmarks the efficacy of defensive strategies, and provides customized recommendations for improving security and accelerating threat detection. Highly automated, low touch, and designed for easy adoption, Javelin will harden your existing security and information technology infrastructure.

Read more about Javelin on the Javelin Blog.

Semantic Analysis of Native Programs, introducing CodeReason

Introduction

Have you ever wanted to make a query into a native mode program asking about program locations that write a specific value to a register? Have you ever wanted to automatically deobfuscate obfuscated strings?

Reverse engineering a native program involves understanding its semantics at a low level until a high level picture of functionality emerges. One challenge facing a principled understanding of a native mode program is that this understanding must extend to every instruction used by the program. Your analysis must know which instructions have what effects on memory calls and registers.

We’d like to introduce CodeReason, a machine code analysis framework we produced for DARPA Cyber Fast Track. CodeReason provides a framework for analyzing the semantics of native x86 and ARM code. We like CodeReason because it provides us a platform to make queries about the effects that native code has on overall program state. CodeReason does this by having a deep semantic understanding of native instructions.

Building this semantic understanding is time-consuming and expensive. There are existing systems, but they have high barriers to entry or don’t do precisely what we want, or they don’t apply simplifications and optimizations to their semantics. We want to do that because these simplifications can reduce otherwise hairy optimizations to simple expressions that are easy to understand. To motivate this, we’ll give an example of a time we used CodeReason.

Simplifying Flame

Around when the Flame malware was revealed, some of its binaries were posted onto malware.lu. Their overall scheme is to store the obfuscated string in a structure in global data. The structure looks something like this:


struct ObfuscatedString {
char padding[7];
char hasDeobfuscated;
short stringLen;
char string[];
};

Each structure has variable-length data at the end, with 7 bytes of data that were apparently unused.

There are two fun things here. First I used Code Reason to write a string deobfuscator in C. The original program logic performs string deobfuscation in three steps.

The first function checks the hasDeobfuscated field and if it is zero, will return a pointer to the first element of the string. If the field is not zero, it will call the second function, and then set hasDeobfuscated to zero.

The second function will iterate over every character in the ‘string’ array. At each character, it will call a third function and then subtract the value returned by the third function from the character in the string array, writing the result back into the array. So it looks something like:


void inplace_buffer_decrypt(unsigned char *buf, int len) {
int counted = 0;
while( counted < len ) {
unsigned char *cur = buf + counted;
unsigned char newChar = get_decrypt_modifier_f(counted);
*cur -= newChar;
++counted;
}
return;
}

What about the third function, ‘get_decrypt_modifier’? This function is one basic block long and looks like this:


lea ecx, [eax+11h]
add eax, 0Bh
imul ecx, eax
mov edx, ecx
shr edx, 8
mov eax, edx
xor eax, ecx
shr eax, 10h
xor eax, edx
xor eax, ecx
retn

An advantage of having a native code semantics understanding system is that I could capture this block and feed it to CodeReason and have it tell me what the equation of ‘eax’ looks like. This would tell me what this block ‘returns’ to its caller, and would let me capture the semantics of what get_decrypt_modifier does in my deobfuscator.

It would also be possible to decompile this snippet to C, however what I’m really concerned with is the effect of the code on ‘eax’ and not something as high-level as what the code “looks like” in a C decompilers view of the world. C decompilers also use a semantics translator, but then proxy the results of that translation through an attempt at translating to C. CodeReason lets us skip the last step and consider just the semantics, which sometimes can be more powerful.

Using CodeReason

Getting this from CodeReason looks like this:


$ ./bin/VEEShell -a X86 -f ../tests/testSkyWipe.bin
blockLen: 28
r
...
EAX = Xor32[ Xor32[ Shr32[ Xor32[ Shr32[ Mul32[ Add32[ REGREAD(EAX), I:U32(0xb) ], Add32[ REGREAD(EAX), I:U32(0x11) ] ], I:U8(0x8) ], Mul32[ Add32[ REGREAD(EAX), I:U32(0xb) ], Add32[ REGREAD(EAX), I:U32(0x11) ] ] ], I:U8(0x10) ], Shr32[ Mul32[ Add32[ REGREAD(EAX), I:U32(0xb) ], Add32[ REGREAD(EAX), I:U32(0x11) ] ], I:U8(0x8) ] ], Mul32[ Add32[ REGREAD(EAX), I:U32(0xb) ], Add32[ REGREAD(EAX), I:U32(0x11) ] ] ]
...
EIP = REGREAD(ESP)

This is cool, because if I implement functions for Xor32, Mul32, Add32, and Shr32, I have this function in C, like so:


unsigned char get_decrypt_modifier_f(unsigned int a) {
return Xor32(
Xor32(
Shr32(
Xor32(
Shr32(
Mul32(
Add32( a, 0xb),
Add32( a, 0x11) ),
0x8 ),
Mul32(
Add32( a, 0xb ),
Add32( a, 0x11 ) ) ),
0x10 ),
Shr32(
Mul32(
Add32( a, 0xb ),
Add32( a, 0x11 ) ),
0x8 ) ),
Mul32(
Add32( a, 0xb ),
Add32( a, 0x11 ) ) );
}

And this also is cool because it works.


C:\code\tmp>skywiper_string_decrypt.exe
CreateToolhelp32Snapshot

We’re extending CodeReason into an IDA plugin that allows us to make these queries directly from IDA, which should be really cool!

The second fun thing here is that this string deobfuscator has a race condition. If two threads try and deobfuscate the same thread at the same time, they will corrupt the string forever. This could be bad if you were trying to do something important with an obfuscated string, as it would result in passing bad data to a system service or something, which could have very bad effects.

I’ve used CodeReason to attack string obfuscations that were implemented like this:


xor eax eax
push eax
sub eax, 0x21ece84
push eax

Where the sequence of native instructions would turn non-string immediate values into string values (through a clever use of the semantics of twos compliment arithmetic) and then push them in the correct order onto the stack, thereby building a string dynamically each time the deobfuscation code ran. CodeReason was able to look at this and, using a very simple pinhole optimizer, convert the code into a sequence of memory writes of string immediate values, like:


MEMWRITE[esp] = '.dll'
MEMWRITE[esp-4] = 'nlan'

Conclusions

Having machine code in a form where it can be optimized and understood can be kind of powerful! Especially when that is available from a programmatic library. Using CodeReason, we were able to extract the semantics of string obfuscation functions and automatically implement a string de-obfuscator. Further, we were able to simplify obfuscating code into a form that expressed the de-obfuscated string values on their own. We plan to cover additional uses and capabilities of CodeReason in future blog posts.

Follow

Get every new post delivered to your Inbox.

Join 3,579 other followers