Fuzzing In The Year 2000

It is time for the second installment of our efforts to reproduce original fuzzing research on modern systems. If you haven’t yet, please read the first part. This time we tackle fuzzing on Windows by reproducing the results of “An Empirical Study of the Robustness of Windows NT Applications Using Random Testing” (aka ‘the NT Fuzz Report’) by Justin E. Forrester and Barton P. Miller, published in 2000.

The NT Fuzz Report tested 33 applications on Windows NT and an early release copy of Windows 2000 for susceptibility to malformed window messages and randomly generated mouse and keyboard events. Since Dr. Miller published the fuzzer code, we used the exact same tools as the original authors to find bugs in modern Windows applications.

The results were nearly identical: 19 years ago 100% of tested applications crashed or froze when fuzzed with malformed window messages. Today, 93% of tested applications crash or freeze when confronted with the same fuzzer. Among the the applications that did not crash was our old friend, Calculator (Figure 1). We also found a bug (but not a security issue) in Windows.

Figure 1: Bruised but not beaten. The recently open-sourced Windows Calculator was one of two tested applications that didn’t freeze or crash after facing off against the window message fuzzer from the year 2000. Calculator was resized after fuzzing to showcase artifacts of the fuzzing session.

A Quick Introduction to Windows

So what are window messages and why do they crash programs?

Windows applications that display a GUI are driven by events: a mouse move, a button click, a key press, etc. An event-driven application doesn’t do anything until it is notified of an event. Once an event is received, the application takes action based on the event, and then waits for more events. If this sounds familiar, it’s because the architecture is making a comeback in platforms like node.js.

Window messages are the event notification method in Windows. Each window message has a numeric code associated with a particular event. Each message has one or more parameters, by convention called lParam and wParam, that specify more details about the event. Examples of such details include the coordinates of mouse movement, what key was pressed, or what text to draw in a window. These messages can be sent by the program itself, by the operating system, or by other programs. They can arrive at any time and in any order, and must be handled by the receiving application.

Security Implications

Prior to Windows Vista it was possible for a low-privilege process to send messages to a high-privilege process. Using the right combination of messages, it was possible to gain code execution in the high-privilege process. These “shatter attacks” have been largely mitigated since Vista with UIPI and by isolating system services in a separate session.

Mishandling of window messages is unlikely to have a security impact on modern Windows systems for two reasons. First, window messages can’t be sent over the network. Second, crashing or gaining code execution at the same privilege level as you already have is not useful. This was likely apparent to the authors of the NT Fuzz Report. They do not make security claims, but correctly point out that crashes during window message handling imply a lack of rigorous testing.

There are some domains where same-privilege code execution may violate a real security boundary. Some applications combine various security primitives to create an artificial privilege level not natively present in the operating system. The prime examples is a browser’s renderer sandbox. Browser vendors are well aware of these issues and take steps to mitigate them. Another example is antivirus products. Their control panel runs with normal user privileges but is protected against inspection and tampering by other parts of the product.

Testing Methodology

We used the same core fuzzing code and methodology described in the original NT Fuzz Report to fuzz all applications in our test set. Specifically, in both SendMessage and PostMessage modes, the fuzzer used three iterations of 500,000 messages with the seed 42 and three iterations of 500,000 messages using the seed 1,337. We saw results after executing just one iteration of each method.

Fuzzing using the “random mouse and keyboard input” method was omitted due to time constraints and the desire to focus purely on window messages. We encourage you to replicate those results as well.

Caveats

Two minor changes were necessary to use the fuzzer on Windows 10. First was a tiny change to build the fuzzer on 64-bit Windows. The second change was enabling the fuzzer to target a specific window handle via a command line argument. Fuzzing a specific handle was a quick solution to the problem of fuzzing Universal Windows Platform (UWP) applications. The window message fuzzer is oriented to fuzzing windows belonging to a specific process, but UWP applications all display their UI via the same process (Figure 2). This meant that the fuzzer could not target the main window of UWP applications.

Figure 2: UWP application windows all belong to the same process (ApplicationFrameHost.exe). To fuzz these applications, the original NT fuzzer was modified to allow fuzzing of a user-specified window handle.

While modifying the fuzzer, a serious flaw was identified: the values selected for the two primary sources of randomized input, the lParam and wParam arguments to SendMessage and PostMessage, are limited to 16-bit integers. Both of the arguments are 32-bit on 32-bit Windows, and 64-bit on 64-bit Windows. The problem occurs In Fuzz.cpp, where the lParam and wParam values are set:

     wParam = (UINT) rand();
     lParam = (LONG) rand();

The rand() function returns a number in the range [0, 216], greatly limiting the set of tested values. This bug was purposely preserved during evaluation, to ensure results were accurately comparable against the original work.

Tested Applications

The NT Fuzz Report tested 33 programs. This reproduction tests just 28 because only one version of each program is used for testing. The Windows software ecosystem has changed substantially since 2000, but there is also a surprising amount of conservation. The Microsoft Office suite features the same programs as the original tests. Netscape Communicator evolved into what is now Firefox. Adobe Acrobat was renamed to Adobe Reader, but is still going strong. Even Winamp made a new release in 2018, allowing for a fair comparison with the original NT Fuzz Report. However, some legacy software has gone the way of the last millenium. Find below the list of changes, and why:

  • CD Player ⇨ Windows Media Player: The Windows Media Player has subsumed CD Player functionality.
  • Eudora ⇨ Windows Mail: Qualcomm now makes basebands, not email clients. Because Eudora is no longer around, the default Windows email client was tested instead.
  • Command AntiVirus ⇨ Avast Free Edition: The Command product is no longer available. It was replaced with Avast, the most popular third-party antivirus vendor.
  • GSView ⇨ Photos: The GSView application is no longer maintained. It was replaced with Photos, the default Windows photo viewer.
  • JavaWorkshop ⇨ NetBeans IDE: The JavaWorkshop IDE is no longer maintained. NetBeans seemed like a good free alternative that fits the spirit of what should be tested.
  • Secure CRT ⇨ BitVise SSH: Secure CRT is still around, but required a very long web form to download a trial version. BitVise SSH offered a quick download.
  • Telnet ⇨ Putty: The telnet application still exists on Windows, but now it is a console application. To fuzz a GUI application, we replaced telnet with Putty, a popular open-source terminal emulator for Windows.
  • Freecell & Solitaire were run from the Microsoft Solitaire Collection application in the Windows App Store.

The specific application version appears in the results table. All fuzzing was done on a 64-bit installation of Windows 10 Pro, version 1809 (OS Build 17763.253).

Results

As mentioned in the NT Fuzz Report, the results should not be treated as security vulnerabilities, but instead a measure of software robustness and quality.

“Finally, our results form a quantitative starting point from which to judge the relative improvement in software robustness.”

From “An Empirical Study of the Robustness of Windows NT Applications Using Random Testing” by Justin E. Forrester and Barton P. Miller

The numbers are not particularly encouraging, although the situation is improving. In the original NT Fuzz Report, every application either crashed or froze when fuzzed. Now, two programs, Calculator and Avast Antivirus, survive the window message fuzzer with no ill effects. Our praise goes to the Avast and Windows Calculator teams for thinking about erroneous window messages. The Calculator team gets additional kudos for open sourcing Calculator and showing everyone how a high-quality UWP application is built. See Table 1 for all of our fuzzing results, along with the specific version of the software used.

Program Version SendMessage PostMessage
Microsoft Access 1901 crash crash
Adobe Reader DC 2019.010.20098 crash ok
Calculator 10.1812.10048.0 ok ok
Windows Media Player 12.0.17763.292 crash crash
Visual Studio Code 1.30.2 crash ok
Avast Free 19.2.2364 ok ok
Windows Mail 16005.11231.20182.0 crash crash
Excel 1901 crash ok
Adobe FrameMaker 15.0.2.503 crash crash
Freecell 4.3.2112.0 crash crash
GhostScript 9.26 crash ok
Photos 2019.18114.17710.0 crash crash
GNU Emacs 26.1 crash crash
IE Edge 44.17763.1.0 crash crash
NetBeans 10 crash crash
Firefox 64.0.2 crash crash
Notepad 1809 crash ok
Paint 1809 crash crash
Paint Shop Pro 2019 21.1 crash crash
Powerpoint 1901 crash ok
Bitvise SSH 8.23 crash crash
Solitaire 4.3.2112.0 crash crash
Putty 0.70 freeze freeze
VS Community 2017 15.9.5 crash crash
WinAmp 5.8 5.8 Build 3660 crash ok
Word 1901 crash ok
Wordpad 1809 crash crash
WS_FTP 12.7.0.1903 crash crash

Table 1: The results of replicating the original NT Fuzz Report on Windows 10. After 19 years, very few applications properly handle malformed window messages.

A Bug in Windows?

Unfortunately our curiosity got the better of us and we had to make one exception. One common problem seemed to plague multiple unrelated applications. Some debugging showed the responsible message was WM_DEVICECHANGE. When the fuzzer sent that message, it would even crash the simplest application possible — the official Windows API HelloWorld Sample (Figure 3).

Figure 3: A 32-bit HelloWorld.exe crashes when faced with the window message fuzzer. This shouldn’t happen since the program is so simple. The implication is that the issue is somewhere in Windows.

Using the HelloWorld sample we quickly realized that the problem only affects 32-bit applications, not 64-bit applications. Some rapid debugging revealed that the crash is in wow64win.dll, the 32-to-64-bit compatibility layer. My quick (and possibly wrong) analysis of the problem shows that the wow64win.dll!whcbfnINDEVICECHANGE function will treat wParam as a pointer to a DEV_BROADCAST_HANDLE64 structure in the the target program. The function converts that structure to a DEV_BROADCAST_HANDLE32 structure for compatibility with 32-bit applications. The crash happens because the wParam value generated by the fuzzer points to invalid memory.

Treating wParam as a local pointer is a bad idea, although it was probably an intentional design decision to make sure removable device notifications work with legacy 32-bit Windows applications. Regardless, it certainly feels wrong that it is possible to crash another application without explicitly debugging it. We reported the issue to MSRC, even though no security boundary was being crossed. They confirmed the bug is not a security issue. We hope to see a fix for this admittedly obscure problem in a future version of Windows.

Conclusion

Window messages are an under-appreciated and often ignored source of untrusted input to Windows programs. Even 19 years after the first open-source window message fuzzer was deployed, 93% of tested applications still freeze or crash when run against the very same fuzzer. The fact that some applications gracefully handle these malformed inputs is an encouraging sign: it means frameworks and institutional knowledge to avoid these errors exist in some organizations.

There is also much room for improvement in window message fuzzing — the simplest method possible crashes 93% of applications. There may even be examples where window messages travel across a real security boundary. If you explore this area further, we hope you’ll share what you find.

2 thoughts on “Fuzzing In The Year 2000

  1. Speaking of sharing what you find, I don’t suppose you’d be willing to put your findings about PuTTY into the form of a bug report mentioning which window messages caused it to freeze?

  2. Pingback: Fuzzing in the Year 2000 | Hacker News

Leave a Reply