Marshal madness: A brief history of Ruby deserialization exploits

Page content

Documenting the evolution of exploitation techniques serves a crucial purpose in security engineering: it helps us understand not just individual vulnerabilities but the systemic patterns that resist conventional fixes. The story of deserialization exploits in Ruby’s Marshal module offers a uniquely well-documented case study of this phenomenon. That is, a decade-long cycle of patches and bypasses that reveals the futility of addressing symptoms rather than root causes.

This history matters because it demonstrates why certain classes of vulnerabilities persist despite our best efforts. By tracing how we got here, we can better understand why fundamental changes to the Ruby ecosystem are necessary, rather than continued reliance on the patch-and-hope approach that has thus far failed to solve the problem.

It’s worth noting that Trail of Bits has been documenting Ruby deserialization bugs since at least 2015, although these are in JSON and YAML data formats. Hal Brodigan, a Trail of Bits employee from 2012 to 2015, documented even earlier Ruby examples as well. Additionally, Java has CVEs going back to 2011 and PHP to 2007 regarding deserialization of untrusted data. For this reason, I’ve decided to focus primarily on Ruby Marshal deserialization in this post and avoid the long-tail of all serialization formats and programming languages. And with that, here is a brief look at the evolution of Marshal deserialization exploits:

A timeline of Ruby deserialization exploits

Understanding Marshal deserialization vulnerabilities

It’s December 12, 2024. Somewhere in the Ruby language’s CI servers version 3.4.0-rc1 has just been released. Unbeknownst to Ruby developers, a subtle bug exists that allows for Marshal deserialization exploitation. This code had not been touched for 16 years, so there’s no reason to suspect a thing. However, a few weeks prior Luke Jahnke had published a new Marshal exploitation technique, but word had not gotten around yet. Just before Christmas day a patch is merged and Ruby 3.4.0 is eventually released without the vulnerable code. Crisis averted.

If the unpatched code were released, then a Rails controller like the following would have been vulnerable:

class UserRestoreController < ApplicationController
  def show
    # A user data restoration controller somewhere...
    user_data = params[:data]
    if user_data.present?
      deserialized_user = Marshal.load(Base64.decode64(user_data))
      user_object = UserObject.new(data: deserialized_user)
      user_object.save!
      render plain: "User data saved successfully: #{deserialized_user.inspect}"
    else
      render plain: "No data provided", status: :bad_request
    end
  end
end
Figure 1: A Rails controller performing Marshal deserialization

Generally, you will not see code as contrived as the snippet above, but the Marshal format often makes its way into back-end systems like caching layers or when storing Ruby objects on the filesystem. Simply put, passing untrusted input to Marshal.load should be considered an arbitrary code execution vulnerability. Building exploits for this type of vulnerability is beyond the scope of this post and has been exhaustively covered in sources linked below (see Jahnke 2024). However, generally it would require sending a Marshal-formatted sequence of bytes like the following to the Rails controller in figure 1:

"\x04\b[\ac\x15Gem::SpecFetcherU:\x11Gem::Version[\x06o:\x1EGem::RequestSet::Lockfile\n:\t@seto:\x14Gem::RequestSet\x06:\x15@sorted_requests[\ao:%Gem::Resolver::SpecSpecification\x06:\n@speco:$Gem::Resolver::GitSpecification\a:\f@sourceo:\x15Gem::Source::Git\n:\t@gitI\"\bzip\x06:\x06ET:\x0F@referenceI\"\x10/etc/passwd\x06;\x10T:\x0E@root_dirI\"\t/tmp\x06;\x10T:\x10@repositoryI\"\bany\x06;\x10T:\n@nameI\"\bany\x06;\x10T;\vo:!Gem::Resolver::Specification\a;\x14I\"\bany\x06;\x10T:\x12@dependencies[\x00o;\n\x06;\vo;\f\a;\ro;\x0E\n;\x0FI\"\bzip\x06;\x10T;\x11I\"*-TmTT=\"$(id>/tmp/marshal-poc)\"any.zip\x06;\x10T;\x12I\"\t/tmp\x06;\x10T;\x13I\"\bany\x06;\x10T;\x14I\"\bany\x06;\x10T;\vo;\x15\a;\x14I\"\bany\x06;\x10T;\x16[\x00;\x16[\x00:\x13@gem_deps_fileI\"\x11/private/tmp\x06;\x10T:\x12@gem_deps_dirI\"\r/private\x06;\x10T:\x0F@platforms[\x00"
Figure 2: An example Marshal deserialization exploit payload

Thankfully the new exploitation vector mentioned earlier was patched before Ruby 3.4.0 was released, but, as we’ll see, patching eventually becomes an exercise in futility. To understand how we got here, first we must understand where we’ve been.

The beginning: An unassuming bug tracker issue

In the beginning, Charlie Somerville (now Hailey) created a Ruby bug tracker issue on January 31, 2013. It discusses the dangers of Marshal.load in Ruby version 2.0.0. There may have been some private correspondence between Hailey and the Ruby team on the security mailing list before this issue was created, but I’ll call this issue the beginning of the Marshal deserialization exploitation lineage.

Before running off to find earlier examples, consider the point I’m trying to make here: that there is a direct link between Hailey’s issue and modern Ruby deserialization exploit development. This post is not about finding the earliest reference to Marshal deserialization bugs—it’s instead about tracing an evolution of thought and development of exploits.

From here, the story picks back up on May 6, 2016 in Phrack #69. Ahh Phrack, our old friend. It’s a good thing this Phile made it into #69 because #70 would come out over five years later. The concept of Marshal deserialization exploitation finds its way into a subsection of a section of a Phile of a Phrack Issue. It targeted Marshal deserialization in Ruby on Rails versions 3 and 4. The Phile’s author, joernchen, directly credits Hailey Somerville and then immediately states that the technique is patched in Rails 4.1 unless you modify the default behavior. The life and death of an exploit. This marks the end of the beginning. From here on out, Ruby deserialization exploitation will explode in both popularity and creativity.

The explosion: Security researchers riffing

On November 8, 2018, Luke Jahnke published a blog post titled “Ruby 2.x Universal RCE Deserialization Gadget Chain.” This gadget chain targets Marshal deserialization in Ruby 2.x. For the first time in this story, we see the programmatic hunting of exploit gadgets—that is, using rudimentary program analysis to search the Ruby standard library and other common libraries for exploit gadgets. Exploit gadgets are snippets of code that can be chained together to create a payload enabling malicious actions like arbitrary code execution or arbitrary file download. These payloads are what we typically think of as an “exploit.” So, Luke published a technique for generating Marshal deserialization exploitation payloads. He also made sure to credit the Phrack article and Hailey Somerville.

Now the fun begins. In relatively quick succession, the following content is published:

DateContentReferencesPatch
January 2, 2019 (publicly disclosed March 19, 2019)ooooooo_q opens a HackerOne report for a Rails 5.2 Marshal deserialization bug. This bug receives CVE-2019-5420.Rails 5.2.2.1 and backported to other supported versions
March 2, 2019Etienne Stalmans publishes “Universal RCE with Ruby YAML.load,” targeting YAML deserialization in Ruby 2.x.Luke’s work from 2018Ruby 2.7.2 and Rails 6.1
June 20, 2019Zero Day Initiative publishes a blog post.ooooooo_q’s work from 2019
January 7, 2021William Bowling publishes “Universal Deserialisation Gadget for Ruby 2.x-3.x,” targeting Ruby 2.x and 3.x.Luke’s work from 2018Ruby 3.1.0 (per 2b17d2f)
January 9, 2021Etienne Stalmans publishes “Universal RCE with Ruby YAML.load (versions > 2.7),” which includes a full YAML payload for exploiting the chain.Luke’s work from 2018 and William’s work from 2021
April 4, 2022William Bowling publishes “Round Two: An Updated Universal Deserialisation Gadget for Ruby 2.x-3.x,” targeting Ruby 2.x and 3.x with the latest patches.Luke’s work from 2018 and William’s work from 2021Ruby 3.2.0

The flurry of activity was so great that ooooooo_q even wrote a book called Deserialization on Rails. However, from there things go quiet for a bit. Two things happen: Ruby 3.1.0 and Psych (YAML) 4.0.0 make safe YAML loading the default, and Ruby 3.2.0 patches Marshal deserialization gadgets. It appears that deserialization exploitation is on the decline, but hackers will always find a way. I consider the next era to be the “modern era.” It continues to blend bug hunting and program analysis to push the state of the art forward.

The modern era: Robust gadget discovery

One of the defining characteristics of what I consider to be the modern era of Ruby deserialization exploitation is the industrialized, clinical approach. No longer are individuals hacking away in their spare time. The modern era sees professionals using “industrial-grade” tooling to overpower defenders. In many ways this mirrors the security industry as a whole. Vulnerability research and exploit development are not what they were 10 to 15 years ago. The cat and mouse game remains the same, but the tools and techniques are significantly more advanced. And there are many more organizations willing to pay for it.

The modern era opens with a blog post published on March 13, 2024, by Alex Leahu from Include Security titled “Discovering Deserialization Gadget Chains in Rubyland.” He references Luke’s and William’s blog posts and uses grep to search for exploit gadgets. He ultimately uses Rails libraries to create an exploit chain. The downside is that this chain will not work outside of Rails environments. However, shortly after this post, another is published that takes everything to another level.

On June 20, 2024, Peter Stöckli and GitHub Security Lab publish a blog post titled “Execute commands by sending JSON? Learn how unsafe deserialization vulnerabilities work in Ruby projects.” This post describes their research on Ruby deserialization exploits for JSON, XML, YAML, and, yes of course, Marshal data formats. It provides CodeQL queries that perform interprocedural analysis to determine if your serialization usage is vulnerable to exploitation. It provides proof-of-concept payloads for exploiting deserialization bugs in all the previously mentioned data formats. In short, it’s quite a departure from Hailey Somerville’s Ruby bug tracker issue in 2013. Although, of course, it references William’s universal deserialization gadget post.

This is where my investigation into the modern era was supposed to end. I’ve been referencing the resources mentioned here for a number of years as I audit Ruby code, and every so often I’m gifted with another arrow for my quiver. And lo and behold, just as I go to write this post, another gadget gets posted for Ruby 3.4 on October 16, 2024, by Leonardo Giovannini from Doyensec. Then, as mentioned at the beginning of the story, Luke Jahnke enters back onto the scene on November 24, 2024, with “Ruby 3.4 Universal RCE Deserialization Gadget Chain,” and again on December 3 with “Gem::SafeMarshal escape.” Both of these techniques were eventually patched. The wheel turns.

The future: Ending Marshal madness

So where do we go from here? Unfortunately, the cycle continues today. During our recent security audit of RubyGems.org for Ruby Central, we discovered multiple Marshal-related vulnerabilities within code that thousands of developers rely on daily. Both of these vulnerabilities were rated as informational severity due to their difficulty of exploitation, but, as we’ve seen in this post, they are still a ticking time bomb. One of them enables the exact Gem module gadget chain we’ve seen time and time again. The persistence of these issues in a security-conscious codebase like RubyGems.org speaks volumes about how deeply entrenched use of Marshal remains in the Ruby ecosystem.

To improve the situation, we make the following recommendations to the Ruby ecosystem.

Here’s what Ruby developers should do:

  1. Audit your codebase for Marshal usage. Search for Marshal.load, marshal_load, and similar methods.
  2. Replace Marshal with safer alternatives:
    • Use YAML’s safe_load with explicitly permitted classes.
    • Use JSON with manual object construction.
    • Use properly typed database columns, not opaque binary blobs.
    • Consider other serialization formats such as MessagePack or Protocol Buffers.
  3. Add deserialization to your security review checklist for both code and dependency reviews.

To the Ruby core team and community, more fundamental changes are needed. We recommend slowly deprecating and eventually removing the Marshal module in stages:

  1. Introduce a Marshal.safe_load method similar to YAML’s that only deserializes primitives by default. This method should take a permitted_classes keyword argument that allows additional classes to be (de)serialized.
  2. Add runtime warnings when Marshal.load is called.
  3. In future Ruby versions, make Marshal.load behave like safe_load by default, and add a Marshal.unsafe_load method that points to the original load behavior.
  4. Finally, after several versions, fully deprecate and remove the unsafe behavior.

Languages like Python, Java, and Ruby have been plagued by these types of bugs for decades. Indeed, there are other possibilities for deserialization bugs in formats such as JSON and YAML, but we have to start somewhere. Marshal is simply too dangerous to use in 2025. The same argument can be made for Python and pickle, but we’ll spare the AI industry for another day. There’s a reason Go and Rust do not have these types of bugs. If we remove Marshal and unsafe variants of these serialization formats, then these bugs go away. If they’re too ergonomic and too tempting to use, then we will continue to see these bugs. It’s as simple as that.

If you’d like to read more about our Ruby work, then check out our blog post “Introducing Ruzzy, a coverage-guided Ruby fuzzer,” our Semgrep Ruby rules, and our Ruby Security Field Guide.

Contact us if you’re interested in a Ruby audit for issues like deserialization bugs and others!