[HN Gopher] Practical HTTP Header Smuggling: Sneaking Past Rever... ___________________________________________________________________ Practical HTTP Header Smuggling: Sneaking Past Reverse Proxies to Attack AWS Author : MalacodaV Score : 90 points Date : 2021-11-11 15:49 UTC (7 hours ago) (HTM) web link (www.intruder.io) (TXT) w3m dump (www.intruder.io) | bigdubs wrote: | This was mentioned in go's net/http by this CL: https://go- | review.googlesource.com/c/go/+/17980/ it's an interesting point | that the spec allows this. | missblit wrote: | Note that the spec is stricter for field-name than for field- | value. Field names are ASCII, while field values are latin1 (or | mime encoded but no one cares about mime encoding). | | And yes I have seen bytes in both names and values in the wild | (where bytes in names are invalid but need to be handled | gracefully, while bytes in values are effectively valid latin1 | if only for legacy reasons) | | Looking at the bug you linked to, looks like this almost bit | them too. Here's the final field-value behavior they landed on: | https://go-review.googlesource.com/c/go/+/18375/ | Matthias247 wrote: | With HTTP/2 you can theoretically even transmit bytes with | all binary values inside them in both names and values - | since the values are length-delimited. | | Based on that, some implementations seem to restrict allowed | values to the rules that you describe, while others don't. | missblit wrote: | Yeah, and really the moral of the story shouldn't be (just) | to get good at parsing, but to assume that any two parsers | may disagree on how to parse a piece of data as part of | your security model. | Sohcahtoa82 wrote: | I think the real problem here is that people are writing web | servers that don't enforce spec. | | As soon as a space in a header name is found, a 400 Bad Request | needs to be thrown. "Content-Length abcd: 0" is invalid and | should never be accepted. | legulere wrote: | Like so many web technologies http headers are a big | complicated ad-hoc mess (some headers are specified in a way | that's not standard compliant), so it's to be expected that | there things like this happening. | thephyber wrote: | There is something more than that. HTTP Content smuggling has | already been identified as a significant issue and the largest | cloud providers and Reverse Proxy server software should have | already fixed these issues. | | I started a GitHub repo to run integration tests for popular | combinations of reverse proxy to popular language web servers | to identify these gaps in expectations (how duplicates, | capitalization, white space, etc affect HTTP headers in | different servers) | rini17 wrote: | More interesting is why Content-Length abcd: is treated same as | Content-Length: at all? Someone overoptimized the header | lookup? Then perhaps other kinds of extensions like Content- | Length-abcd are possible, not only with space? | kingcharles wrote: | I'm guessing they just check what each line starts with. Then | they probably split the line on the : to get the value. That | would produce the results seen. | | It shows just how careful you have to be when writing code | that is Internet-facing, and especially on the scale of AWS | where you have half the world's hackers trying to find | exploits. | | I'm not even looking for exploits and I find them every day. | For instance, I wanted to read some magazines the other day | but they were behind a paywall. Just to see what was behind | the wall I checked for a sitemap file. 35MB sitemap.xml | contains direct links to the full downloads of every item | with no auth needed. | thephyber wrote: | > It shows just how careful you have to be when writing | code that is Internet-facing | | All code. "Internet facing" is not the only relevant | qualification. | | Any code where user-generated code is parsed should be | carefully written, tested, and documented. Edge cases | should be identified and described in specs. Non-compliant | software should be identified and shamed (or preferably | PRed). | | I know that AWS has already patched some HTTP Smuggling | attacks maybe 3 years ago, but I don't remember if is was | the same AWS feature (the previous one might have been | CloudFront) and the parsing error might have been a little | different. | Sohcahtoa82 wrote: | More likely, they're stopping on the first space OR colon to | parse the header name since "Content-Length : 0" is valid. | | Personally, if I were writing a HTTP request parser while | being lazy about enforcing spec, I'd split ONLY on the colon, | then just strip the white space on either side of both the | header name and value. In Python: header, | value = line.split(':', maxsplit=1) header = | header.strip().lower() value = value.strip() | | After that, `header` should ALWAYS be checked via equality, | and never `.startswith(...)`. | thephyber wrote: | Note that parsing is likely more complicated than your code | because you have assumed that your "line" has already been | identified before parsing the line. AFAIK there is an | escape sequence for the header delineator (\r\n). | | Also, your code doesn't fix the issue where a header name | with a white space is accepted (which may violate | expectations, depending on the server). | | Your pseudo code also doesn't handle edge cases where 2 | headers which normalize to the same stripped text collide. | One HTTP smuggling vector is the front server keeping a | different header value than the back server when 2 header | names collide. ___________________________________________________________________ (page generated 2021-11-11 23:01 UTC)