[HN Gopher] Practical HTTP Header Smuggling: Sneaking Past Rever...
       ___________________________________________________________________
        
       Practical HTTP Header Smuggling: Sneaking Past Reverse Proxies to
       Attack AWS
        
       Author : MalacodaV
       Score  : 90 points
       Date   : 2021-11-11 15:49 UTC (7 hours ago)
        
 (HTM) web link (www.intruder.io)
 (TXT) w3m dump (www.intruder.io)
        
       | bigdubs wrote:
       | This was mentioned in go's net/http by this CL: https://go-
       | review.googlesource.com/c/go/+/17980/ it's an interesting point
       | that the spec allows this.
        
         | missblit wrote:
         | Note that the spec is stricter for field-name than for field-
         | value. Field names are ASCII, while field values are latin1 (or
         | mime encoded but no one cares about mime encoding).
         | 
         | And yes I have seen bytes in both names and values in the wild
         | (where bytes in names are invalid but need to be handled
         | gracefully, while bytes in values are effectively valid latin1
         | if only for legacy reasons)
         | 
         | Looking at the bug you linked to, looks like this almost bit
         | them too. Here's the final field-value behavior they landed on:
         | https://go-review.googlesource.com/c/go/+/18375/
        
           | Matthias247 wrote:
           | With HTTP/2 you can theoretically even transmit bytes with
           | all binary values inside them in both names and values -
           | since the values are length-delimited.
           | 
           | Based on that, some implementations seem to restrict allowed
           | values to the rules that you describe, while others don't.
        
             | missblit wrote:
             | Yeah, and really the moral of the story shouldn't be (just)
             | to get good at parsing, but to assume that any two parsers
             | may disagree on how to parse a piece of data as part of
             | your security model.
        
       | Sohcahtoa82 wrote:
       | I think the real problem here is that people are writing web
       | servers that don't enforce spec.
       | 
       | As soon as a space in a header name is found, a 400 Bad Request
       | needs to be thrown. "Content-Length abcd: 0" is invalid and
       | should never be accepted.
        
         | legulere wrote:
         | Like so many web technologies http headers are a big
         | complicated ad-hoc mess (some headers are specified in a way
         | that's not standard compliant), so it's to be expected that
         | there things like this happening.
        
         | thephyber wrote:
         | There is something more than that. HTTP Content smuggling has
         | already been identified as a significant issue and the largest
         | cloud providers and Reverse Proxy server software should have
         | already fixed these issues.
         | 
         | I started a GitHub repo to run integration tests for popular
         | combinations of reverse proxy to popular language web servers
         | to identify these gaps in expectations (how duplicates,
         | capitalization, white space, etc affect HTTP headers in
         | different servers)
        
         | rini17 wrote:
         | More interesting is why Content-Length abcd: is treated same as
         | Content-Length: at all? Someone overoptimized the header
         | lookup? Then perhaps other kinds of extensions like Content-
         | Length-abcd are possible, not only with space?
        
           | kingcharles wrote:
           | I'm guessing they just check what each line starts with. Then
           | they probably split the line on the : to get the value. That
           | would produce the results seen.
           | 
           | It shows just how careful you have to be when writing code
           | that is Internet-facing, and especially on the scale of AWS
           | where you have half the world's hackers trying to find
           | exploits.
           | 
           | I'm not even looking for exploits and I find them every day.
           | For instance, I wanted to read some magazines the other day
           | but they were behind a paywall. Just to see what was behind
           | the wall I checked for a sitemap file. 35MB sitemap.xml
           | contains direct links to the full downloads of every item
           | with no auth needed.
        
             | thephyber wrote:
             | > It shows just how careful you have to be when writing
             | code that is Internet-facing
             | 
             | All code. "Internet facing" is not the only relevant
             | qualification.
             | 
             | Any code where user-generated code is parsed should be
             | carefully written, tested, and documented. Edge cases
             | should be identified and described in specs. Non-compliant
             | software should be identified and shamed (or preferably
             | PRed).
             | 
             | I know that AWS has already patched some HTTP Smuggling
             | attacks maybe 3 years ago, but I don't remember if is was
             | the same AWS feature (the previous one might have been
             | CloudFront) and the parsing error might have been a little
             | different.
        
           | Sohcahtoa82 wrote:
           | More likely, they're stopping on the first space OR colon to
           | parse the header name since "Content-Length : 0" is valid.
           | 
           | Personally, if I were writing a HTTP request parser while
           | being lazy about enforcing spec, I'd split ONLY on the colon,
           | then just strip the white space on either side of both the
           | header name and value. In Python:                   header,
           | value = line.split(':', maxsplit=1)         header =
           | header.strip().lower()         value = value.strip()
           | 
           | After that, `header` should ALWAYS be checked via equality,
           | and never `.startswith(...)`.
        
             | thephyber wrote:
             | Note that parsing is likely more complicated than your code
             | because you have assumed that your "line" has already been
             | identified before parsing the line. AFAIK there is an
             | escape sequence for the header delineator (\r\n).
             | 
             | Also, your code doesn't fix the issue where a header name
             | with a white space is accepted (which may violate
             | expectations, depending on the server).
             | 
             | Your pseudo code also doesn't handle edge cases where 2
             | headers which normalize to the same stripped text collide.
             | One HTTP smuggling vector is the front server keeping a
             | different header value than the back server when 2 header
             | names collide.
        
       ___________________________________________________________________
       (page generated 2021-11-11 23:01 UTC)