[HN Gopher] The perils of the "real" client IP ___________________________________________________________________ The perils of the "real" client IP Author : zdw Score : 106 points Date : 2022-03-05 18:05 UTC (4 hours ago) (HTM) web link (adam-p.ca) (TXT) w3m dump (adam-p.ca) | terom wrote: | :+1: for the effort to document this, and coordinating the | disclosure with the vendors. This mainly talks about rate- | limiting bypass/DoS, but if XFF is also used for audit trail | logging of IP addresses and/or IP-based access lists, then the | security implications can be even more severe, with falsified | audit logs and bypassed security controls. | | Setting up an application server behind a reverse proxy to use | the "real" client IP is unfortunately very typically just a | trial-and-error based process, with very little room for this | kind of nuanced security-conciousness, because the configuration | and exact behavior is all so non-standardized across different | implementations of reverse-proxies and application servers... | Typically users will just try different configuration settings | until they find a combination that seems to work, and you would | actually need to dig in with curl and tshark to understand the | edge cases, because the documentation of the application-specific | implementation is typically just one brief sentence... | | Getting XFF working correctly through a complicated HTTP stack | with multiple layers of nginx/haproxy/apache proxies (yes, they | have different non-overlapping feature sets), custom backends | implementing custom XFF handling/forwarding, and jetty/spring | backends upgraded across a major version bump that changed the | implementation and configuration properties related to XFF | handling was insanely difficult. And of course it broke when | migrating from a F5 LB to an AWS ALB, because it behaved | differently for that one edge-case for an important customer... | highly recommended to just override the entire XFF header with a | single value at the appropriate point in your stack, if at all | possible. | | If just the naive leftmost-first vs rightmost-ish-with- | configurable-list-of-trusted-upstream-proxies wasn't enough, then | yeah, HAProxy does the thing where it adds a new 100% standards- | compliant header continuation line [1] that maybe 1% of backend | application developers have ever tested with. And trying to | configure HAProxy to interpret the incoming XFF headers for | logging/access-control ~is~/was even more weird [2]. | | [1] https://github.com/haproxy/haproxy/issues/44 [2] | https://github.com/haproxy/haproxy/issues/90 | adam-p wrote: | > Setting up an application server behind a reverse proxy to | use the "real" client IP is unfortunately very typically just a | trial-and-error based process | | This is very true, and I narrowly missed getting burned myself | by thinking, "looks good to me!" after seeing my own IP. | | However... I take a dig in the conclusions at the implementers | of security-related libraries. I don't think it's okay for them | to stop at "seems to work". They should be taking the time to | fully understand the problem space. | | > highly recommended to just override the entire XFF header | with a single value at the appropriate point in your stack, if | at all possible | | I agree, and probably should have emphasized that approach. | (And maybe will, and will definitely add a note at the end.) I | didn't really give any/enough attention to configuring your | first proxy with a custom single-IP header. (Partly because I | was writing more for people who are trying to use what's | available.) | userbinator wrote: | For many years, a very prominent computer science journal used | XFF for guarding access --- if you set it to an IP of some well- | known universities, you'd be able to download all you want. | | I feel comfortable about disclosing this now since we have things | like SciHub (which may have used this trick at one point), and | they fixed it a few years ago. | hedora wrote: | I'd argue that this whole concept is inherently flawed. My home | internet is behind a carrier grade NAT. Many services have rate | limited the IP. It's to the point where I'm strongly considering | paying a "streaming friendly" VPN to black hat subvert these | reputation schemes. | | The biggest offenders are internet of things backends for | registered devices I "own" and streaming services I'm paying for. | | Edit: Here's a concrete example: I looked up my CGNAT IP and saw | that it was flagged as a malicious actor because someone ran a | port scan from it a month ago. | | Now that this offence has started to age out, a few services | started working again. My entire ISP can be trivially DOS'ed with | a raspberry pi and NMAP! | | Of course, other services seem to just do per-IP rate limiting, | so they run at << 1MB/s during peak hours. Fast.com and | Speedtest.net claim the connection is healthy. | xorcist wrote: | To be fair, this is a real problem that you sometimes can't | ignore. There are plenty of situations where you need to filter | out a bad actor by ip address. | dspillett wrote: | _> There are plenty of situations where you need to filter | out a bad actor by ip address._ | | It isn't so much that you need to do it that way, but that | there is no more practical way despite the inherent problems. | Which has effectively the same end result, but thinking that | way highlights the fact that CGNAT and other IPv4 limit | "solutions" cause as many problems as they solve. | kingforaday wrote: | Do you not have the ability to lease a static IP from the ISP | for a nominal amount? From my experience this is always an | option even on consumer lines. | hedora wrote: | No. It's a local rural ISP. I don't think they're | particularly competent. | c0wb0yc0d3r wrote: | It may be an option, but as a consumer, why would this be the | optimal answer? All this does is hide costs for you. | | This is a problem with service. If the service doesn't want | to fix it, then that sounds like a potential market | opportunity, no? I'm new to thinking about the business side, | but it seems like I want to remove any obstacle I can | preventing a customer from using my service. | convolvatron wrote: | it takes a little getting used to. the key here it thats | alot easier to convince your customers that any flaws in | your service are just 'the way it is' rather than even | attempt to do anything about them. | anderspitman wrote: | This is an interesting point. As more users get pigeonholed | into sharing the same IPv4 addresses, IP-based rate limiting | will essentially become useless. I wonder if this will help | drive IPv6 adoption. | | > The biggest offenders are internet of things backends for | registered devices I "own" and streaming services I'm paying | for. | | Wait those are the services that are rate-limiting you? They're | not smart enough to rate-limit based on your account | credentials? | sodality2 wrote: | > They're not smart enough to rate-limit based on your | account credentials? | | This works until a VPN node starts having tens of thousands | of users from one IP address. | judge2020 wrote: | > I wonder if this will help drive IPv6 adoption. | | Only if users complain enough. Often they complain to the | reverse proxy host[0] or website itself[1], when it would be | a solved problem if IPv6 were properly deployed further. | | 0: top result on google for 'Cloudflare blocked me' with 38k | views https://community.cloudflare.com/t/cloudflare-is- | blocking-me... | | 1: https://www.coursera.support/s/question/0D51U00003BlYiVSAV | /y... | crtasm wrote: | Assuming you raised this with the services, what responses did | you get - if any? | [deleted] | 3np wrote: | > I'm strongly considering paying a "streaming friendly" VPN to | black hat subvert these reputation schemes | | Nothing "black hat" with that, at all. | | ...Unless you're referring to that those "streaming-friendly | VPNs" are (acquiring residential IPs in sketchy ways from | sketchy providers, such as compromised smart devices and other | customers). | c0l0 wrote: | When I worked in (quasi-)SRE at a somewhat busy and popular EU- | based website, we "invented" our own HTTP header namespace (think | like the "X-"-prefix that you see in the wild for "non-standard" | HTTP headers, that often end up as ossified de-fact-standards | without much of an RFC to specify them) for stuff that our | internal infrastructure components, like the HTTP- and HTTPS- | terminating reverse proxies, would add. | | The systems involved took proper care that those headers and | their values really came from _us_ , and not some outside system. | Of course, we also had some header akin to _Our-Site-Prefix-Peer- | IP-Addr_ , to know which TCP peer the outermost system was | actually talking to. In combination with evaluating other | headers, like the _X-Forwarded-For_ that this article handles in | delightful detail, there was a lot of interesting detective work | to be had in terms of which clients tried to spoonfeed us which | kind of would-be-spoofed data. | | These days, Carrier-grade NAT and the slow death of the End-to- | end principle make these techniques woefully inadequate to | discern between "legitimate" use, and clients you'd rather keep | out. Sadly. | scottlamb wrote: | This is a good point. I should have known better, but I just | checked a setup I made a while back (nginx "proxy_set_header" [1] | + Rust http::HeaderMap::get [2]), and I was doing it wrong. The | nginx command appends, and get returns the first. Oops. At least | I'm not doing IP-based auth, but I am looking for this to detect | if the client is using TLS (up to the reverse proxy server, the | backhaul is plaintext right now, like [3]) as well as logging the | IP. | | Another problem that I suspect is common: connections that | sometimes go through a reverse proxy, sometimes not. In my setup, | the "not" ones in theory are trusted (on the LAN or localhost) | but still it doesn't feel right to not really know if the header | came from the proxy or not. I could add a shared secret or | something to the header, but given that a LAN-based attacker | could sniff everything (not just this but also the actual user | credentials/traffic) via ARP spoofing or something, it probably | doesn't as much sense to bother before getting rid of the | plaintext backhaul. | | Speaking of which, it'd be nice to make the TLS end-to-end: from | the user through a proxy that doesn't decrypt all the way to the | application server. But I'm not sure what the state of the art | there is. It used to be possible to dispatch based on the SNI | traffic and then proxy at the TCP level, but I know TLS 1.3 added | encrypted SNI. Not sure if the proxy can force clients fall back | to non-encrypted SNI. I could imagine the spec authors making a | point of not allowing this so a man-in-the-middle can't find the | intended host. Maybe the two legs just have to be encrypted | separately now. | | [1] | http://nginx.org/en/docs/http/ngx_http_proxy_module.html#pro... | | [2] | https://docs.rs/http/0.2.6/http/header/struct.HeaderMap.html... | | [3] https://blog.encrypt.me/2013/11/05/ssl-added-and-removed- | her... | toast0 wrote: | > I know TLS 1.3 added encrypted SNI. Not sure if the proxy can | force clients fall back to non-encrypted SNI. | | Encrypted SNI is not the default. You've got to do a fair bit | of work to do it. You'd probably want your proxy to be able to | decode it to direct the traffic anyway. | anderspitman wrote: | > Speaking of which, it'd be nice to make the TLS end-to-end: | from the user through a proxy that doesn't decrypt all the way | to the application server. But I'm not sure what the state of | the art there is. It used to be possible to dispatch based on | the SNI traffic and then proxy at the TCP level, but I know TLS | 1.3 added encrypted SNI. Not sure if the proxy can force | clients fall back to non-encrypted SNI. I could imagine the | spec authors making a point of not allowing this so a man-in- | the-middle can't find the intended host. Maybe the two legs | just have to be encrypted separately now. | | My boringproxy[0] project has a mode for doing SNI routing at | the reverse proxy, and tunneling it all the way back to a local | client machine which handles the actual TLS termination. The | client also gets certs automatically from Let's Encrypt. So you | end up with automated end-to-end encryption where the | boringproxy server/VPS can't decrypt any of the traffic. | | Early versions of boringproxy acted as a more traditional | reverse proxy, terminating the TLS at the server then making a | new HTTP request upstream. Eventually I added the ability to | move the HTTP proxy into the client to enable e2ee. Most | recently I implemented raw TLS all the way. There are | tradeoffs: | | Pros: | | * End-to-end encryption. | | * Simplicity. You just need to peek at the SNI and look up what | TCP tunnel to pipe into. | | * Things like WebSockets and other hop-by-hop requests don't | need to be implemented by the proxy. | | Cons: | | * You lose the ability to do compression/caching/CDN/etc at the | server. | | * You can't support new protocols like HTTP/2, HTTP/3, etc at | your server because it only understands TCP wrapped in TLS. | | In practice, I think these tradeoffs are totally worth it for | self-hosting. I've found performance to be great for my | purposes. | | As for encrypted SNI (ESNI, now wrapped into encrypted client | hello, ECH), I'm pretty sure it will be implemented in a tiered | approach like you surmise. So in my case the boringproxy server | will have the keys to decrypt the client hello, but the origin | servers will still control the actual TLS decryption. | | [0]: https://boringproxy.io | anderspitman wrote: | TL;DR don't trust the values of headers you don't control. If | you're not sure whether you control them or not, read the rest of | this article. | rhizome wrote: | Right? "Don't depend on anything fakeable." | freedomben wrote: | I think a part of the problem though is that "control" is not | always well defined. I've seen headers used that were | "controlled" because a proxy sat in front and did filtering. | But then the app got moved behind a different proxy, and what | they controlled was no longer such. | | But, if you don't trust any headers at all, it's tough to do | much with HTTP. | nickjj wrote: | That was one of the most comprehensive posts I read on this | topic, well done. | | One addition I'd suggest is to add going into the implications of | enabling "client port preservation" with an AWS ALB. This can be | a death sentence if you decide to turn it on without intimate | knowledge on how all connected apps are trying to figure out the | real IP. It appends the client port to the IP with a colon in | X-Forwarded-For and if you happen to read IPs without expecting | this then it can potentially cause IP values to appear invalid. | It's super app dependent on what would happen but it has a lot of | side effect potential. | kingforaday wrote: | I also would like to express my appreciation for the post and | hope the Author will see this comment. Thanks! | adam-p wrote: | I didn't know about that option. Here's a link for anyone who | wants details: | https://docs.aws.amazon.com/elasticloadbalancing/latest/appl... | | That's pretty bad. It makes the already perilous header even | worse. I'll a note or addendum about it in the post when I get | a chance. | | (@kingforaday: You're welcome!) | iancarroll wrote: | Parsing this header is such a nightmare. HAProxy had a CVE a | while ago where they stopped parsing the header if they hit a | quote in the middle, which allowed you to forge the right-most | IP. | | As a result I had a long conversation with AWS where I told | them it was ridiculous that ALBs allowed garbage to be inserted | in the only header that contains this important information... | suffice to say they did not care. | metanonsense wrote: | Making decisions based on HTTP headers is always an opportunity | to be surprised. A few years back we had implemented rudimentary | access control on a server based on X-Forwarded-For. In front of | that server, we had two chained instances of haproxy, one version | 1.x, the other 2.x. Little did we know that haproxy 1 and 2 | handled headers completely different wrt case-sensitivity. So we | ended up not only not correctly removing untrusted headers, we | also got two different headers with every request (mixed-case and | lower-case) that our server handled differently (which was a | bug). | laurent123456 wrote: | I think one issue is that we expect to rely on a standard here, | while only a proprietary solution, specific to the infrastucture | seems to make sense. | | Basically the first device in the infrastucture, whether it is a | load balancer, a cache, etc. should set (or overwrite, if it's | been spoofed) a custom proprietary header with the client IP. And | that's what you can rely on - either it's set and you have the IP | or its not and you can't really trust anything else. | | He mentions Azure and a few others doing this and that sounds | like the only correct approach. | | (Although there's still an issue when the infrastructure is | changed, and the application code is not updated - in that case | the old header might no longer be set, and could now be spoofed) | marcosdumay wrote: | The standard is not really the problem. The standard existing | allows you to use off the shelf middleware and widespread | libraries to solve your problem. | | The problem here is that your border box should remove any such | header that comes from an untrusted network, and not append to | them. | adam-p wrote: | I agree. However, you need to be _really_ careful about using | the IP provided by your CDN, etc. For example, Akamai sets | `True-Client-IP`... but doesn't overwrite it if it's present. | By default, Fastly does the same with `Fastly-Client-IP`. (Yes, | Azure is better, but make sure you pick the _right_ special | header.) | | Minefields within minefields. | jrockway wrote: | It is all ridiculously complicated. Envoy's documentation makes | it pretty clear what concessions are made: | https://www.envoyproxy.io/docs/envoy/latest/configuration/ht... I | link the documentation not for the informative aspect, but just | for you to gawk at the length and how worried the author is about | it all ;) | | I personally prefer that the load balancer speak "proxy protocol" | to the front proxy, and then the front proxy simply make a | decision as to what the external IP is and relay that to upstream | applications with a proprietary header (x-envoy-external- | address). Upstream applications that really want to be careful | about the external address then verify the signature of envoy's | TLS certificate when the connection comes in. (I use some mTLS | for my homelab stuff, but don't have any software that cares to | check this. I use Envoy's rate limiter rather than application | specific rate limiting, and use Envoy's logs to get the IP | address and the x-b3-trace-id header to correlate access logs | with application logs.) | | Proxy protocol is regrettably problematic from time to time, | however. By default, Kubernetes tries to be "smart" about | routing. If you have a LoadBalancer with internal IP | 10.123.234.56 and external IP 54.43.32.21, traffic originating in | the cluster with a destination address 54.43.32.21 does not | transit the cloud provider's load balancer, it just gets | rewritten to the internal IP address 10.123.234.56. This means | that it doesn't use the proxy protocol, and the front proxy | rejects the connection. This is generally not a problem, but will | be annoying if you host an internal container registry in- | cluster, because containers will be named for the external | container registry and will thus resolve to the external IP | address. The connection will skip your cloud provider's load | balancer which adds the proxy header, and your front proxy will | reject the connection, causing failing pulls. Most people will | never hit this, because their cloud provider just provides a | container registry (when I set up all my container stuff, my | cloud provider of choice didn't offer this; they do now). But if | you ever think internal applications will intend to use the | public Internet to connect to other internal applications, watch | out. This can come up if you do OIDC things internally, for | example. (Apps will want to grab the .well-known configuration | keys and JWKS key material from the external address that it will | eventually redirect clients to. You will be confused when the | configuration fails to load.) | | In conclusion, what a mess. | adam-p wrote: | That Envoy doc is... impressive. I'll find a place to link to | it in the post. | askura wrote: | Good find OP. That was a real breakdown there worth reading. | beardog wrote: | Years ago, I found that a disposable email service Guerrilla Mail | was adding IPs from x-forwarded-for to outgoing emails blindly, | which could have been abused to make it look like any IP you | wanted had sent an email | | https://voidnet.tech/chaos/blog/do-not-trust-x-forwarded-for... ___________________________________________________________________ (page generated 2022-03-05 23:00 UTC)