[HN Gopher] How CDNs Generate Certificates ___________________________________________________________________ How CDNs Generate Certificates Author : ordiblah Score : 77 points Date : 2020-06-25 20:04 UTC (2 hours ago) (HTM) web link (fly.io) (TXT) w3m dump (fly.io) | ancarda wrote: | Is anyone else feeling quite sad reading this article? ALPN being | used because only 80/443 are realistic these days, middleboxes | causing the TLS handshake to have padding so it's not | misinterpreted with an ancient protocol (SSLv2). | | It feels like the Internet is so fragile. | [deleted] | profmonocle wrote: | ALPN would make sense for something like HTTP2 even if you | didn't have the problem of ports being blocked. If HTTP2 had | its own port clients would have to make multiple TCP connection | attempts for each host they connect to. | mrkurt wrote: | I have the opposite feeling, the clever "hacks" people use to | build very useful stuff that bypasses most problems with legacy | infrastructure are pretty exciting. It's very much like | watching a complex organism evolve into something you never | really could have imagined 8000 iterations ago. | SahAssar wrote: | Most of this could have been avoided by using DOH and SRV | records for HTTP/HTTPS. I still don't understand why SRV | records is not supported for HTTP/HTTPS in browsers. | ancarda wrote: | I remember looking into why A/AAAA is still used over SRV, | and it would seem performance is one of the big concerns; | browsers do not want to make more DNS lookups than necessary. | | I think they'd end up with 4 lookups; A, AAAA, SRV | (_http2._tls), and SRV (_http._tls). | | Though perhaps you are suggesting DoH could mean the resolver | also returns SRV records if you request A or AAAA? i.e. | proactively point out there's an HTTP server? | SahAssar wrote: | IIRC chrome is now racing QUIC and HTTP(2) connections | instead of doing negotiation/upgrade to detect QUIC | support. So this argument (if true) has fallen apart just a | few years later. | jiggawatts wrote: | This debate comes up a lot, and it's hilarious how | misguided it was. | | I regularly work with load-balancers such as Citrix ADC | (NetScaler) or F5 BIG IP. These do DNS-based load- | balancing, dynamically returning "A" records to that the | browsers so that they can get the "single working IP | address" they're expecting. The browsers don't try very | hard to fail over to secondary IPs because this is the | established standard architecture, but they don't need to | because of this common setup. | | Sounds like an optimal solution, right? It does at first | glance anyway, as long as you ignore the eye-watering price | tag on those load balancer boxes. | | The subtle but critical issue is that by returning "A" | records, the load balancers have to use a short time to | live (TTL)! This is because there's a trade-off: You can | have fast failover, OR long-lived DNS caching. _With A | records you can 't have both!_ | | Typical response TTL times are 5-30 seconds, 5 minutes tops | if you hate your users. This means that many browsers will | be forced to repeatedly re-query the DNS servers on _every | page load_ for typical end-user workflows. It also means | that for all but the biggest, most popular sites, the ISP | DNS cache does practically nothing for these records. | | Meanwhile with SRV records the TTL times can be much | higher, hours even. This is how Active Directory works, for | example, all of the Domain Controllers add themselves to | various SRV records so that if you query | "_ldap._tcp.dc._msdcs.test.com" you get back all the DCs. | These records include priorities and weightings, so you can | pull tricks like incrementally demote a DC or prioritise | the shiny new one. | | If you watch the AD connection traffic in WireShark, it's | incredible. It very quickly steps through alternate | services and then reorders the successful hits in front of | the failures so that subsequent queries are lightning fast. | It is astonishingly tolerant of partial networking | failures, yet still fast to connect despite that! | | The key mistake made by the original DNS design working | groups was that SRV records should have returned a list of | IP addresses instead of a list of host names. | mrkurt wrote: | We actually run into problems that are similar to what you'd | have with SRV records. | | Fly.io apps can define different service "handlers" (like TLS | and HTTP). If you want to, you can accept TCP connections and | bypass our logic. Which is great and flexible. | | The problem is, when someone is deploying a new version of | their app where they _change_ one of those things, we have to | e really careful about how we (a) load balance and (b) decide | to do things like TLS. If we're not careful we can end up | sending the wrong type of connection to a new VM that's | expecting something else. | | SRV records sound like they'd have it worse. If you do a DNS | lookup to detect something like http2, the IP you connect to | _can't_ do anything else. It's much simpler / safer to | negotiate stuff like that at connection time. | SahAssar wrote: | That all assumes that you use the same port for all those | things, right? Which is one of the points of SRV, to not | have to squeeze everything into one canonical port. With a | SRV record you could route https to whatever port suited | you and rotate that out when rolling out changes, | lomkju wrote: | Can you tell why should I choose fly instead of AWS? | | micro-2x shared 512MB $0.000003044 $8 VS t3a.nano 2 Variable 0.5 | GiB EBS Only $0.0031 per Hour | | I'm missing something? cause seeing the pricing I still feel AWS | is cheaper. | mrkurt wrote: | It's probably better to compare Fly with Lambda or Fargate. | It's not really meant to be cheaper than AWS, though, the real | value is being able to run app servers all over the world | without spending time maintaining servers or wrangling AWS. | lomkju wrote: | Makes sense. Comparing the pricing with AWS lambda fly.io is | way cheaper. Will give it a try :) | mholt wrote: | Anyone looking to automate certificate management at any sort of | scale should read this: https://docs.https.dev/acme-ops | | ... and use Caddy to do the heavy lifting. (I'm biased, yes. But | the linked doc is multi-authored and applies to every sysadmin or | developer who needs to manage certs, regardless of your software | choice.) | awinter-py wrote: | woo, hadn't heard about firecracker | mrkurt wrote: | It's seriously the bomb. | tptacek wrote: | Firecracker is f'ing awesome. I have a lot of notes to write up | about it. I know this isn't how products actually succeed in | the real world, but I'll be honest and say that Kurt had me at | Fly with "WireGuard and Firecracker". | | (For the unfamiliar reader: Firecracker is a micro-vm system | that sits sort of in between a fully virtualized host, like an | EC2 instance, and a container like Docker; you get the security | isolation of a hypervisor but the speed/simplicity of Docker. | It's the engine that powers AWS Lambda and Fargate. The Usenix | paper is a pretty great read, and the code [it's all in Rust] | is simple and easy to follow.) | | https://www.usenix.org/system/files/nsdi20-paper-agache.pdf | AlphaSite wrote: | It's fairly similar in concept to: | https://vmware.github.io/vic/ for vsphere | | Disclaimer: interned with the team | tptacek wrote: | Say more, if you can! I'm not at all familiar with that | project. Thanks! | [deleted] | tialaramex wrote: | It would be interesting to see stats from the CAs about which of | the Blessed Methods is most popular. (This article is about Let's | Encrypt using tls-alpn-01 which is an implementation of | 3.2.2.4.10 "TLS Using a Random Number"). Doubtless Fly aren't the | only people doing tls-alpn-01 in bulk but we don't have a good | overview as far as I'm aware. | | In principle they can all generate those statistics because they | (are supposed to) log enough information to identify what went | wrong when, inevitably, something is misissued. Logically that | also includes at least which method was used to verify domain | authorization or control. | | One of the things wrong at Symantec is that it turns out some of | the records were notionally kept at CrossCert, a separate Korean | company. CrossCert simply did not keep any records (or if it did | they were in such disarray that it seemed less likely to attract | retribution by refusing to disclose them) and Symantec had | seemingly never checked. | | Knowing which methods are popular with Subscribers, and whether | that varies considerably between CAs would be valuable in trying | to figure out how more of the worst Blessed Methods can be | deprecated or improved, and who we need to be talking to about | that. | | For example maybe Let's Encrypt is doing almost all the | 3.2.2.4.19 ("Agreed Upon Change to Website - ACME") then there's | no point ragging on other CAs for the shortcomings of relying on | plaintext HTTP in this method. Or maybe DigiCert are doing a lot | of 3.2.2.4.15 ("Phone Contact with Domain Contact") so they are | the people to talk through any proposed improvements around stuff | like leaving a Voice mail. | tptacek wrote: | Part of the last few weeks involved me learning Rust and using it | in anger (if hooking nfqueue up to tokio counts as "in anger") so | if you'd like to irritate the hell out of 'pcwalton, feel free to | ask me Rust questions. | dchest wrote: | Can you really read Rust code after learning it or does it | still look like a bunch of squiggles? | NetOpWibby wrote: | > Obviously, to do stuff like this, you need to generate | certificates. The reasonable way to do that in 2020 is with | LetsEncrypt. We do that for our users automatically, but "it | just works" makes for a pretty boring writeup, so let's see how | complicated and meandering I can make this. | | This delighted me. | dochtman wrote: | Exciting! Are you doing this in your role as Latacora helping | out startups with security challenges? (Update: apparently not | https://twitter.com/tqbf/status/1276212163582070785) | | How is the Fly proxy implemented? Are you using rustls and/or | any of the available ACME crates? | | I've been wanting to implement tls-alpn-01 support for rustls | (although it might be possible to do this just by mutating the | ServerConfig over time). | | Also interested to hear your general impressions of Rust so far | (I think I read some Twitter grumbling...). | tptacek wrote: | I'm full-time at Fly. I'll let Jerome answer the fly-proxy | question, since it's his code and I wouldn't want to | inadvertently take credit. | | I think I came across as grumbling about Rust when my real | perspective was much more subtle. My take on Rust so far is | that it has been, for me, a vindication of a lot of decisions | the Go team made, because I've been directly exposed to some | of the downsides of the opposite decisions. But, while that | sounds like a critique of Rust, it's not! Rust is the way it | is for real reasons: zero-cost abstractions and no runtime | GC, which are, right now, requirements for some application | domains. | | For me, right now, writing in Rust feels almost identical to | how writing in C++ felt 15 years ago. But I'll keep writing | in it, and it'll get faster for me. We're a Rust-on-the-data- | plane shop! | JoshTriplett wrote: | If you run into issues in Rust that you believe might be | signs of a need for language improvements, please feel free | to raise them. I'm happy to help. | dochtman wrote: | What I perceived as grumbling: | | "I absolutely understand what y'all like so much about Rust, | but I have to say that as an auditor, my blood pressure drops | and my shoulders relax the moment I switch from reading a | Rust project to reading a Go project." | | https://twitter.com/tqbf/status/1260678152084480008 | tptacek wrote: | As long as we're clear that I'm not saying "Rust is less | secure than Go", which is not _at all_ what I meant. I just | meant that it 's much easier for me to read Go code. | | (I will however miss match expressions when I return to my | home planet.) | dochtman wrote: | I'd be very curious to hear if there are specific bits | about the Rust language that you think make it harder to | audit or that (so far) it's just the lack of experience. | jeromegn wrote: | Hey there, Fly co-founder here! | | Fly's proxy uses a mix of tokio, hyper and rustls. We don't | need to use a crate that handles ACME because we're | processing all the validation and certificate authorizations | from a centralized, boring, Rails application. | | We've had to submit a PR to the rustls project a few months | ago to handle different ALPNs. Instead of resolving a | certificate only from a SNI, the crate now provides the full | ClientHello which contains negotiable ALPNs. With that | information you can respond to the tls-alpn-01 challenge. ___________________________________________________________________ (page generated 2020-06-25 23:00 UTC)