[HN Gopher] A New Life for Certificate Revocation Lists ___________________________________________________________________ A New Life for Certificate Revocation Lists Author : grappler Score : 77 points Date : 2022-09-07 18:07 UTC (4 hours ago) (HTM) web link (letsencrypt.org) (TXT) w3m dump (letsencrypt.org) | OrvalWintermute wrote: | Lots of this information is completely bogus. | | > But because OCSP infrastructure has to be running constantly | and can suffer downtime just like any other web service, most | browsers treat getting no response at all as equivalent to | getting a "not revoked" response. This means that attackers can | prevent you from discovering that a certificate has been revoked | simply by blocking all of your requests for OCSP information. | | This is false. Non-nonce OCSP is inherently cachable, and | replayable. That means you can have your own HA setups with HA | OCSP clients talking to HA OCSP servers (repeaters & responders) | backed up by caching in commercial CDNs, and local caching | servers like bluecoats. | | Likewise, OCSP stapling helps remove much of the performance and | privacy issues, pushing it to the serving webserver. | | Beyond this, you can just use squid or localized HA OCSP | services, and do some DNS rewriting to support it even more HA. | | Nonced OCSP is the rare beast that needs to be online, but there | are HA OCSP with smart OCSP clients. | | > To help reduce load on a CA's OCSP services, OCSP responses are | valid and can be cached for about a week. But this means that | clients don't retrieve updates very frequently, and often | continue to trust certificates for a week after they're revoked. | | Trust Stores are inherently manageable. The lag around revocation | completely depends on CRL/OCSP publishing, and client update | requests. | | > And perhaps worst of all: because your browser makes an OCSP | request for every website you visit, a malicious (or legally | compelled) CA could track your browsing behavior by keeping track | of what sites you request OCSP for. | | This is why we advocate OCSP Stapling and use of CDNs for OCSP & | CRL cache hits. Furthermore, localized OCSP mentioned above | decentralizes this even further. | | > So both of the existing solutions don't really work: CRLs are | so inefficient that most browsers don't check them, and OCSP is | so unreliable that most browsers don't check it. We need | something better. | | CRLs & OCSP work pretty well when actually supported. | | When Diginotar happened I polled every single publicly available | commercial CA - strangely, a ton of them were not producing any | CRL/OCSP at all, putting clients into a fail-open mode. | | Lesson of the story: don't blame a protocol for lazy CAs, bad | implementations, or the lack of operational excellence from many | vendors. | phlip9 wrote: | I can't wait for some big site certs to false positive in a CRL | bloom filter and cause a big outage : ) | coffee-- wrote: | CRLite builds a cascade of Bloom filters to ensure no false | positives. | | For Firefox end users, a certificate only gets tested against | the filter cascade if it is known to have been included in its | creation (by examining the embedded SCT timestamps). If it's | not definite that the certificate was used to generate the | filter, then Firefox reverts to OCSP. | | (I'm one of the authors of CRLite in Firefox: | https://insufficient.coffee/2020/12/01/crlite-part-4-infrast... | ) | agwa wrote: | J.C. has already answered for Firefox; for Apple's system | (valid.apple.com), if there's a bloom filter hit, the client | double checks via OCSP before failing the connection. | | Source: a WWDC 2017 talk which unfortunately I can't find | online anymore | xoa wrote: | I was kind of surprised that OCSP stapling didn't get any mention | at all. I thought that was a major improvement in both resource | cost and privacy? Since the time stamped response is proxied by | the site operator rather then going directly to the CA, the load | is almost entirely switched to the site itself, and the site | itself is the only one who knows a given IP is asking for it, | which is fine because obviously the site knows a given browser is | connecting to it anyway. It's decentralized again. I vaguely | remember at one point there was a major limitation of only | supporting a single OCSP response at a time but I thought that | was dealt with via a later RFC and then entirely obviated as an | issue by TLS 1.3. | | Did something happen there or some other significant issue get | discovered? I'm curious why the move back to CRLs (albeit | improved) vs must-staple. It seemed like a reasonable elegant and | straight forward solution that fit the web pretty well. | mcpherrinm wrote: | must-staple requires additional effort from server operators, | which is very challenging to roll out. This new scheme can be | implemented by the browsers and CAs together, which is vastly | simpler than having every webserver in the world turn on OCSP | stapling. | | There isn't any particular force driving for OCSP stapling: | Browsers can't turn it on until it is ubiquitous. CAs don't | want to enforce must-staple because their users will need | enormous amount of help rolling it out. Site operators don't | care if browsers are fetching OCSP (especially since Chrome and | Edge don't) | champtar wrote: | I also love OCSP stapling but there are some limitations: - | webserver need to implement it - admin need to enable it - | webserver need internet access | | Maybe they saw with some telemetry that very few website | actually enable OCSP stampling and decided to implement a fix | that cover all certs and can really be deployed | mholt wrote: | Yes, I noticed this too and asked the author about it [0]. | | IMO, OCSP stapling _is the best overall solution_ until | certificate lifetimes are shorter ( < 7 days). | | As a server developer I'm worried that the focus on independent | CRLs will make it difficult to automate certificates in the | face of revocation. Currently, Caddy staples OCSP for all | certificates by default, caches the staples, and refreshes them | halfway through their lifetime. Works great. Every server | should do this. And when an OCSP response is discovered to be | Revoked, Caddy automatically replaces the certificate. Works | great. Every server should do this. | | If every browser is independently going to decide which | certificates to distrust, now I am not sure of a good, | authoritative way to determine "revoked" and then replace | certificates automatically. I'm worried this will hurt the TLS | ecosystem unless we answer those questions first. | | Actually, let's just shorten certificate lifetimes and be done | with it already. | | Main blockers to short cert lifetimes: | | - CA's uptime determines Web's uptime. | | Main solution: | | - Multiple redundant ACME CAs. If one goes down, try another. | (This is what Caddy already does.) | | [0]: https://twitter.com/mholt6/status/1567559325949763588 | xoa wrote: | Replying to this as it's on top, and first thanks for the | reply (and sibling replies as well). The challenges of | getting servers to update and implement being a practical | roadblock makes sense, I was more surprised just to not see | it mentioned at all. I'd have been fully prepared for a few | sentences along the lines of "this would be ideal if doing it | from scratch but it'd be hard to get everyone to go along | now, the reverse perils of decentralization". What you write | is also interesting. | OrvalWintermute wrote: | I too found this troubling. | | Mainly because we are not talking about validation | performance, and OCSP stapling is an excellent performance | fix. | | Not to mention all the privacy issues that OCSP stapling | really fixes. | GauntletWizard wrote: | OCSP Stapling is incredibly stupid. OSCP Stapling is either | ignored because it's not present, or it's effectively just an | override for the NotBefore and NotAfter fields. There's never a | reason as a server operator to present an invalid NotAfter, and | it represents a substantial surface area for client libraries | to respect them, in the area that's already most prone to | mistakes (the recursive descent to find a certificate chain) | | It's much saner to issue a new, shorter-lived cert. Reduce | certificate lifetimes to match whatever you'd use OCSP stapling | for. Continue to use ACME. Figure out rotation lifetimes such | that I can sleep at night (i.e. they need to be at least 2 | days, so that if the rotation fails I've still got time to wake | up and start working). Work on building client and server | tooling to make it easy to accept rotated certificates (i.e. | reload your client certificates on SIGHUP). | | The advantage of CRLs isn't that they can be used offline, it's | that it allows your security team to burn certificates rather | than waiting for the CA to do so. You should always subscribe | to your CA's CRL too, but you should have a CRL for your own | internal use, too. | OrvalWintermute wrote: | > The advantage of CRLs isn't that they can be used offline, | it's that it allows your security team to burn certificates | rather than waiting for the CA to do so. You should always | subscribe to your CA's CRL too, but you should have a CRL for | your own internal use, too. | | This is incorrect conventionally | | CRLs are signed by the Originating CA in the conventional | trust model. | | Most CAs won't give you a certificate with the ability to | generate your own CRL sharing the same trust, because it can | be weaponized to cause a denial of service. | | This then brings up non-conventional trust models, and | Validation Authority as Co-equal to CA situations for | validity which can be significantly more challenging, and | requiring VA certificate insertion into every Relying Party, | just to name a few things, not to mention all the security | concerns. | GauntletWizard wrote: | I'm aware. That trust model is inherently broken, and it's | not followed anyway. In practice, any CA you trust can sign | a revocation for any certificate (the serial numbers are | all grouped together) - And that's the way it should be. | You should not wait for the issuing CA to get it's act | together, you should burn a certificate as soon as anyone | distrusts it, and if you have a rogue CA that starts | burning Google.com and other important sites, it's better | that they burn them (and take you offline) than that they | issue false certs (and leak all your data); It's also far | more obvious. | | https://www.imperialviolet.org/2014/04/19/revchecking.html | | Revocation checking is useful for your security team to | blacklist site. That's the only useful use. | aaomidi wrote: | OCSP stapling increases how large the first response from the | server is, increasing the time to first useful byte. | tinus_hn wrote: | Is certificate revocation really so common a crl of a normal | authority is gigabyte sized? | michaelt wrote: | According to [1] "On a typical Monday, we would expect to see a | total of around 22,000-30,000 SSL certificates being revoked | over the course of the day." i.e. ~8 million per year. [3] | meanwhile says "1.8 million certificates are revoked per year" | | Looking at a random CRL [2] it's 41 bytes per revoked | certificate. | | 8 million records at 41 bytes per record would be 300+ | Megabytes. And a cautious CA might keep revoked certificates in | their CRL for more than a year. | | So if an event like heartbleed happened again and uncommonly | large numbers of certificates needed to be revoked, the | gigabyte range is within the bounds of possibility. | | [1] https://news.netcraft.com/archives/2014/04/11/heartbleed- | cer... [2] http://crl3.digicert.com/Omniroot2025.crl [3] | https://www.grc.com/revocation/crlsets.htm | WorldMaker wrote: | > And a cautious CA might keep revoked certificates in their | CRL for more than a year. | | A CA does need to keep the revoked certificate in the CRL | until at least the natural expiration date on the | certificate, so for the CAs that give certificates out with 5 | years out or more expiration dates, they may need to keep | CRLs for much longer than just a year just naturally by | nature of their expiration dates. | roblabla wrote: | The problem is, you need to design for the worse case scenario | here, because when the worse case does happen, the last thing | you need is for your revocation system to not work because it | doesn't scale. | | So, no, it's not common. But it's necessary. | er4hn wrote: | The DoD uses x509, and CRLs, in those Common Access Cards (CAC) | everyone in the org has. Since this covers most of the armed | forces, that's fairly large. As of 2012[1] this was around 200 | MB of CRLs and was only expected to get larger over time. | | [1] https://dl.dod.cyber.mil/wp-content/uploads/pki- | pke/pdf/uncl... - Pg 7, under Local Cache | OrvalWintermute wrote: | As mentioned above, DoD uses smart strategies around | certificate validation. | | CDNs + Localized OCSP + Tactical OCSP + Smart OCSP Clients + | Network caching + OCSP & CRLs on the filesystem, just to name | a few (not including delta CRLs and other solutions). | | The DoD OCSP Responders are configured to share hash sets | with downstream OCSP Responders & Repeaters, which makes | promulgation particularly easy. | schoen wrote: | No, not at all. | | The bad case would be if Let's Encrypt discovers a problem | (like a security flaw or implementation error in a validation | method, as happened with the TLS-ALPN-01 method before) and | concludes that it has to mass-revoke a very large number of | affected certificates. | luhn wrote: | > If we had an incident where we needed to revoke every single | one of those certificates at the same time, the resulting CRL | would be over 8 gigabytes. | | I don't know much about this stuff, so apologies if this is a | silly question: | | If you needed to revoke all the certificates, couldn't you just | revoke the handful of intermediary certificates and call it a | day? I assume you'd want to revoke them anyways if there's a | situation severe enough that merits revoking 200 million | certificates. | mcpherrinm wrote: | Yes, it is likely we'd revoke an intermediate if a significant | fraction of all issued certs had to be revoked. But we do want | our revocation infrastructure to support revoking all certs if | needed. | | We have a set of backup intermediates that can be activated if | we had to revoke the active ones for any reason, so the | disruption wouldn't be too high hopefully. | | (I work at Let's Encrypt, but this is my own opinion and not | that of my employer) | mholt wrote: | It's a good question, if I read you rightly. | | There is no "handful" of intermediate certificates -- there are | precisely 4 (for Let's Encrypt [0]) and they are essentially | on-line root certificates. And if those certificates aren't | even compromised, revoking them would only harm the ecosystem. | | [0]: https://letsencrypt.org/certificates/ | shallichange wrote: | According to the link you posted, there are intermediate CAs. | Those could be revoked and effectively revoke all the end | entity certificates. | jiripospisil wrote: | > This means that they're often very large - easily the size of a | whole movie. | | Couldn't they just use actual units? This says absolutely | nothing. | WorldMaker wrote: | The article does later state Lets Encrypt's own expected worst | case is 8 GBs in one CRL file in the hopefully unlikely | scenario that every unexpired certificate they manage was | revoked. | OrvalWintermute wrote: | Merkle Hash Trees are the well-known solution for this, | whenever they decide to update the protocols (was software | patent encumbered til 2017) | TheSpiceIsLife wrote: | One Olympic swimming football bus tree worth of data. | | When I was reading the thread the other day about a trees worth | of oxygen from the MOXIE experiment, I couldn't help thinking: | _why not just use a term everyone is familiar with, litres per | minute air_. | | Enough oxygen to sustain an adult at rest for x minutes. | | I'm beginning to suspect there's an in-joke with science / tech | writers about strained analogies. | | And I'm not _in_. | JohnFen wrote: | > I'm beginning to suspect there's an in-joke with science / | tech writers about strained analogies. | | Which has been going on for longer than I've been alive. The | idea (I assume) is to take a large number that's hard to | conceive and turn it into something everyone can relate to. | | But inevitably, they choose things that few can actually | relate to, or things that are so vague/variable to be | meaningless. It just adds more confusion all around. | | It has to be an intentional joke. | ok_dad wrote: | Browsers and CAs still deciding for the user what's best with yet | another centralized database that we have to "trust" is complete | and implemented correctly. I don't see how this is so hard: just | let us download the CRLs. Maybe add them on a torrent-like system | so they can be shared (and validated) peer-to-peer, or at least | have hundreds or thousands of mirrors that can provide that data. | All this "it's too big and the user would have to download it" | bullshit is just pushing us more and more into "computing as a | service"; aka: we OWN your digital life. | | Edit: I was partly wrong, this is a good thing because you CAN | download the CRLs now (see comments below here for info), whereas | previously you couldn't. Your browser still probably won't | support a full CRL download, but I could be pleasantly surprised. | agwa wrote: | > I don't see how this is so hard: just let us download the | CRLs | | Previously, you couldn't do this, because not all CAs published | CRLs. | | Beginning October 1, you _will_ be able to just download the | CRLs, because Apple and Mozilla are requiring it. | | It's therefore unclear what your beef is, | ok_dad wrote: | > Beginning October 1, you will be able to just download the | CRLs | | Correction: Apple and Mozilla will be able to just download | the CRLs. Not me. The link in the post SPECIFICALLY says us | common plebes don't get that right. | agwa wrote: | Where does the post say that? | | If you think it's because the URLs will be disclosed in the | CCADB, note that the contents of the CCADB are published | here: https://www.ccadb.org/resources | | Specifically, the CRL URLs can be found in this CSV file: | http://ccadb- | public.secure.force.com/ccadb/AllCertificateRec... | ok_dad wrote: | I was pretty sure this section meant what I said but | maybe you can get them from that database without being a | BigCo?: | | "Our new CRL URLs will be disclosed only in CCADB, so | that the Apple and Mozilla root programs can consume them | without exposing them to potentially large download | traffic from the rest of the internet at large." | chrisfosterelli wrote: | I assumed what they meant is that the database is | publicly available but that browser implementations won't | be directly pulling CRLs. Instead the browser providers | pull the CRLs and create a compressed version that their | browser users download. | | In the same way that you can technically query the DNS | root servers yourself but you don't tend to do that | because your computer will query a more downstream DNS | server. | agwa wrote: | Yes, that's exactly what it means. | agwa wrote: | I have a cron job that pulls that CSV file once a day. I | assure you I am not a BigCo. | ok_dad wrote: | I was wrong, thanks for correcting me :) | mholt wrote: | "The connection has timed out. An error occurred during a | connection to ccadb-public.secure.force.com." | agwa wrote: | Works for me, though the time to first byte is currently | rather long. ___________________________________________________________________ (page generated 2022-09-07 23:00 UTC)