[HN Gopher] A New Life for Certificate Revocation Lists
       ___________________________________________________________________
        
       A New Life for Certificate Revocation Lists
        
       Author : grappler
       Score  : 77 points
       Date   : 2022-09-07 18:07 UTC (4 hours ago)
        
 (HTM) web link (letsencrypt.org)
 (TXT) w3m dump (letsencrypt.org)
        
       | OrvalWintermute wrote:
       | Lots of this information is completely bogus.
       | 
       | > But because OCSP infrastructure has to be running constantly
       | and can suffer downtime just like any other web service, most
       | browsers treat getting no response at all as equivalent to
       | getting a "not revoked" response. This means that attackers can
       | prevent you from discovering that a certificate has been revoked
       | simply by blocking all of your requests for OCSP information.
       | 
       | This is false. Non-nonce OCSP is inherently cachable, and
       | replayable. That means you can have your own HA setups with HA
       | OCSP clients talking to HA OCSP servers (repeaters & responders)
       | backed up by caching in commercial CDNs, and local caching
       | servers like bluecoats.
       | 
       | Likewise, OCSP stapling helps remove much of the performance and
       | privacy issues, pushing it to the serving webserver.
       | 
       | Beyond this, you can just use squid or localized HA OCSP
       | services, and do some DNS rewriting to support it even more HA.
       | 
       | Nonced OCSP is the rare beast that needs to be online, but there
       | are HA OCSP with smart OCSP clients.
       | 
       | > To help reduce load on a CA's OCSP services, OCSP responses are
       | valid and can be cached for about a week. But this means that
       | clients don't retrieve updates very frequently, and often
       | continue to trust certificates for a week after they're revoked.
       | 
       | Trust Stores are inherently manageable. The lag around revocation
       | completely depends on CRL/OCSP publishing, and client update
       | requests.
       | 
       | > And perhaps worst of all: because your browser makes an OCSP
       | request for every website you visit, a malicious (or legally
       | compelled) CA could track your browsing behavior by keeping track
       | of what sites you request OCSP for.
       | 
       | This is why we advocate OCSP Stapling and use of CDNs for OCSP &
       | CRL cache hits. Furthermore, localized OCSP mentioned above
       | decentralizes this even further.
       | 
       | > So both of the existing solutions don't really work: CRLs are
       | so inefficient that most browsers don't check them, and OCSP is
       | so unreliable that most browsers don't check it. We need
       | something better.
       | 
       | CRLs & OCSP work pretty well when actually supported.
       | 
       | When Diginotar happened I polled every single publicly available
       | commercial CA - strangely, a ton of them were not producing any
       | CRL/OCSP at all, putting clients into a fail-open mode.
       | 
       | Lesson of the story: don't blame a protocol for lazy CAs, bad
       | implementations, or the lack of operational excellence from many
       | vendors.
        
       | phlip9 wrote:
       | I can't wait for some big site certs to false positive in a CRL
       | bloom filter and cause a big outage : )
        
         | coffee-- wrote:
         | CRLite builds a cascade of Bloom filters to ensure no false
         | positives.
         | 
         | For Firefox end users, a certificate only gets tested against
         | the filter cascade if it is known to have been included in its
         | creation (by examining the embedded SCT timestamps). If it's
         | not definite that the certificate was used to generate the
         | filter, then Firefox reverts to OCSP.
         | 
         | (I'm one of the authors of CRLite in Firefox:
         | https://insufficient.coffee/2020/12/01/crlite-part-4-infrast...
         | )
        
         | agwa wrote:
         | J.C. has already answered for Firefox; for Apple's system
         | (valid.apple.com), if there's a bloom filter hit, the client
         | double checks via OCSP before failing the connection.
         | 
         | Source: a WWDC 2017 talk which unfortunately I can't find
         | online anymore
        
       | xoa wrote:
       | I was kind of surprised that OCSP stapling didn't get any mention
       | at all. I thought that was a major improvement in both resource
       | cost and privacy? Since the time stamped response is proxied by
       | the site operator rather then going directly to the CA, the load
       | is almost entirely switched to the site itself, and the site
       | itself is the only one who knows a given IP is asking for it,
       | which is fine because obviously the site knows a given browser is
       | connecting to it anyway. It's decentralized again. I vaguely
       | remember at one point there was a major limitation of only
       | supporting a single OCSP response at a time but I thought that
       | was dealt with via a later RFC and then entirely obviated as an
       | issue by TLS 1.3.
       | 
       | Did something happen there or some other significant issue get
       | discovered? I'm curious why the move back to CRLs (albeit
       | improved) vs must-staple. It seemed like a reasonable elegant and
       | straight forward solution that fit the web pretty well.
        
         | mcpherrinm wrote:
         | must-staple requires additional effort from server operators,
         | which is very challenging to roll out. This new scheme can be
         | implemented by the browsers and CAs together, which is vastly
         | simpler than having every webserver in the world turn on OCSP
         | stapling.
         | 
         | There isn't any particular force driving for OCSP stapling:
         | Browsers can't turn it on until it is ubiquitous. CAs don't
         | want to enforce must-staple because their users will need
         | enormous amount of help rolling it out. Site operators don't
         | care if browsers are fetching OCSP (especially since Chrome and
         | Edge don't)
        
         | champtar wrote:
         | I also love OCSP stapling but there are some limitations: -
         | webserver need to implement it - admin need to enable it -
         | webserver need internet access
         | 
         | Maybe they saw with some telemetry that very few website
         | actually enable OCSP stampling and decided to implement a fix
         | that cover all certs and can really be deployed
        
         | mholt wrote:
         | Yes, I noticed this too and asked the author about it [0].
         | 
         | IMO, OCSP stapling _is the best overall solution_ until
         | certificate lifetimes are shorter ( < 7 days).
         | 
         | As a server developer I'm worried that the focus on independent
         | CRLs will make it difficult to automate certificates in the
         | face of revocation. Currently, Caddy staples OCSP for all
         | certificates by default, caches the staples, and refreshes them
         | halfway through their lifetime. Works great. Every server
         | should do this. And when an OCSP response is discovered to be
         | Revoked, Caddy automatically replaces the certificate. Works
         | great. Every server should do this.
         | 
         | If every browser is independently going to decide which
         | certificates to distrust, now I am not sure of a good,
         | authoritative way to determine "revoked" and then replace
         | certificates automatically. I'm worried this will hurt the TLS
         | ecosystem unless we answer those questions first.
         | 
         | Actually, let's just shorten certificate lifetimes and be done
         | with it already.
         | 
         | Main blockers to short cert lifetimes:
         | 
         | - CA's uptime determines Web's uptime.
         | 
         | Main solution:
         | 
         | - Multiple redundant ACME CAs. If one goes down, try another.
         | (This is what Caddy already does.)
         | 
         | [0]: https://twitter.com/mholt6/status/1567559325949763588
        
           | xoa wrote:
           | Replying to this as it's on top, and first thanks for the
           | reply (and sibling replies as well). The challenges of
           | getting servers to update and implement being a practical
           | roadblock makes sense, I was more surprised just to not see
           | it mentioned at all. I'd have been fully prepared for a few
           | sentences along the lines of "this would be ideal if doing it
           | from scratch but it'd be hard to get everyone to go along
           | now, the reverse perils of decentralization". What you write
           | is also interesting.
        
           | OrvalWintermute wrote:
           | I too found this troubling.
           | 
           | Mainly because we are not talking about validation
           | performance, and OCSP stapling is an excellent performance
           | fix.
           | 
           | Not to mention all the privacy issues that OCSP stapling
           | really fixes.
        
         | GauntletWizard wrote:
         | OCSP Stapling is incredibly stupid. OSCP Stapling is either
         | ignored because it's not present, or it's effectively just an
         | override for the NotBefore and NotAfter fields. There's never a
         | reason as a server operator to present an invalid NotAfter, and
         | it represents a substantial surface area for client libraries
         | to respect them, in the area that's already most prone to
         | mistakes (the recursive descent to find a certificate chain)
         | 
         | It's much saner to issue a new, shorter-lived cert. Reduce
         | certificate lifetimes to match whatever you'd use OCSP stapling
         | for. Continue to use ACME. Figure out rotation lifetimes such
         | that I can sleep at night (i.e. they need to be at least 2
         | days, so that if the rotation fails I've still got time to wake
         | up and start working). Work on building client and server
         | tooling to make it easy to accept rotated certificates (i.e.
         | reload your client certificates on SIGHUP).
         | 
         | The advantage of CRLs isn't that they can be used offline, it's
         | that it allows your security team to burn certificates rather
         | than waiting for the CA to do so. You should always subscribe
         | to your CA's CRL too, but you should have a CRL for your own
         | internal use, too.
        
           | OrvalWintermute wrote:
           | > The advantage of CRLs isn't that they can be used offline,
           | it's that it allows your security team to burn certificates
           | rather than waiting for the CA to do so. You should always
           | subscribe to your CA's CRL too, but you should have a CRL for
           | your own internal use, too.
           | 
           | This is incorrect conventionally
           | 
           | CRLs are signed by the Originating CA in the conventional
           | trust model.
           | 
           | Most CAs won't give you a certificate with the ability to
           | generate your own CRL sharing the same trust, because it can
           | be weaponized to cause a denial of service.
           | 
           | This then brings up non-conventional trust models, and
           | Validation Authority as Co-equal to CA situations for
           | validity which can be significantly more challenging, and
           | requiring VA certificate insertion into every Relying Party,
           | just to name a few things, not to mention all the security
           | concerns.
        
             | GauntletWizard wrote:
             | I'm aware. That trust model is inherently broken, and it's
             | not followed anyway. In practice, any CA you trust can sign
             | a revocation for any certificate (the serial numbers are
             | all grouped together) - And that's the way it should be.
             | You should not wait for the issuing CA to get it's act
             | together, you should burn a certificate as soon as anyone
             | distrusts it, and if you have a rogue CA that starts
             | burning Google.com and other important sites, it's better
             | that they burn them (and take you offline) than that they
             | issue false certs (and leak all your data); It's also far
             | more obvious.
             | 
             | https://www.imperialviolet.org/2014/04/19/revchecking.html
             | 
             | Revocation checking is useful for your security team to
             | blacklist site. That's the only useful use.
        
         | aaomidi wrote:
         | OCSP stapling increases how large the first response from the
         | server is, increasing the time to first useful byte.
        
       | tinus_hn wrote:
       | Is certificate revocation really so common a crl of a normal
       | authority is gigabyte sized?
        
         | michaelt wrote:
         | According to [1] "On a typical Monday, we would expect to see a
         | total of around 22,000-30,000 SSL certificates being revoked
         | over the course of the day." i.e. ~8 million per year. [3]
         | meanwhile says "1.8 million certificates are revoked per year"
         | 
         | Looking at a random CRL [2] it's 41 bytes per revoked
         | certificate.
         | 
         | 8 million records at 41 bytes per record would be 300+
         | Megabytes. And a cautious CA might keep revoked certificates in
         | their CRL for more than a year.
         | 
         | So if an event like heartbleed happened again and uncommonly
         | large numbers of certificates needed to be revoked, the
         | gigabyte range is within the bounds of possibility.
         | 
         | [1] https://news.netcraft.com/archives/2014/04/11/heartbleed-
         | cer... [2] http://crl3.digicert.com/Omniroot2025.crl [3]
         | https://www.grc.com/revocation/crlsets.htm
        
           | WorldMaker wrote:
           | > And a cautious CA might keep revoked certificates in their
           | CRL for more than a year.
           | 
           | A CA does need to keep the revoked certificate in the CRL
           | until at least the natural expiration date on the
           | certificate, so for the CAs that give certificates out with 5
           | years out or more expiration dates, they may need to keep
           | CRLs for much longer than just a year just naturally by
           | nature of their expiration dates.
        
         | roblabla wrote:
         | The problem is, you need to design for the worse case scenario
         | here, because when the worse case does happen, the last thing
         | you need is for your revocation system to not work because it
         | doesn't scale.
         | 
         | So, no, it's not common. But it's necessary.
        
         | er4hn wrote:
         | The DoD uses x509, and CRLs, in those Common Access Cards (CAC)
         | everyone in the org has. Since this covers most of the armed
         | forces, that's fairly large. As of 2012[1] this was around 200
         | MB of CRLs and was only expected to get larger over time.
         | 
         | [1] https://dl.dod.cyber.mil/wp-content/uploads/pki-
         | pke/pdf/uncl... - Pg 7, under Local Cache
        
           | OrvalWintermute wrote:
           | As mentioned above, DoD uses smart strategies around
           | certificate validation.
           | 
           | CDNs + Localized OCSP + Tactical OCSP + Smart OCSP Clients +
           | Network caching + OCSP & CRLs on the filesystem, just to name
           | a few (not including delta CRLs and other solutions).
           | 
           | The DoD OCSP Responders are configured to share hash sets
           | with downstream OCSP Responders & Repeaters, which makes
           | promulgation particularly easy.
        
         | schoen wrote:
         | No, not at all.
         | 
         | The bad case would be if Let's Encrypt discovers a problem
         | (like a security flaw or implementation error in a validation
         | method, as happened with the TLS-ALPN-01 method before) and
         | concludes that it has to mass-revoke a very large number of
         | affected certificates.
        
       | luhn wrote:
       | > If we had an incident where we needed to revoke every single
       | one of those certificates at the same time, the resulting CRL
       | would be over 8 gigabytes.
       | 
       | I don't know much about this stuff, so apologies if this is a
       | silly question:
       | 
       | If you needed to revoke all the certificates, couldn't you just
       | revoke the handful of intermediary certificates and call it a
       | day? I assume you'd want to revoke them anyways if there's a
       | situation severe enough that merits revoking 200 million
       | certificates.
        
         | mcpherrinm wrote:
         | Yes, it is likely we'd revoke an intermediate if a significant
         | fraction of all issued certs had to be revoked. But we do want
         | our revocation infrastructure to support revoking all certs if
         | needed.
         | 
         | We have a set of backup intermediates that can be activated if
         | we had to revoke the active ones for any reason, so the
         | disruption wouldn't be too high hopefully.
         | 
         | (I work at Let's Encrypt, but this is my own opinion and not
         | that of my employer)
        
         | mholt wrote:
         | It's a good question, if I read you rightly.
         | 
         | There is no "handful" of intermediate certificates -- there are
         | precisely 4 (for Let's Encrypt [0]) and they are essentially
         | on-line root certificates. And if those certificates aren't
         | even compromised, revoking them would only harm the ecosystem.
         | 
         | [0]: https://letsencrypt.org/certificates/
        
           | shallichange wrote:
           | According to the link you posted, there are intermediate CAs.
           | Those could be revoked and effectively revoke all the end
           | entity certificates.
        
       | jiripospisil wrote:
       | > This means that they're often very large - easily the size of a
       | whole movie.
       | 
       | Couldn't they just use actual units? This says absolutely
       | nothing.
        
         | WorldMaker wrote:
         | The article does later state Lets Encrypt's own expected worst
         | case is 8 GBs in one CRL file in the hopefully unlikely
         | scenario that every unexpired certificate they manage was
         | revoked.
        
           | OrvalWintermute wrote:
           | Merkle Hash Trees are the well-known solution for this,
           | whenever they decide to update the protocols (was software
           | patent encumbered til 2017)
        
         | TheSpiceIsLife wrote:
         | One Olympic swimming football bus tree worth of data.
         | 
         | When I was reading the thread the other day about a trees worth
         | of oxygen from the MOXIE experiment, I couldn't help thinking:
         | _why not just use a term everyone is familiar with, litres per
         | minute air_.
         | 
         | Enough oxygen to sustain an adult at rest for x minutes.
         | 
         | I'm beginning to suspect there's an in-joke with science / tech
         | writers about strained analogies.
         | 
         | And I'm not _in_.
        
           | JohnFen wrote:
           | > I'm beginning to suspect there's an in-joke with science /
           | tech writers about strained analogies.
           | 
           | Which has been going on for longer than I've been alive. The
           | idea (I assume) is to take a large number that's hard to
           | conceive and turn it into something everyone can relate to.
           | 
           | But inevitably, they choose things that few can actually
           | relate to, or things that are so vague/variable to be
           | meaningless. It just adds more confusion all around.
           | 
           | It has to be an intentional joke.
        
       | ok_dad wrote:
       | Browsers and CAs still deciding for the user what's best with yet
       | another centralized database that we have to "trust" is complete
       | and implemented correctly. I don't see how this is so hard: just
       | let us download the CRLs. Maybe add them on a torrent-like system
       | so they can be shared (and validated) peer-to-peer, or at least
       | have hundreds or thousands of mirrors that can provide that data.
       | All this "it's too big and the user would have to download it"
       | bullshit is just pushing us more and more into "computing as a
       | service"; aka: we OWN your digital life.
       | 
       | Edit: I was partly wrong, this is a good thing because you CAN
       | download the CRLs now (see comments below here for info), whereas
       | previously you couldn't. Your browser still probably won't
       | support a full CRL download, but I could be pleasantly surprised.
        
         | agwa wrote:
         | > I don't see how this is so hard: just let us download the
         | CRLs
         | 
         | Previously, you couldn't do this, because not all CAs published
         | CRLs.
         | 
         | Beginning October 1, you _will_ be able to just download the
         | CRLs, because Apple and Mozilla are requiring it.
         | 
         | It's therefore unclear what your beef is,
        
           | ok_dad wrote:
           | > Beginning October 1, you will be able to just download the
           | CRLs
           | 
           | Correction: Apple and Mozilla will be able to just download
           | the CRLs. Not me. The link in the post SPECIFICALLY says us
           | common plebes don't get that right.
        
             | agwa wrote:
             | Where does the post say that?
             | 
             | If you think it's because the URLs will be disclosed in the
             | CCADB, note that the contents of the CCADB are published
             | here: https://www.ccadb.org/resources
             | 
             | Specifically, the CRL URLs can be found in this CSV file:
             | http://ccadb-
             | public.secure.force.com/ccadb/AllCertificateRec...
        
               | ok_dad wrote:
               | I was pretty sure this section meant what I said but
               | maybe you can get them from that database without being a
               | BigCo?:
               | 
               | "Our new CRL URLs will be disclosed only in CCADB, so
               | that the Apple and Mozilla root programs can consume them
               | without exposing them to potentially large download
               | traffic from the rest of the internet at large."
        
               | chrisfosterelli wrote:
               | I assumed what they meant is that the database is
               | publicly available but that browser implementations won't
               | be directly pulling CRLs. Instead the browser providers
               | pull the CRLs and create a compressed version that their
               | browser users download.
               | 
               | In the same way that you can technically query the DNS
               | root servers yourself but you don't tend to do that
               | because your computer will query a more downstream DNS
               | server.
        
               | agwa wrote:
               | Yes, that's exactly what it means.
        
               | agwa wrote:
               | I have a cron job that pulls that CSV file once a day. I
               | assure you I am not a BigCo.
        
               | ok_dad wrote:
               | I was wrong, thanks for correcting me :)
        
               | mholt wrote:
               | "The connection has timed out. An error occurred during a
               | connection to ccadb-public.secure.force.com."
        
               | agwa wrote:
               | Works for me, though the time to first byte is currently
               | rather long.
        
       ___________________________________________________________________
       (page generated 2022-09-07 23:00 UTC)