[HN Gopher] Please stop using CDNs for external JavaScript libra...
       ___________________________________________________________________
        
       Please stop using CDNs for external JavaScript libraries
        
       Author : edent
       Score  : 405 points
       Date   : 2020-10-11 11:58 UTC (11 hours ago)
        
 (HTM) web link (shkspr.mobi)
 (TXT) w3m dump (shkspr.mobi)
        
       | withinboredom wrote:
       | I was surprised to see that the cost of another connection wasn't
       | thrown in as a downside. With http2, that cost goes away if the
       | resource is stored at the same domain as the site.
        
         | politelemon wrote:
         | That's a really good point now - previously an advantage of
         | using a CDN was that a browser would request a limited number
         | of concurrent connections per domain, so having the static
         | assets spread across domains sped up that initial page load. As
         | HTTP/2 spreads the cost is indeed reduced.
        
       | davidmurdoch wrote:
       | Loading common libraries from a CDN will no longer bring any
       | shared cache benefits, at least in most major browsers. Here's
       | Chrome's intent to ship:
       | https://chromestatus.com/feature/5730772021411840 Safari already
       | does this, and I think Firefox will, or is already, as well.
        
         | dndvr wrote:
         | Will this cause performance issues for sites that use static
         | cookieless domains for js, images etc
         | 
         | Google themselves do this with gstatic.net and ytimg.com etc
        
           | babuskov wrote:
           | > Will this cause performance issues for sites that use
           | static cookieless domains for js, images etc
           | 
           | > Google themselves do this with gstatic.net and ytimg.com
           | etc
           | 
           | Most probably not. The point of cookieless domains is that
           | you can use a very simple web server to serve content (no
           | need to handle user sessions, files are pre-compresses and
           | cached, etc.) and it lowers incoming bandwidth a lot. If you
           | have a lot of requests (images, css, js) the cookie
           | information adds up quickly.
           | 
           | Opening video thumbnails from ytimg.com will still be cached
           | for youtube.com as before. The only thing that will change is
           | for embedded videos on 3rd party websites as those won't be
           | able to use caches ytimg.com thumbails from elsewhere.
        
             | pferde wrote:
             | Couldn't the same thing be achieved by routing e.g.
             | google.com/static/ to a separate simple webserver, instead
             | of using another domain? Or use a subdomain, e.g.
             | static.google.com.
             | 
             | The current way seems like needless DNS spam to me...
        
         | edent wrote:
         | Oh, that's interesting. I guess it makes sense from a security
         | and privacy perspective.
        
         | dheera wrote:
         | I wonder if an nginx plugin could be made to auto-cache CDN
         | javascript/css files and edit the HTML on the fly to serve them
         | from locally.
        
           | oefrha wrote:
           | You're probably thinking about a caching proxy like squid
           | cache.
        
           | daveFNbuck wrote:
           | You can set up a path that does a cached proxy to the CDN and
           | just edit the HTML yourself. It's a bit annoying to get the
           | cache settings to work properly, but editing the HTML is
           | easy.
        
           | ex_amazon_sde wrote:
           | https://decentraleyes.org/
        
         | thorum wrote:
         | On the other hand, the URL for a common library hosted on cdnjs
         | (or one of the other big JavaScript CDNs) and included on many
         | different websites is much more likely to already be cached on
         | edge servers close to your users than if you host the file
         | yourself.
        
           | arghwhat wrote:
           | The time to connect to the CDN hostname will negate any
           | benefit, especially if push can be used.
        
           | Matthias247 wrote:
           | You can mitigate this by getting your website itself on a
           | CDN. If this is cached, then it's assets (incl javascript),
           | would be too.
           | 
           | And by going that route you make sure that all pieces of your
           | website have the same availability guarantees, the same
           | performance profile, and the same security guarantees that
           | the content was not manipulated by a 3rd party.
        
             | GeneralTspoon wrote:
             | > And by going that route you make sure that all pieces of
             | your website have the same availability guarantees, the
             | same performance profile, and the same security guarantees
             | that the content was not manipulated by a 3rd party.
             | 
             | You can already guarantee the security of the file by using
             | the integrity attribute on the <script> tag. And the
             | performance of your CDN is probably worse than the Google
             | CDN (not to mention that you lose out on the shared cache).
        
         | GordonS wrote:
         | I'm not sure I understand the threat here. Say I visit SiteA
         | which references jQuery from CDNJS, then later visit SiteB
         | which references exactly the same jQuery from CDNJS - what's
         | the problem?
        
           | dathinab wrote:
           | It can be used to track users across domains to some degree.
        
             | mmcwilliams wrote:
             | Not to be dense but wasn't that always the purpose of
             | running a CDN service for common scripts and libraries?
        
               | daveFNbuck wrote:
               | The script wouldn't have to be from a CDN to track people
               | using the browser cache. I could infer whether you've
               | visited a site that doesn't use CDNs or trackers by
               | asking you to load something from that site and inferring
               | whether you have that resource cached by the time it took
               | you to load it.
        
               | mmcwilliams wrote:
               | This is true, but if you're running a CDN you have access
               | to cross-domain user information just based on the
               | headers, no?
        
               | daveoc64 wrote:
               | The CDN is not the place you have to worried about.
               | 
               | If Site A loads a specific JavaScript file for users with
               | an administrator account, Site B can check to see if the
               | JavaScript file is in your cache, and infer that you must
               | have an administrator account if the file is there.
               | 
               | The attack can happen with different types of resources
               | (such as images).
        
             | GordonS wrote:
             | Thanks, I understand the issue now - I haven't thought
             | about CDNs from a privacy perspective before.
             | 
             | I suppose with HTTP2 some of the benefits of serving JS
             | through CDNs are gone anyway, so I guess it's time to stop
             | using them.
        
             | amelius wrote:
             | How serious is this type of threat? Compared to all the
             | info about us that is already shared by data brokers?
        
               | gpvos wrote:
               | It's being actively used by the ad networks to do user
               | fingerprinting instead of cookies, since the latter are
               | more and more blocked.
        
               | ev1 wrote:
               | There are several data-broker-esque "services" that
               | actually do this already with FB, Google, etc assets
               | (favicon.ico and similar, loggedIn urls, ...) to check
               | whether you have visited those pages, or whether you are
               | logged in to those services by trying to request a URL
               | that might return a large image if logged in, or fail
               | rapidly if logged out. -- This has been a thing for a
               | long time: https://news.ycombinator.com/item?id=15499917
               | 
               | If you don't use any of those sites, you're considered
               | higher risk/fraudulent user/bot.
               | 
               | Here's an example of a very short and easy way to see if
               | someone is probably gay:
               | https://pastebin.com/raw/CFaTet0K
               | 
               | On chrome, I consistently get 1-5 back after it's been
               | cached, and 100+ on a clean visit. On Firefox with
               | resistFingerprinting, I get 0 always.
        
               | amelius wrote:
               | Thank you, that was insightful.
               | 
               | > Here's an example of a very short and easy way to see
               | if someone is probably gay
               | 
               | Ok, but now the resource is in my cache, so from now on
               | they will think I'm gay?
        
               | tzot wrote:
               | You could have run this on a private window of the
               | browser (and in that case, they would surely think you're
               | a closeted gay).
        
               | ev1 wrote:
               | > Ok, but now the resource is in my cache, so from now on
               | they will think I'm gay?
               | 
               | This resource is just generic, so probably not, but if
               | you actually visited grindr's site without adblocking
               | heavily, they load googletagmanager and a significant
               | number of other tracking services, which will almost
               | certainly associate your advertising profile and
               | identifiers as 'gay'
               | 
               | I also can't believe they send/sell your information to 3
               | pages worth of third party monetization providers/adtech
               | companies for something that is this critically
               | sensitive.
        
           | kami8845 wrote:
           | maybe not with CDNJS, but perhaps you don't want every
           | website to know you have AshleyMadison.com assets cached.
        
             | Uehreka wrote:
             | Can websites even tell what is cached and what's pulled
             | fresh?
        
               | EE84M3i wrote:
               | Yes, using timing
        
               | evilduck wrote:
               | Wouldn't the act of timing a download mean that I
               | download and pollute my cache with new assets from the
               | site trying find where else I've been? Does this only
               | work for the first site that tries to fingerprint a
               | browser in this way?
        
               | curryst wrote:
               | Is there a noCache option? Or can JS remove entries from
               | the cache to reset it?
               | 
               | Someone below mentioned doing requests for a large image
               | that requires authentication. Short response time means
               | the user isn't logged in (they got a 403), long response
               | time means they downloaded the image and are logged in.
        
               | amelius wrote:
               | Not if the javascript starts running only after all
               | resources have loaded.
        
               | darepublic wrote:
               | No there could still be timing attacks after. Just
               | dynamically request a cross domain asset
        
               | amelius wrote:
               | Then those requests should not be cached?
        
               | [deleted]
        
               | tylerhou wrote:
               | const start = window.performance.now();       const t =
               | await fetch("https://example.com/asset_that_may_be_cached
               | .jpg");       const end = window.performance.now();
               | if (end - start < 10/*ms*/) {
               | console.log("cached");       } else {
               | console.log("not cached");       }
        
               | amelius wrote:
               | In that case, the browser would always load the asset (it
               | is not cached). So the rule would be that only stuff that
               | is directly in the <head> may be cached (or stuff that is
               | on the same domain).
        
               | tylerhou wrote:
               | To be clear, the context of the thread is "why do we need
               | to partition the HTTP cache per domain." My example code
               | works under the (soon-to-be-false) assumption that the
               | cache is NOT partitioned (i.e. there is a global HTTP
               | cache).
               | 
               | > In that case, the browser would always load the asset
               | (it is not cached).
               | 
               | Agreed, if the cache is partitioned per domain AND the
               | current domain has not requested the resource on a prior
               | load. If the cache is global, then the asset will be
               | loaded from cache if it is present:
               | https://developer.mozilla.org/en-
               | US/docs/Web/API/Request/cac...
               | 
               | > So the rule would be that only stuff that is directly
               | in the <head> may be cached (or stuff that is on the same
               | domain).
               | 
               | You could be more precise here: with a domain-partitioned
               | cache, all resources _regardless of domain_ loaded by any
               | previous request on the same domain could be cached. So
               | if I load HN twice and HN uses
               | https://example.com/image.jpg on both pages, then the
               | second request will use the cached asset.
        
               | amelius wrote:
               | > To be clear, the context of the thread is "why do we
               | need to partition the HTTP cache per domain."
               | 
               | Ah right, the thread is becoming long :)
               | 
               | > So if I load HN twice and HN uses
               | https://example.com/image.jpg on both pages, then the
               | second request will use the cached asset.
               | 
               | Good point!
        
           | Phemist wrote:
           | I'm guessing many websites are identifiable by which patterns
           | of libs and specific versions they will force you to cache.
           | One SiteA would then be able to tell that a user visited a
           | SiteB (which, depending on the website, may or may not be
           | problematic)
        
             | franga2000 wrote:
             | I'm sure some sites would be identifiable by their cached
             | libs, but the cache is shared, so any overlapping
             | dependencies would decrease the accuracy to unusable
             | levels. The best you could do is know someone did not visit
             | a site in the last ${cache_time}.
             | 
             | There are, of course, other vectors to consider, but I
             | can't think of any that could be abused by third parties.
             | If anything, isolating caches would make it easier for the
             | CDN themselves to carry out the attack you mentioned, as
             | they would be receiving all the requests in one batch.
        
               | notsuoh wrote:
               | Like a discount Bloom Filter?
        
           | madeofpalk wrote:
           | Try and load assets from another domain and observe if it was
           | probably cached or not, and you can know that they visited
           | the site
        
             | convery wrote:
             | I guess, but that disadvantage seems massively outweigh by
             | the benefits. Can always use something like [1] to check if
             | a client is active on interesting sites.
             | 
             | [1] - https://www.webdigi.co.uk/demos/how-to-detect-
             | visitors-logge...
        
         | jakub_g wrote:
         | For info, this shipped in Chrome 86 just last week:
         | 
         | https://developers.google.com/web/updates/2020/10/http-cache...
        
         | rebelde wrote:
         | Wouldn't a better, but partial, solution be for browsers to
         | preload the top x common libraries? All other libraries would
         | probably have to follow this new rule.
        
           | ValueNull wrote:
           | Isn't this essentially what Decentraleyes does?
           | 
           | https://decentraleyes.org/
        
             | Forbo wrote:
             | I've been using LocalCDN, it seems to be more acatively
             | maintained and has a better selection of libraries.
             | 
             | https://www.localcdn.org/
        
           | codegladiator wrote:
           | Please no, don't create such barriers.
        
             | admax88q wrote:
             | Every other language runtime has a standard library, it's
             | always been a shortcoming of the web IMO
        
           | lkschubert8 wrote:
           | At that point wouldn't it make more sense to just have the
           | browsers include that functionality?
        
           | tracker1 wrote:
           | What version(s) of those libraries? I mean, I don't deal with
           | this anymore... but I've seen sites literally loading half a
           | dozen different copies of jQuery in the past (still makes me
           | cringe).
        
         | arendtio wrote:
         | Does someone know why they don't split the cookie storage
         | equally by the top origin?
         | 
         | I mean, wouldn't that take care of a whole class of attack
         | vectors and make cross-origin requests possible without having
         | to worry about CSRF?
        
           | tgsovlerkhgsel wrote:
           | One of the problems is that it breaks use cases like logging
           | into stackoverflow.com and then visiting serverfault.com, or
           | (if you do it by top-level origin) even en.wikipedia.org and
           | then visiting de.wikipedia.org. [1]
           | 
           | While privacy sensitive users may consider this a feature in
           | case of e.g. google.com and youtube.com, the average user is
           | more likely to consider it an annoyance, and worse, it is
           | likely to break some obscure portal somewhere that is never
           | going to be updated, so if one browser does it and another
           | doesn't, the solution will be a hastily hacked note "this
           | doesn't work in X, use Y instead" added to the portal. And no
           | browser vendor wants to be X.
           | 
           | [1] The workaround of using the public suffix list for such
           | purposes is being discouraged by the public suffix list
           | maintainers themselves IIRC, so the "right" thing to do would
           | be breaking Wikipedia.
           | 
           | Edit: If done naively on an origin basis right now, it would
           | break the Internet. You couldn't use _any_ site/app that has
           | login/account management on a separate host name. You
           | couldn't log into your Google account with such a browser
           | anymore (because accounts.google.com != mail.google.com).
           | _Countless_ web sites that require logins would fail, both
           | company-internal portals and public sites.
        
             | singron wrote:
             | It's possible to get around this with a redirect staple.
             | E.g. if Google wants you to be logged in on youtube.com and
             | google.com simultaneously:
             | 
             | 1) User logs in at google.com/login and sets google.com
             | cookies. 2) Server generates a nonce and redirects to
             | youtube.com/login?auth=$NONCE 3) youtube.com checks the
             | $NONCE and sets youtube.com cookies 4) youtube.com
             | redirects back to google.com.
             | 
             | Firefox's container tabs can maintain isolation despite
             | this since even this redirect will stay within a container.
             | However there is a usability penalty since the user has to
             | open links for sites in the right container (and
             | automatically opening certain sites in certain containers
             | will enable cross-container stapling again).
        
           | fastest963 wrote:
           | This is being worked on https://github.com/privacycg/storage-
           | partitioning.
        
       | alibarber wrote:
       | I started my career, and have spent most of it, working in places
       | where the production network was (almost) airgapped from the
       | entire internet (MPAA accredited facilities). I would say that
       | the general quality of software and and robustness when it comes
       | to dependencies is so much greater in these places. If you want
       | to use some library, it's up to you to get it, and its
       | dependencies, check the versions are compatible and package it up
       | and build it all internally. Yup, it's work... Do you need this
       | library? Is it actually any good? Is the license compatible with
       | our usage? This is all basic code quality stuff that's often
       | completely overlooked when people can just pull in whatever junk
       | from whatever trendy repo is the hotness nowadays. And then when
       | that goes down/bankrupt - it's up to you to fix something you've
       | no control over.
        
         | bob1029 wrote:
         | We provide software that runs within very secure financial
         | networks, and have some extreme constraints regarding what
         | sorts of 3rd party code we can pull in. We are having to do a
         | vendor swap on some document parsing code because one of our
         | clients scanned a dependency and found it could be vulnerable
         | in a context that it would never be exposed to in our
         | application. These types of things make it really risky to go
         | out and build your enterprise on top of someone else's idea of
         | a good time.
         | 
         | Virtually everything we do is in-house on top of first party
         | platform libraries - i.e. `System. _`, `Microsoft._ `, etc. We
         | exclusively use SQLite for persistence in order to reduce
         | attack surface. Our deliverables to our customers consist of a
         | single binary package that is signed and traceable all the way
         | through our devops tool chain, which is also developed in-house
         | for this express purpose of tightly enforcing software release
         | processes.
         | 
         | This approach is _certainly_ slower than vendoring out
         | everything to the 7 winds, but there are many other advantages.
         | Every developer knows how everything works up and down the
         | entire vertical since its all sitting inside one happy solution
         | just an F12 away. Being able to see a true, enterprise-wide
         | reference count above a particular property or method is like a
         | drug to me at this point. We are definitely over the hill and
         | reaping dividends for building our own stack. It did take 3-4
         | years though. Most organizations cannot afford to do what we
         | did.
        
           | postpawl wrote:
           | Couldn't this end up being disaster with a large codebase and
           | even a small amount of turnover? Are you really advocating
           | that someone should write all their dependencies themselves
           | even if they can afford it?
        
             | cs02rm0 wrote:
             | Even without turnover, I've no idea how you're supposed to
             | compete with the quality of top open source software. Maybe
             | no one finds the bugs in your software, but they're
             | definitely there.
        
               | bob1029 wrote:
               | We don't try to compete with the quality of top open
               | source software. Our stack fundamentally consists of:
               | 
               | C# 8.0 / .NET Core 3.x / AspNetCore / Blazor / SQLite
               | 
               | Of these, SQLite is arguably the most stellar example of
               | what open source software can provide to the world.
               | 
               | Everything else in our stack consists of in-house
               | primitives built upon these foundational components.
        
               | Godel_unicode wrote:
               | It's funny to me the number of developers who have
               | effectively forgotten that Microsoft exists, and that
               | it's possible to have your entire stack be provided by
               | one company who directly sells it's software for profit.
        
               | steverb wrote:
               | That's really interesting. My team's stack is the same
               | except we use Azure SQl server instead of SQLite.
               | 
               | I'd love to understand why you chose that.
               | 
               | Feel free to hit my up at the address in my profile if
               | you don't want to talk here.
        
             | alibarber wrote:
             | It depends on the context. Yeah it'd be lovely to have your
             | system connected to the internet - but no, our clients
             | wouldn't give us money if you do that, and we need money.
             | So, maybe yes in this case, start writing some quality,
             | well thought out maintainable libraries (that can include
             | audited third party code), and just bill it. In my case,
             | the cost to the client of a team of devs working on that
             | was less than the cost to them of the risk of a film
             | leaking...
             | 
             | [Edit] - But what I have found through experience is that
             | the code that was written under these constraints seemed to
             | be better, more secure and robust, than without them. YMMV.
        
             | jrumbut wrote:
             | I assume if you have the money to be that thorough you have
             | the money to offer some inducements to stick around.
             | 
             | Plus if all you know is this custom stack, where are you
             | gonna go?
        
               | Godel_unicode wrote:
               | Anywhere that hires good software developers, since if
               | you learned this stack it's presumably not hard to get a
               | job somewhere else? Exooglers have a pretty easy time
               | getting hired.
        
             | jasonkester wrote:
             | I'm find it amusing that we've reached the point where
             | developers can no longer imagine a shop that writes its own
             | software.
        
               | postpawl wrote:
               | The software community has built a lot of extraordinary
               | tools that have been through a lot of battle testing.
               | Pretending those lessons aren't worth something and
               | thinking you can do it all yourself is a mistake a lot of
               | the time.
        
               | dahfizz wrote:
               | The community had produced 1000x as many tools that are
               | garbage. It can sometimes be hard to tell the difference
               | (if the developer cares to look at all).
        
               | ratww wrote:
               | If you're taking about foundational dependencies like
               | OpenSSL, Linux, LLVM, or even jQuery and React, sure.
               | Also most stdlibs and DBs, like GP said he uses, also
               | fall into that.
               | 
               | Dependencies in general, like 95% (or more) of the kind
               | we see in modern package managers? No, they're mostly
               | untested liabilities and the majority of them could be
               | rewritten in an afternoon.
               | 
               | This whole discussion is a bit strange. GP clearly uses
               | dependencies, just not as much as everyone else today. I
               | don't understand why the fixation with polarizing the
               | discussion into "use lots of dependencies" vs "write
               | everything from scratch".
        
               | bob1029 wrote:
               | I think the problem is mostly cognitive. The codebase
               | should be viewed as the most important investment that a
               | software company could ever hope to possess.
               | 
               | Within the codebase, the infrastructure and tooling is by
               | far the most important aspect in terms of productivity
               | and stability of daily work process.
               | 
               | If you take the time to position yourself accordingly,
               | you can make the leverage (i.e. software that builds the
               | software) work in virtually any way you'd like for it to.
               | If it doesn't feel like you are cheating at the game, you
               | probably didn't dream big enough on the tooling and
               | process. Jenkins, Docker, Kubernetes, GitHub Actions, et.
               | al. are not the pinnacle of software engineering by a
               | long shot.
        
               | codebje wrote:
               | A company's codebase is a liability, not an asset - it
               | needs to be maintained, and as you point out, it needs
               | money spent on tooling and infrastructure to be most
               | effective.
               | 
               | Unless you happen to be one of the very rare companies
               | that sells source code and not built artefacts, your
               | asset is the built artefact and your code is the expense
               | you take on to get it.
               | 
               | Having less code to get the business outcome only makes
               | sense when you see the code as a cost, not a thing of
               | value itself.
        
               | dragonwriter wrote:
               | > A company's codebase is a liability, not an asset
               | 
               | Its an asset. Like many (virtually all, other than pure
               | financial) assets, it has associated expenses;
               | maintenance, depreciation, and similar expenses are the
               | norm for non-financial assets.
        
               | zepearl wrote:
               | > _A company 's codebase is a liability, not an asset..._
               | 
               | > _your asset is the built artefact_
               | 
               | Therefore, summarized, you mean that the sourcecode
               | needed to generate the resulting <app/service/whatever>
               | is a liability, but that the result can be an asset (if
               | it does generate external revenue, or internally lowers
               | costs ,etc..)?
               | 
               | I personally never thought about this kind of separation
               | - interesting.
        
               | cloudhead wrote:
               | This is misleading and pretty much wrong. It's like
               | saying the hen is a liability, the only asset is the egg.
               | Or like saying your team is a liability and the only
               | asset is the work they produce..
        
               | iainmerrick wrote:
               | _A company 's codebase is a liability, not an asset_
               | 
               | Why not just throw it away, then?
        
             | pwdisswordfish4 wrote:
             | Original commenter advocates for writing your own,
             | presumably "from scratch", and mentions high-risk targets.
             | Even if you don't take those for granted, though... Let's
             | assume lower risk than a studio, and you relax the
             | conditions from developed-in-house to _maintained_ -in-
             | house (e.g., a library exists, so you go grab it, and by
             | the power of open source, internally it's now "yours"--a
             | fork that you take full responsibility for and have total
             | control over, just like if you _had_ developed in-house,
             | except you 're shortcutting the process by cribbing from
             | code that already exists.)
             | 
             | Here's an unrecognized truth:
             | 
             | The cost of forking a third-party library and maintaining
             | it in-house solely for your own use is no higher than the
             | cost of relying on the third-party's unforked version.
             | Depending on specifics, it can actually be lower.
             | 
             | Note that this is a _truth_ ; the only real variable is
             | whether it's acknowledged to be true or not. Anyone who
             | disputes it has their thumb on the scale in one way or
             | another, consciously or unconsciously.
        
               | jefftk wrote:
               | You're really claiming this is a universal truth? Do you
               | think it applies to OpenSSL? Chromium?
        
               | ntauthority wrote:
               | For cryptography one should just be able to rely on their
               | OS' library, and depending on a full browser with high
               | amounts of code churn, no compatibility for
               | implementation code and a large dependency graph of its
               | own is not really seen as a good thing in this context at
               | all.
        
               | jefftk wrote:
               | Let's be more specific: do you think Brave would be
               | better off if they hard forked Chromium?
        
               | Godel_unicode wrote:
               | Chromium ships with Windows and is maintained by
               | Microsoft. Use the OS crypto library.
        
               | jefftk wrote:
               | libjpeg?
               | 
               | Specifically, I think hard forking is a bad idea for any
               | sort of library that needs to be regularly updated for
               | compatibility or security reasons.
        
               | Godel_unicode wrote:
               | That's possibly true if you don't have headcount for
               | doing that maintenance. If you have appropriately planned
               | for it however, it's just more software that you're
               | writing to do the work you need done.
               | 
               | If you're depending on some random person on the internet
               | to update software which underlies your whole stack, then
               | when the next imagetragick drops you can't update until
               | they get around to fixing it. Since you won't have
               | developers familiar with the code, fixing it won't likely
               | be feasible for you. That's a lot of risk.
        
         | rapind wrote:
         | Are you saying that when you need to add a space character to
         | the beginning of your strings you write it yourself?!
        
           | alibarber wrote:
           | Yes, but after getting approval from the board I was able to
           | get a dispensation to use https://isevenapi.xyz/ for some of
           | our calculation services.
        
         | jrsj wrote:
         | This is probably better from a quality perspective and also
         | probably not worth the time it would take in 90% of projects.
         | 
         | That being said I'm amazed how much production software depends
         | on multiple libraries that are developed and maintained by a
         | single person as a hobby.
        
           | madeofpalk wrote:
           | Yeah like as a frontend/web developer, most of what we do is
           | make what are essentially Wordpress themes for some company
           | that really doesn't matter or do much important.
        
           | ClikeX wrote:
           | That one person (sometimes) puts in more effort into that
           | single library then I've seen some agencies put into a client
           | project.
        
             | wwweston wrote:
             | Intrinsic vs extrinsic motivation.
             | 
             | It's pretty common for someone building a library to do so
             | for the utility, scratching a real itch. Not universal
             | (people build libraries to be someone who built a library,
             | too), but common.
             | 
             | It's very common for agencies to be building client
             | projects in order to bill for it. Sometimes there's a
             | special alignment where the client is also primarily
             | interested in spending an allotted budget so long as
             | there's a plausibly adequate deliverable.
        
             | typon wrote:
             | It's hard for me to think of open source libraries that are
             | developed and maintained by mega corporations that I
             | actually prefer to use over libraries made by small indie
             | developers. The only one that I can think of is pytorch,
             | but thats kind of unfair since it was acquired by Facebook
             | not developed from scratch by them.
        
               | tracker1 wrote:
               | For me, the biggest is React... aside from that, not much
               | really.
        
           | alibarber wrote:
           | Sure, everything's a tradeoff - but sometimes I see things
           | like that as frontloading the pain of when bndlrrr.io or
           | whatever goes down in the middle of the night and your client
           | is angry at you. But yeah, like everything in software, it's
           | a spectrum and 'best practice' and the 'right way' are highly
           | dependent on the context.
        
         | cs02rm0 wrote:
         | I've spent much of my career on similarly, perhaps a little
         | more, restricted networks and found the opposite.
         | 
         | Dependencies are painful to pull in and only pulled in when a
         | dev needs a specific version. The internal repos end up being a
         | missing version nightmare where people cobble together whatever
         | works with what's available. Where feature and security
         | upgrades go ignored, left to rot like the brains of the devs
         | who struggle to keep up with what's available in the real
         | world.
         | 
         | Many of those networks I've worked on are becoming more
         | permeable at the edges because the cost of the air gap
         | outweighs any benefits.
        
         | tyldum wrote:
         | Even Cisco's web-based firewall management interface uses
         | Google Analytics. Granted, it will work just as bad regardless
         | of reachability, so there's that.
        
         | ex_amazon_sde wrote:
         | Ex-Amazon SDE here. The same happens in FAANGS: tons of
         | software is written and served internally.
         | 
         | It's not NIH syndrome, usually. It's about having control over
         | the whole software supply chain for security, reliability,
         | licensing compliance and general quality.
        
         | dvdkon wrote:
         | I have to ask, why does the MPAA accredit airgapped facilities?
         | I know movies are a big business, but that seems a little
         | extreme.
        
           | mikeryan wrote:
           | It's not airgapping per se. The large studios want to ensure
           | they can send out their content pre-release to the various
           | third-party vendors working on a project. Ensuring the
           | vendors meet MPAA guidelines is a mechanism they use to
           | ensure this. It's not technically an accreditation but it's
           | usually contractually enforced if you want to work with a
           | large studio.
           | 
           | You can actually read the whole thing:
           | https://www.motionpictures.org/wp-
           | content/uploads/2020/07/MP...
        
       | martini333 wrote:
       | A DNS lookup will also affect speed
        
         | kijin wrote:
         | Not to mention another TCP handshake, TLS handshake,
         | certificate validity checks, etc.
         | 
         | HTTP/2 changed everything. Now the fastest way to serve your
         | assets is to push them down the pipe that has already been set
         | up.
        
       | kizer wrote:
       | $$$
        
       | xthestreams wrote:
       | Mostly opinions without data, as usual. Why can't we as a
       | community switch to a more serious scientific/engineeristic
       | approach, at least in those areas where it is easy to do so?
       | 
       | Why should I trust this blog post on the "Caching" point, for
       | example? It's got no data and no references.
       | 
       | "What are the chances that your user has visited a site which
       | uses the exact same CDN as your site?" ... hey, you can _measure_
       | that.
        
       | lolc wrote:
       | Weirdly, the piece fails to mention my number on reason for
       | loading libraries from a CDN: economy. Least setup cost and
       | externalized hosting costs mean that unless I have good reasons
       | otherwise, I have to use them from an economic perspective!
        
       | whyagaindavid wrote:
       | This depends on the CDN. Lets say you are using google or
       | cloudflare CDN. They have more engineers and better security
       | processes - and work 24X7 -365 days - continuous monitoring -
       | than you remembering to download jquery update. What about if you
       | are on holiday (post/pre COVID era)? BA is not the best
       | example... Rename your article to: Please stop using 'unknown
       | CDNs'...
        
       | Dyaz17 wrote:
       | Regarding the security aspect, I created GuardScript to help
       | catch malicious 3rd party (or 1st party) javascript changes:
       | https://www.guardscript.com/ . You should use if you own a SaaS
       | service that require your clients to include your Javascript
       | library in their page, and when you can't use Subresource
       | Integrity.
        
       | ThePhysicist wrote:
       | > Latest software versions automatically when the CDN updates.
       | 
       | I'd also be very careful including a JS library from a CDN that
       | gets auto-updated to the latest version, this might break your
       | site without you noticing.
       | 
       | We did this with our own JS library (Klaro -
       | https://github.com/kiprotect/klaro), for which we offered a CDN
       | version hosted at https://cdn.kiprotect/klaro/latest/klaro.js. We
       | stopped doing that with version 0.7 as we realized we could not
       | introduce any breaking changes without risking to break the
       | websites of all users that were using the "latest" version. So
       | what we do know is that we only automatically update minor
       | releases, i.e. we have a "0.7" tag in the CDN that will receive
       | patch releases (0.7.1, 0.7.2 , ...) and which users can safely
       | embed in their websites.
       | 
       | That said we always recommend users to self-host their scripts as
       | well and to use integrity tags whenever possible, it's usually
       | better from a security and privacy perspective as well.
       | 
       | Like most JS libraries Klaro is quite small (45 kB compressed),
       | so the loading time is dominated by the connection roundtrip
       | time, which again is dominated by the TLS handshake time. For our
       | main servers in Germany we have a roundtrip ping of around 20 ms
       | (for most connections from Germany), and with the TLS negotiation
       | it takes about 60-100 ms to serve the script file from our
       | server. If the server connection is already open that time
       | reduces to 20-40 ms. So the benefit of hosting small scripts with
       | your main website content is that your browser doesn't have to do
       | another TLS handshake with the CDN, which for small scripts can
       | dominate the transfer time. If you then use modern HTTP & TLS
       | versions you can reduce the transfer time quite a bit.
        
         | laurent92 wrote:
         | The CDN can't swap the files under us, if we activate the CSP,
         | content security policies, which are activated once per domain.
         | They require to provide <script> tags with the "integrity="
         | attribute holding the hash of the file.
        
           | ThePhysicist wrote:
           | That will still break your site if the CDN updates the file
           | because it won't load anymore. The point the author made was
           | that people use libraries distributed by CDNs to get auto-
           | updates and fixes, sadly that won't work in combination with
           | CSP as you'll still have to update the integrity tag
           | manually.
           | 
           | We deliver integrity tags for our CDN files as well btw (http
           | s://github.com/kiprotect/klaro/blob/master/releases.yml),
           | that doesn't solve the auto-updating problem though.
        
       | quickthrower2 wrote:
       | What if you are using hotjar, google analytics etc. Maybe those
       | companies should provide the scripts for you to serve up from
       | your site.
        
       | anm89 wrote:
       | > So what? This isn't the biggest issue on the web. And I'm
       | certainly guilty of misusing CDNs like this.
       | 
       | I just love this person's attitude. So many blog posts try to
       | beat you over the head with some opinion that doesn't feel like
       | such a big deal. The disclaimer on this actually gave the author
       | a lot of credibility in my mind.
       | 
       | Great post, and I will consider my opinion changed on this topic.
        
       | seanwilson wrote:
       | The pattern I don't like is the one where people load JS like
       | jQuery from a CDN with a local fallback if that doesn't work:
       | 
       | - realistically, nobody is going to be testing if the fallback
       | functionality works as the site grows so you're going to get
       | weird breakages if the fallback misbehaves (e.g. your local file
       | is a different version).
       | 
       | - big CDNs go down so rarely it's not worth thinking about (and
       | if it is serve it from your own site)
       | 
       | - it's easy nowadays to put your whole site behind a CDN so
       | there's little point (the article debunks the caching advantages)
       | 
       | I think people are just copy/pasting snippets that do this from
       | somewhere without thinking.
        
       | jabart wrote:
       | CDNs are misunderstood these days. Caching at the browser across
       | sites is not that important, it caching at a point of presence
       | (POP). This POP being so much closer to your end users brings
       | performance gains because TCP is terrible over distances. QUIC
       | may fix this by it's shift to UDP. I haven't seen a benchmark
       | yet.
       | 
       | Security is a concern, use SRI.
       | 
       | Reliability can be mitigated with fail over logic to a backup.
       | 
       | The part missed is bandwidth. Using a CDN means your web server
       | doesn't have to serve out static files that you are paying per a
       | GB to serve. Small sites it's not much but it does add up. It's a
       | Content Delivery Network not a Cache Delivery Network.
        
         | Matthias247 wrote:
         | > This POP being so much closer to your end users brings
         | performance gains because TCP is terrible over distances. QUIC
         | may fix this by it's shift to UDP. I haven't seen a benchmark
         | yet.
         | 
         | Quic can't defeat physics. Performance will still lineary
         | degrade with distance to (edge) servers, and therefore CDNs
         | will stay important.
         | 
         | What Quic however will do is reduce the time-to-first-byte on
         | an intial connection by 1RTT due to one less handshake - which
         | can be e.g. a 30ms win. After the connection is established it
         | aims to yield more consistent performance than e.g. HTTP/2 over
         | TCP. But packets will still require the same time to go from
         | the browser to an edge location, and therefore the minimum
         | latency for a certain distance is the same.
        
         | baskire wrote:
         | My understanding is that it's not TCP but the tcp AND tls
         | handshake overhead combined. Where-as quick combined both
         | handshakes at the protocol level.
        
           | jabart wrote:
           | TCP has the concept of a TCP Window, where its a buffer of
           | the data that the opposite side waits for an ACK packet from
           | before sending more data. Windows defaults to 64kb to start.
           | On your local LAN (which TCP was built for), no big deal, but
           | going across a distance. Then add in one lost packet or one
           | out of order packet and TCP has to ask for it again and delay
           | the whole thing. Its why HTTP/2 has higher latency on spotty
           | 4g networks. The TLS handshake suffers from the same distance
           | issue ACK packets have which with TLS 1.3 there is 0-rtt
           | which removes the handshake as part of the first TCP packet.
           | 
           | QUIC puts everything in UDP, so theoretically its a never
           | ending firehose of data for a download with the occasional
           | "hey, I missing packet 3, 12, 18, please resend". Mimicking
           | TCP but putting the app in control versus the kernel.
        
             | baskire wrote:
             | Quic also has a window size.
             | 
             | > QUIC congestion control has been written based on many
             | years of TCP experience, so it is little surprise that the
             | two have mechanisms that bear resemblance. It's based on
             | the CWND (congestion window, the limit of how many bytes
             | you can send to the network) and the SSTHRESH (slow start
             | threshold, sets a limit when slow start will stop).
             | 
             | https://blog.cloudflare.com/cubic-and-hystart-support-in-
             | qui...
        
               | Matthias247 wrote:
               | Quic kind of has 3 windows:
               | 
               | Per stream and per connection flow control windows, which
               | kind of indicate how much data the peer may send on a
               | given connection before it gets a window update. Those
               | windows also indicate how much the server is willing to
               | store in its receive buffers, since the updates are
               | likely sent when those buffers are drained.
               | 
               | A congestion window, which indicates how many low-level
               | packets and data in them can be in-flight without being
               | acknowledged. Those also account for retransmissions, and
               | packets which do not necessarily contain stream data.
        
         | tasogare wrote:
         | One won't have loading time problem if one doesn't ship
         | websites with kilotons of JS crap. Also paying per downloaded
         | content is dumb as it's easy for an attacker to attack you
         | financially and lot of hosting companies (like OVH) offers
         | "unlimited" bandwidth.
        
         | adrianmonk wrote:
         | How would failover actually be implemented? If you have a
         | script tag with integrity enabled, and the cryptographic hash
         | doesn't match, what happens next?
         | 
         | From some quick research, it doesn't seem like the script tag
         | has built-in support for this. One could imagine something like
         | multiple src attributes (used as a search order for the first
         | valid file), but that doesn't seem to exist. So it seems like
         | the web page has to do it manually.
         | 
         | Which I guess means you have to have some javascript (probably
         | inline, so you know _it 's_ loaded and for performance?) to
         | check and fix the loading of your other javascript.
         | 
         | If it's really that manual, it sounds like it adds cost to
         | implementing this correctly. in other words, it might be one of
         | those scenarios where correctness is achievable, but it's a
         | whole lot simpler to just not do it that way.
        
           | jabart wrote:
           | Since Script tags are blocking, you can do a undefined check
           | then if that fails, inject a new script tag either local or a
           | secondary CDN.
           | 
           | Link for reference. .Net Core has this built in as a tag
           | helper too! https://www.hanselman.com/blog/cdns-fail-but-
           | your-scripts-do...
        
             | adrianmonk wrote:
             | Thanks. So it seems like it's not really that bad.
             | Particularly if you are already using some loader tools
             | (and don't have to add them to your build just to get
             | this).
        
         | zamadatix wrote:
         | The post is "Please stop using CDNs for external JavaScript
         | Libraries" not "Please stop using CDNs". If a CDN is critical
         | to your site's performance you should put your site on it not
         | internal libraries here and external libraries there. The page
         | mentions this as well:
         | 
         | "Speed:
         | 
         | You probably shouldn't be using multi-megabyte libraries. Have
         | some respect for your users' download limits. But if you are
         | truly worried about speed, surely your whole site should be
         | behind a CDN - not just a few JS libraries?"
         | 
         | Also even before QUIC HTTP/2 fixed a lot of the problems with
         | distance as you no longer need to wait for separate handshakes
         | for multiple files to be streamed. QUIC will still give a few
         | advantages but again those advantages would be good to have on
         | your whole site not just a few libraries.
        
           | MrStonedOne wrote:
           | But wouldn't a site be faster if cachable requests go to a
           | cdn, and no un-cachable requests go directly to origin with
           | no forwarding at the cdn layer?
        
         | EE84M3i wrote:
         | Wait, what is the misunderstanding? Aren't these the well known
         | benefits of CDNs?
        
           | dathinab wrote:
           | CDNs are just a service to handle delivery of static content
           | for you. Their main points (unordered) are:
           | 
           | - reliability
           | 
           | - delivery speed through closeness to user (having nodes all
           | around the world)
           | 
           | - cost
           | 
           | - ease of use
           | 
           | - handling of high loads for you / making static content less
           | affectedly by accidental or intentional DoS situations
           | 
           | That multiple domains might use the same url and might share
           | the cache was always just a lucky bonus. Given that the other
           | side needs to use the exact same version of the exact same
           | library with the exact same build options accessed through
           | the exact same url to profit from cach sharing it never was
           | reliable at all.
           | 
           | I mean how fast does the JS landscape change?
           | 
           | Given how cross domain caching can be used to track users
           | across domains safari and Firefox disabled it a while ago as
           | far as I know, and chrome will do so soone.
        
             | m463 wrote:
             | go to a webpage that uses a cdn and do view source.
             | 
             | it all looks like
             | https://cdn.example.com/foo/bar.js?v=129a1d14ad3
        
           | [deleted]
        
       | ffpip wrote:
       | Extensions that act like a local CDN. Page loads faster, more
       | privacy, etc.
       | 
       | https://decentraleyes.org/
       | 
       | https://www.localcdn.org/ (fork of decentraleyes, with many more
       | resources)
        
         | sippingjippers wrote:
         | Interesting, I was already using DecentralEyes. Do you know why
         | the fork happened? What policy does DecentralEyes keep that
         | LocalCDN extends or violates?
         | 
         | edit: will be sticking with the original, looks like the fork
         | maintainer made no effort to work upstream first, which is a
         | very bad look for what is essentially a piece of security
         | software. https://gitlab.com/nobody42/localcdn/-/issues/5
        
           | smichel17 wrote:
           | I found out about localCDN recently, commented on this
           | subject, and got this response from the author:
           | 
           | https://codeberg.org/nobody/LocalCDN/issues/51#issuecomment-.
           | ..
           | 
           | I haven't gone searching for a PR yet and didn't think to do
           | so beforehand (all made more complicated by both projects'
           | repos having moved locations at least once recently).
        
           | ffpip wrote:
           | localCDN has more resources than Decentraleyes. It also has
           | very important resources like Google Fonts, some cloudflare
           | resources etc, none of which were present in Decentraleyes
           | (the last time I checked)
           | 
           | > will be sticking with the original
           | 
           | It's your choice. The fork is better. The maintainer seems a
           | bit more active (more updates) and extremely pro-privacy (I
           | concluded this from his home page and extension settings)
        
             | nXqd wrote:
             | localCDN looks super interesting. It works with Chrome
             | partially and fully with Firefox. I'm curious, is there any
             | good native tool to replace localCDN, uBlock, uMatrix (
             | resources concern ). Btw, thanks for pointing out localCDN
        
               | ffpip wrote:
               | I don't know any replacements, but localCDN can generate
               | rulesets for uBlock Origin and uMatrix if you have
               | configured it too hard (had mode,etc)
               | 
               | You must enable the rulesets. It's very easy and a one
               | time job. To generate them, go to LocalCDN settings and
               | select your adblocker.
        
             | mimimi31 wrote:
             | I was thinking about switching from decentraleyes too, but
             | I'm hesitant to install a "can access all sites" extension
             | that hasn't been vetted ("recommended") by Mozilla.
        
               | ffpip wrote:
               | LocalCDN is the most offline extension I have seen.
               | 
               | It even opens donation pages locally, instead of opening
               | the author's website. He says ''I think it is better if
               | your public IP address is rarely listed in any server log
               | files.''
        
           | m463 wrote:
           | I use decentraleyes and like it, but although it has a ton of
           | content, much was outdated and there were no updates. It
           | would also be nice to be able to add resources yourself.
        
       | MattSteelblade wrote:
       | As a user, I recommend using Decentraleyes for Firefox[1] and
       | Chrome[2].
       | 
       | [1] https://addons.mozilla.org/en-
       | US/firefox/addon/decentraleyes... [2]
       | https://chrome.google.com/webstore/detail/decentraleyes/ldpo...
        
         | [deleted]
        
       | arielm wrote:
       | Without real examples this just gives more reasons to load big
       | libraries, such as jquery, from popular CDNs.
       | 
       | Let's take jquery for example;
       | 
       | Speed - if everyone loaded jquery from the official CDN,
       | considering the usage of jquery is high, you're very likely to
       | get a speed improvement.
       | 
       | Faster website = happier customer (and happier search engine,
       | which could mean more customers)
       | 
       | That to me is enough.
       | 
       | But... I do agree that if everyone uses that CDN and it gets
       | compromised everyone is in trouble.
       | 
       | That to me is enough reason not to use it. The article doesn't do
       | this one any justice IMO + it's a personal decision because major
       | CDNs don't get compromised every day, so convenience might win
       | here.
        
       | ocdtrekkie wrote:
       | When packaging apps for Sandstorm, eradicating these sorts of
       | external dependencies is one of my top irritations. Not only do
       | projects themselves rely on them, but components they drag in
       | also bring in their own CDN dependencies. FontAwesome is the top
       | offender today, call out to a third party server just to avoid
       | hosting your own toolbar icons.
       | 
       | Recently, Sandstorm added the ability to block apps from loading
       | external client-side scripts, and I hope more app platforms and
       | hosting platforms adopt similar so that doing it goes out of
       | style.
        
       | [deleted]
        
       | paulgb wrote:
       | There are some good reasons in here (especially privacy), but I'm
       | not convinced by the security point. It seems like the example
       | linked about British Airways was JS under the britishairways.com
       | domain being changed, not a third party CDN.
       | 
       | Incidentally, a few years ago when people were loading third
       | party scripts over HTTP, I demoed a fun hack where, if you
       | control a user's DNS, you could redirect queries to popular CDNs
       | to a proxy that injects keylogger code _and tells the browser to
       | cache it indefinitely_. Because at the time almost every site
       | included either jQuery or Google Analytics, you 'd have a
       | persistent keylogger even after the user switched to a more
       | secure connection. How far we've come!
        
       | pkz wrote:
       | Fun case from Sweden: the central government website for
       | healthcare included js from a third party with no integrity hash.
       | Third party got hacked and they changed the js to mine
       | cryptocurrency over the weekend. Thousands of people participated
       | in mining...
        
         | quickthrower2 wrote:
         | This is where browsers need to be locked down a bit more. The
         | web api, and available local compute resources should be
         | governed by permissions. Yeah it'll break a few sites at first!
        
         | edent wrote:
         | Ooh! Do you have a source on that? I couldn't find anything in
         | English language media.
        
       | anticristi wrote:
       | Should you also stop serving "analytics.js" from Google's website
       | and serve it from your own website?
        
         | Doctor_Fegg wrote:
         | Yes.
         | 
         | (Matomo is your friend.)
        
       | sergeykish wrote:
       | Page contains 3rd party script without subresource integrity
       | 
       | <script type='text/javascript'
       | src='https://stats.wp.com/e-202041.js' async='async'
       | defer='defer'></script>
        
       | alkonaut wrote:
       | Can't js build tools pack together my code with the third party
       | libraries and then also tree-shake out all the bits I'm not
       | using?
        
       | duxup wrote:
       | >But if you are truly worried about speed, surely your whole site
       | should be behind a CDN - not just a few JS libraries?
       | 
       | Does it have to be all one or the other?
       | 
       | I feel like this kinda hand waves away an advantage.
        
       | maple3142 wrote:
       | Just wondering. Can "integrity" be used instead of url to provide
       | better caching? Because it is based on hash, which should be more
       | reliable than just url.
        
         | RL_Quine wrote:
         | That would be a super-cookie though.
        
           | lucb1e wrote:
           | How would specifying what content the library should have (to
           | preserve integrity) lead to tracking?
        
             | im3w1l wrote:
             | Caching across sites at all means you can see if it loads
             | quickly or slowly to make inferences. Non-obvious tradeoff
             | imo.
        
               | XCSme wrote:
               | It's not about how fast it loads, it about whether it
               | actually requests the file from the server or not.
               | 
               | See my comment above, if you have "x.js" in your cache
               | and it's your first time on this site, it means that you
               | previously visited another site that contains "x.js".
        
             | shakna wrote:
             | A number of problems surface [0]. Starting with timing
             | leaks.
             | 
             | > The first difficulty of implementing cross-origin,
             | content addressable caching on the Web platform is that it
             | may leak information about the user's past browsing. For
             | example, site "A" could load a script with a given hash,
             | then later when the user visits site "B", it could attempt
             | to observe the time it takes to load that same resource and
             | infer whether the user had previously been at "A".
             | 
             | [0] https://hillbrad.github.io/sri-addressable-caching/sri-
             | addre...
        
         | XCSme wrote:
         | Not sure if you are referring about domain-local cache, or a
         | cache shared between domains: https://github.com/w3c/webappsec-
         | subresource-integrity/issue...
        
           | XCSme wrote:
           | A privacy issue example if a shared cache is implemented:
           | jQuery v2 is installed and shared between sites A,B,C. If
           | user goes to site B for the first time and the file is in
           | cache (never requested from it), then it means the user has
           | also visited A or C. If more files like this are shared,
           | then, based on the requested files you can somewhat estimate
           | which websites a user visited before.
        
         | lucb1e wrote:
         | Might subresource integrity be what you are looking for? Not
         | sure if you meant that, or if you are proposing such a system
         | (which is a good idea!) without realizing that this already
         | exists:
         | 
         | https://developer.mozilla.org/en-US/docs/Web/Security/Subres...
        
           | XCSme wrote:
           | He is proposing to use the integrity hash for caching
           | purposes, not for security purposes.
           | 
           | This has been discussed before and it is unlikely it will be
           | implemented as it is almost impossible to eliminate finger-
           | printing and privacy issues if this cache is implemented plus
           | it might lead to security problems:
           | 
           | https://github.com/w3c/webappsec-subresource-
           | integrity/issue...
        
             | lucb1e wrote:
             | > for caching purposes, not for security purposes
             | 
             | Whether it's for caching or security purposes, a hash of
             | the contents is a hash of the contents.
             | 
             | The link you shared is about tracking whether a given
             | document is already cached. That's the same problem as with
             | normal caching. At least the top N libraries could be
             | downloaded with a browser update and be equal for everyone.
        
           | [deleted]
        
       | NikolaeVarius wrote:
       | Just block JS entirely. 99.99% of sites are worthless anyway, its
       | not hard to enable JS on sites you want to use
        
         | eeZah7Ux wrote:
         | Plus, you save CPU, memory, bandwidth and electricity.
        
         | nulbyte wrote:
         | I use NoScript in Firefox. It floors me how many folks use a
         | slew of third-party assets hosted on CDNs I'venever heard of,
         | many of which aren't even critical to the site's functionality.
        
       | jedberg wrote:
       | The best point in this article is that your site should already
       | be on a fast CDN. Focus on that and then none of the other
       | advantages of JS on a CDN matter (and you don't have to worry
       | about the downsides).
        
       | lazyjeff wrote:
       | Some of the time, web developers are using CDNs because they want
       | to save 5 minutes from downloading and hosting it on their own
       | servers, or they want to not have to add it to their
       | package.json. I've noticed this with novice web developers, where
       | they just want to get something working up as fast as possible.
       | This practice of linking to a CDN or external URL for a js
       | library used to be heavily discouraged, called "hotlinking", but
       | it seems like many javascript libraries now are fine with it
       | (even encouraging it).
       | 
       | I see a similar problem with Google Analytics or Google Fonts,
       | where you're sacrificing user privacy and agency in exchange for
       | developer convenience. In a slightly more privacy-centric era
       | (perhaps in a few years), I think people will consider this
       | practice unethical, as the web developer is sending their
       | visitors off to fetch and run javascript code from a third
       | party's computer. The security, usability, and privacy problems
       | noted by the article are not worth the few minutes saved by the
       | developer.
        
       | PaulHoule wrote:
       | This browser extension undoes this phenomenon as much as it can:
       | 
       | https://decentraleyes.org/
        
       | uniqueid wrote:
       | Stop using externally-hosted fonts, too! It's 2020. I don't visit
       | websites to marvel at their beauty; I just want to read the
       | article. So I'm not going to compromise my privacy just to read
       | seven or eight paragraphs of text in Myriad instead of Helvetica.
        
       | thomasfromcdnjs wrote:
       | After starting cdnjs nearly 10 years ago, I still use it just for
       | convenience sake if I am building some rough site.
       | 
       | I use webpack etc for big projects.
        
       | GeneralTspoon wrote:
       | Almost every point in this post is incorrect and/or exaggerated
       | (except for the Privacy one):
       | 
       | > Cacheing. I get the superficial appeal of this. But there are
       | dozens of popular Javascript CDNs available. What are the chances
       | that your user has visited a site which uses the exact same CDN
       | as your site?
       | 
       | Quite high - if you're using the Google CDN I'd imagine.
       | 
       | > Speed. You probably shouldn't be using multi-megabyte
       | libraries. Have some respect for your users' download limits. But
       | if you are truly worried about speed, surely your whole site
       | should be behind a CDN - not just a few JS libraries?
       | 
       | (a) Some websites/web apps do actually need to load a lot of JS
       | to function properly (e.g. Google Maps). (b) Putting the HTML of
       | a website into a CDN introduces extra complexity around making
       | the html stateless (cookies, user data, etc.) + the issues that
       | come from having to cache bust the CDN every time some dynamic
       | content on the page is updated
       | 
       | > Versioning. There are some CDN's which let you include the
       | latest version of a library. But then you have to deal with
       | breaking changes with little warning. So most people only include
       | a specific version of the JS they want. And, of course, if you're
       | using v1.2 and another site is using v1.2.1 the browser can't
       | take advantage of cacheing.
       | 
       | Yeah, don't use library/latest.js. But then this basically boils
       | down to the same argument as point 1 ("Caching").
       | 
       | > Reliability. Is your CDN reliable? You hope so! But if a user's
       | network blocks a CDN or interrupts the download, you're now
       | serving your site without Javascript. That isn't necessarily a
       | bad thing - you do progressive enhancement, right? But it isn't
       | ideal.
       | 
       | Fair enough - depends on your CDN. The Google CDN is probably
       | more reliable than whichever one you're going to pay for instead
       | (not every site can/should be put on Cloudflare for free-ish).
       | 
       | > Privacy
       | 
       | Yup - this is the real valid issue IMO
       | 
       | > Security. British Airways' payments page was hacked by
       | compromised 3rd party Javascript. A malicious user changed the
       | code on site which wasn't in BA's control - then BA served it up
       | to its customers.
       | 
       | If OP had actually read his own link, they would have seen that
       | the British Airways JS in question was loaded _from their own
       | CMS_ - not an external CDN. Ironically, in this case - it
       | probably would have been harder to hack the CDN.
       | 
       | Plus, always use SRI.
        
       | vaccinator wrote:
       | Stop using Google for fonts while you're at it...
        
       | MrStonedOne wrote:
       | > But if you are truly worried about speed, surely your whole
       | site should be behind a CDN - not just a few JS libraries?
       | 
       | There is an argument for being more efficient with cdn redirects.
       | The ideal cdn'ified site ensures that all requests that are
       | always served from origin goes directly to origin, and all
       | requests that are cachable go through a cdn, this means seperate
       | domains or subdomains.
        
       | sp332 wrote:
       | I want to discuss a (minor) antipattern that I think is
       | (slightly) harmful.
       | 
       | It's caching, not cacheing.
        
         | scambier wrote:
         | Non-native speaker here. How is that an "antipattern" and not
         | simply a spelling mistake?
        
           | Biganon wrote:
           | That + the word "harmful": it's probably a parody of HN-
           | speak. I hope.
        
           | sp332 wrote:
           | I'm just copying the language from the beginning of the
           | article.
        
       | szundi wrote:
       | Free CDN does not cost anything compared to other solutions
       | owned/payed be me.
        
       | sneak wrote:
       | Additionally: Please stop using Javascript as an essential
       | component in the rendering of non-interactive pages.
       | 
       | It's trendy these days to require client JS for every old
       | webpage, even ones that work just like webpages did in the pre-
       | JS-all-the-things days.
       | 
       | This breaks graceful degradation, uses more power and bandwidth,
       | takes longer to load, and is a security risk, for little or no
       | benefit.
       | 
       | Please stop doing it.
        
       | tracker1 wrote:
       | I understand the sentiment, and disagree on a few points... if
       | you're doing more adhoc enhancements using jquery, lodash or some
       | chart/graph library... there's a decent chance that there's
       | caching/reuse in play.
       | 
       | If you're building a specific application, then I'm less likely
       | to rely on a CDN... I think applications should largely deliver
       | all of their own resources.
       | 
       | If you're in financial or security (military) applications, there
       | may be legal, resource and other requirements that supercede what
       | you can do.
       | 
       | It's not a one size fits all... that said, modern web application
       | development centers around resources via npm, and largely
       | delivered _in_ your appliation bundle, so it 's less of a concern
       | in the common case these days imho.
        
       | A7med wrote:
       | no
        
       | 734129837261 wrote:
       | Let us say that you work for or own a website with 1m unique
       | visitors per month. Let us assume your average JS library size
       | that can be cached and outsourced totals 400 kilobytes (that's on
       | the low end).
       | 
       | 1 million * 400 kilobytes = 400 gigabytes
       | 
       | For AWS the entry level tier per gigabyte is $0.15/GB.
       | 
       | That is a whopping $60 USD per MONTH that you can save by running
       | your scripts on a CDN.
       | 
       | So, yeah.
        
         | sergeykish wrote:
         | And someone is paying that...
        
           | eeZah7Ux wrote:
           | The users are paying for it with their privacy.
           | 
           | CDN providers are not charities: they sell traffic metadata.
        
             | sergeykish wrote:
             | that's what I meant, million data points costs $60
        
         | heartbeats wrote:
         | Why are you serving 400kb of JavaScript? Why are you paying
         | $0.15/GB?
         | 
         | These numbers are certainly not on the low end.
        
           | quickthrower2 wrote:
           | I assume that it's "for the sake of argument"
           | 
           | https://www.ldoceonline.com/dictionary/for-the-sake-of-
           | argum...
           | 
           | Also it's a bit like a proof. By assuming worse numbers, and
           | arriving at $60, your better numbers will only prove the
           | point even more.
        
         | quickthrower2 wrote:
         | I don't think it is done to save money. I reckon it's done
         | because the MVP tutorial they used when they got started used
         | the CDN. Or they are using Wordpress and the theme decided to
         | use it (... I just checked mine as I recently changed the
         | theme, luckily it doesn't do this). It might also be done
         | because people don't care. If it ain't broke don't fix it.
        
       | jaimex2 wrote:
       | npm, bower etc more or less killed this pattern I thought?
        
         | user_501238901 wrote:
         | You'd be surprised how many new websites out there are still
         | made by importing JS from a random CDN, no version control of
         | course, code is uploaded manually through FTP
        
           | jopsen wrote:
           | I think we should be amazed at how far amateurs can go :)
           | 
           | In a world where all websites have to be perfect, we would
           | have far fewer websites.
        
         | quickthrower2 wrote:
         | We need another new competing module system to screw up all the
         | CDNS!
        
         | mistrial9 wrote:
         | What does Wordpress do ?
        
           | XCSme wrote:
           | I think this is like asking "What would Jesus do?", but with
           | a worse role-model.
           | 
           | The problem with WordPress is a lot bigger with the
           | Themes/Plugins than with the platform itself.
        
         | a_humean wrote:
         | During code review we just discovered that a somewhat popular
         | npm library used a CDN for its assets (images of flags
         | corresponding to country iso codes), forcing us to reimplement
         | without the library.
        
       | basilgohar wrote:
       | Am I the only one that feels using CDNs comes with very little
       | benefit compared to just hosting all resources locally? Almost
       | all security issues go away with locally hosted (i.e., same
       | domain or another domain controlled by the website), plus you
       | avoid an extra DNS lookup, and you still get caching on the same
       | site.
       | 
       | I realize there are benefits, but are the benefits so extreme
       | that they merit all the hype around CDNs? So many developers talk
       | about them like the web would crawl to a halt if they were
       | stopped, but I think that they have their own slippery-slope of
       | problems that has resulted in folks just hand-waving away the
       | expense of web assets since it's a hidden problem from the
       | developer. I doubt actual bytes transferred and latency are
       | affected in a significant way as folk that promote CDNs claim.
       | 
       | It'd be nice to see actual comparisons in a real world scenario.
       | Keep in mind, web site responsiveness is not just linked to
       | download time/size, but also asset processing. If your page is
       | blocking because the JS is still being parsed, the time you saved
       | downloading it is moot.
        
         | x86_64Ubuntu wrote:
         | As a scrub in this domain. When I use a CDN, I get a network
         | that is cheaper per-GB at delivering content, and is better at
         | delivering content. Far more resources in a CDN are configured
         | to deliver content faster and cheaper than whatever rinky-dink
         | EC2 I've got serving the website.
        
           | heartbeats wrote:
           | Just how much JavaScript do you have to use for it to make it
           | a significant difference? If you serve 100k of JS (please
           | don't), at 1 million hits per month you're looking at 100GB.
           | Even at EC2's obscene pricing, this is still just $10/month.
           | 
           | If you're that stingy, why aren't you just putting the entire
           | website behind CloudFlare and calling it a day?
        
             | MrStonedOne wrote:
             | I have a js file that is over 900kb, (nearing a mb)
             | 
             | Given, its for use in embedded webviews of a video game,
             | and it powers like 50 different atomic interfaces via
             | react.
        
             | [deleted]
        
           | alynn wrote:
           | I wasn't aware that CDNs were much cheaper, and am genuinely
           | curious what service you are using at what prices.
           | 
           | When I look at AWS pricing, us-east-1 at < 10TB, I see: EC2
           | data transfer out to the internet is 9C//GB, and Cloudfront
           | is 8.5C//GB for the lowest price class. That's a slight
           | savings, but at 6% I can't justify the effort to switch over
           | on cost alone.
           | 
           | Should I be looking at a different CDN service?
        
             | coder543 wrote:
             | Cloudflare famously charges $0/GB, with some arbitrary
             | restrictions on the way you use their service, and I've
             | heard rumors of soft limits on the total amount of
             | bandwidth you can use before they email you to upgrade to a
             | higher plan.
             | 
             | I've never used BunnyCDN, but they charge a flat $0.01/GB
             | for North American traffic, and I've heard some good things
             | about them.
             | 
             | DigitalOcean, Vultr, Linode, and some other cloud providers
             | charge $0.01/GB without a CDN, just using their regular
             | servers, but obviously a CDN is more than just a way to
             | save money -- it's a way to lower latency and improve user
             | experience.
             | 
             | The mega clouds (AWS, Azure, and GCP) seem to significantly
             | over-charge for egress bandwidth as a nice profit
             | mechanism, just because they can.
             | 
             | My unpopular opinion is that mega clouds are overrated.
             | They're fine, but they have a lot of weird gotchas that
             | most people have just accepted as "how the cloud works."
        
         | dehrmann wrote:
         | Remember that the JS CDN thing started 10+ years ago when
         | internet connections were a lot slower and JQuery was _the_ JS
         | framework and only released one new version per year. That
         | means there 's a really good chance another website also used
         | it from the Google CDN, and you're browser cached it forever-
         | ish, no DNS lookup needed.
         | 
         | The other thing to remember is old browsers used to cap the
         | number of connections per domain HTTP 1.1 only supports serial
         | requests, so there were benefits to hosting on multiple
         | domains.
         | 
         | Even today, the big benefit of a CDN domain is that CDNs can
         | host static resources faster and cheaper than your webservers.
         | Yes, you can forward requests from the CDN to your webservers,
         | but it's also one point point of failure. What's interesting is
         | that with a modern, JS-only site, the split becomes API and
         | static JS.
        
         | old-gregg wrote:
         | Yes, the benefits are substantial. I used to have a static web
         | site served out of a Dallas data center, with a great network
         | connection, from a powerful bare metal machine that did nothing
         | else. Sitting in Austin, it felt instant, equal to localhost
         | speed, until I tried accessing it while traveling, especially
         | overseas.
         | 
         | It's not just geographic latency you're addressing with a CDN,
         | you're also reducing the number of network hops. It's not
         | uncommon to experience higher latency going from SF to San Jose
         | datacenter just because you're on a "wrong" ISP. A good CDN
         | usually has a POP on the same network as you.
        
           | esperent wrote:
           | I live in Vietnam and frequently give tech advice to people
           | regarding websites they are developing. Most sites on the
           | works fine, but I can tell straight away when someone has
           | hosted their site like you have described - because it will
           | take >5 minutes to load! And they'll be telling me that it's
           | fine, and I have to explain, yes, it's fine if you only want
           | people in the same country as you to access the site. If you
           | want it to work worldwide, you'll have to do better.
           | 
           | Fortunately, these days that just means creating a free
           | Cloudflare account.
        
       | pdevr wrote:
       | I believe a site should load all JavaScript files from its own
       | domain if it can. There are problems, but they have solutions
       | too. As an example:
       | 
       | Problem: You can use a home-grown tracking solution, but, if the
       | site has to be sold, how do you provide independent third party
       | verification of your traffic details? Having Google Analytics
       | helps you to do that.
       | 
       | Solution: Install and configure AWStats or Webalizer. Many lower
       | tier hosts have these built into their offerings. Provide those
       | statistics to interested buyers.
        
       | softwaredoug wrote:
       | I recall being regularly surprised how infrequently CDNd
       | JavaScript libraries were cached locally. In my testing, even
       | something popular like jQuery or Bootstrap has enough different
       | versions and CDN hosts that it wouldn't be cached ~50% of the
       | time.
       | 
       | Perhaps what's really needed is some kind of browser based
       | package versioning and management system? Where your CDNs jQuery
       | 1.10 is seen as a mirror of mine, with perhaps even some signing
       | and authority of libraries not directly hosted on the host
       | server? (I'm probably talking about something that already exists
       | or has been discussed I'm sure...)
        
       | ignoramous wrote:
       | > _British Airways ' payments page was hacked by compromised 3rd
       | party Javascript. A malicious user changed the code on site which
       | wasn't in BA's control - then BA served it up to its customers._
       | 
       | That's the only reason you should ever need to not load any 3p
       | javascript including google-analytics.
       | 
       | If you're like me and want to avoid loading popular javascript
       | libraries from CDNs but want those webpages to work, get the
       | Decentraleyes plugin: https://en.wikipedia.org/wiki/Decentraleyes
        
         | Spivak wrote:
         | Or at least do subresource integrity so you know you're getting
         | the exact same file every time.
        
           | rgbrenner wrote:
           | That can mitigate the damage, but any user with a browser
           | that doesn't support it will still be compromised.
           | 
           | If it's an important part of the site, it might make the
           | failure more obvious in newer browsers.. but small libraries
           | used on only some pages might not be noticed quickly... so
           | you'd probably also want to test all of your resources
           | regularly.. and even then the time between those test runs
           | may allow some users to be compromised.
           | 
           | So using it is a good idea but it's not a fix for the actual
           | problem.
        
             | maple3142 wrote:
             | https://caniuse.com/subresource-integrity
             | 
             | It is basic available for every browser except IE and Opera
             | Mini, so I think it is user's problem to use an old browser
             | that don't support a wide supported security feature.
        
               | emn13 wrote:
               | Might as well simply block IE users outright at this
               | point; it's just not worth the risk (and classic edge is
               | close). It's probably better user experience to be
               | upfront about issues than pretend you actually test and
               | support all those old versions (unless you do... but
               | why?)
        
               | rgbrenner wrote:
               | Your own link says 94.79% coverage. So 1 in 20 users
               | would be compromised.. on a large site that could be
               | millions of users.
               | 
               | And your response is: _" that's their problem"_ ??
               | 
               | I hope you're not in charge of any important or large
               | sites or anything that handles financial data (ecommerce,
               | etc)... because this isn't a good attitude when it comes
               | to security.
        
               | [deleted]
        
         | hn_throwaway_99 wrote:
         | Except this article and that quote are wrong. If you look at
         | the linked report on the British Airways hack,
         | https://www.riskiq.com/blog/labs/magecart-british-airways-
         | br..., you'll see that the compromised script was hosted on
         | www.britishairways.com. It wasn't a third party CDN hack, their
         | own CMS was simply compromised.
         | 
         | I kept reading this article looking for an actual decent reason
         | to not use a third party CDN, and I never found one. In fact,
         | the _right_ answer is really that you should always set up your
         | CSP and subresource integrity rules to completely prevent these
         | kinds of attacks, whether from an unintended script injection
         | from your own domain or a 3rd party.
        
       ___________________________________________________________________
       (page generated 2020-10-11 23:00 UTC)