hngopher.com

       [HN Gopher] How Websites Die
       ___________________________________________________________________
        
       How Websites Die
        
       Author : herbertl
       Score  : 75 points
       Date   : 2022-06-27 08:00 UTC (2 days ago)
        
 (HTM) web link (notebook.wesleyac.com)
 (TXT) w3m dump (notebook.wesleyac.com)
        
       | sparcpile wrote:
       | There have been some attempts to capture sites that became the
       | ghosts or disappeared completely. During the dotcom bubble
       | through 2008, Steve Baldwin(co-author of Netslaves) had that as a
       | side project.
       | 
       | https://www.disobey.com/ghostsites/
       | 
       | Our Incredible Journey still does this, with more snark and
       | humor.
       | 
       | https://ourincrediblejourney.tumblr.com/
        
       | nonrandomstring wrote:
       | A very nice read with some poignant thoughts and quotes on
       | digital impermanence. Life on the Internet is nasty, brutish and
       | short.
        
       | paulgb wrote:
       | > you may attempt to archive it, but should you wish to avoid
       | sadness down the line, you should accept now in your heart that
       | all archives will eventually succumb to the sands of time.
       | 
       | Enjoyed this.
        
         | bombcar wrote:
         | You could etch your data and website onto copper plates and
         | launch them into the depths of space; this will likely last
         | near until the heat-death of the universe!
        
           | asddubs wrote:
           | and simultaneously it will be lost immediately
        
             | saagarjha wrote:
             | And here we see the difference between available and
             | accessible :)
        
       | [deleted]
        
       | doodles33 wrote:
       | Unstoppable domains solve this problem by making domain purchases
       | permanent and stored on a decentralized blockchain - although
       | that brings some problems of it's own - and IPFS solves that by
       | not requiring that a central server stays online to serve that
       | content, although this does require that _someone_ be insterested
       | in serving the website at all.
        
         | rchaud wrote:
         | This article isn't about websites going offline. It's about
         | websites dying of 'natural causes', meaning the author or site
         | manager lost interest in updating and maintaining the site.
         | 
         | There's much more to the soul of a website than the datacenter
         | where it's hosted. I doubt if the handful of mirror.xyz
         | 'decentralized' sites I see popping up will still exist in 5
         | years' time.
        
       | lazyjeff wrote:
       | I've been working on this problem for a while. Website upkeep is
       | hard to quantify, but basically every disk fails and every
       | operating system eventually needs a serious upgrade. The
       | timeframe that a system can run continuously is not that long
       | compared to the timeframe that information is relevant. So the
       | most lightweight way to keep something up and running is to make
       | it trivial to port to many hosting configurations by simplifying
       | the toolchain needed to rehost it. (Note that humans are part of
       | that workflow, if it's a company)
       | 
       | I've written a manifesto about making a commitment to keep
       | websites online and maintained for 10-30 years, for people who
       | are maintaining web content:
       | https://jeffhuang.com/designed_to_last/
       | 
       | And on the flipside (from a user's point of view), I've also been
       | working on a background process that automatically captures full-
       | resolution screenshots of every website you visit, creating your
       | own searchable personal web archive: https://irchiver.com/
       | 
       | I've personally been trying to make a commitment to keep my web
       | projects and writing online for 30 years. My original internal
       | goal when I started thinking about this, was to outlast all the
       | content on Twitter, Google+, and facebook.com. One of those has
       | already been met, kind of sadly.
        
         | 10000truths wrote:
         | It's not uncommon to hear of rack servers with several years of
         | continuous uptime. I wouldn't be surprised if you could keep a
         | website online for a decade without touching anything by using
         | an LTS distro, enabling unattended upgrades, and running
         | something like nginx.
        
         | EddieDante wrote:
         | > I've also been working on a background process that
         | automatically captures full-resolution screenshots of every
         | website you visit, creating your own searchable personal web
         | archive:
         | 
         | How are screenshots searchable? They aren't plain text. You
         | can't grep them.
        
           | anonymoushn wrote:
           | The tool captures screenshots in addition to text.
        
             | doodles33 wrote:
             | That seems awfully simillar to archive.org wayback machine.
             | I do like to see all these archival projects though, they
             | are certainly worthwhile.
        
               | lazyjeff wrote:
               | irchiver captures text on the page, and separately OCRs
               | the screenshots (specifically, the screenshot from your
               | viewport). So you can search just what was shown on the
               | page, or what was in the page. Both techniques have pros
               | and cons.
               | 
               | While archive.org is fantastic, it can only capture pages
               | that are both 1) publicly accessible (i.e. no social
               | media content) that it happens to crawl, and 2) static
               | content (you're out of luck if the content you want is
               | loaded dynamically, or changes depending on user input).
        
           | Jaxan wrote:
           | Why not though. On MacOS you can select text in bitmaps
           | nowadays. So the tech is there to make a grep for pictures.
        
         | SoftTalker wrote:
         | I think it's less a technological problem and more just that
         | everyone who used to care about that site or its content no
         | longer does. Or the company behind it has gone out of business
         | -- who is going to maintain a website for a defunct
         | organization, and why would anyone want to?
         | 
         | There's no obligation for a person to maintain anything longer
         | than he wants to do it. Putting a blog online is not a lifetime
         | committment. Interests change, or you simply realize nobody
         | much cares about your online musings, and you move on to other
         | interests.
        
           | gowld wrote:
           | But what can you do when you _do_ care, to make your website
           | as durable as a printed book?
        
             | hypertele-Xii wrote:
             | You could literally print it in a book. With links suffixed
             | by a page number [915].
        
             | jl6 wrote:
             | Point archive.org at it (and make a donation).
        
             | bombcar wrote:
             | Depends on if you want it to survive _without_ you
             | maintaining it. If so, something like a bog-standard
             | Wordpress blog _hosted by them_ might work until they
             | decide they don 't want to bother anymore.
             | 
             | Otherwise, some setup using S3-as-a-website or GitHub pages
             | may work, but those also depend on the company maintaining
             | that service.
             | 
             | If your entire website can be dropped into a ZIP file and
             | served anywhere, you have a greater chance in it surviving,
             | especially if the internet archive got a copy at some
             | point.
             | 
             | But if you die and _nobody cares_ about the content,
             | eventually it will disappear.
        
               | ElectricalUnion wrote:
               | > If your entire website can be dropped into a ZIP file
               | and served anywhere, you have a greater chance in it
               | surviving, especially if the internet archive got a copy
               | at some point.
               | 
               | Well, your entire website can be a zip:
               | https://redbean.dev/
        
             | SoftTalker wrote:
             | Keep it as simple as possible. Static HTML that can be
             | dropped on any web server.
             | 
             | HTML/HTTP may some day be obsolete, but will likely be
             | around longer than anything built on complicated javascript
             | frameworks or tightly tied to current web browser
             | technologies.
        
           | dybber wrote:
           | When a company goes out of business you will typically also
           | try to sell of all its assets, and a good domain name might
           | be such an asset.
        
         | anonymoushn wrote:
         | irchiver seems incredible. I hope there's a comparable product
         | for other OSes one day.
        
           | ryanfox wrote:
           | I've been working on a very similar thing which runs on
           | Windows, Mac, and Linux: https://apse.io
        
       | jaytaylor wrote:
       | > I've often thought about getting together with some friends to
       | pay into a fund to house our websites after we die. I don't think
       | setting that up would be too hard -- the math around insurance
       | policies of this sort is quite simple -- I mostly haven't tried
       | to set something like this up just since it's a pretty morbid
       | ask. But, if you'd be interested, maybe reach out to me?
       | 
       | > Our ghosts could live forever, if we help each other.
       | 
       | I love this idea and would gladly assist in the effort, let's set
       | it up :)
        
       | fleddr wrote:
       | I'd like to call out the somewhat related problem of website rot.
       | Meaning, the websites is online, it once worked perfectly, but
       | becomes increasingly dysfunctional due to technical deprecations.
       | 
       | The soft obligation to use HTTPS these days has deranked old
       | HTTP-only websites in search, making them hard to find. These
       | websites are also "defaced" with browser warnings or some
       | subresources may not load at all.
       | 
       | Embedded maps no longer work, since Google regularly breaks their
       | API.
       | 
       | Facebook login or other FB plugins no longer work, since it needs
       | a yearly checkup of your account and there's the new requirement
       | of needing to have a privacy policy.
       | 
       | Those are just some examples of websites partially breaking
       | through no fault of its creator, if you'd agree that the web
       | should be backwards-compatible.
        
         | asddubs wrote:
         | also, even if those older http sites get a certificate, any
         | embedded scripts that point at http URLs, even if those URLs
         | are also available on https, will not load and break.
         | Especially hard to fix if you have user generated HTML on
         | there.
         | 
         | Same for http download links from https websites, will not work
         | anymore.
         | 
         | anything that used cookies on embedded resources will also be
         | broken because of the missing samesite header (same for third
         | party cookies obviously)
         | 
         | if google really goes through with removing alert/prompt/etc
         | from their browser like they said they were planning to a while
         | back, so so many things will break
        
           | EamonnMR wrote:
           | Notification prompt is 10x more annoying than alert ever was.
        
             | [deleted]
        
           | fleddr wrote:
           | I'm uniquely bothered by this problem as I visit many such
           | dysfunctional sites.
           | 
           | I'm active in the (hobbyist) field of documenting species.
           | There's thousands upon thousands of websites created by
           | amateurs containing unique niche content. For example,
           | somebody might have made it their lifelong hobby to document
           | every species of bee in their territory.
           | 
           | It's a fragmented mess of incompetently produced websites,
           | but I find it incredibly charming and in the spirit of the
           | original web. Above all, it is their content that has lasting
           | value.
           | 
           | The people behind it are good, generous. That's why it makes
           | me so angry when their work is cast aside like this. Things
           | not just technically breaking, in many cases simply
           | disappearing altogether from search results.
        
       | sshine wrote:
       | My first domain "expired" in an unexpected way after 18 years. I
       | got a .eu.org because I believed that eu.org would be more stable
       | than a commercial provider. I used the same not-for-profit DNS
       | provider until they were commercially acquired and the parent
       | company shut down the old nameservers.
       | 
       | Now I'm locked out: eu.org does not respond to inquiries, and my
       | account predates the auth system. While my phone number is the
       | same, auth reset does not work with phone.
       | 
       | It would have been fun to retain the same domain forever, but
       | stuff breaks, people die, and things crumble.
        
       | spc476 wrote:
       | It comes down to the person running the website _has_ to care.
       | That 's it. It doesn't matter how simple it is if the person
       | doesn't care.
       | 
       | In my own case, I've been running my own website for 24 years now
       | [1]. The URLs I started out with have remained the same (although
       | some have gone, and yes, I return 410 for those) and the
       | technology hasn't changed much either (it was Apache 24 years
       | ago, it's still Apache today; my blog engine [2] was a C-based
       | CGI program, and it's still a C-based CGI program. The rest of
       | the site is static, and there's no Javascript (except for one
       | page). I can see it lasting at least six more years, and probably
       | more. But I care.
       | 
       | [1] Started out on a physical server (an AMD 586) and a few years
       | later on a virtual server.
       | 
       | [2] https://github.com/spc476/mod_blog
        
         | remus wrote:
         | > It comes down to the person running the website has to care.
         | 
         | Personally I think it is a little more nuanced, in particular I
         | think the relationship between how much someone cares and how
         | much effort is required to keep the website online is what
         | matters.
         | 
         | If your website is super simple you don't need to care about it
         | very much (though you do need to care at least a little bit).
         | On the other hand, if everyone who works at google suddenly
         | quit tomorrow because they didn't care the stack of cards would
         | fall over very quickly because it's a lot of work maintaining
         | millions of servers.
        
       ___________________________________________________________________
       (page generated 2022-06-29 23:00 UTC)