[HN Gopher] How Websites Die ___________________________________________________________________ How Websites Die Author : herbertl Score : 75 points Date : 2022-06-27 08:00 UTC (2 days ago) (HTM) web link (notebook.wesleyac.com) (TXT) w3m dump (notebook.wesleyac.com) | sparcpile wrote: | There have been some attempts to capture sites that became the | ghosts or disappeared completely. During the dotcom bubble | through 2008, Steve Baldwin(co-author of Netslaves) had that as a | side project. | | https://www.disobey.com/ghostsites/ | | Our Incredible Journey still does this, with more snark and | humor. | | https://ourincrediblejourney.tumblr.com/ | nonrandomstring wrote: | A very nice read with some poignant thoughts and quotes on | digital impermanence. Life on the Internet is nasty, brutish and | short. | paulgb wrote: | > you may attempt to archive it, but should you wish to avoid | sadness down the line, you should accept now in your heart that | all archives will eventually succumb to the sands of time. | | Enjoyed this. | bombcar wrote: | You could etch your data and website onto copper plates and | launch them into the depths of space; this will likely last | near until the heat-death of the universe! | asddubs wrote: | and simultaneously it will be lost immediately | saagarjha wrote: | And here we see the difference between available and | accessible :) | [deleted] | doodles33 wrote: | Unstoppable domains solve this problem by making domain purchases | permanent and stored on a decentralized blockchain - although | that brings some problems of it's own - and IPFS solves that by | not requiring that a central server stays online to serve that | content, although this does require that _someone_ be insterested | in serving the website at all. | rchaud wrote: | This article isn't about websites going offline. It's about | websites dying of 'natural causes', meaning the author or site | manager lost interest in updating and maintaining the site. | | There's much more to the soul of a website than the datacenter | where it's hosted. I doubt if the handful of mirror.xyz | 'decentralized' sites I see popping up will still exist in 5 | years' time. | lazyjeff wrote: | I've been working on this problem for a while. Website upkeep is | hard to quantify, but basically every disk fails and every | operating system eventually needs a serious upgrade. The | timeframe that a system can run continuously is not that long | compared to the timeframe that information is relevant. So the | most lightweight way to keep something up and running is to make | it trivial to port to many hosting configurations by simplifying | the toolchain needed to rehost it. (Note that humans are part of | that workflow, if it's a company) | | I've written a manifesto about making a commitment to keep | websites online and maintained for 10-30 years, for people who | are maintaining web content: | https://jeffhuang.com/designed_to_last/ | | And on the flipside (from a user's point of view), I've also been | working on a background process that automatically captures full- | resolution screenshots of every website you visit, creating your | own searchable personal web archive: https://irchiver.com/ | | I've personally been trying to make a commitment to keep my web | projects and writing online for 30 years. My original internal | goal when I started thinking about this, was to outlast all the | content on Twitter, Google+, and facebook.com. One of those has | already been met, kind of sadly. | 10000truths wrote: | It's not uncommon to hear of rack servers with several years of | continuous uptime. I wouldn't be surprised if you could keep a | website online for a decade without touching anything by using | an LTS distro, enabling unattended upgrades, and running | something like nginx. | EddieDante wrote: | > I've also been working on a background process that | automatically captures full-resolution screenshots of every | website you visit, creating your own searchable personal web | archive: | | How are screenshots searchable? They aren't plain text. You | can't grep them. | anonymoushn wrote: | The tool captures screenshots in addition to text. | doodles33 wrote: | That seems awfully simillar to archive.org wayback machine. | I do like to see all these archival projects though, they | are certainly worthwhile. | lazyjeff wrote: | irchiver captures text on the page, and separately OCRs | the screenshots (specifically, the screenshot from your | viewport). So you can search just what was shown on the | page, or what was in the page. Both techniques have pros | and cons. | | While archive.org is fantastic, it can only capture pages | that are both 1) publicly accessible (i.e. no social | media content) that it happens to crawl, and 2) static | content (you're out of luck if the content you want is | loaded dynamically, or changes depending on user input). | Jaxan wrote: | Why not though. On MacOS you can select text in bitmaps | nowadays. So the tech is there to make a grep for pictures. | SoftTalker wrote: | I think it's less a technological problem and more just that | everyone who used to care about that site or its content no | longer does. Or the company behind it has gone out of business | -- who is going to maintain a website for a defunct | organization, and why would anyone want to? | | There's no obligation for a person to maintain anything longer | than he wants to do it. Putting a blog online is not a lifetime | committment. Interests change, or you simply realize nobody | much cares about your online musings, and you move on to other | interests. | gowld wrote: | But what can you do when you _do_ care, to make your website | as durable as a printed book? | hypertele-Xii wrote: | You could literally print it in a book. With links suffixed | by a page number [915]. | jl6 wrote: | Point archive.org at it (and make a donation). | bombcar wrote: | Depends on if you want it to survive _without_ you | maintaining it. If so, something like a bog-standard | Wordpress blog _hosted by them_ might work until they | decide they don 't want to bother anymore. | | Otherwise, some setup using S3-as-a-website or GitHub pages | may work, but those also depend on the company maintaining | that service. | | If your entire website can be dropped into a ZIP file and | served anywhere, you have a greater chance in it surviving, | especially if the internet archive got a copy at some | point. | | But if you die and _nobody cares_ about the content, | eventually it will disappear. | ElectricalUnion wrote: | > If your entire website can be dropped into a ZIP file | and served anywhere, you have a greater chance in it | surviving, especially if the internet archive got a copy | at some point. | | Well, your entire website can be a zip: | https://redbean.dev/ | SoftTalker wrote: | Keep it as simple as possible. Static HTML that can be | dropped on any web server. | | HTML/HTTP may some day be obsolete, but will likely be | around longer than anything built on complicated javascript | frameworks or tightly tied to current web browser | technologies. | dybber wrote: | When a company goes out of business you will typically also | try to sell of all its assets, and a good domain name might | be such an asset. | anonymoushn wrote: | irchiver seems incredible. I hope there's a comparable product | for other OSes one day. | ryanfox wrote: | I've been working on a very similar thing which runs on | Windows, Mac, and Linux: https://apse.io | jaytaylor wrote: | > I've often thought about getting together with some friends to | pay into a fund to house our websites after we die. I don't think | setting that up would be too hard -- the math around insurance | policies of this sort is quite simple -- I mostly haven't tried | to set something like this up just since it's a pretty morbid | ask. But, if you'd be interested, maybe reach out to me? | | > Our ghosts could live forever, if we help each other. | | I love this idea and would gladly assist in the effort, let's set | it up :) | fleddr wrote: | I'd like to call out the somewhat related problem of website rot. | Meaning, the websites is online, it once worked perfectly, but | becomes increasingly dysfunctional due to technical deprecations. | | The soft obligation to use HTTPS these days has deranked old | HTTP-only websites in search, making them hard to find. These | websites are also "defaced" with browser warnings or some | subresources may not load at all. | | Embedded maps no longer work, since Google regularly breaks their | API. | | Facebook login or other FB plugins no longer work, since it needs | a yearly checkup of your account and there's the new requirement | of needing to have a privacy policy. | | Those are just some examples of websites partially breaking | through no fault of its creator, if you'd agree that the web | should be backwards-compatible. | asddubs wrote: | also, even if those older http sites get a certificate, any | embedded scripts that point at http URLs, even if those URLs | are also available on https, will not load and break. | Especially hard to fix if you have user generated HTML on | there. | | Same for http download links from https websites, will not work | anymore. | | anything that used cookies on embedded resources will also be | broken because of the missing samesite header (same for third | party cookies obviously) | | if google really goes through with removing alert/prompt/etc | from their browser like they said they were planning to a while | back, so so many things will break | EamonnMR wrote: | Notification prompt is 10x more annoying than alert ever was. | [deleted] | fleddr wrote: | I'm uniquely bothered by this problem as I visit many such | dysfunctional sites. | | I'm active in the (hobbyist) field of documenting species. | There's thousands upon thousands of websites created by | amateurs containing unique niche content. For example, | somebody might have made it their lifelong hobby to document | every species of bee in their territory. | | It's a fragmented mess of incompetently produced websites, | but I find it incredibly charming and in the spirit of the | original web. Above all, it is their content that has lasting | value. | | The people behind it are good, generous. That's why it makes | me so angry when their work is cast aside like this. Things | not just technically breaking, in many cases simply | disappearing altogether from search results. | sshine wrote: | My first domain "expired" in an unexpected way after 18 years. I | got a .eu.org because I believed that eu.org would be more stable | than a commercial provider. I used the same not-for-profit DNS | provider until they were commercially acquired and the parent | company shut down the old nameservers. | | Now I'm locked out: eu.org does not respond to inquiries, and my | account predates the auth system. While my phone number is the | same, auth reset does not work with phone. | | It would have been fun to retain the same domain forever, but | stuff breaks, people die, and things crumble. | spc476 wrote: | It comes down to the person running the website _has_ to care. | That 's it. It doesn't matter how simple it is if the person | doesn't care. | | In my own case, I've been running my own website for 24 years now | [1]. The URLs I started out with have remained the same (although | some have gone, and yes, I return 410 for those) and the | technology hasn't changed much either (it was Apache 24 years | ago, it's still Apache today; my blog engine [2] was a C-based | CGI program, and it's still a C-based CGI program. The rest of | the site is static, and there's no Javascript (except for one | page). I can see it lasting at least six more years, and probably | more. But I care. | | [1] Started out on a physical server (an AMD 586) and a few years | later on a virtual server. | | [2] https://github.com/spc476/mod_blog | remus wrote: | > It comes down to the person running the website has to care. | | Personally I think it is a little more nuanced, in particular I | think the relationship between how much someone cares and how | much effort is required to keep the website online is what | matters. | | If your website is super simple you don't need to care about it | very much (though you do need to care at least a little bit). | On the other hand, if everyone who works at google suddenly | quit tomorrow because they didn't care the stack of cards would | fall over very quickly because it's a lot of work maintaining | millions of servers. ___________________________________________________________________ (page generated 2022-06-29 23:00 UTC)