[HN Gopher] Shining a light on the digital dark age ___________________________________________________________________ Shining a light on the digital dark age Author : weird_science Score : 119 points Date : 2023-09-01 18:18 UTC (4 hours ago) (HTM) web link (longnow.org) (TXT) w3m dump (longnow.org) | rrherr wrote: | https://web.archive.org/web/20230901181858/https://longnow.o... | Slava_Propanei wrote: | [dead] | izzydata wrote: | I think there is something to be said about what is worth | archiving. I don't know what it is that should be said though. It | seems weird to me that as a society we might be saving things | such as a 10 hour video of white noise for another 100 years | rather than some personal blogs. What is and isn't worth saving? | dzhiurgis wrote: | IMO social media got a little better once temporary stories | came out. | | Slack's limited search history is a feature too - forces you to | document using appropriate tools and not endless email | threads... | NoZebra120vClip wrote: | > Slack's limited search history | | I'm sure that I don't know what you mean. My employer is on a | paid plan for Slack, and searches cover everything, as far | back as I wish to go. Are you thinking of the limitations on | the free license? | esafak wrote: | In my experience it makes people ask the same questions over | and over. | rolobio wrote: | It will probably end up being the most popular things, the most | viewed or read. More copies of it, more likely to be archived. | | This reminds me of books. I'm sure the majority of books from | over a hundred years ago are lost because they weren't popular. | We haven't really noticed their absence... | AlbertCory wrote: | > I'm sure the majority of books from over a hundred years | ago are lost | | If they were in one of the university libraries that Google | scanned, they're not "lost." But you're right; you can't read | them. Congress should mandate that the Library of Congress, | at least, get a copy to preserve them for the ages. | | Read the Atlantic article | | https://www.theatlantic.com/technology/archive/2017/04/the-t. | .. | | for the sad story. | _jal wrote: | What's interesting there is how many authors of works we | consider classics now only become popular well after their | deaths. | | Kierkegaard, Thoreau, Dickenson and Melville, for instance. | | If their works had been lost, "we" probably wouldn't have | noticed any of those absences either. | aziaziazi wrote: | When you publish a book or magazine in France you're required | to give 2 copies to the national library for archive purpose. | Doesn't something like that exist in other countries? | debugnik wrote: | It certainly does in Spain. We even extended it to | videogames, although I don't know how much that achieves | when so many games are barely playable before the first few | patches, have much of their content released in future | updates and many are unplayable after the servers close. | grotorea wrote: | Does that applies to all books, even if you have 10 copies | printed and distribute them privately? | Pxtl wrote: | > It will probably end up being the most popular things, the | most viewed or read. More copies of it, more likely to be | archived. | | So, 10 hours of white noise yes, some person's personal blog | where they poured their heart out, no. | | Beautiful. | cj wrote: | > I'm sure the majority of books from over a hundred years | ago are lost | | Especially if you include independently published books that | weren't widely circulated. I wonder what percentage of total | books this is. | | My grandfather published a book before he passed away. It was | never sold online or in any big retail stores. Once the last | hard copy is lost, it's gone forever. | colinsane wrote: | > Once the last hard copy is lost, it's gone forever. | | i believe the Library of Congress will archive that book | for you if you mail them a hard copy. assuming it has a | ISBN, you're in the US, etc. | cj wrote: | That's a very good idea. Thanks for the tip. | imtringued wrote: | Absence? Technically we haven't even noticed their presence! | rolobio wrote: | Absolutely. I wonder what percent of tweets or blog posts | are even seen by one human? | kamel3d wrote: | It would be very interesting for us to learn about something in | ancient Egypt that is equivalent to white noise today. I think | the issue of storage should be solved and made abundant. We | should not be worrying about what to save, but rather what if | we cannot save. | ModernMech wrote: | Depends on how many whales are left in 100 years. | Almondsetat wrote: | I think there is something to be said about what is worth | archiving. I don't know what it is that should be said though. | It seems weird to me that as a monastery we might be saving | things such as a 10 volume satyrical piece about a forgotten | greek tyrant for another 100 years rather than some personal | thoughts of an Egyptian philosopher. What is and isn't worth | saving? | CuriouslyC wrote: | AI is going to give a lot of that data that would have | otherwise died eternal life. As the tech evolves, businesses | will be able to monetize by selling data (for walled gardens) | or their pages will all be scraped, cleaned up and resold by | multiple orgs (for stuff on the open web). | kbrannigan wrote: | Most paper, pen and pencil will outlive most digital media. Most | printed photos will outlive digital photos. Most CD, DVD, VHS | will outlive cloud stored videos. | CatWChainsaw wrote: | And yet I suspect that digital information involving | identification and tracking for the purpose of serving ads won't | be lost, because there's money to be made and control to be | exerted. | | Oh please. You'd have to be naive to think otherwise. | jawns wrote: | One area where I expect this digital rot to have significant | effects is with obituaries. It will likely frustrate the next few | generations of genealogists hunting for records of early 21st | century ancestors. | | Obituaries that appeared in print newspapers during the 20th | century were easily disseminated and decentrally archived | (typically by loved ones and libraries), making them relatively | rot-tolerant. | | Distribution isn't a problem for digital obituaries, and in many | ways the web is better than print in this respect. | | But when it comes to preservation, there are many factors that | make digital obits in their current state particularly | susceptible to rot. They tend to be centrally archived and often | behind paywalls, making them susceptible to digital rot and | difficult for organizations acting in the public interest to | archive. | | The for-profit company Legacy.com controls a strikingly large | share of the market for digital obituaries. It partners with | funeral homes and newspapers, and in many cases when a visitor | browses obituaries on the website of a local newspaper or funeral | home, they're actually redirected to Legacy.com, which hosts the | content. | | Unfortunately, the newspapers and funeral homes themselves often | don't maintain their own copies of the obituaries. What happens | if Legacy.com or one of the smaller memorial sites goes out of | business or experiences some sort of data loss? Because of the | centralized nature of how these digital obituaries are stored, | it's possible that very few other organizations will have | archived copies of the content. | numtel wrote: | For these kinds of things that don't take a ton of data, a | blockchain is a great place to store them for a long time for | one upfront fee. | causality0 wrote: | What's preserved is also getting more and more sterile. Material | that's now unfashionably rude or just plain unprofitable has a | tendency to just be erased. I invite you to look through your | Liked YouTube videos and witness how much of what you enjoyed has | been taken from you forever. | | Consider that everything you witness has an expiration date and | if you don't save it maybe nobody else will. | grglburgp wrote: | Attenuation of signal is how reality works. | | Generational churn means eventually no one who knows why the | nested dolls were nested as they are will be gone. Humans will | create a new set of nested dolls they can grok. | | Paraphrasing Thomas Jefferson; clearly the dead do not rule the | living. | | Edit; forgot this point... Maybe he said that; it's unverifiable | for us. Maintenance of hallucination is all it ends up. | | Reality we see, smell, hear, and touch is what we get. There's no | violating physics. Let it be lost. It's going to happen anyway. | HappMacDonald wrote: | > Paraphrasing Thomas Jefferson; clearly the dead do not rule | the living. | | Yeah but don't listen to him, he's dead. | | Hence we can safely conclude that the dead _do_ rule the | living. | | Checkmate, Epimenides! | chrisco255 wrote: | The only reason you can paraphrase Thomas Jefferson is because | his words were archived a long time ago, and the archives have | survived all this time. | grglburgp wrote: | Sorry; I did not make my point clear. Were they his words or | is the association made up? | | Without direct observation it's hearsay. | | What's the point of maintaining associations we can't verify? | The truth that Jefferson stated such is only hallucination | for us. | | The value is in the awareness of the realities mechanisms, | not the association with Jefferson. | kepano wrote: | I like the idea of accepting transience, but if we lose | important knowledge we return to superstition. | xwdv wrote: | We should think of digital information more as graffiti. It won't | be around forever, I've seen most of the old web that I | experienced as a child simply disappear, nothing but a memory | now. Enjoy information while you have it, eventually it will be | lost, like tears in the rain. | duck wrote: | Why is there a leading zero on all the years listed like 01980s? | hollerith wrote: | To encourage the reader to take the long view. That's the idea | anyways. | maverick2007 wrote: | I've been noticing this a lot lately in one of my little hobbies | of Halloween events. There's so much history from events in the | early 2000s that has been lost. So many links in ancient web | forums that 404 and aren't in the Internet archive. Info from the | 2010s is much better but still missing a lot of media. As an | individual, I certainly don't have the resources of some of the | organizations listed in this article but I've been trying to do | what I can and mirror important sites. I dread seeing what the | future looks like as more and more of the communities in my niche | move onto discord and away from more traditional web forums. | | ETA: The issue I face is two-fold. More pressingly, there are | files just missing. Nothing I can do about that barring someone | magically having them downloaded to an old PC. But also | interesting is working with old formats. The 2000s/early 2010s | Halloween Horror Nights websites were all written in Flash. They | have tons of little Easter eggs and information for the event | obsessives like me. Between fan backups and the Internet archive, | the files are pretty complete thankfully. But since Flash has | been dead for a while, I have to rely on the Ruffles Flash | emulator to get it running on the web. But that doesn't work | super well. On my list of things to do is to contribute to the | project to try to get some of the files working. | | In case anyone is curious about my backup of the event or if | you're also a fan and have any of the files listed to share and | archive, my site is hhncrypt.com! | usea wrote: | Just yesterday I went searching for a memorial page for someone | who died in 2005. It's still been maintained ever since then, | and I was grateful to read through it again. | | Even a single-file website with 2 photos can be important to | somebody. Thanks. | myth_drannon wrote: | On the other hand, most public text files from 80's and 90's | are still here. | theragra wrote: | My story is simple: I supported important historical site, but | due to war and personal issues I was not able to transfer money | to .ru zone registrar. I have the data, but domain was lost. | DoingIsLearning wrote: | If it has historic value, you can upload the data for free to | archive.org to avoid data loss. | | . | alentred wrote: | This reminds me about the "Life After People" television series | [1]. In several episodes it presents possible or imagined | scenarios in case of a sudden human removal, in various areas: | cities, animal life, oceans, etc. Every single bit of modern life | requires maintenance. They could have added an episode about the | digital information :) | | [1] https://en.wikipedia.org/wiki/Life_After_People | figassis wrote: | Would it be a good idea to create something like the arctic vault | for other info? What are the logistics of that? | ChrisArchitect wrote: | The way things are and have been progressing for awhile now, this | line sticks out in my head often: | | _If it happened before the internet, or during an early stage of | it, or there isn 't an archive of it somewhere online, did it | really happen??_ | kepano wrote: | I love to see more interest in the topic of digital continuity. | | My philosophy around this is "File over app" -- if you want to | create digital artifacts that last, they must be files you can | control, in formats that are easy to retrieve and read. Use tools | that give you this freedom. | | In the fullness of time, the files you create are more important | than the tools you use to create them. Apps are ephemeral, but | your files have a chance to last. | | https://stephanango.com/file-over-app | reidjs wrote: | I agree with "files over apps." I used google docs for my | personal notes until one day I had shoddy internet and realized | how awful it is to lose access to something important that you | wrote down. | | I now use iA Writer which allows you to save your notes in | markdown from iPhone, syncs via iCloud, and then I can continue | from my laptop. Admittedly the native Notes app has some better | functionality around sharing and searching. However, Notes | saves into a SQLite db, which would be fine, but it's not | trivial to view the tables/schema in there. | cobertos wrote: | Just ran into this. Many companies I request my data from have | retention periods and would rather be rid of it. Rare are a | couple that do seem to keep _everything_ though | mesozoic wrote: | Couple hundred years and cataclysmic disasters and only L. Ron | Hubbard writings will be left. | imtringued wrote: | Oh I know. There is an easy solution to this. Just sell the | information for Bitcoin and use Bitcoin as a store of value and | then in the future buy the information back using the store of | value feature of Bitcoin. | | (Cough, someone still has to store the information.) | pro-kythera wrote: | Proof of Archive | MrMattWright wrote: | The sites serving me up a 500 internal server error - which you | know - I guess fits the title :) | rollcat wrote: | Tangentially relevant: http://collapseos.org | dappermanneke wrote: | i don't think people realise just how much of data degradation in | the age of everything being documented online is a good thing. | nobody wants to see your cringe from 20 years ago. the internet | is a conduit for culture and much of it has to be erased and | rebuilt with every generation, such is its nature | cardboard9926 wrote: | Interesting how much data is being collected/stored nowadays, yet | how fragile storage to store that data is. | bobsmooth wrote: | Imgur recently purged an untold number of pictures from their | servers. How much knowledge has simply been discarded because it | was too expensive to keep it? | WesolyKubeczek wrote: | Won't no one remember imageshack.us and, dare I say, | rapidshare? | ajsnigrutin wrote: | I live in a small country with a weird language and with a lot of | our local pop music... | | The new music is fine, everything is on youtube.... for now! ... | but the older songs are disappearing. | | Years ago, we had a bunch of "mp3 sites" (websites where you | could download pirated mp3s), that disappeared, a huge local | torrent tracker, focusing also on local stuff, that lost most of | the data in the OVH datacenter fire and slowly died, and some | stuff was put on youtube 15 years ago, and then got copyright | claimed 5 years ago, and was removed. The music groups don't | exist anymore, so the CDs aren't publushed anymore, second hand | shops are very rare and cater mostly to LPs and tourists, and the | groups stopped existing before they could sign deals with the | publisher for streaming services. | | So yeah, it's not just "those semi-personal photos taken on a | party and kept by maybe one or two people", but also pop-music | that used to be on the radio all the time in late 80s, early 90s, | and is just..gone! | | I'm sure that there are data hoarders somewhere, that have mp3s | of all those songs somewhere, but unless we get some p2p type of | service like gnutella/ed2k/kazaa working again, I won't be able | to find them anymore. | espe wrote: | not sure it helps but - soulseek is pretty alive | OfSanguineFire wrote: | I recently logged into Soulseek for the first time in about a | decade, and I was unable to find lots of stuff that was | commonly shared in the early millennium. It's no secret that | the generation most interested in audio filesharing is | graying, and as many people raise families and have less and | less time for obsessive music collecting, they fall away from | the scene. | version_five wrote: | When I was in university late 90s early 00s, my friends and i | had all kinds of music and tv clips downloaded from p2p | sharing, that I have since lost and can't find. Clips from | shows, remixes of songs. Like you say, somebody must have them, | but they're not findable anymore. | RetroTechie wrote: | _" I'm sure that there are data hoarders somewhere, that have | mp3s of all those songs somewhere, but unless we get some p2p | type of service like gnutella/ed2k/kazaa working again, I won't | be able to find them anymore."_ | | Such archives would need to be kept online though. If only | archived (or hoarded), it's just data in a box that no-one can | access. | | Current internet is really poor at handling this long tail. Eg. | a movie can be streamed & torrented, millions see it, many have | it in personal archives, but 1 year later the torrent swarm has | died out, and those personal archives of the downloaders aren't | online. And then copyright holder pulls it from their streaming | service. | | Result: nowhere to be found. Even though popular not long ago. | | Copies in personal archives don't count for much if others | can't access that data. | [deleted] | noarchy wrote: | Looks like the article in question is getting the hug of death. | But information is being lost in droves even today. Entire | YouTube channels can vanish overnight, for various reasons, | taking years of content with them. Without any backups floating | around out there this stuff is gone forever, and this is assuming | we even know about any backups. | nomel wrote: | > Without any backups floating around out there this stuff is | gone forever | | I think part of the problem is that most backup efforts result | in lawsuits. Related, I wonder how much the data stores, like | those that OpenAI used, have _preserved_ , and I wonder how | much they will purge from their servers, to be lost forever, as | the lawsuits increase. | | For Youtube, it also has compression rot. It's _not_ a storage | solution, as I 've learned. All the videos I uploaded have | slowly reduced in resolution, bitrate, and quality, over the | years. Those that are more than a decade old have become a | blurry mess that I can barely see, at a fraction of the | resolution. I can't blame them. They don't really have views, | so they're a money sink. | RcouF1uZ4gsC wrote: | I think the Library of Congress should be given a well funded | mandate to create and maintain a digital archive of the entire | publicly accessible internet. | | The government is probably already doing with the NSA but we | normal people can't access it. | seydor wrote: | Good. Every time we forget the old world, we reinvent it better | gadflyinyoureye wrote: | That is so true. Use to be that slaves knew they were slaves. | We're much better at hiding that. | izzydata wrote: | "Those who forget history are condemned to repeat it" - George | Santayana | seydor wrote: | Those who remember it keep repeating it alright | chriswait wrote: | If this wasn't true, we couldn't know it. | discussDev wrote: | Or we think we do, giving ourselves a much needed boost to ego | and letting the cycle continue and keep everyone happy. Always | assuming that we have made things "simple" because we know our | way and not someone else's etc. That being said there's a lot | of progress in the end but we take a lot of steps backwards to | get there. | wwweston wrote: | Having forgotten the old world, how would we know? | nemo wrote: | >Every time we forget the old world, we reinvent it better | | Historically that has not been the case for most of human | history. As odd as it might seem, in general the rediscovery of | the accomplishments ancient world has been a great driver | towards progress. The periods when the accomplishments of the | past were lost and fully forgotten were the sorts of times | people call Dark Ages. | eep_social wrote: | Are we currently in a Digital Dark Age because the pace of | content creation has far outstripped our ability to preserve | that content for posterity? | | Obviously we can't be sure how the current era will be viewed | from the far future, but your comment made me realize that | the current situation has similarities to that dark age. | justrealist wrote: | Actually the cities that are the most modern and progress- | oriented today are the ones that felt comfortable bulldozing | all relics of the past. | | (or the ones where the "old city" was destroyed in brutal | urban warfare) | nemo wrote: | Responding to a statement about "for most of human history" | with a reference to a very recent event isn't something you | should begin with "actually" since it's not a reply to what | I was saying, it's just your own tangent. Also, actually, | even in the early modern period people were still looking | to the past for inspiration even when they were reusing | land - there's far more to the past than mere buildings. | TeMPOraL wrote: | You mean the cities that are most expensive and worst to | live in, because they didn't have relics of the past | putting brakes on the greed of real estate owners and | developers? | justrealist wrote: | This is word salad. | jrh3 wrote: | Each generation thinks they are so much smarter than the | previous. | nomel wrote: | It's not reinvented, it's iterated, knowingly or not. And, you | can only move towards optimization by knowing your derivative. | RecycledEle wrote: | We have made a lot of progress in this field. | | Part of storing data is the cost per byte per year. That cost is | ridiculously low on my home 52 TB workstation. (2 TB NVME, 14 TB | and 2x 18 TB HDDs) | | A backup copy is stored on 3x 18 TB HDDs in USB enclosures. | forgotmypw17 wrote: | I think it is important to focus on portability and archivability | of our online spaces. | | Using today's technology, it is possible to allow exporting all | content as basic file formats such as txt + zip. | | In addition, PKI (public/private key infrastructure) allows us to | decouple a user's private and public identities, meaning the | public identity can now be portable between servers. | | What does all this mean for the average user or community | operator? | | It means your community can be completely transparent, auditable, | and PORTABLE, allowing any user to archive the whole thing and | clone it to another server -- right away or years later. | | I've been writing a framework for this type of system for several | years now, and if you're curious, you know where to look. | | Thank you for coming to my ted talk. | bottlepalm wrote: | With AI we have a real chance of 'losing the past' as in we won't | be able to tell fact from fiction. The bar to modify images, | video, and text form the past will be so low that everyone will | do it. And on top of that, coming autonomous agents will do be | creating and modifying information at rates we just won't be able | to keep up with. | | I'd say we should sign everything we can now, anything created | from 2023 onward is already suspect of being created by AI. The | past will be 'erased' as well if there's no way to verify our | historical information. | | For example, I create an AI photo of Frank Sinatra in an LA diner | eating a sandwich and post it online - tell me how on can verify | today that picture is legit or not. Whose the arbiter of all | Frank Sinatra photos? How much time, effort, and money would it | take to do that verification? Now extrapolate this example to | everything. The past becomes only myth and legend. | [deleted] | nighthawk454 wrote: | is that really any different than the rest of history? maybe | blindly trusting recordings is a bubble. | bottlepalm wrote: | AI is going to give us the power to extrapolate information | from the past like never before. With a few lines of text I | may be able to create a feature film on Abraham Lincoln, how | much of that will be 'frog DNA' spliced in to keep the story | going? Which may be used as source data for something else, | and so on. The past is already a game of telephone into the | future. With the ability to fill in the blanks provided by | AI, the signal to noise goes way down. Figuring out the | actual real source information becomes a lot more valuable, | but without locking down that source information today we may | lose the ability to verify many pieces of information as | sources versus AI creations in the future. | figassis wrote: | There is a difference. Humanity lies, but until now it could | not retroactively mass fabricate truth so convincing that | even people current day skeptics had trouble seeing through. | When I say current day, I mean in the past, a genocide was | being seen as such by at least part of the human population, | but than that truth was erased by storytelling. Today, events | are fabricated into existence by AI. You go to google to find | alternative resources and you can't trust them either. You | can't verify because it is too costly to verify every single | thing you see/hear/read. It's a problem. | | I can imagine a couple million years from now, some alien | species shows up, we're all gone and they think maybe we had | wings, some of us were born with blue hair and other were | half robots. I get they can study some of our remains, but so | much of us is mutable digital info now. | TeMPOraL wrote: | So is prosperity and civilization. | tough wrote: | Is good to remind that history is always written by the | winner side by necessity. | | Also they usually like to burn any -history- or -culture- of | the loosing side and adapt their customs to their new ones | and call it a day, erasing history pretty much, as much of it | as they can at least. | | YMMV | TeMPOraL wrote: | This is different, because: | | - You usually can attribute history written by a victor to | said victor; | | - There's only so much control a victor has over what's | being kept by the monks, librarians, museum curators and | individuals, and what of it will resurface once they're | gone. | | With AI, we're not talking about alternative history, but | rather about infinite, arbitrary alternative histories that | can't be told apart from the real one. | jay_kyburz wrote: | Its kind of a fun thought, but the past will become more like | the future, we can guess what probably happened, we'll never | really know for sure. Just like I probably know what will | happen tomorrow, but I'll never be sure. ___________________________________________________________________ (page generated 2023-09-01 23:00 UTC)