[HN Gopher] The long road to recover Frogger 2 source from tape ... ___________________________________________________________________ The long road to recover Frogger 2 source from tape drives Author : WhiteDawn Score : 249 points Date : 2023-05-24 17:41 UTC (5 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | FearNotDaniel wrote: | > the ADR-50e drive was advertised as compatible, but there was a | cave-at | | I'm assuming the use of "cave-at" means the author has inferred | an etymology of "caveat" being made up of "cave" and "at", as in: | this guarantee has a limit beyond which we cannot keep our | promises, if we ever find ourselves AT that point then we're | going to CAVE. (As in cave in, meaning give up.) I can't think of | any other explanation of the odd punctuation. Really quite | charming, I'm sure I've made similar inferences in the past and | ended up spelling or pronouncing a word completely wrong until I | found out where it really comes from. There's an introverted | cosiness to this kind of usage, like someone who has gained a | whole load of knowledge and vocabulary from quietly reading books | without having someone else around to speak things out loud. | nocoiner wrote: | I thought it might have been a transcription error of "carve | out," but your theory is more logical. | huehehue wrote: | Fascinating read that unlocked some childhood memories. | | I'm secondhand pissed at the recovery company, I have a couple of | ancient SD cards laying around and this just reinforces my fear | that if I send them away for recovery they'll be destroyed (the | cards aren't recognized/readable by the readers built into | MacBooks, at least) | ryanjshaw wrote: | Painful lesson I've learned myself the hard way - don't rush | something that doesn't need to be rushed. | ogurechny wrote: | Modern backup would simply state "API keys and settings are | here:", and a link to collaboration platform closed after 3 years | of existence. | jandrese wrote: | Hey, it's the cloud. Backups are "someone else's problem". That | is until they are your problem, then you're up a creek. | tivert wrote: | > Hey, it's the cloud. Backups are "someone else's problem". | That is until they are your problem, then you're up a creek. | | The FSF used to sell these wonderful stickers that said | "There is not cloud. It's just someone else's computer." | isaidthis wrote: | The sticker: | https://static.fsf.org/nosvn/stickers/thereisnocloud.svg | | "Stickers from various FSF campaigns - Print out copies of | our stickers for your own uses, local conferences and | more." https://www.fsf.org/resources/stickers | ilyt wrote: | Honestly backup space is weirdly sparse for anything on | enterprise scale. | | For anything more than few machines there is bacula/bareos | (that pretends everything is tape with mostly miserable | results), backuppc (that pretends tapes are not a thing, with | miserable results), and that's about it, everything else seems | to be point-to-point backups only with no real central | management. | dllthomas wrote: | On the topic of Froggers, I enjoyed | https://www.youtube.com/watch?v=FCnjMWhCOcA | smokel wrote: | Heh, I remember playing .mp3 files directly from QIC-80 tapes, | somewhere around 1996. One tape could store about 120 MB, which | is equal to about two compact discs' worth of audio. The noise of | the tape drive was slightly annoying, though. And it made me | appreciate what the 't' in 'tar' stands for. | mjaniczek wrote: | Did you mean 1200 MB? That would make sense wrt. 2x CD | capacity. | smokel wrote: | No, it was really only 120 MB. I was referring to the length | of an audio compact disc, not the capacity of a CD-ROM. At | 128 kbps, you'd get about 2 hours of play time. | | Of course it didn't really make sense to use digital tapes | for that use case, even back then. It was just for fun, and | the article sparked some nostalgic joy, which felt worth | sharing :) | [deleted] | [deleted] | stewarts wrote: | They reference MP3, and a CD ripped down to MP3 probably fits | in the 50-100MB envelope for size. It has been a very long | time since I last ripped an album, but that size jives with | my memory. | crazygringo wrote: | Wow, this part makes my blood boil, emphasis mine: | | > This issue doesn't affect tapes written with the ADR-50 drive, | but all the tapes I have tested written with the OnStream SC-50 | do NOT restore from tape _unless the PC which wrote the tape is | the PC which restores the tape._ This is because the PC which | writes the tape stores a catalog of tape information such as tape | file listing locally, which the ARCserve is supposed to be able | to restore without the catalog because it 's something which only | the PC which wrote the backup has, _defeating the purpose of a | backup._ | | Holy crap. A tape backup solution that doesn't allow the tape to | be read by any other PC? That's madness. | | Companies do shitty things and programmers write bad code, but | this one really takes the prize. I can only imagine someone | inexperienced wrote the code, nobody ever did code review, and | then the company only ever tested reading tapes from the same | computer that wrote them, because it never occured to them to do | otherwise? | | But _yikes_. | throw0101b wrote: | > _Holy crap. A tape backup solution that doesn 't allow the | tape to be read by any other PC? That's madness._ | | What is needed is the backup catalog. This is fairly standard | on a lot of tape-related software, even open source; see for | example "Bacula Tape Restore Without Database": | | * http://www.dayaro.com/?p=122 | | When I was still doing tape backups the (commercial) backup | software we were using would e-mail us the bootstrap | information daily in case we had to do a from-scratch data | centre restore. | | The first step would get a base OS going, then install the | backup software, then import the catalog. From there you can | restore everything else. (The software in question allowed | restores even without a license (key?), so that even if you | lost that, you could still get going.) | ilyt wrote: | Right, the on-PC database act as index to data on the tape. | That's pretty standard. | | But having format where you can't recreate the index from | data easily is just abhorrently bad coding... | tinus_hn wrote: | Obviously to know what to restore, you need to index the data | on the tapes. Tape is not a random access medium, there is no | way around this. | | This is only for a complete disaster scenario, if you're | restoring one PC or one file, you would still have the backup | server and the database. But if you don't, you need to run | the command to reconstruct the database. | ShadowBanThis01 wrote: | There is a way around this: You allocate enough space at | the beginning (or the end, or both) of the tape for a | catalog. There are gigabytes on these tapes; they could | have reserved enough space to store millions of filenames | and indices. | IshKebab wrote: | Wouldn't it make sense to _also_ write the backup catalog to | the tape though? Seems like a very obvious thing to do to me. | fsckboy wrote: | you'd have to put the catalog at the end of the tape, but | in that case you might as well rebuild the catalog by | simply reading the tape on your way to the end (yeah, if | the tape is partially unreadable blah blah backup of your | backup...) | Nextgrid wrote: | I'd like to believe maybe that's why the company went out of | business but that's just wishful thinking - a lot of | incompetence is often ignored if not outright rewarded in | business nowadays. Regardless, it's at least somewhat of a | consolation those idiots did go out of business in the end, | even if that's wasn't the root cause. | Neil44 wrote: | I'm familiar with needing to re-index a backup if it's accessed | from a 'foreign' machine and sometimes the procedure is non- | obvious but just not having that option seems pretty bad. | bluedino wrote: | I worked for an MSP a million years ago and we had a customer | that thought they had lost everything. They had backup tapes | but the backup server itself had died, after showing them the | 'catalog tape' operation, and keeping their fingers crossed | for a few hours, they bought me many beers. | EvanAnderson wrote: | I always had the Customer keep a written log of which tapes | were used on which days. It helped for accountability but | also prevented the "Oh, shit, we have to catalog all the | tapes because the log of which tapes were used on which day | are on the now-failed server." | winrid wrote: | It's basically an index stored on faster media. You would have | redundancy on that media, too. | readyplayernull wrote: | A few months ago I was looking for an external backup drive and | thought that SSD would be great because it's fast and shock | resistant. Years ago I killed a Macbook Pro HD by throwing it on | my bed from few inches high. Then I read a comment on Amazon | about SSD losing information when unpowered for a long time. I | couldn't find any quick confirmation in the product page, took me | a few hours of research to find some paper about this phenomenon. | If I remember correctly it takes a few weeks for the stored SSD | to start losing its data. So I bought a mechanical HD. | | Another tech tip is not buying 2 backup devices from the same | batch or even the same model. Chances being these will fail in | the same way. | vidarh wrote: | To the last bit, I've seen this first hand. Had a whole RAID | array of the infamous IBM DeathStar drives fail one after the | other while we frantically copied data off. | | Last time I ever had the same model drives in an array. | jimbob45 wrote: | F2 was a really neat game. It almost invented Crypt of the | Necrodancer's genre decades early. | | It's a little sad that it took such a monumental effort to bring | the source code back from the brink of loss. It's times like that | that should inspire lawmakers to void copyright in the case that | the copyright holders can't produce the thing they're claiming | copyright over. | LeoPanthera wrote: | I really wish they would name the data recovery company so that I | can never darken their door with my business. | bluedino wrote: | > Over the span of about a month, I received very infrequent | and vague communications from the company despite me providing | extremely detailed technical information and questions. | | Ahh the business model of "just tell them to send us the tape | and we'll buy the drive on eBay" | Nextgrid wrote: | To be honest as long as they are very careful about not doing | any damage to the original media then it might work and be a | win-win for both sides in a "no fix no fee" model where the | customer only pays if the data is successfully recovered. | | Their cardinal sin was that they irreparably damaged the tape | without prior customer approval. | nickt wrote: | It's not too hard to find with the following search, "we can | recover data from tape formats including onstream" | stepupmakeup wrote: | The OP explicity didn't name them (despite many people | recommending to, even preservationists in this field on | Reddit and Discord) but it's easy to find just by googling | the text on the screenshots | ddtaylor wrote: | Name them and we can setup a thread or site to publicly | shame them | stepupmakeup wrote: | the comment I replied to edited the link out | https://www.datarecovery.net/tape-data-recovery.aspx | omoikane wrote: | Reddit thread: https://www.reddit.com/r/DataHoarder/comment | s/13q1pv7/playst... | a1369209993 wrote: | https://news.ycombinator.com/item?id=36063114 claims it's | https://www.datarecovery.net/tape-data-recovery.aspx (and that | https://news.ycombinator.com/item?id=36062785 had been edited | to censor the information, so I'm dupicating it here). Caveat | that I don't know if that's actually correct, since efforts to | suppress it are only circusantial evidence in favor. | omnibrain wrote: | Is anyone else calling it "froggering/to frogger" if they have to | cross a bigger street by foot without a dedicated crossing? | hlandau wrote: | Absolutely amazing story. Fantastic! | | I've actually long been stunned by the propensity of proprietary | backup software to use undocumented, proprietary formats. I've | always found this quite stunning, in fact. It seems to me like | the first thing one should make sure to solve when designing a | backup format is to ensure it can be read in the future even if | all copies of the backup software are lost. | | I may be wrong but I think some open source tape backup software | (Amanda, I think?) does the right thing and actually starts its | backup format with emergency restoration instructions in ASCII. I | really like this kind of "Dear future civilization, if you are | reading this..." approach. | | Frankly nobody should agree to use a backup system which | generates output in a proprietary and undocumented format, but | also I want a pony... | | It's interesting to note that the suitability of file formats for | archiving is also a specialised field of consideration. I recall | some article by someone investigating this very issue who argued | formats like .xz or similar weren't very suited to archiving. | Relevant concerns include, how screwed you are if the archive is | partly corrupted, for example. The more sophisticated your | compression algorithm (and thus the more state it records from | longer before a given block), the more a single bit flip can | result in massive amounts of run-on data corruption, so better | compression essentially makes things worse if you assume some | amount of data might be damaged. You also have the option of | adding parity data to allow for some recovery from damage, of | course. Though as this article shows, it seems like all of this | is nothing compared to the challenge of ensuring you'll even be | able to read the media at all in the future. | | At some point the design lifespan of the proprietary ASICs in | these tape drives will presumably just expire(?). I don't know | what will happen then. Maybe people will start using advanced | FPGAs to reverse engineer the tape format and read the signals | off, but the amount of effort to do that would be astronomical, | far more even than the amazing effort the author here went to. | hlandau wrote: | To add, thinking a bit more about it: Designing formats to be | understandable by future civilizations actually reduces to a | surprising degree to the same set of problems which METI has to | face. As in, sending signals designed to be intelligible to | extraterrestrials - Carl Sagan's Contact, etc. | | Even if you write an ASCII message directly to a tape, that | data is obviously going to be encoded before being written to | the tape, and you have no idea if anyone will be able to figure | out that encoding in future. Trouble. | | What makes this particularly pernicious is the fact that LTO | nowadays is a proprietary format(!!). I believe the spec for | the first generation or two of LTO might be available, but last | I checked, it's been proprietary for some time. The spec is | only available to the (very small) consortium of companies | which make the drives and media. And the number of companies | which make the drives is now... two, I think? (They're often | rebadged.) Wouldn't surprise me to see it drop to one in the | future. | | This seems to make LTO a very untrustworthy format for | archiving, which is deeply unfortunate. | rootsudo wrote: | Name and shame the company, you had a personal experience, you | have proof. Name and shame. It helps nobody if you don't | publicize it. Let them defend it, let them say whatever excuse, | but your review will stand. | phkahler wrote: | >> The tape was the only backup for those things, and it | completes Frogger 2's development archives, which will be | released publicly. | | In cases like this can imagine some company yelling "copyright | infringement" even though they don't possess a copy themselves. | It's a really odd situation. | chrisstanchak wrote: | I've been suffering through something similar with a DLT IV tape | from 1999. Luckily I didn't send out to the data recovery | company. But still unsuccessful. | db48x wrote: | Wow, that backup software sounds like garbage. Why not just use | tar? Why would anyone reinvent that wheel? | robotnikman wrote: | The company that made it probably was hoping for vendor lock-in | cosmotic wrote: | Vendor lock in for backup and archival products is so | ridiculous. It increases R&D to ensure the lock-in, and the | company won't exist by the time the lock-in takes effect. | fifteen1506 wrote: | Well yes, but the boss probably is willing to invest more | money (meaning higher salaries, more people, better tools) | expecting a future return than when using reasonable | formats. | giantrobot wrote: | IIRC tar has some Unixisms that don't necessarily work for | Windows/NTFS. Not saying reinventing tar is appropriate but | there's Windows/NTFS that a Windows based tape backup need to | support. | cosmotic wrote: | Most of what makes NTFS different than FAT probably doesn't | need to be backed up. Complex ACLs, alternative data streams, | shadow copies, etc, are largely irrelevant when it comes to | making a backup. Just a simple warning "The data being backed | up includes alternative data streams. These aren't supported | and won't be included in the backup" would suffice. | jandrese wrote: | All of that stuff matters when you're using the backup for | its intended purpose: to restore a system after hardware | failure. | | Unix tar is obviously not the right solution, but a Windows | tar seems like it shouldn't be that hard to do and yet we | are in the situation we are today. I've been using | dump/restore for decades now on Unix, including to actually | recover from loss, but I admit that it's not that pleasant | to use. I like that it is very simple and reliable however, | unlike the mess that is Time Machine (recovering from a | hardware loss on a Mac is a roll of the dice, and I've | gotten snakes) or worse Deja Dup. I'm not sure I've ever | successfully recovered a system from a Deja Dup backup. | a1369209993 wrote: | > using the backup for its intended purpose: to restore a | system after hardware failure. | | No. The intended purpose of a backup is to restore the | _data_ (such as the Frogger 2 source code) after a | hardware failure. If it has the side effect of also | producing a working system, that 's _good_ , but it's not | the point. After all, the hardware necessary to build a | working system may not exist any more; one (only-probably | not the last) instance of said hardware just broke, after | all. | cosmotic wrote: | I think the use case for disaster recovery is a bit | different than long-term archival. | nycdotnet wrote: | If you're backing up a db or something sure, but for a file | server this can be just as important as the data itself | (ex: now everyone can read HR's personnel files which had | strict permissions before) | ilyt wrote: | The format is extensible enough that it could be added | bombcar wrote: | The world of tape backup was (is?) absolutely filled with all | sorts of vendor-lock in projects and tools. It's a complete | mess. | | And even various versions of tar aren't compatible, and that's | not even starting with star and friends. | stepupmakeup wrote: | It's not just limited to tape, most archiving and backup | software is proprietary. It's impossible to open Acronis or | Macrium Reflect images without their Windows software. In | Acronis's case they even make it impossible to use offline or | on a server OS without paying for a license. NTBackup is | awfully slow and doesn't work past Vista, and it's not even | part of XP POSReady for whatever reason, so I had to rip the | exe from a XP ISO and unpack it (NTBACKUP._EX... I forgot | microsoft's term for that) because the Vista version | available on Microsoft's site specifically checks for | longhorn or vista. | | Then there's slightly more obscure formats that didn't take | off in the western world, and the physical mediums too. Not | many people had the pleasure of having to extract hundreds of | "GCA" files off of MO disks using obscure Japanese freeware | from 2002. The English version of the software even has a | bunch of flags on virustotal that the standard one doesn't. | And there's obscure LZH compression algorithms that no tool | available now can handle. | | I've found myself setting up one-time Windows 2000/XP VMs | just to access backups made after 2000. | jandrese wrote: | I have at various times considered a tape backup solution for | my home, but always give up when it seems every tape vendor | is only interested in business clients. It was a race to stay | ahead of hard drives and oftentimes they seemed to be losing. | The price points were clearly aimed at business customers, | especially on the larger capacity tapes. In the end I do | backup to hard drives instead because it's much cheaper and | faster. | bombcar wrote: | The only way to do tape at home is with used equipment and | Linux/BSD. You can do quite a bit with tar and mt (iirc) - | even controlling auto loaders. | | What's fun are the hard drive based systems designed to | perfectly imitate a tape autoloader so you don't have to | buy new backup software (virtual tape libraries). | stepupmakeup wrote: | Tape absolutely isn't viable for the consumer at all, but | definitely worth exploring for the novelty. Even if you | manage to get a pretty good deal on a legacy LTO system | (other formats don't even come close to the tb/$ of 10+ | year old LTO and drives are still fairly cheap), the drives | aren't being made any more and aren't getting any cheaper. | Backwards compatibilty may be in your favor depending on | your choice of tape generation at least, I think there's at | least two generations guaranteed. Optical will probably | remain king though the pricing is worse than HDDs, there's | no shortage of DVD or BD readers, but you might run into | issues with quad layer 128 BD as they only hit the market | fairly recently. | ilyt wrote: | Tape drive and Bareos/Bacula "just works" | | Absolutely not worth it tho. Drives are hideously expensive | which means they only start making sense where you have at | least dozens of tapes. | | There is an advantage of tapes not being electrically | connected most of the time so lightning strike will not | burn your archives, I have pondered making a separate box | with a bunch of hard drives that boots once a month and | just copies last months of backups on hard drives, powered | from solar or something just to separate from the network | EvanAnderson wrote: | ARCServe was a Computer Associates product. That's all you need | to know. | | It had a great reputation on Novell Netware but the Windows | product was a mess. I never had a piece of backup management | software cause blue screens (e.g. kernel panics) before an | unfortunate Customer introduced me to ARCServe on Windows. | nycdotnet wrote: | My favorite ArcServe bug which they released a patch for (and | which didn't actually fix the issue, as I recall) had a KB | article called something along the lines of "The Open | Database Backup Agent for Lotus Notes Cannot Backup Open | Databases". | h2odragon wrote: | Truly noble effort. Hopefully the writeup and the tools will save | others much heartbreak. | bsder wrote: | Is there way to read magnetic tapes like these in such a way as | to get the raw magnetic flux at high resolution? | | It seems like it would be easier to process old magnetic tapes by | imaging them and then applying signal processing rather than | finding working tape drives with functioning rollers. Most of the | time, you're not worried about tape speed since you're just doing | recovery read rather than read/write operations. So, a slow but | accurate operation seems like it would be a boon for these kinds | of things. | fifteen1506 wrote: | You still need to know where to look, the format, and using | specialized equipment which cost wasn't driven down by mass | manufacturing, so, in theory yes, in practice not. | | (Completely guessing here with absolute no knowledge of the | real state of things) | EvanAnderson wrote: | For anybody who is into this this is a a good excuse to share a | presentation from Vintage Computer Fest West 2020 re: magnetic | tape restoration: https://www.youtube.com/watch?v=sKvwjYwvN2U | | The presentation explores using software-defined signal | processing analyze a digitized version of the analog signal | generated from the flux transitions. It's basically moving the | digital portion of the tape drive into software (a lot like | software-defined radio). This is also very similar to efforts | in floppy disk preservation. Floppies are amazingly like tape | drives, just with tiny circular tapes. | bombcar wrote: | Yes. There's some guy on YouTube who does stuff like that (he | reverse engineered the audio recordings from a 747 tape array) | but it can be quite complicated. | Nextgrid wrote: | Would you have a link by any chance? Thanks! | iforgotpassword wrote: | Sounds like at least in this case that ASIC in the drive was | doing some (non trivial) signal processing. Would be | interesting to know how hard it would be to get from the flux | pattern back to zeros and ones. I guess with a working drive | you can at least write as many test patterns as you want until | you maybe figure it out. | jandrese wrote: | At the very least the drive needs to be able to lock onto the | signal. It's probably encoded in a helix on the drive and if | the head isn't synchronized properly you won't get anything | useful, even with a high sampling rate. | dpratt wrote: | At the very least, and the cost for this perhaps would be | prohibitive, but some mechanism to duplicate the raw flux off | the tape onto another tape in an identical format, a backup of | the backup. This would allow for attempts to read the data that | may be potentially destructive to the media (for example, | breaking the tape accidentally) and not lose the original | signal. | tombert wrote: | This is giving me some anxiety about my tape backups. | | I have backed up my blu-ray collection to a dozen or so LTO-6 | tapes, and it's worked great, but I have no idea how long the | drives are going to last for, and how easy it will be to repair | them either. | | Granted, the LTO format is probably one of the more popular | formats, but articles like this still keep me up at night. | bombcar wrote: | Do test restores. LTO is very good but without verification | some will fail at some point. | | But your original bluray disk are _also_ a backup. | EvanAnderson wrote: | The only surefire method to keep the bits readable is to | continue moving them onto new media every few years. Data has a | built-in recurring cost. I'd love to see a solution to that | problem but I think it's unlikely. It's a least possible, | though, that we'll come up with a storage medium with | sufficient density and durability that'll it'll be good enough. | | I don't even want to think about the hairy issues associated | with keeping the bits able to be interpreted. That's a human | behavior problem more than a technology problem. | wazoox wrote: | LTO-7 drives read LTO-6, and will be available for quite a | while. | | In 2016 I've used an LTO-3 drive to restore a bunch (150 or | 200) of LTO-1/2 tapes from 2000-2003, and almost all but one or | two worked fine. | robotnikman wrote: | I've always admired the tenacity of people who reverse engineer | stuff. To be able to spend multiple months figuring out barely | documented technologies with no promise of success takes a lot a | willpower and discipline. It's something I wish I could improve | more in myself. | detrites wrote: | I think you could. In some sense "easily". It may be about | finding _that thing_ you 're naturally so interested in or | otherwise drawn to, that the months figuring out become a type | of driven joy, and so the willpower kinda automatic. | | And if you find it, don't judge what it is or worry what others | might think - or even necessarily tell anyone. Sometimes the | most motivating things are highly personal, as with the OP; a | significant part of their childhood. | masto wrote: | This brings back (unpleasant) memories. I remember trying to get | those tape drives working with FreeBSD back in 1999, and it going | nowhere. | ilamont wrote: | In The Singularity Is Near (2005) Ray Kurzweil discussed an idea | for the "Document Image and Storage Invention", or DAISI for | short, but concluded it wouldn't work out. I interviewed him a | few years later about this and here's what he said: | | _The big challenge, which I think is actually important almost | philosophical challenge -- it might sound like a dull issue, like | how do you format a database, so you can retrieve information, | that sounds pretty technical. The real key issue is that software | formats are constantly changing. | | People say, "well, gee, if we could backup our brains," and I | talk about how that will be feasible some decades from now. Then | the digital version of you could be immortal, but software | doesn't live forever, in fact it doesn't live very long at all if | you don't care about it if you don't continually update it to new | formats. | | Try going back 20 years to some old formats, some old programming | language. Try resuscitating some information on some PDP1 | magnetic tapes. I mean even if you could get the hardware to | work, the software formats are completely alien and [using] a | different operating system and nobody is there to support these | formats anymore. And that continues. There is this continual | change in how that information is formatted. | | I think this is actually fundamentally a philosophical issue. I | don't think there's any technical solution to it. Information | actually will die if you don't continually update it. Which | means, it will die if you don't care about it. ... | | We do use standard formats, and the standard formats are | continually changed, and the formats are not always backwards | compatible. It's a nice goal, but it actually doesn't work. | | I have in fact electronic information that in fact goes back | through many different computer systems. Some of it now I cannot | access. In theory I could, or with enough effort, find people to | decipher it, but it's not readily accessible. The more backwards | you go, the more of a challenge it becomes. | | And despite the goal of maintaining standards, or maintaining | forward compatibility, or backwards compatibility, it doesn't | really work out that way. Maybe we will improve that. Hard | documents are actually the easiest to access. Fairly crude | technologies like microfilm or microfiche which basically has | documents are very easy to access. | | So ironically, the most primitive formats are the ones that are | easiest._ | ChuckMcM wrote: | This is very very true. I have archived a number of books and | magazines that were scanned and converted into "simplified" | PDF, and archived on a DVD disks with C source code. | | There are external dependencies but one hopes that the | descriptions are sufficient to figure out how to make those | work. | magpi3 wrote: | One of the claimed benefits of the JVM (and obviously later | VMs) was that it would solve this issue: Java programs written | in 2000 should still be able to run in 2100. And as far as I | know the JVM has continued to fulfill this promise. | | An honest question: If you are writing a program that you want | to survive for 100+ years, shouldn't you specifically target a | well-maintained and well-documented VM that has backward | compatibility as a top priority? What other options are there? | wongarsu wrote: | In 2005 the computing world was much more in flux than it is | now. | | PNG is 26 years old and basically unchanged since then. Same | with 30 year old JPEG, or for those with more advanced needs | the 36 year old TIFF (though there is a newer 21 year old | revision). All three have stood the test of time against | countless technologically superior formats by virtue of their | ubiquity and the value of interoperability. The same could be | said about 34 year old zip or 30 year old gzip. For executable | code, the wine-supported subset of PE/WIN32 seems to be with us | for the foreseeable future, even as Windows slowly drops | comparability. | | The latest Office365 Word version still supports opening Word97 | files as well as the slightly older WordPerfect 5 files, not to | mention 36 year old RTF files. HTML1.0 is 30 years old and is | still supported by modern browsers. PDF has also got constant | updates, but I suspect 29 year old PDF files would still | display fine. | | In 2005 you could look back 15 years and see a completely | different computing landscape with different file formats. Look | back 15 years today and not that much changed. Lots of exciting | new competitors as always (webp, avif, zstd) but only time will | tell whether they will earn a place among the others or go the | way of JPEG2000 and RAR. But if you store something today in a | format that's survived the last 25 years, you have good chances | to still be able to open it in common software 50 years down | the line. | forgotmypw17 wrote: | There is something called Lindy Effect, which states that a | format's longevity is proportional to its current age. | | I try to take advantage of this by only using older, open, | and free things (or the most stable subsets of them) in my | "stack". | | For example, I stick to HTML that works across 20+ years of | mainstream browsers. | orbital-decay wrote: | This is too shortsighted by the archival standards. Even Word | itself doesn't offer full compatibility. VB? 3rd party active | components? Other Office software integration? It's a mess. | HTML and other web formats are only readable by the virtue of | being constantly evolved while keeping the backwards | compatibility, which is nowhere near complete and is | hardware-dependent (e.g. aspect ratios, colors, pixel | densities). The standards _will_ be pruned sooner or later, | due to the tech debt or being sidestepped by something else. | And I 'm pretty sure there are plenty of obscure PDF features | that will prevent many documents from being readable in mere | half a century. I'm not even starting on the code and | binaries. And cloud storage is simply extremely volatile by | nature. | | Even 50 years (laughable for a clay tablet) is still pretty | darn long in the tech world. We'll still probably see the | entire computing landscape, including the underlying | hardware, changing fundamentally in 50 years. | | Future-proofing anything is a completely different dimension. | You have to provide the independent way to bootstrap, without | relying on the unbroken chain of software standards, | business/legal entities, and the public demand in certain | hardware platforms/architectures. This is unfeasible for the | vast majority of knowledge/artifacts, so you also have to | have a good mechanism to separate signal from noise and to | transform volatile formats like JPEG or machine-executable | code into more or less future proof representations, at least | basic descriptions of what the notable thing did and what | impact it had. | ilyt wrote: | >Future-proofing anything is a completely different | dimension. You have to provide the independent way to | bootstrap, without relying on the unbroken chain of | software standards, business/legal entities, and the public | demand in certain hardware platforms/architectures. This is | unfeasible for the vast majority of knowledge/artifacts, so | you also have to have a good mechanism to separate signal | from noise and to transform volatile formats like JPEG or | machine-executable code into more or less future proof | representations, at least basic descriptions of what the | notable thing did and what impact it had. | | I'd argue that best way would be to not do that but to make | sure format is ubiquitous enough that the knowledge will | never be lost in the first place. | moron4hire wrote: | While it's true that these standards are X years old, the | software that encoded those formats yesteryear is very | different from the software that decodes it today. It's a | Ship of Theseus problem. They can claim an unbroken lineage | since the distant future, the year 2000, but encoders and | decoders had defects and opinions that were relied on--both | intentionally and unintentionally--that are different from | the defects and opinions of today. | | I have JPEGs and MP3s from 20 years ago that don't open | today. | matja wrote: | Are they really JPEGs and MP3s, or just bitrot? | | I've found https://github.com/ImpulseAdventure/JPEGsnoop | useful to fix corruption but I haven't come across a non- | standard JFIF JPEG unless it was intentionally designed to | accommodate non-standard features (alpha channel etc). | orbital-decay wrote: | I personally never encountered JPEGs or MP3s which were | totally unreadable due to the being encoded by ancient | software versions, but the metadata in common media | formats is a total mess. Cameras and encoders are writing | all sorts of obscure proprietary tags, or even things | like X-Ray (STALKER Shadow of Chernobyl game engine) | keeping gameplay-relevant binary metadata in OGG Vorbis | comments. Which is even technically compliant with the | standard I think, but that won't help you much. | ilyt wrote: | Actually I'd argue it's wrong precisely because we _do_ manage | to retrieve even such old artifacts. Only problem is that | nobody cared for 30 years so the process was harder than it | should be but in the end it was possible. | | Sure, there is a risk that at some point, for example, any | version of every PNG or H.264 decoder gets lost and so re- | creating decoder for that would be significantly more | complicated, but chances for that are pretty slim, but looking | at `ffmpeg -codecs` I'm not really worried for that to ever | happen. | krapp wrote: | I'm certain that 100 years from now, when the collapse really | gets rolling, we'll still have cuneiform clay tablets | complaining about Ea-Nassir's shitty copper but most of the | digital information and culture we've created and tried to | archive will be lost forever. Eventually, we're going to lose | the infrastructure and knowledge base we need to keep updating | everything, people will be too busy just trying to find food | and fighting off mutants from the badlands to care. | jakeinspace wrote: | Well, almost all early tablets are destroyed or otherwise | lost now. Do you think we will lose virtually all digital age | information within a century? Maybe from a massive CME, I | suppose. | jcranmer wrote: | Clay tablets were usually used for temporary records, as | you could erase it simply by smearing the clay a little bit | (a lot easier than writing with on papyrus). The tablets we | have exist because of something that causes the clay to be | baked into ceramic, which is generally some sort of | catastrophic fire that caused the records to accidentally | be preserved for much longer. | krapp wrote: | I can see it happening. Not as a single catastrophic event | but, like Rome falling bit by bit, our technological | civilization fails and degenerates as climate change (in | the worst possible scenario) wreaks havoc on everything. | 0xdeadbeefbabe wrote: | > Hard documents are actually the easiest to access. Fairly | crude technologies like microfilm or microfiche which basically | has documents are very easy to access. | | Maybe it isn't crude after all if it wins. | hello_computer wrote: | I was able to backup/restore an old COBOL system via cpio | between modern GNU cpio (man page last updated June 2018), and | SCO's cpio (c. 1989). This is neither to affirm nor contradict | Kurzweil, but rather to praise the GNU userland for its solid | legacy support. | crazygringo wrote: | But he seems to have written this before virtual machines | became widespread. | | I think the concern is becoming increasingly irrelevant now, | because if I really need to access a file I created in Word 4.0 | for the Mac back in 1990, it's not too hard to fire up System 6 | with that version of Word and read my file. In fact it's much | _easier_ now than it was in 2005 when he was writing. Sure it | might take half an hour to get it all working, but that 's | really not too bad. | | Most of this is probably technically illegal and will sometimes | even have to rely on cracked versions, but also nobody cares | and. All the OS's and programs are still around and easy to | find on the internet. | | Not to mention that while file formats changed all the time | early on, these days they're remarkably long-lived -- used for | decades, not years. | | The outdated hardware concern _was_ more of a concern (as the | original post illustrates), but so much of everything important | we create today is in the cloud. It 's ultimately being saved | in redundant copies on something like S3 or Dropbox or Drive or | similar, that are kept up to date. As older hardware dies, the | bits are moved to newer hardware without the user even knowing. | | So the problem Kurzweil talked about has basically become | _less_ of an issue as time has marched on, not _more_. Which is | kind of nice! | ilyt wrote: | >I think the concern is becoming increasingly irrelevant now, | because if I really need to access a file I created in Word | 4.0 for the Mac back in 1990, it's not too hard to fire up | System 6 with that version of Word and read my file. In fact | it's much easier now than it was in 2005 when he was writing. | Sure it might take half an hour to get it all working, but | that's really not too bad. | | And that was easy years ago. | | Now you can WASM it and run it in a browser | xigency wrote: | As a kid, I got this game as a gift and really, really wanted to | play it. But after beating the second level, the game would | always crash on my computer with an Illegal Operation exception. | I remember sending a crash report to the developer, and even | updating the computer, but I never got it working. | jakeinspace wrote: | I adored this game as a kid, and I think I do have a faint | memory of some stability issues, but I believe I was able to | beat the game. | bluedino wrote: | This will be fun in 20 years, trying recover 'cloud' backups from | servers found in some warehouse. | ilyt wrote: | Nah it will be very simple: | | ....What do you mean "nobody paid for the bucket for last 5 | years" ? | | There is some chance someone might stash old hard drive or tape | with backup somewhere in the closet. There is no chance there | will be anything left when someone stops paying for cloud. | PicassoCTs wrote: | The author has fantastic endurance, what a marathon to get the | files of the tape. ___________________________________________________________________ (page generated 2023-05-24 23:00 UTC)