[HN Gopher] LTO Tape data storage for Linux nerds ___________________________________________________________________ LTO Tape data storage for Linux nerds Author : detaro Score : 240 points Date : 2022-01-27 12:10 UTC (10 hours ago) (HTM) web link (blog.benjojo.co.uk) (TXT) w3m dump (blog.benjojo.co.uk) | CharleFKane wrote: | I would like to thank the author for bringing back memories. Not | all of which are good... | | (I used to work for a four letter computer corporation doing | enterprise technical support, mostly on tape-based products.) | cassepipe wrote: | "Unlike most block devices these are devices that do not enjoy | seeking of any kind. So you generally end up writing streaming | file formats to tape, unsurprisingly this is exactly what the | Tape ARchive (.tar) is actually for. " | | Haha! moment | magicalhippo wrote: | Nice calculator. Crossover point here for LTO-8 seems to be | around 250TB. I think I'll stick with my HDD's for now. | paulmd wrote: | I picked up an LTO-5 drive a couple years ago and one thing I | found (this is probably a good place to bring it up!) is that the | software documentation for tape utilities and high-level | overviews of the strategies employed to build and manage | libraries of data on this model is pretty thin on the ground at | this point. Completely understandable given how few people have | tapes these days, but it also makes it a little tougher to pick | up from scratch. | | (And in particular the high-level overviews are important because | tapes are wear items, you only have on the order-of a hundred or | two (don't remember the exact figures) full tape reads before the | tape wears out, so this is something you want to go into it | knowing a strategy and not making it up as you go!) | | Since it's complimentary to this discussion I'll link a few: | | https://www.cyberciti.biz/hardware/unix-linux-basic-tape-man... | | https://databasetutorialpoint.wordpress.com/to-know-more/how... | | https://sites.google.com/site/linuxscooter/linux/backups/tap... | | https://access.redhat.com/documentation/en-us/red_hat_enterp... | | https://access.redhat.com/solutions/68115 | | That is, unfortunately, essentially the apex of LTO tape | documentation in 2022, as far as I can tell. | | Do note that in terms of tape standards, LTO-5 is an important | threshold because that's where LTFS support got added, and that's | the closest thing to a "normal" filesystem abstraction that's | available for tape (sort of like packet-formatted CDRWs I guess, | in the sense of presenting an abstraction over the raw seekable | block device). There is also very little documentation on init, | care, and feeding of LTFS iirc - and again, it would be nice to | know any pitfalls that might cause shoeshining and tape death. | Although I suppose in practice it's mostly going to get used more | in a "multi session" scenario where you mostly aren't deleting | files, you write till it's full and then maybe wipe the whole | tape at once, and it's just a nice abstraction to allow the | abstraction of "files" rather than sequential records (tape | archives/TARs, in fact!) along an opaque track with no | contextualization. | MayeulC wrote: | The tech doesn't seem to be too complex, is there an open | hardware project? | | Seems like one could go quite far in terms of performance with | just some basic HW and an FPGA. Is there significant difference | between multiple generation of the tapes themselves, or is it | just data encoding patterns that change? | | More specifically, I was a bit appalled by the "magnetic erasing" | bit. Seems like DRM to me, on a medium that is conceptually | extremely simple. | | One could probably take a VHS drive and convert it to a data | drive, unless I'm being naively optimistic about it? | justsomehnguy wrote: | > One could probably take a VHS drive and convert it to a data | drive | | https://en.wikipedia.org/wiki/ArVid | | > More specifically, I was a bit appalled by the "magnetic | erasing" | | Nobody laments what there is no 'low level format' for HDDs | anymore. | zaarn wrote: | A VHS drive uses a different encoding pattern, the head of the | VHS player is physically incapable of moving like the head of | an LTO tape. Additionally it lacks precision as an LTO tape is | much more densely packed. Lastly, LTO drives use different | magnetic materials and signalling, so by all chance the VHS | head is probably only going to pick up noise. | tssva wrote: | Back in the day there were a few backup products available | which connected to standard VHS VCRs. | dark-star wrote: | PSA: Don't do a cleaning run unless your tape drive tells you to | (there are SCSI sense codes for that). The tapes can pretty well | assess the need (or not) for cleaning, and excessive cleaning can | negatively affect the lifetime of the r/w head (the cleaning | tapes are abrasive) | wolfgang42 wrote: | The mt(1) manpage describes seeking on files, records, and file | marks, but doesn't explain what any of them are. What's the | difference between these these options? (It sounds like file | marks are stored on the tape on a special track or something, but | I can't seem to find any discussion of the others.) | StillBored wrote: | So, all from a few years old memory and its a complex | interwoven mess. | | Lets start with tape has two types of head positioning | commands, locate and space. Locate is absolute (and mt calls it | seek), and space is relative. Mt is generally using space | (although one can read the current position with tell then do | relative space) for all the commands that aren't "seek". Hence | the mt commands are things like "fsf" which is forward space | file (mark), or "bsf" for back space file (mark). At some point | in the past someone thought that each "file" would fit in a | tape block, but then reality hit because there are limits on | how large the blocks can actually be (in linux its generally | the number of scatter gather entries that can fit in a page | reliably). So there are filemarks, which are like "special" | tape blocks without any data in them. Instead if you attempt to | read over a filemark the drive returns a soft error telling you | that you just tried to read a filemark. There are also "fsr" | for forward space records with are just the individual blocks | forming a "file". | | So back to seeking. If you man st, you will notice that each | tape drive gets a bunch of /dev/st* aliases, which control the | close behavior/etc, as well as some ioctls that match the mt | commands. The two important close behaviors to remember are | that if the tape is at EOD due to the last command being a | write it will write a filemark, then rewind the tape unless a | /dev/stXn device is being used, in which case it will leave the | head position just past the FM (this is actually a bit more | complex too because IIRC there may be two filemarks at EOD, and | the tape position gets left between them). | | This allows one to do something like "for (x in *.txt); do cat | $x >> /dev/st0n; done" and write a bunch of files separated by | filemarks (at the default blocking size which will be slow | (probably 10k), replace the cat with tar to control | blocking/etc). Or if you want to read the previous file `mt -f | /dev/st0n bsf 2` to back space 2 filemarks. | | Now, the actual data format on tape is going to be dictated by | the backup utility used to write it. Some never use filemarks, | some do but as a volume separator (eg tar), old ones actually | put FM's between files, but that tends to be slow because it | kills read perf because it takes the drive out of streaming | mode whenever you either read over the filemark (not the part | on man st about reading a filemark). | | Now you can pick which file to read via "mt -f /dev/st0 rewind; | mt -f /dev/st0n fsf X; cat /dev/st0n > restore.file" | | There are also tape partition control commands, and tape set | marks and various other options which may/may not apply to a | given type of tape. Noticeably there are also density flags on | the special file (some unix'es) and via mt. LTO for example | doesn't have settable densities because its fixed by the | physical tape in the drive. Some drives STK T10K/IBM | TS11X0/3592 can upgrade the tape density/capacity when used in | a newer drive. | | That got long... | kortex wrote: | Is there a unix-style streaming tool, like tar/zstd/age, that | does forward error correction? I'd love to stick some ECC in that | pipeline, data>zstd>age>ecc>tape, cause I'm paranoid about | bitrot. I search for such a thing every few months and haven't | scratched the itch. | | The closest is things like inFECtious, which is more of just a | library. | | I would prefer something in go/rust, since these languages have | shown really high backwards compatibility over time. Last thing | you want is finding 10 years later building your recovery tool | that you can't build it. Will also accept some dusty c util with | a toolpath that hasn't changed in decades. | | https://github.com/vivint/infectious | | Ok I just dug up blkar, this looks promising, but the more the | merrier. | | https://github.com/darrenldl/blockyarchive | StillBored wrote: | So, while others have pointed out the media blocks are ECC | protected/etc, I think what you are really looking for is | application/fs control. LTO supports "Logical Block Protection" | which is meta data (CRC's) which are tracked/checked alongside | the transport level ECC/etc on fibrechannel & the drive itself. | | Check out section 4.9 in | https://www.ibm.com/support/pages/system/files/inline- | files/.... | | To be clear, this is a "user" level function that basically | says "here is a CRC I want the drive to check and store | alongside the data i'm giving it". It needs to be supported by | the backup application stack/etc if one isn't writing the drive | with scsi passthrough or similar. Its sorta similar to adding a | few bytes to a 4k HD sector (something some FC/scsi HDs can do | too) turning it into a 4K+X bytes sector on the media, that | gets checked by the drive along the way vs, just running in | variable block mode and adding a few bytes to the beginning/end | of the block being written (something thats possible too since | tape drives can support blocks of basically any size). | | The problem with these methods, is that one should really be | encoding a "block id" which describes which/where the block is | as well. Since its entirely possible to get a file with the | right ECC/protection information and its the wrong (version) | file. | | So, while people talk about "bitrot", no modern piece of HW | (except intel desktop/laptops without ECC ram) is actually | going to return a piece of data that is partially wrong because | there are multiple layers of ECC protecting the data. If the | media bit rots and the ECC cannot correct it, then you get read | errors. | eternityforest wrote: | There's gotta be an API to get the raw data even if it's | wrong, right? | StillBored wrote: | Not usually, its the same with HD's. You can't get the raw | signal data from the drive unless you have special | firmware, or find a hidden read command somewhere. | | The drive can't necessarily even pick "wrong" data to send | you because there are a lot more failure cases than "I got | a sector but the ECC/CRC doesn't match". Embedded servo | errors can mean it can't even find the right place, then | there are likely head positioning and amp tuning parameters | which generally get dynamically adjusted on the fly. This | AFAIK is a large part of why reading a "bad" sector can | take so long. Its repeatedly rereading it trying to | adjust/bias those tuning parameters in order to get a clean | read. And there are multiple layers of signal | conditioning/coding/etc usually in a feedback loop. The | data has to get really trashed before its not recoverable, | but when that happens it good and done. (think about even | CD's which can get massively scratched/damaged before they | stop playing). | dmitrybrant wrote: | If I'm not mistaken, the tape drive automatically adds ECC to | each written block, and then uses it to verify the block next | time you read it. So if there's bit rot on the tape (i.e. too | much for ECC to fix), it will just be reported as a bad block | with no data, and there wouldn't be any point of adding | "second-order" ECC from the user end. | metabagel wrote: | You're exactly right. There is substantial ECC in the LTO | format. If the drive can recover the data, then it's valid. | BenjiWiebe wrote: | There might be a point if you interleaved data and/or had a | much higher amount of EC, such that you could recover from | isolated bad blocks. | c0l0 wrote: | It may not _exactly_ be what you are looking for, but if you | want to protect a stable data set from bit-rot after it 's been | created, make sure to take a look at Parchive/par2: | | https://en.wikipedia.org/wiki/Parchive | | https://github.com/Parchive/par2cmdline/ | genewitch wrote: | Parity archives used to be extremely popular back when dialup | was king. I've often wondered if there's a filesystem that | has that sort of granular control over how much parity there | is. I'd use it, for sure. | uniqueuid wrote: | ZFS is probably closest to what you want. | | It allows you to choose the amount of parity on the disk- | level (as in: 1,2, or 3 disk parity in raidz1, raidz2 and | raidz3). You can also keep multiple copies of data around | with copies=N (but note that when the entire pool fails, | those copies are gone - this just protects you by storing | multiple copies in different places, potentially on the | same disk). | | [edit] To add another neat feature that allows for | granularity: ZFS can set attributes (compression, record | size, encryption, hash algorithm, copies etc.) on the level | of logical data sets. So you can have arbitrarily many data | stores on a single pool with different settings. Sadly, | parity is not one of those attributes - that's set per | pool, not per dataset. | Notanothertoo wrote: | Zfs is king imo. Brtfs is the more liberally licensed oss | competitor and Refs is the m$ solution. | JustFinishedBSG wrote: | Still extremely popular (as in _the norm_ ) on Usenet | dmitrygr wrote: | man par2 | amelius wrote: | With these prices for drives the market seems ripe for | disruption. | dsr_ wrote: | It already is, by spinning disks. Cheaper at the low end, | faster the whole way through, random access beats linear access | for end user expectations. | zozbot234 wrote: | SMR spinning disks are also being widely repurposed as | "archival", somewhat tape-like media since they turned out to | be quite low-performance for the most common use scenarios | (which means they were getting dropped from soft-RAID arrays, | etc.). | amelius wrote: | You are too much focused on read speed. I just want to write | huge amounts of data at a low cost, and don't mind waiting a | day for retrieval. I.e., how one normally uses backups. | lazide wrote: | Most of the time when people think backups they need faster | than 24 hr turn around to restore - because it usually | takes about that long to figure out they even need a | backup, and most people don't think ahead enough for 2 day | recovery time to be useful for most use cases now a days. | | If their local snapshots are dead too, or they look for it | and realize they can't find a copy of something they | thought they had, it's often because they needed that data | right away and it wasn't there when they went to get it. | Hence 'user expectations'. | | That's not in a catastrophic case (which rarely happens) | that's the 'bob just realized he deleted the folder | containing the key customer presentation last Friday' or | 'mary just tried to open the contract copy she needed and | it's corrupted'. | | If it's a once in 10 or 100 year or whatever event, a 1-2 | day turnaround is not unexpected and everything else is | probably broken too. The file deleted or something got | screwed up happens more often and slow response there | grinds things to a halt - and causes a lot of stress | knowing it's not 'solved'. | amelius wrote: | I bet most companies who are confronted with ransom | demands would die for tape backup even if restoration | took a week (which is the amount of time they need anyway | to get the whole mess sorted out). | TheCondor wrote: | And durability. I've had a portable usb hard drive fall | over on my desk and it had major problems after that. Solid | state fixes that but it's expensive and I've heard they can | lose data if not plugged in with regularity | lazide wrote: | Yeah, SSD is not good for long term storage (like a copy | of your tax documents from last year you might need in 5 | years). The expense for size also makes it infeasible to | copy ongoing roll up copies of everything which is one | way of solving that. | KaiserPro wrote: | kinda but not. The problem with spinny disks is that you have | allocate space for them. You can't quickly swap out drives to | take offsite. | | Whats grand about tape is that its still faster to dump to | your library, eject the magazines and store off site. | | Whilst you can do that with HDDs (think snowballs but bigger) | its a lot more expensive and error prone. | | Tape serves a purpose, but thats pretty niche by todays | standards. | wglb wrote: | It would appear the Google backs up the internet on tape: | https://www.youtube.com/watch?v=eNliOm9NtCM | | Or at least did at one time. | fishnchips wrote: | It probably still does. I was on the gTape SRE team until 2014 | and we had lots and lots of tapes and tape libraries back then, | most of them giant beasts with 8 robots each. With the capacity | of new LTO generations constantly growing and the existing | investment in hardware and software it would be unusual to | discard that. | cassepipe wrote: | Apart from archiving huuuuuge amounts of data, does it make sense | for any business to invest in those when you add up in the | qualified work time it necessitates for the halved priced it | provides. Plus the constant reinvestment in hardware. Plus the | fact that to get the data you actually need a human to fetch for | you and operate a machine. | | Who uses this ? | motoboi wrote: | Everyone. | | It's much easier to store tapes in a fire proof and water | resistant safe than to find a fire and water resistant storage. | | So you can keep you backups in disk, but last resort disaster | recoveries should be on tape somewhere. | | Gmail has tapes[1]. And they saved me their asses at least | once. This can give you a hint of how important and how much | use tapes get. | | 1 - | https://www.datacenterknowledge.com/archives/2011/03/01/goog... | madduci wrote: | A lot of companies, trust me | archi42 wrote: | Something not mentioned by the author, but what I was told here | on Hacker News some years ago: If your drive has too much wear | (or misalignment of the drive head?) you might end up with tapes | that you can only read with exactly your drive. | detaro wrote: | That's something I've seen mentioned too but never could verify | if that is something that's actually true with modern tape | standards or not. (i.e. last I asked on HN I was told it wasn't | a concern anymore) If the drive needs to adjust to get precise | enough positioning anyhow, misalignment seems way less likely. | StillBored wrote: | That was true before embedded servo tracks (why the author | mentions you cant bulk erase LTO tapes), its not been true | for ~20 years unless one was using DLT, DAT, etc. | op00to wrote: | It's absolutely true. There is a LOT more to tape storage | than meets the eye. | | Let's say you're using LTO tapes as an archive. Did you know | LTO tape itself is abrasive, but that abrasive is meant to | wear over time with the intended use of the cartridges, which | was backups? | | If you use new tapes a single time, the abrasive doesn't wear | and destroys the tape heads. You will go through a drive head | at month, running the drives 24/7. I had a library used as a | genomic storage archive with 8 drives (always write, almost | never read), and two were constantly out of service, as we | averaged two head replacements from IBM a week. | | This is much less a factor on use tapes that have been run | through a drive a few times. | detaro wrote: | But that's different than "drive will produce tapes that it | can read, but other drives don't"? Because sure, drives can | fail and need service/replacement, but that's less | insidious than a drive producing tapes that are silently | unusable in other drives. | KaiserPro wrote: | It used to be true with DAT tapes. | | I've not seen it on LTO. Where I work we either had very | large tape libraries, with 25+ drives in. We didn't have | drive affinity, so if that happened I would get an alert. | | The other team used to import bulk data by receiving tapes | from all over london and beyond, there must have been | thousands of drives writing and reading that data. Plus we | didn't buy fresh tapes, and they were dropped, thrown, left | in the cold/sun, all sorts. | | I think LTO is pretty solid. | eternityforest wrote: | I wonder why the head has to touch the tape at all? Does | the hard drive thing where you float a few nm away not | apply? | metabagel wrote: | I worked for an LTO tape drive manufacturer for 20 years, | and I never heard about this. I think something else was at | play here, although I could be wrong. The drives are often | used just as you did, although perhaps not always as | intensively. Data is written to tapes, and they are shipped | offsite. Basically, WORN (write once read never). The | backups are for an absolute emergency, such as a 911 type | event where a whole building comes down or a data center | burns to the ground. | | A few factors which may have influenced what you | experienced: | | * The quality of the tapes could be variable. In my | experience, some branded tapes were significantly inferior | to others. | | * If the drive ran hot, then that may have contributed. | IIRC, IBM's LTO-3 drive ran very hot. | | * If you don't write data to the tape fast enough, it won't | stream. It'll shoe-shine back and forth, as it runs out of | data, repositions backwards on the tape, and resumes | writing. I think this might affect the tape head life. | op00to wrote: | These were IBM drives in a QualStar XLS connected to | systems running FileTek StorEdge. I don't remember if | these were Fuji or Sony tapes, but I think Fuji, branded | Fuji. | | We did have shoeshining issues in testing, but increasing | the amount of caching fixed that. Never heard of any | throughput issues in production, but .. .edu so you know | how well we monitored. That was a software issue anyway. | | I think it was LTO5 era, but I don't rightly remember. | | The IBM dude who handled all the hardware support would | take a look at everything, nod, and replace the drive. I | took him out for beer once and that's when he told me | about the issues with the tapes. I left for greener | pastures before that was solved, but it was going on for | a good year. | | Maybe he liked the food trucks outside the building, or | maybe it was cheaper for them to replace the drives than | actually help us fix the problem. Anyway, thanks for the | insight! Glad I don't work on hardware anymore. | MayeulC wrote: | I'm wondering what would be the best way to store archival data? | | A disk image plus compressed, encrypted then forward-corrected | `btrfs-send` snapshots sounds quite efficient to me. Take your | hourly, etc snapshots to a regular disk, write monthly ones to | the tape until fills up, then take another tape and repeat. The | downside is that you need to replay multiple diffs. | | Or would it be a good idea to make more frequent writes? I'm not | sure what best practices are when it comes to tape and backup. | einpoklum wrote: | > LTO Tape is ... much cheaper than hard drives ... a 12TB SATA | drive costs around PS18.00 per TB ... a LTO-8 tape that has the | same capacity costs around PS7.40 per TB ... That's a significant | price difference. | | Actually, it isn't very significant. Price factor of 2.5. I had | thought tape storage was cheaper than that. And then there are | the drives: A drive to write (3,000 GBP for LTO-8), and at least | a couple more drives for reading tapes. | | At this price ratio, I would say that ease-of-use and | safety/robustness of the backed-up material are more important | considerations. | shellac wrote: | Yes, this doesn't sound quite right to me but it may be an | economies-of-scale thing. I work on an HPC system and we budget | an order of magnitude less for tape storage, and that has held | for quite a few years. | AshamedCaptain wrote: | I am also worried for the long-term. If there are new | generations so frequently and backwards compatibility is | limited or not guaranteed, I ponder if you'd be able to find a | working-condition tape reader for your 20-year old tape... | | At least it's likely I can find a USB port 20 years from now, | or a DVD reader (they are still being manufactured today, when | even more than 20 years have passed since their introduction, | and they are even compatible with much older CDs...). | ktpsns wrote: | What was actually ignored at that comparison is energy costs. | Which can get quite somewhere, if you have all your disks | running 24/7 and do not use power saving functions (which is | frequently turned off in server contexts). Costs are in the | ballpark of 5W per drive, given a contemprary 16TB drive this | means 0.3W/TB, with 0.25EUR/kWh (a typical consumer price in | Germany), this is roughly 0.6 EUR per TB per year. However, | probably the replacement costs for these always-on disk drives | will be even higher. | q3k wrote: | Another consideration related to this is that tapes, being | usually offline, as much more secured against accidental (or | malicious!) erasure when compared to always-on hard drives. | | Also related is that tapes can easily be transported | around/offsite, literally thrown in the back of a truck as | they are. Try doing that to hard drives and see how many | start throwing bad sectors after a round-trip. | einpoklum wrote: | HDD's can be taken offline. But beyond that - if you're | using HDDs as backup, you'll probably be using an HDD | drawer, e.g. something like one of these: | | https://www.newegg.com/global/p/pl?d=hot+swap+hard+drive+ba | y | | ... and the actual disks will usually be stored offline. | So, no accidental erasure. But I agree that tapes are | probably less sensitive to transportation. | piaste wrote: | If you want tape-like offline storage on HDDs, you can use a | SATA docking station. Keep the 'active' backup drives plugged | in, store full drives wherever you like. | | As a bonus, they can generally be used to offline clone | drives. | archi42 wrote: | Tapes are offline and even require manual loading, so I think | it's feasible to mitigate this by just powering down the | backup system. At least that's what I do (with my primary | NAS). But yeah, disk idle usage should not be underestimated. | | Also, some nit-picking: Energy prices in Germany are | currently MUCH higher than that. We moved and had to get a | new contract. Close to 40c/kWh. This makes your point a bit | stronger. | | //edit: Also2, when doing the math I realized I should first | transcode suitable content to h265 (per TB saved the | necessary power is cheaper than a new disk), and as a second | step replace my four or five remaining 1 TB HDDs with a | single bigger drive to reduce the idle power draw (the NAS is | on a btrfs mixed-size RAID1). | pessimizer wrote: | > and at least a couple more drives for reading tapes. | | Why? | op00to wrote: | Even in an "enterprise+++ class" multi petabyte, multi drive, | totally integrated from top to bottom tape archive for | scientific data, there would be all kinds of errors found by | our data validation process that would have failed an archive | restore. It's not just cache overruns, some times the tapes | or drives just screwed up silently. | benjojo12 wrote: | If you have 500TB of tape, the chances are that you are | reading at least one tape while also needing to write stuff. | | I've personally never experienced that scale, I'm sure the | industry has some recommended ratio of drives to tapes. | op00to wrote: | When I evaluated this, it was all about read and write | access patterns. So much data coming in for so much amount | of time, that needs so much validation, and will be | restored so many times in the next few years, etc. It's | pretty easy if you know your data flows, but when it's a | big question mark, you just kind of throw hardware at it | and fix the bottlenecks when they come up. We usually wrote | more than we read, but we absolutely needed to keep read | capacity open. | dale_glass wrote: | Backups are there so that they can be restored. If your only | drive is dedicated to writing, then you may never bother | reading anything, and that's bad because you should verify | your backups. | | Also, tape is slow. The MB/s is pretty nice on the latest | tech, but a tape is pretty big, so if you have a lot of stuff | it'll take a good while. Google says it takes 9.25 hours to | write a full LTO8 12TB tape. Which means that if you have a | sizable backup, in case of needing a full restore you might | well spend a whole week reading tapes. | | And that's not accounting for that something might suddenly | break, and the time where that becomes important is right | when you need something restored urgently. | connorgutman wrote: | I recently purchased a LTO-5 drive for my Gentoo-based NAS and | have a few key takeaways for those who are interested. Don't buy | a HP tape drive if you want to use LTFS on Linux! HPE Library & | Tape Tools is prety much dead on modern Linux. Official support | is only for RHE 7.x and a few versions of Suse. Building from | source is a dependency nightmare that will leave you pulling | hair. IBM drives have much better Linux support thanks to | https://github.com/LinearTapeFileSystem/ltfs. That being said, | IMO, you should consider ditching LTFS for good ol' TAR! It's | been battle tested since 1979 and can be installed on basically | anything. TAR is easy to use, well documented, and makes way more | sense for linear filesystems. While drag&drop is nice and all, it | really does not make sense for linear storage. | smackeyacky wrote: | Upvote for tar! LTFS seems like an overly complex solution to a | relatively simple problem that tar already solved. Treating | tapes like disks and trying to run a file system on them | ignores the way they work. | wazoox wrote: | Hum, that reminds me that I've written a somewhat more complete | user guide for LTO tapes, but in French: | https://blogs.intellique.com/tech/2021/08/20#BandesCLI | | Let me know if you'd like an English version :) | wazoox wrote: | I did it anyway: | http://blogs.intellique.com/tech/2022/01/27#TapeCLI | cbm-vic-20 wrote: | > Unlike most block devices these are devices that do not enjoy | seeking of any kind. | | Old-school DECtapes were actually random-access, seekable block | devices! They help 578 blocks of data, each block being 512 bytes | (or to be more period correct, 256 16 bit words), so 144kiB. They | could be read/written in both directions. When mounted on a tape | drive, the OS (like DEC RT-11) would treat it just like how a PC | DOS computer treats a floppy: you could get a directly listing, | work with files, etc. The random access nature caused the tape to | move quickly back and forth across the tape head, a process known | as "shoe shining". | | https://youtu.be/ZGBS8mBAfYo?t=579 | rbanffy wrote: | I've seen AIX being installed from a DDS tape, after booting | from said tape. | | Fun times. | StillBored wrote: | Tape can do random seeks, but its generally append only. LTO, | though supports partitioning which is utilized by LTFS | (https://www.lto.org/linear-tape-file-system/) to provide a | mountable filesystem abstraction. It works just like any other | filesystem, but one has to remember that seeks are much slower | than HDs and that overwriting/updating a file basically is like | a versioned FS where the old data is still being stored. | | Edit: Also, tape formats tend to come in two scan methods since | they are generally wider than the tape heads (which frequently | are actually multiple heads). Helical scan (think VHS/DAT) and | serpentine. LTO is serpentine which means it writes a track | from beginning to end, then writes the next track in "reverse" | from end to beginning, then the next track again from beginning | to end. Back and forth until it hits its track limit. | | So basically just about every modern drive reads and writes in | both forward and reverse. | | Although shoe shining (backing up to start the next read/write) | is still a thing despite variable speed drives which try to | speed match to the data rate the host is reading/writing at. | EvanAnderson wrote: | This makes me think about the Stringy Floppy: | https://en.wikipedia.org/wiki/Exatron_Stringy_Floppy | tssva wrote: | The Coleco Adam home computer also had tape drives which were | random-access seekable block devices. 2 tracks with 128 1k | blocks per track for a total capacity of 256k. Coleco called | their tapes digital data packs. They were standard compact | cassette tapes with some additional holes. If you drilled the | appropriate holes you could use standard tapes instead of | paying the Coleco premium. | | CP/M required booting from a block device and as far as I know | the Coleco Adam was the only computer which could boot CP/M | from a tape. Once booted to CP/M the tape drives were treated | just as floppies. | tlamponi wrote: | Interesting read, as with most of ben's blog.. And yeah, | buffering is definitively required to get acceptable speed out of | tape tech. | | If you want a LTO Tape solution with more bells and whistles you | could check out Proxmox Backup Server's tape support: | | https://pbs.proxmox.com/docs/tape-backup.html | | We also rewrote mt and mtx (for robots/changers) in rust, well | the relevant parts: | | https://pbs.proxmox.com/docs/command-syntax.html#pmt | | https://pbs.proxmox.com/docs/command-syntax.html#pmtx | | The introduction/main feature section of the docs contain more | info, if you're interested: | https://pbs.proxmox.com/docs/introduction.html If you have your | non-Linux workload contained in VMs and maybe even already use | Proxmox VE for that it's really covering safe and painless self- | hosted backup needs. | | Disclaimer: I work there, but our projects are 100% open source, | available under the AGPLv3: https://git.proxmox.com/ | azalemeth wrote: | Do you run a service where I can give you data and reasonable | money, and you store it on tapes for me? Low cost cloud storage | prices seem very distant from this, because presumably it's | usually spinning rust and not tapes that are doing the storage. | I'd be into a cheaper, larger storage service where this was | offered. | tlamponi wrote: | No, we don't provide hosting services - only the software, | i.e., Proxmox VE for Hypervisor (VM and Linux container), | clustering, hyper-converged storage (Ceph, ZFS integrated | directly and most Linux stuff somewhat too, then Proxmox | Backup Server with PVE integration, can do duplicated and | incremental sending of backups and save that to any Linux FS | or, well, LTO Tapes, at last (at least currently, we got more | up the pipeline) there's Proxmox Mail Gateway, the oldest | project and a bit of a niche, but there's not much else like | it available today anymore. | | > and you store it on tapes for me? | | I mean, we can do client-side encryption and efficient remote | syncs, so such a service would be possible to pull of with | PBS, but no, we don't got the bunker or dungeon to shelve all | those LTO tapes at the moment :-) | Johnny555 wrote: | What is reasonable money? AWS Glacier Deep Archive is around | $1/TB/month. Since it includes Multi-AZ replication for | "free", you'd have to store multiple tapes in multiple | facilities to get the same durability with tapes. | | Retrieval costs are additional of course, and depend on how | quickly you need access to the data, but if you just want to | store data long term in case of disaster, $1/TB for multi-AZ | replicated data seems like pretty reasonable pricing. | | LTO-6 tapes hold 2.5TB of data (uncompressed), assuming you | store 2 for redundancy, you'd need to find a place that will | store them for $1.25/tape/month to break even, plus you're | paying $25 for the tape itself, so over 3 years, that's | almost another $1/month/tape. Plus the tape drive itself is | around $1500. | | You can use newer tape technology for better economies of | scale, but your buy-in cost is higher due to the higher price | of the tape drive, so you'd need a pretty high volume of data | to break even. | terafo wrote: | Glacier cost in the cheapest region is 3.6$/TB/month. Plus | at least 50$ to download that terabyte once it's needed(if | hardware that you're backing up is not in AWS), and I don't | even factor in retrieval costs. You can get HDD storage | cheaper than this(twice as cheap with some providers) if | you are willing to use dedicated servers. And they come | with unlimited traffic. And you can use hardware there for | something. Glacier is expensive AF. | Johnny555 wrote: | _S3 Glacier Deep Archive_ is the closest equivalent to | off site tape storage. | | From their pricing page: | | S3 Glacier Deep Archive - For long-term data archiving | that is accessed once or twice in a year and can be | restored within 12 hours - us-east-2 (Ohio) | | All Storage / Month $0.00099 per GB | | https://aws.amazon.com/s3/pricing/ | terafo wrote: | Sorry, I confused it with Archive access tier. Still, you | need to spend at least 50$ to download it from AWS. | Johnny555 wrote: | This is deep archive offsite tape storage, not something | you'd need to restore often. | | When I last managed offsite tape backups, I never planned | on really needing to retrieve the data -- I had the data | on disk and on the most recent tapes. (I did do periodic | restore tests) | | If I had to restore the data, I wouldn't care how much it | costs (within reason). | [deleted] | aperrien wrote: | That is really impressive! Are the Proxmox tape utilities | separate from Proxmox itself? I have a Synology NAS that I'd | like to back up to tape. I actually have a tape library, but I | haven't seen anything that looks like a simple solution for | this until now. | tlamponi wrote: | Well, the CLI tools are not really couple to Proxmox Backup | Server and could be built for most somewhat modern Linux | distros, quite possibly also other *nix like systems. | | The whole tape management is in the common PBS API, so that'd | be a bit harder to port but not impossible. For example, I | made some effort to get all compile on AARCH64 (arm) and | while we do not officially support that currently there are | some community members that run it just fine. | | So, maybe, but could require a bit more hands-on approach. If | you run into trouble you could post in the community forum | (<https://forum.proxmox.com>). | epilys wrote: | What I never see explained, is what exactly PCI cards should I | get to get the full sized SAS drive to work with my desktop PC? | Because looking at server component stores, I see there digit | prices for SAS controllers, and the author mentions they are | cheap. | c0l0 wrote: | My advice: On eBay (or any other platform that makes it easy to | buy used hardware components), go look for "sas2008 4e", and | check out the offers. You should be able to get a decent HBA | driven by mpt2sas/mpt3sas for around 40 to 60US$. | albertzeyer wrote: | I'm interested specifically for long term archiving. So these | tapes claim 30 years. I have read that some types of CD, DVD or | Blue-rays can last much longer. | | https://superuser.com/a/71239/37009 | | For example the M-DISC (https://en.wikipedia.org/wiki/M-DISC). | | > Millenniata claims that properly stored M-DISC DVD recordings | will last 1000 years. | buttonpusher wrote: | Yes, but storing many TBs on several low volume disks is a PITA | unless you can invest in a robotic library. | | I wonder if Sony's ODA format could ever become more popular in | the consumer market. I've never heard anybody mention it | before. | | Alternatively, I wonder if there could even be a "prosumer" | robotic library system for common optical disks, something like | a desktop archival data jukebox... | albertzeyer wrote: | There are many examples of cheap self-build robotic systems | (basically robotic CD changers). E.g.: | | https://hackaday.com/tag/cd-changer/ | | http://hackalizer.com/jack-the-ripper-is-an-automated-diy- | di... | | http://hackedgadgets.com/2006/06/07/cd-changing-lego-robot/ | | Yes, you definitely want sth like that. And further extend | it. | londons_explore wrote: | If I had a large amount of data I needed to archive long term | and cost effectively, I would archive it to 12 different medias | with an 4,8 erasure code, such that if any 4 of the 12 media | types are readable, then I can recover the data. I'd choose | media like a few types of hard disk (different vendors), DVD's, | SD cards, USB memory sticks, tapes. | | I would then store those bits of media geographically and | politically distributed. And I'd store it with paper documents | describing the encoding, the file formats, the compression, any | encryption, etc. I'd also include a few physical computers (eg. | a raspberry pi or laptop) that has all necessary software to | read, decode, and display the data. Set it up to be usable by a | non-expert - in 1000 years time, there may be nobody who knows | how to use a shell or open a file! | | And I'd have a 2nd copy of the whole lot on hard drives | connected to the internet for day to day serving of the data to | people who need to see it. All the stuff above is only needed | in case of organisational failure, war, civilisation collapse, | etc. | albertzeyer wrote: | That sounds all nice... but do you actually do that? I'm sure | you have some amount of data (maybe not so large) that you | want to backup long-term? As most of us do? | | If you do that, I would really love to read some more details | on how you actually organize that. | raron wrote: | Maybe Github's "code vault" would be interesting for you: | https://github.com/github/archive- | program/blob/master/GUIDE.... | https://archiveprogram.github.com/ | londons_explore wrote: | My personal data I have no need to keep beyond my own | lifespan, and I don't have much of it, so it's easy. | | The above is what I have set up for some organisations who | want to keep data for thousands of years. | | There are other bits to the process, like every 10-30 | years, repeat the process with the new data _and_ the old | data. This time, the 'old' data will be much smaller | compared to the storage mediums, so keep that data | uncompressed, preferably unencrypted, and un-erasure coded | in every geographic location. That removes many barriers to | access the data, and increases the chances someone that | finds it in 200 years bothers recovering the data. | | Sadly in the future world there is a high chance some of | the data is copyright, illegal knowledge or gdpr-impacted | and all records need to be erased. There isn't really a | good solution to that. It's almost impossible to protect | against future humans _wanting_ your data gone. | c0balt wrote: | Iirc the cost per Tb, compared to tape, made discs unviable for | most backup/ archival applications. | paulmd wrote: | Depends on who's asking. Amazon Glacier never formally | disclosed their storage medium (at least as of a few years | ago) and one of the theories on what it might be was actually | a robotic optical disc changer library based on BD-XL, and | the cost/capacity actually does math out. Yeah, discs might | be $15 a pop (for a quad layer/128GB disc) for you as a | consumer, but when you're Amazon and you'll be buying the | complete output of at least one optical disc factory, the | economies of scale kick in. It's just expensive because | there's no market for 128GB media for consumers (and honestly | these days hardly any market for WORM media at all as a | consumer), it's not inherently that expensive to make the | discs. | | (I believe the final consensus pointed to arrays of HDDs | where most of them are powered off, and the number of "live" | drives per rack is bounded to allow high density/low cost, | hence the need for access time/service level bounds, but the | BD-XL idea is still intriguing!) | | With the consumer discs, even considering cost per GB, the | amount of effort required to handle a large library of low- | capacity discs is just too great even if the cost is a little | bit better. 128GB discs would have been very usable 5 years | ago but again, those discs were never affordable to | consumers, and the 25GB was still some effort at that time. | Today even 128GB is not all that much, as data has grown. As | far as I know there is nothing realistic on the horizon to | replace blu-ray with higher capacity either, if movie content | started being released in 8K it probably would be something | like BD-XL with AV1 encoding (or maybe H265 again), not a | fundamentally new iteration like DVD->BD. | | The future for consumer storage seems to be SSDs and hard | drives for fast and slow/bulk storage, and cloud for nearline | storage. Tape is still relevant for enterprises though | especially in automatic libraries. | c0balt wrote: | Interesting, i didn't know about glacier. | | The theory of hard drives being shut off/ powered on | dynamically in a rack sounds intriguing. Sounds simple and | yet difficult because of the rare usecase, i.e. no | commodity hardware available. Maybe something to test out | for colo backups to keep power usage down and prolong disk | health. | numpad0 wrote: | Capacity per disc too. Blu-Ray discs tops at 125GB and | there's no cheap and easy way to automate disc handling to | work around that. | at_a_remove wrote: | I am also interested in some long-term archiving: in | particular, .ISOs of various Blu-Ray, DVD, and CD media | releases. | | Still, aside from it being prohibitively expensive (LTO-8 seems | like something of a floor given the size of Blu-Rays), tape | backups seem to be a hard area to get into. I did some crappy | little DLTs in the 1990s but nothing since, so "what software?" | and the like questions are all new to me. And this would be | with just a single drive, not even a library. | Robotbeat wrote: | Magneto-optical disks using glass media (instead of plastic) | have a rated stable media lifetime of at least 50 years and can | probably last a century or longer. Glass DVDs are a thing and | often can be read in regular DVD drives. | dehrmann wrote: | > Magneto-optical disks using glass media | | Are there any commercial products that use this technology? | eternityforest wrote: | Not magneto, but m-disk makes a blu ray that lasts 1k years | supposedly. Some people don't trust it though. | EvanAnderson wrote: | I believe mag-op has fallen out of fashion. I worked with | HP-branded mag-op "platters" and drives back in the late | 2000's. Plasmon and Sony both had offerings in that space | too. | c0l0 wrote: | I recently started looking into using an LTO-7 tape drive that I | got handed down, along with a few dozens of pristine LOT-6 tapes, | for archiving purposes. I got to play around a bit with SAS HBAs, | and was kinda shocked how much of a difference that can make in | the user (or shall I say sysadmin?) experience: LTO-6 tapes are | spec'd to transfer rates of around 150MB/s, so well within the | reach of even the the first SAS gear generation. However, the | very first SAS HBA with external SFF-8088 connector I managed to | get my hands on (an LSI SAS1068e) topped out at a disappointing | 80MiB/s, no matter what I tried in terms of blocking and | buffering. Switching to a more modern (but still old) LSI | SAS2008-based HBA got me close to the theoretical maximum. | | Then there's the (to me, still open) question of how to best use | the actual tape storage capactiy... Since my hardware is newer | than LTO-5, LTFS (https://github.com/LinearTapeFileSystem/ltfs) | is an option for convenient access, especially listing tape | contents, but that could make it hard for other people down the | line to restore data from the tapes I create. | | It's probably safest to assume that tar will always be there, at | least wherever there's tape, too. GNU tar also handles multi- | volume/-tape archives, which seems like a necessity if you need | to back up amounts of data that exceed a single tape's capacity. | Then again, if you want to use encryption with actual tar | (important for the kind of data I need to archive), your only | option seems to be piping the whole archive through something to | compress the stream, which will make accessing individual records | in the archive opaque to the drive itself... and you can't just | dispose of individual keys to make select parts of the archived | data go away for good, either. | | Also, I would like to conserve as much tape as (conveniently) | possible in my archiving adventure. There's "projects" (i.e., | top-level directories of directory trees) that consume more than | one tape of their own, and then there's smaller projects that you | can bin-pack together onto tapes that can fit more than one such | project. | | I've started implementing a small python wrapper around GNU tar | to solve a number of these problems by bin-packing projects into | "tape slots" and also keeping track of tape-to-file mappings in a | small sqlite database, but a workable solution for the encryption | problem(s) is not something I managed to come up with yet... If | someone has an idea (or better yet, a complete and free | implementation of what I am trying to hack together :)), please | be so kind and let me know! | TheCondor wrote: | LTFS has been reasonably well supported and it's fairly open. | (I think it's totally open and published, but I haven't drilled | deep in to it) I haven't manually restored files but I have | switched vendors and it was transparent. It makes tape almost | shockingly good, if you can identify by name what you want to | recover you can recover it quite quickly. | | I had previously used Blu-ray for backups, I think they are | fairly durable if you have a dry, cool place to store them, but | if you have to find date spread over 20 discs, it's quite a | pain. Now it would feel better if Redhat or Suse or somebody | cooked ltfs in to their products as a first class thing. I | think the catastrophic recovery process would involve building | enough of a system to download and install ltfs to access the | tapes. I could also create a "recovery system" and then just | tar that on to a tape too. | | My advice strategy has been to keep things relatively warm and | when ltfs starts to feel like a liability then I'm going to | move the whole archive to something else, fortunately it's not | 100s of br discs, it's tens of tapes so it will take some hours | but it's mostly waiting on data to stream. | [deleted] | ndespres wrote: | I think what you are describing is a feature of the Amanda | backup system, which might be worth a look. It supports writing | to a library of "virtual tapes" which can then be backed by | real tapes, tape libraries, hard disks, etc. and will handle | the splitting/overflow problem that you are dealing with. | | https://www.zmanda.com/downloads/ | abbbi wrote: | if one wants to play with virtual tape libraries, quadstorvtl is | a nice solution to that: | | https://quadstor.com/ | | unfortunately they dont seem to have an open vcs for the | source... (other than really old versions on github) | | other than that there is mhvtl: | | http://www.mhvtl.com | Synaesthesia wrote: | That's a lot of storage! Can't really think of a use for this | (200tb plus) personally but it is appealing. | throw0101a wrote: | Tape makes more sense the larger you go, as it help amortize | the fixed/upfront costs. The incremental costs of buying more | tapes (that are re-usable) isn't that much at scale. It's often | relatively cheap insurance against data loss for many | organizations. | | A lot of 'enterprise' backup software is also now coming with | hooks into cloud storage (e.g., S3 APIs), but then you have to | worry about bandwidth and the time it takes to get the bits | offsite at "x" bits/second. | | Of course you also have to worry about retrieving the data in | case of disaster per the Recovery Time Objective: | | * | https://en.wikipedia.org/wiki/Disaster_recovery#Recovery_Tim... | | Also: a backup has not happened until you try and succeed your | recovery process. | metabagel wrote: | > but then you have to worry about bandwidth and the time it | takes to get the bits offsite at "x" bits/second. | | Reminds me of the saying that the fastest throughput is | achieved by a 747 full of hard drives. | | > Also: a backup has not happened until you try and succeed | your recovery process. | | A thousand times this. | simcop2387 wrote: | For me it's a lot about just being a data hoarder and never | _having_ to delete something because i 'm low on storage. About | half of my system though is taken up by system backups and | virtual machines. I should do a cleanup of those, but the | freedom of just being able to spin up something new or put a | new backup on there without ever going, "do i have enough space | for this?" is rather nice. | organsnyder wrote: | I also rarely/never delete anything, but my ~2tb NAS still | has plenty of room. I guess it makes a difference that the | only media I store is my own photos and videos. | Spooky23 wrote: | Backups are a really interesting business. I helped out a | colleague a few years ago with a project in a big data center and | it was like a whole world that nobody knew existed. | | Because of the RTOs and backup windows, the supporting | infrastructure was _fast_. The caching layer stuff was the | fastest disk in the data center by far, and the team was a small, | tight group of people who basically honed their craft by meeting | auditor and other requirements. The management left them alone | and they did their thing. | | That was about a decade ago now; those guys have all moved on to | really big things. | StillBored wrote: | Its still that way, the netflix guys get a lot of press for | their bandwidth numbers but plenty of backup systems were | getting similar (or greater) bandwidth numbers years ago, since | many of the caching stacks are basically pcie or mem bandwidth | limited. The 300MB/sec number the author lists is really slow, | and likely appropriate for LTO3/4, (IIRC, the wikipedia numbers | are understated) LTO7+ can peak at > 1GB/sec with the modern | drives going even faster if the compression is left enabled. | So, given a library with a few dozen drives, the bandwidth gets | insane. (ex: SL8500) | trasz wrote: | Someone should ask Spectra Logic folks for their numbers :-) | | (Spectra Logic's tape libraries run FreeBSD too.) | monocasa wrote: | SpectraLogic's code isn't in the data plane, you hook up to | the drives directly, and the drives can forward changer | requests to the internals of the library. So it's however | fast the drives are (which are all third party). | | Also last I checked freebsd was used for their disk | product, not tape. | johnklos wrote: | LTO has been around for more than twenty years, true, but not | quite thirty, so we can't test the claim of thirty years of shelf | life, but DLT, which are surprisingly similar, came out in 1984, | and lots of thirty year old and older DLT media has been shown to | be readable. | | The tape drives themselves are much more of an issue than the | tapes. It's a shame, because it necessitates moving data on older | tapes to newer generation tapes after a few generations (which | reminds me I have to do that with some LTO-3 tapes). | wheybags wrote: | My one experience of digital magnetic tape is mini-dv | cassettes. I recently ripped a bunch of old home videos from | some cassettes from the 2000s, and quite a few were fairly | damaged. Compared to the vhses from the same time and even | older, they were way worse. | jgrahamc wrote: | Speaking of tape lifetimes, my old cassette CrO2 tapes seem to | have survived my parents' house: | https://blog.jgc.org/2009/08/in-which-i-switch-on-30-year-ol... | grapescheesee wrote: | Many clients I have seen using tapes for archive or onsite backup | keep them in a humidity and temperature controlled device (looks | like a mini fridge). Seems the emphasis is on humidity for the | onsite backup rotations. | watersb wrote: | Everyone who cares about backups chooses a backup system design. | | Anyone who cares about their stuff needs to practice a full | emergency RESTORE. | | I have met very few people who actually do that. For most systems | I've seen, the first full test of the restore process is a very | scary first production usage of the restore process. | | Which is very exciting, sure. I don't want excitement in my data | management life. | | (I actually see weekly test of onsite backup power at the local | banks, and at some large commercial kitchens. Those diesel | generators are very loud. I've never seen systematic test of UPS | or generators in a front-office environment.) | smackeyacky wrote: | There is one recommendation there I find a bit questionable and | thats encryption. If you are out of options and restoring from | tape, might be better to have it uncompressed and not encrypted. | Its possible, after some physical disaster that you are on | somebody elses infrastructure and having some encryption on your | data doubles the problems you might have. | | I use an ancient LTO2 drive for last resort backups that are off | cloud and off premises. Its more peace of mind than practical on | a daily basis but I did find myself restoring a few files a | couple of weeks ago as I had fat fingered an rm command. It was | quicker than getting them from S3 glacier. | El_RIDO wrote: | I'd like to suggest two arguments that made me use software | encryption on my tapes instead: 1. You don't have to trust the | hardware and can use tool I trust and have the sources for. 2. | If you encrypt yourself you can combine it with something like | par2 to generate error detection and recovery data, letting you | restore the encrypted file off a damaged tape. | | A downside of encrypting yourself is that you can't benefit | from the hardware compression either, hence the articles | suggestion to do that in software before compressing as well. | | Personally, my tape writing workflow is: dar (per file | compression, skips uncompressable mime types + encryption) | followed by par2cmdline with 30% redundancy. For comparison: | CD-ROMs have 33% redundancy information (8 bits per 24 bits, | CIRC encoding). | op00to wrote: | The tapes compress themselves. There's no real need for file | compression. | benjojo12 wrote: | I agree somewhat. Encryption is more critical on tape because | there is no easy path to wiping a tape, and in a company | situation if you need to erase something in your backups too | (think GDPR erasure), then encryption is reasonably critical | unless you want to go though all of your cold backups. | | For my archival use (the reason why I got into this in the | first place) I do not encrypt nor compress the data going to | tape. For server/desktop backups. they are compressed and | encrypted. | rowanG077 wrote: | It's trivial to wipe data on a tape with a degausser. You | destroy the Tape in the process since it also wipes out | factory written servo tracks. | kortex wrote: | Is there a way to restore the servo tracks? This sounds | like the kind of hack a dedicated nerd could pull off with | an arduino and duct tape. | ansible wrote: | Without looking into the specs, at the very least, you'd | need to modify the LTO drive firmware. The drive itself | isn't designed to operate without the servo tracks. Those | are written to newly-manufactured tapes with special | equipment at the factory. | | So, it would take a very dedicated nerd indeed. | rowanG077 wrote: | Not that I know off. But the positioning on recent Gen | LTOs is pretty tight. I don't think it's out of the realm | of possibility for a dedicated nerd but it won't be | trivial. | throw0101a wrote: | > [...] _nor compress the data going to tape._ | | Just to note that tape drives have built-in compression that | generally is done transparently in the background. So while | using something like _zstd_ (per the article) may get more | bits on a given tape, there is some compression that one gets | "for free" without doing anything at all. | | * https://en.wikipedia.org/wiki/Linear_Tape- | Open#Optional_tech... | | * https://en.wikipedia.org/wiki/Magnetic_tape_data_storage#Da | t... | benjojo12 wrote: | I mention this in the post itself | lights0123 wrote: | You mention that they're advertised in the amount of | compressed data that can be stored, not that they | actually compress data themselves. I thought you meant | that they assume you use a compression algorithm | yourself. | benjojo12 wrote: | Ah, ok fair enough! I should have pointed that out more | clearly! | throw0101a wrote: | You wrote: | | > _Drives above LTO-4 have built-in hardware encryption, | however I would steer away from using it and instead just | encrypt data yourself (possibly with the tool I helped | make called age!). Like most things, you should also | consider compressing your data before encrypting and | writing it to tape. LTO tape capacities are often quoted | in their "compressed capacity" which is a little cheeky | since it assumes basically over a 50% compression ratio, | this is not at all likely to be true if you are writing | video or other lossy mediums like images etc to the tape. | I generally run my data through zstd to compress and then | age to encrypt. Zstd and age are quite fast and I've not | found them to impede performance noticeably._ | | If someone is not familiar with tape drives, I think it | would be easy not to realize that the compression is | built into drives like the explicitly called out "built- | in hardware encryption". | lostapathy wrote: | > Encryption is more critical on tape because there is no | easy path to wiping a tape. | | I used to work for a government agency. We ran backup tapes | that rotated out through a degaussing machine that spun them | around for like 10 minutes to wipe them. It's not common to | have, but it's definitely easy. | amelius wrote: | Would love to see an article of someone taking a drive apart, and | hooking an oscilloscope to the read head of a tape drive. | dmitrybrant wrote: | Funny, I just recently did a similar thing: found an LTO-4 tape | drive on eBay for $40, and a few used cartridges (2TB each) for | $20. | | But before writing my backup to the cartridges, I tried reading | their contents, and found that they actually came from a major | film studio, with backups of raw animated film content on them! | paulmd wrote: | one thing to emphasize is that the quoted LTO capacity numbers | are usually including transparent device compression - if your | data is not compressible, such as ZIP/RAR files or compressed | audio/video, that's not the number you will get! | | Home users will really want to think in terms of the "raw | capacity" imo. This is normally half of the advertised capacity | for the older standards (I believe the newer ones have stronger | compression that squeezes a bit more). LTO-5 tapes are 1.5tb | raw, for example. | | Maybe you'll get a little bit out of it, but a lot of the | things you'd want to back up (and especially the bulkier stuff | that really eats space) are already compressed. Family photo | library, audio/video storage? JPGs are compressed, H264/H265 or | MP3/FLAC/etc are already compressed. System images? A lot of | application files are already compressed. Home user scenarios | are not outlook mailboxes and database backups like the | "official" scenarios. | nybble41 wrote: | > Home users will really want to think in terms of the "raw | capacity" imo. | | _Everyone_ would be better off thinking in terms of the raw | capacity. "Compressed capacity" is nothing but a marketing | gimmick. Even in enterprise use cases the compression ratios | will vary, and the drive's transparent compression is | unlikely to offer the most savings. If your data is at all | compressible you should compress the backup yourself before | sending it to the drive. | dark-star wrote: | It actually works pretty well. Compression in the tape | drives is certainly worse than what you could achieve by | zipping before, but at least it works at line speed (which | is a couple hundred megabytes per second). Factor in the | fact that you often write out multiple streams in parallel | from a single server to multiple tapes, and it'll become | rather tricky to find a compression algorithm that keeps up | AND compresses better than the drive. | | And most enterprises don't really care if their monthly | backup requires 10 or 15 tapes. And zipping it all up | beforehand requires even more space on the primary storage | which is even more expensive than a couple dozen tapes | nybble41 wrote: | It's still misleading to market the tapes based on a | compression factor which will depend in practice on the | data being stored. The _tape 's_ capacity is one thing; | the effectiveness of the _drive 's_ hardware-accelerated | compression algorithm on any given dataset is something | else entirely. The two should not be mixed. | NavinF wrote: | I was looking into the same thing recently. The price is right | ($10/TB tape vs $13/TB HDD) and it'd be nice to have fewer HBAs | and SAS cables, but having to swap the tapes manually every 2TB | (every 6 hours?) kinda ruins it for me. An automatic tape | library would be ideal, but I couldn't find any in the 100TB | range that are cheaper than spinning rust. | numpad0 wrote: | I have a 2U sized LTO2 robot that might have collapsed from | stuffs on top by now, but it seemed to have a standard 5" bay | drive inside with a passthrough adapter marshaling the drive | and the loader mechanism. I wonder if a more recent drive can | just be dropped into those libraries or if they need firmware | supports. | ChuckNorris89 wrote: | Damn, that's cool. I wish the second hand market in my country | was abundant with cheap exotic hardware. Then again, maybe not, | because I'd probably fill my small apartment from hoarding | stuff like this. | | Still did you try to recover any material and wach it? | dragontamer wrote: | I've come to the understanding that tape-drives are for people | who need to "build a custom-sized storage solution", especially | if you need capacity but not necessarily read/write speeds. | | A tape-drive is your read-heads. The tape is like a platter. The | tape-library / jukebox is just a robotic mechanism for switching | tapes into and/or out of the read-head. | | ---------- | | If you need a Petabyte of uncompressed storage, you can reach it | with a tape-library consisting of 84 LTO8 tapes (12TB each). If | read/write of 400MB/s is sufficient, one tape drive is | sufficient. If you need faster access speeds, you buy a 2nd, 3rd, | or 4th tape drive. | | So lets say you need 2GB/s read/write speed and a petabyte of | storage. You simply get 4x LTO 8 drives, 84 LTO8 tapes, and stick | them into a tape library of some kind. | | You then buy a certain amount of SSDs + HDDs sufficient for | caching, so that you can read/write to this tape library at | sufficient speeds (especially since it could be many minutes | before a specific byte is accessed). | kragen wrote: | Hey uh | | is that a DECTape? | benjojo12 wrote: | The header image is the insides of a LTO 5 tape | watersb wrote: | I use a cloud storage provider to back up via Arq | https://arqbackup.com | | But I don't expect to restore more than a few gigabytes at a time | from that. | | It would take me a week or more to download a terabyte of data. I | have very little power over internet connection speed, and there | are very few alternatives here. I believe there are two different | vendors providing connectivity to our town, and you can pick | between four retail resellers. | | With those limitations, I have tested a full restore process | exactly once. That's not good enough. | | Data at rest on LTO or offline hard disk is something I can | control. Distributed offsite storage, too. Restore within 12 | hours, I can do that. | | The downside to tape or cold disk is more in the management of | hourly/daily/weekly backups: you have to provision a media | rotation schedule, whereas that's sort of built into an online | cloud storage service. | robohoe wrote: | I cut my sysadmin teeth doing tape work in early 2000s. It was | quite fun but I don't miss changing tapes and ensuring that the | FC tape loader library properly labeled them. | PaulHoule wrote: | I notice that he talks alot about dealing with malfunctioning | drives and malfunctioning tapes. | | That is my experience too. There is that time I got kicked out of | the computer lab as an undergraduate because I'd created a number | of newsgroups and they 'wrote' all my files... to what turned out | to be an empty SunTape. That time I tried to recover a | configuration file from an IBM tape robot and it took 14 hours. | When I was successful with tape I always did a lot of practicing | and testing. A sysadmin who taught me a lot (esp. how to get | things done in a place where you need 'social engineering' to get | things done) told me "you don't have a backup plan until you've | tested it" and many people learned that the hard way. | ansible wrote: | > _" you don't have a backup plan until you've tested it"_ | | Yep. Though that's what makes small-shop disk-to-disk backups | easy, depending on the backup software used. | | We use rsnapshot, which uses rsync and "cp -l" to make backups. | So restoring is as easy as using cd to go into the appropriate | directory and copying out the files. No special utilities | needed. Yes, we encrypt the backup drives using cryptfs / LUKS. ___________________________________________________________________ (page generated 2022-01-27 23:00 UTC)