[HN Gopher] ZFS fans, rejoice - RAIDz expansion will be a thing ... ___________________________________________________________________ ZFS fans, rejoice - RAIDz expansion will be a thing soon Author : rodrigo975 Score : 173 points Date : 2021-06-17 07:33 UTC (1 days ago) (HTM) web link (arstechnica.com) (TXT) w3m dump (arstechnica.com) | milofeynman wrote: | My resizing consists of buying 8 more hard drives that are 2x the | previous 8 and moving data over every few years (: | garmaine wrote: | FYI you don't have to move data. You can just replace each disk | one at a time, and after the last replacement you magically | have a bigger vpool. | curtis3389 wrote: | Does anyone know if this also means a draid can be expanded? | bearjaws wrote: | Just upgraded my home NAS, had to swap all 8 drives, took 7 | days... Not to mention it doubled the size of the array, I would | have been much happier with an incremental increase. | dsr_ wrote: | With RAID10, one could swap out 2 drives to get a size | increase. | | With 2 4 disk vdevs, one could swap out 4 drives for a size | increase. | | So I'm assuming you have a single 8 disk vdev, and no spare | places to put disks. | xanaxagoras wrote: | I did that once, and the experience was a big part of why I use | unraid now. | louwrentius wrote: | > Data newly written to the ten-disk RAIDz2 has a nominal storage | efficiency of 80 percent--eight of every ten sectors are data-- | but the old expanded data is still written in six-wide stripes, | so it still has the old 67 percent storage efficiency. | | This makes this feature quite 'meh'. The whole goal is capacity | expansion and you won't be able to use the new capacity unless | you rewrite all existing data, as I understand it. | | This feature is mostly relevant for home enthusiasts and I think | it doesn't really bring the desired behavior this user group | wants and needs. | | > Undergoing a live reshaping can be pretty painful, especially | on nearly full arrays; it's entirely possible that such a task | might require a week or more, with array performance limited to a | quarter or less of normal the entire time. | | Not an issue for home users as they often don't have large work | loads thus this process is fast and convenient. Even if it would | take two days. | uniqueuid wrote: | The article is a great example of all the somewhat surprising | peculiarities in ZFS. For example, the conversion will keep the | stripe width and block size, meaning your throughput of existing | data won't improve. So it's not quite a full re-balance. | | Other fun things are the flexible block sizes and their relation | to the size you're writing and compression ... Chris Siebenmann | has written quite a bit about it (https://utcc.utoronto.ca/~cks/s | pace/blog/solaris/ZFSLogicalV...). | | One thing I'm particularly interested in is to see if this new | patch offers a way to decrease fragmentation on existing and | loaded pools (allocation changes if they are too full, and this | patch will for the first time allow us to avoid building a | completely new pool). | | [edit] The PR is here: https://github.com/openzfs/zfs/pull/12225 | | I also recommend reading the discussions in the ZFS repository - | they are quite interesting and reveal a lot of the reasoning | behind the filesystem. Recommended even to people who don't write | filesystems as a living. | chungy wrote: | > The article is a great example of all the somewhat surprising | peculiarities in ZFS. For example, the conversion will keep the | stripe width and block size, meaning your throughput of | existing data won't improve. So it's not quite a full re- | balance. | | This is generally in-line with other ZFS operations. For | example, changing compression policies will not rewrite | existing data and only new data is affected. | | It simplifies some code paths and keeps performance good no | matter what. You don't get a surprising reduction on | performance. | [deleted] | nwmcsween wrote: | I'm starting to get concerned about the ZFS issue list, there are | a ton of gotchas hiding in using OpenZFS that will cause data | loss: | | * Swap on ZVOL (data loss) | | * Hardlocking when removing ZIL (this has caused dataloss for us) | nimbius wrote: | this might sound like a troll comment but its coming from someone | with almost zero experience with raid. What is the purpose of ZFS | in 2021 if we have hardware RAID and linux software RAID? BTRFS | does RAID too. Why would people choose ZFS in 2021 if both Oracle | and Open Source users have 2 competing ZFS? are they | interoperable? | rektide wrote: | No matter what happens, people will seemingly forever declare | BTRFS is not as stable and not as safe. There's a status page | that details what BTRFS thinks of itself[1], and I doubt any of | the many people docking BTRFS have read or know or care what | that page says. There is one issue still being worked out to | completion, a "write hole" problem, involving two separate | failures, an unplanned/power-loss shut-down, followed by a | second disk failure, which can result in some data being | lost[2] in RAID5/6 scenarios. | | Other than that one extreme double-failure scenario being | worked out, BTRFS has proven remarkably stable for a while now. | A decade ago that wasn't quite as absolutely bulletproof, but | today the situation is much different. Personally, it feels to | me like there is a persistent & vocal small group of people who | seemingly either have some agenda that makes them not wish to | consider BTRFS, or they are unwilling to review & reconsider | how things might have changed in the last decade. Not to | belabor the point but it's quite frustrating, and it feels a | bit odd that BTRFS is such a persistent target of slander & | assault. Few other file systems seem to face anywhere near as | much criticism, never so out of hand/casually, and honestly, in | the end, it just seems like there's some continent of ZFS folks | with some strange need to make themselves feel better by | putting others down. | | One big sign of trust: Fedora 35 Cloud looks likely to switch | to BTRFS as default[3], following Fedora 33 desktop lat year | making the move. A number of big names use BTRFS, including | Facebook. I have yet to see any hyperscalers interested in ZFS. | | I'm excited to see ZFS start to get some competent | expandability. Expanding ZFS used to be a nightmare. I'll | continue running BTRFS for now, but I'm excited to see file | systems flourish. Things I wouldn't do? Hardware RAID. | Controllers are persnickety weird devices, each with their own | invisible sets of constraints & specific firmware issues. If at | all possible, I'd prefer the kernel figure out how to make | effective use out of multiple disks. BTRFS, and now it seems | ZFS perhaps too, do a magical job of making that easy, | effective, & fast, in a safe way. | | Edit: the current widely-adopted write hole fix is to use RAID1 | or RAID1c3 or RAID1c4 (3 copy RAID1, 4 copy RAID1) for meta- | data, RAID5/6 for data. | | [1] https://btrfs.wiki.kernel.org/index.php/Status | | [2] https://btrfs.wiki.kernel.org/index.php/RAID56 | | [3] | https://www.phoronix.com/scan.php?page=news_item&px=Fedora-C... | Datagenerator wrote: | Netflix has been using ZFS in production for many years now. | Unnamed research companies are using ZFS moving PB's of data. | NetApp is FreeBSD based and was on the forefront of what we | now call ZFS. I'm totally biased, designed many production | critical systems with ZFS at it's core in one way or another. | The power of ZFS send and receive function is tremendous to | say the least, it beats any file based synchronizing methods. | webmobdev wrote: | One guess I can make for the "hate" BTRFS gets is probably | because everyone loves their data and doesn't expect to | "fight" with a file system to get access to it. | | E.g. Sailfish OS is perhaps the only mobile OS I know that | uses / used BTRFS in _production_ (and they adopted it nearly | 6-7 years ago!). And some of its users have had issues with | BTRFS in the earlier versions - https://together.jolla.com/qu | estions/scope:all/sort:activity... ... in fact, I too | remember that once or twice, we had to manually run the btrfs | balancer before doing an OS update. For Sailfish OS on Tablet | Jolla even experimented with LVM and ext4, and perhaps even | considered dropping BTRFS. (I don't know what it uses for | newer versions of Sailfish OS now - I think it allows the | user to choose between BTRFS or LVM / EXT4). | | Most users consider a file system (be it ZFS or BTRFS) to be | a really low-level system software with which they only wish | to interact transparently (even I got anxious when I had to | run btrfs balancer on Sailfish OS the first time worrying | what would happen if there was not enough free space to do | the operation and hoping I wouldn't lose my data). Even on | older systems, everybody frustrated over the need to run a | defragmenter. | | Perhaps because of improper expectations or configurations, | some of the early adopters of BTRFS got burnt with it after | possibly even losing their precious data. It's hard to forget | that kind of experience and thus perhaps the "continuing | hate" you see for BTRFS - a PR issue that BTRFS' proponents | needs to fix. | | (It's interesting to see the progress BTRFS has made. Thanks | to your post, I may consider it for future Linux | installations over EXT4. Except for the hands-on tinkering it | required once or twice, I remember it as being rock-solid on | my Sailfish mobile.) | chasil wrote: | Suse uses btrfs in production for the root filesystem, and | they have done so for years. | sz4kerto wrote: | I don't want to be trolling either, but a simple Google search | gives you really detailed answers. Or just look at Wikipedia: | https://en.wikipedia.org/wiki/ZFS | | Some highlights: hierarchical checksumming, CoW snapshots, | deduplication, more efficient rebuilds, extremely configurable, | tiered storage, various caching strategies, etc. | magicalhippo wrote: | > What is the purpose of ZFS in 2021 if we have hardware RAID | and linux software RAID? | | Others have touched on the main points, I just wanted to stress | that an important distinction between ZFS and hardware RAID and | linux software RAID (by which I assume you mean MD) is that the | latter two present themselves as block devices. One has to put | a file system on top to make use of them. | | In contrast, ZFS does away with this traditional split, and | provides a filesystem as well as support for a virtual block | device. By unifying the full stack from the filesystem down to | the actual devices, it can be smarter and more resilient. | | The first few minutes of this[1] presentation does a good job | of explaining why ZFS was built this way and how it improves on | the traditional RAID solutions. | | [1]: https://www.youtube.com/watch?v=MsY-BafQgj4 | wyager wrote: | ZFS RAID is the best RAID implementation in many respects. | Hardware RAID is bad at actually fixing errors on disk (as | opposed to just transparently correcting) and surfacing errors | to the user. | | BTRFS is frequently not considered stable enough for production | usage. | | ZFS has dozens of useful features besides RAID. Transparent | compression, instant atomic snapshots, incremental snapshot | sync, instant cloning of file systems, etc etc. | | Yes, different ZFS implementations are mostly compatible in my | experience, and they should become totally compatible as | everyone moves to OpenZFS. FreeBSD 13 and Linux currently have | ZFS feature parity I believe. | tehbeard wrote: | I can't speak with much experience, but what I have gleamed is. | | - You generally want to avoid hardware raid, if the card dies | you'll likely need to source a compatible replacement vs. | grabbing another SATA/was expander and reconstructing the | array. | | - zfs handles the stack all the way from drives to filesystem, | allowing them to work together (i.e filesystem usage info can | better dictate what gets moved around tiered storage, or better | raid recovery. | LambdaComplex wrote: | My understanding is that hardware RAID is mainly a thing in | the Windows world, because apparently its software RAID | implementation is garbage | nickik wrote: | > What is the purpose of ZFS in 2021 if we have hardware RAID | | Hardware RAID is actually older then ZFS style software RAID. | ZFS was specifically designed fix the issues with hardware | RAID. | | The problem with Hardware RAID is that is has no ideas what | going on on top of it, and even worse, its a mostly a bunch of | closed-source fireware from a vendor. And they cost money. | | You can find lots of terrible story about those. | | ZFS is open-source and battle tested. | | > linux software RAID | | Not sure what you are referring too. | | > BTRFS does RAID too. | | BTRFS is basically copied many of the features done in ZFS. | BTRFS has a history of being far less stable. ZFS is far more | battle tested. They say its stable now, but they had said that | many times. It eat my data twice so I have not followed the | project anymore. A file system in my opinion gets exactly 1 | chance with me. | | They each have some features the other doesn't but broadly | speaking they are similar technology. | | The new bcacheFS is also coming up and adding some interesting | features. | | > Why would people choose ZFS in 2021 if both Oracle and Open | Source users have 2 competing ZFS? | | Not sure what that has do with anything. Oracle is an evil | company, they tried take all these great open source | technologies away from people and the community thought against | it. Most of the ZFS team left after the merger. | | The Open-Source version is arguable better, and has far more of | the original designers working on it. The two code bases have | diverged a lot since then. | | At the end of the day ZFS is incredibly battle tested, works | incredibly well at what it does. And had a incredible | reputation of stability basically since it came out. They | question in my opinion is why not ZFS, then why ZFS. | _tom_ wrote: | > ZFS is far more battle tested. They say its stable now, but | they had said that many times. It eat my data twice | | Did you mean "it ate my data" to apply to ZFS? Or did you | mean BTRFS? | znpy wrote: | It was probably BTRFS. | | I never fell for the BTRFS meme but many friends of mine | did, and many of them ended up with a corrupted filesystem | (and lost data). | garmaine wrote: | He was referring to mdadm RAID. | Quekid5 wrote: | > hardware RAID | | That's just the worst of all worlds: Usually proprietary _and_ | you get the extreme aversion to improvement (or any change | really) of hardware vendors. | | This ZFS changes is going to come, and it may end up being | complex to implement for users... but it's happening. At the | risk of being hyperbolic: Something like would never be | possible with a HW raid system unless it had explicitly been | designed for it from the start. | | Also: ZFS does much more than any hardware RAID ever did. | boomboomsubban wrote: | They are not interoperable but they're barely competing as | Solaris is dead. Does Oracle Linux even offer Oracle ZFS? I | assume they stick to btrfs considering they are the original | developers. | | RAID does not feature the data protection offered by a copy on | write filesystem, and OpenZFS is the most stable and portable | option. | nix23 wrote: | It's pretty easy, having lots of experience with HW-Raid and | SW-Raid, software it the way to go because: | | 1. Do you trust Firmware...i don't, i can tell you storys about | freaking out san's...never had that with solaris or freebsd and | zfs. | | 2. Why having a additional abstraction layer, HW Raid caching | vs FS-Caching, no transparency for error correction, not smart | raid rebuild etc. | | the list can go on and on, but HW-Raid is a thing of the past | (exceptions are specialized san's etc) | usefulcat wrote: | The last time I had to use HW raid it was horrible. The | software for managing the RAID array was a poorly documented, | difficult to use proprietary blob. I used it for years and the | experience never improved. And this is a thing where if you | make a mistake you can destroy the very data that you've gone | to such lengths to protect. Having switched to ZFS several | years ago, I lack to the words to express how much I don't miss | having to deal with that. | nickik wrote: | I prefer just to have mirrors but its cool that it slowly coming, | some people seem to really want this feature. | | ZFS has been amazing to me, I have zero complaints. | | I just wish it wouldn't have taken so long to come to /root on | linux. Even still today you have to a lot of work unless you want | to use the new support in Ubuntu. | | This license snafu is so terrible, open-source licenses excluding | each other. Crazy. The world would have been a better place if | linux had incorporated ZFS long ago. (And no we don't need yet | another legal discussion, my point is just that its sad). | 1980phipsi wrote: | This will be very useful! | | TIL FreeNAS is now TrueNAS. | znpy wrote: | Actually there's more! | | A new version of TrueNAS is in the works, it's called TrueNAS | scale and it's going to be Linux-based (no more FreeBSD). | | I'm frankly happy, because TrueNAS is great as a NAS operating | system but I really wanted to run containers where my storage | is and having to run a VM adds a really unnecessary overhead | (plus, it's another machine to manage) | d33lio wrote: | I'll believe it when I see it, why anyone uses BTRFs (UnRaid or | any other form of software raid that _isn 't_ ZFS) is still | beyond me. At least when we're not talking SSD's ;) | | ZFS is incredible, curious to mess around with these new | features! | mixedCase wrote: | I just put two 8TB drives into btrfs because it's a home | server, I can't provision things up front. One day I may put a | third 8TB drive and turn this RAID1 into RAID5. btrfs lets me | do that, zfs doesn't, simple as. | | One day I may switch the whole thing to bcachefs, which I've | donated and am looking forwards to. For the moment, btrfs will | have to do. | | EDIT: downvoted by... the filesystem brigade? | edgyquant wrote: | There are a large group of people who really dislike BTRFS. I | think they were probably burned by it at some point but I've | never had trouble and I've been using it since it became the | default on fedora. | nix23 wrote: | >RAID5 | | I wish you lots of fun with that on btrfs :) | | Edit: | | https://btrfs.wiki.kernel.org/index.php/Status | | RAID56 Unstable n/a write hole still exists | | > treated as if I'm storing business data or precious | memories without backups, guess I'm just dumb | | No your not, but don't use unstable features in a filesystem | mixedCase wrote: | Well, that's the idea! This a low I/O media server where | all the important stuff (<5G of photos) has 2+ redundancy, | once remotely, and on every workstation I sync, with the | rest of the data being able to crash and burn without much | repercussion. | | The whole point of me using RAID1 (and maybe later RAID5) | is that if a disk goes bust, odds are I can still watch a | movie from it until I can get another disk. What's more, if | I ever fill the RAID1 and I don't feel like breaking the | piggy bank for another disk, I can go JBOD as far as my | usecase is concerned. | | But hey, if the orange website tells me all servers are | supposed to be treated as if I'm storing business data or | precious memories without backups, guess I'm just dumb. On | that note: donations welcome, each 8TB disk costs close to | 500 USD here in Uruguay, so if anyone's first world opinion | can buy me a couple so I can use the Right Filesystem(tm), | I'd appreciate it! | jbverschoor wrote: | Licensing. Similarly, otherwise it would've been included in | macOS a long time ago (as the default fs according to some..) | tw04 wrote: | The reason it didn't end up in macOS is because NetApp sued | Sun for patent infringement. Apple wanted nothing to do with | that lawsuit and quickly abandoned the project. | | As others have stated, dtrace has the exact same license and | has been in MacOS for years. | jen20 wrote: | The licensing is nothing to do with it on OSX - indeed DTrace | (also under the CDDL) has been shipping in it for years. | bsder wrote: | I do believe that the license was fine for macOS but when | Oracle bought Sun that killed it cold. | | Jobs _never_ liked anybody other than himself holding all the | cards. Having Ellison and Oracle holding the keys to ZFS was | just never going to fly. | spullara wrote: | I had ZFS on a Mac from Apple for a short amount of time | during one of the betas :( I think TimeMachine was going to | be based on it but they pulled out. | codetrotter wrote: | FYI there is a third-party effort for making OpenZFS | usable on macOS. | | https://openzfsonosx.org/ | | I used it for a while but unfortunately since they are | not many people working on this and they are not working | on it full time it can take them a good while from a new | version of macOS is released until OpenZFS is usable with | that version of macOS. This was certainly the case a | while ago and why I stopped using OpenZFS on macOS and | went back to only using ZFS on FreeBSD and Linux instead | of additionally using it on macOS. So with my Mac | computers I only use APFS. | qaq wrote: | Jobs and Ellison were really close friends | jamiek88 wrote: | And also cold hearted clear eyed businessmen unlikely to | allow friendship to affect their corporations. | | I'd love to be a fly on the wall for some of those | conversations. | tw04 wrote: | That makes absolutely no sense. Jobs and Ellison were best | friends. Oracle acquiring Sun would have made it MORE | attractive, not less. | | https://www.cnet.com/news/larry-ellison-talks-about-his- | best... | ghaff wrote: | It's a combination of the license and the fact that it's | Oracle, of all entities, that owns the copyright. Perhaps | either one by itself wouldn't be a dealbreaker but the | combination is. And, of course, Oracle could have changed | the license at any time after buying Sun. | | (Of course, Jobs may have just decided he didn't want to | depend on someone else for the MacOS filesystem in any | case.) | | ADDED: And as others noted, there were also some storage | patent-related issues with Sun. So just a lot of potential | complications. | ghaff wrote: | And it's arguably even a bigger issue on Linux distros. | mnd999 wrote: | It's a moderate pain on Linux and then only really that if | you're running on something bleeding-edge like Arch. | Otherwise it's just a kernel module like any other. | ghaff wrote: | But it doesn't ship with either Red Hat or SUSE distros, | which is an issue for supported commercial use. | justaguy88 wrote: | Whats Oracle's play here, do they somehow make money out of | ZFS which makes them reluctant to re-license it? | sneak wrote: | Is there a CLA for OpenZFS/ZoL? I don't believe there is, | so I don't think Oracle can unilaterally relicense it. | Dylan16807 wrote: | > why anyone uses BTRFs (UnRaid or any other form of software | raid that isn't ZFS) is still beyond me. | | BTRFS can do after-the-fact deduplication (with much better | performance than ZFS dedup) and copy-on-write files. And you | can turn snapshots into editable file systems. | eptcyka wrote: | I've had 3 catastrophic BTRFS failures. In two cases, the | root filesystem just ran out of space and there was no way to | repair the partition. Last time, the partition was just | rendered unmountable after a reboot. All data was lost.No | such thing has ever happened with ZFS for me. | Dylan16807 wrote: | I've had some annoying failures too. But I wasn't listing | pros and cons, I was explaining that there _are_ some very | notable features that ZFS lacks. | eptcyka wrote: | That's fair. However, when listing notable features for | the sake of comparing software, I think it's important to | also list other characteristics of a given piece of | software. If we were to compare software by feature sets | alone, one might argue that Windows has the most | features, so Windows must be best OS. | tux1968 wrote: | A recent Fedora install here came with a new default of | BTRFS use rather than ext4. So i'm curious about your | experience, were any of those catastrophic failures recent? | Do you know of any patches entering the kernel that purport | to fix the issues you experienced? | benlivengood wrote: | I think cloning a zfs snapshot into a writeable filesystem | matches at least the functionality of btrfs writeable | snapshots, but I could be ignorant about some use-cases. | Dylan16807 wrote: | Let's say you want to clear out part of a snapshot of | /home, but keep the rest. | | So you clone it and delete some files. All good so far, but | the snapshot is still wasting space and needs to be | deleted. | | But to make this happen, your clone has to stop being copy- | on-write. All the data that exists in both /home and the | clone will now be duplicated. | | And you could say "plan ahead more", but even if you split | up your drive into many filesystems, now you have the | problem that you can't move files between these different | directories without making extra copies. | auxym wrote: | RAM? | | Everytime I looked into setting up a freenas box, every | hardware guide insisted that ungodly amounts of absolutely-has- | to-be-ECC RAM was essential, and I just gave up at that point. | colechristensen wrote: | ZFS likes RAM and uses it to get better performance (and | don't think about using dedup without huge ram), but you | don't need it and can change the defaults. | | ECC tends to attract zealots after a perfect error-free | existence which ECC does tend towards but doesn't deliver, it | just reduces errors. I personally don't care about a tiny | amount of bit rot (zfs will prevent most of this) and | rebooting my storage machine now and then. | | You can run ZFS/freenas on a crappy old machine and you'll be | just fine as long as you aren't hosting storage for dozens of | people and you aren't a digital archivist trying to keep | everything for centuries. | | Real advice: | | * Mirrored vdevs perform way better than raidz, I don't think | the storage gain is worth it until you have dozens of drives | | * Dedup isn't worth it | | * Enable lz4 compression everywhere | | * Have a hot spare | | * You can increase performance by adding a vdev set and by | adding RAM | | * Use drives with the same capacity | InvaderFizz wrote: | > Dedup isn't worth it | | To add to that, ZFS dedup is a lie and you should forget | its existence unless you have a very specific scenario of | being a SAN with a massive amount of RAM, and even then, | you had better be damn sure. | | I really wish ZFS had either an option to store the Dedup | Table on a NVMe like Optane, or to do an offline | deduplication job. | hpfr wrote: | Does rebooting help with soft errors in non-ECC RAM? I | would have thought bit flips would be transient in nature, | but I'm not really familiar. | AdrianB1 wrote: | Running ZFS (FreeNAS/TrueNAS) on 2 home made NAS devices | for years and years, I can say it is rock solid without | ever using ECC RAM due to lack of choices. I can bet | there were many soft-errors in all these years, but so | far I never had problems that could not be recovered; the | biggest issue ever was destroying the boot USB storage in | months, but that was partially solved lately, I moved to | fixed drives as boot drive and later I moved to | virtualization for boot disk and OS, so the problem | completely went away. | livueta wrote: | > Enable lz4 compression everywhere | | Is the perf penalty low enough now that it just doesn't | matter? I've always disabled compression on datasets I know | are going to store only high-entropy data, like encoded | video, that has a poor compression ratio. | | I second the hot spare recommendation many times over. It | can save your bacon. | simcop2387 wrote: | It's generally the other way around actually, aside from | storing already highly compressed datasets (e.g. video). | The compression from lz4 will get you better effective | performance because of the lower amount of io that has to | be done, both in throughput and latency on zfs. This is | because your CPU can usually do lz4 at hundreds of gb/s | compared to the dozen you might get on your spinning rust | disks. | livueta wrote: | Neat! Makes sense. | n8ta wrote: | The freenas hardware requirements themselves say "8 GB RAM | (ECC recommended but not required)" | | https://www.freenas.org/hardware-requirements/ | | I myself use freenas with 16GB of non-ECC ram. | | Of course it is possible to have a bit flip in memory that is | then dutifully stored incorrectly by ZFS to disk, but this | was a possibility without ZFS as well. | | I've actually been waiting for this feature for since I first | setup my pool. It seemed theoretically possible we were just | waiting for an implementation. | dsr_ wrote: | Neither quantity nor ECC is essential. | | ZFS defaults to assuming it is the primary reason for your | box to exist, but it only takes two lines to define more | reasonable RAM usage: zfs_arc_min and zfs_arc_max. On a NAS | type server, I would think setting the max to half of your | RAM is reasonable. Maybe 3/4 if you never do anything except | storage. | | ECC is not recommended because ZFS has some kind of special | vulnerability without it; ECC is recommended because ZFS has | taken care of all the more likely chances of undetectable | corruption, so that's the next step. | fpoling wrote: | It is not that simple regarding ECC. Since ZFS uses more | memory, the probability of hitting a memory bug is simply | higher with it. | amarshall wrote: | But it doesn't really use more memory. The ARC gives the | impression of high memory usage because it's different | than the OS page cache and usually called out explicitly | and not ignored in many monitoring tools like the OS | cache is. Linux--without ZFS--will happily consume nearly | all RAM with _any_ filesystem if enough data is read and | written. | dsr_ wrote: | This is correct. Any filesystem using the kernel's | filesystem cache will do this, too. | | For a long running, non-idle system, a good rule of thumb | is that all RAM not being actively used is being used by | evictable caching. | zerd wrote: | A colleague who was used to other UNIXes was | transitioning to Linux for a database. He saw in free | that used was more at more than 90%, so he added more | ram. But to his surprise it was still using 90%! He kept | adding ram. I told him that he had to subtract the buffer | and cached values (this was before free had the Available | column). | mark-wagner wrote: | Before the Available column there was the -/+ | buffers/cache line that provided the same information. | Maybe it was too confusing. | total used free shared buffers cached | Mem: 12286456 11715372 571084 0 81912 | 6545228 -/+ buffers/cache: 5088232 7198224 | Swap: 24571408 54528 24516880 | ahofmann wrote: | Others have said good things (ECC is good by itself, has not | much to do with ZFS) and it is actually quite easy to check | if you need much RAM for ZFS. Start a (Linux) VM with a few | hundred megabytes of RAM and run ZFS an on it. Of course, it | will not be as performant as having a lot of RAM. But it will | not crash, or hang or be unusable in one way or another. | | Sources: - https://www.reddit.com/r/DataHoarder/comments/3s7v | rd/so_you_... - https://www.reddit.com/r/homelab/comments/8s6 | r2r/what_exactl... - My own tests with around 8 TB ZFS data | in a Linux vm with 256 MB RAM. | IgorPartola wrote: | Heh so you have that backwards. All RAM should be ECC if you | care about what's stored in it. It's not a ZFS requirement, | it's just that ZFS specifically cares about data integrity so | it advises you to use ECC RAM. But it's not like any other | file system is immune from random RAM corruption: it's not, | it just won't tell you about it. | handrous wrote: | The "you need at least 32GB of memory and it _has to be_ ECC, | or don 't even bother trying to use ZFS" crowd has done some | serious harm to ZFS adoption. Sure, that's what you need if | you want _excellent_ data integrity guarantees and to use | _all_ of ZFS ' advanced features. If you're fine with merely | way-better-than-most-other-filesystems data integrity | guarantees and using only _most_ of ZFS ' advanced features, | you don't need those. | tombert wrote: | I really don't know where the "You gotta have ECC RAM!" | thing started. I've been running a ZFS RAID on Nvidia | Jetson Nanos for years now and haven't had any issues at | all with data integrity. | | I don't see why ZFS would be more prone to data integrity | issues spawning from a lack of ECC than any other | filesystem. | kurlberg wrote: | Years ago I saw it at: | | https://www.truenas.com/community/threads/ecc-vs-non-ecc- | ram... | | (the gist of the scary story is that faulty ram while | scrubbing might kill "everything".) However, in the end | ECC appears to NOT be so important, e.g., see | | https://news.ycombinator.com/item?id=23687895 | radiowave wrote: | Relevant quote from one of ZFS's primary designers, Matt | Ahrens: "There's nothing special about ZFS that | requires/encourages the use of ECC RAM more so than any | other filesystem. ... I would simply say: if you love | your data, use ECC RAM. Additionally, use a filesystem | that checksums your data, such as ZFS." | tombert wrote: | Yeah, I remember reading that a few years ago. | | If I were running a server farm or something, then yeah, | I'd probably use ECC memory, but I think if you're | running a home server, then the argument that ZFS | necessitates ECC more than Ext4 or Btrfs or XFS or | whatever doesn't really seem to be accurate. | oarsinsync wrote: | > the argument that ZFS necessitates ECC more than Ext4 | or Btrfs or XFS or whatever doesn't really seem to be | accurate | | Agreed. | | > If I were running a server farm or something, then | yeah, I'd probably use ECC memory, but I think if you're | running a home server | | Then you should still use ECC RAM, regardless of what | filesystem you're using. | | No, really. ECC matters | (https://news.ycombinator.com/item?id=25622322) | generally. | tombert wrote: | Fair enough, though AFAIK none of the SBC systems out | there have ECC, and I generally use SBCs due to the low | power consumption. | simcop2387 wrote: | You really only end up needing that if and only if you're | also going to do live deduplication of large amounts of data. | Very few people actually need that, just using compression | with lz4 or zstd depending on your needs will suffice for | just about everyone and perform better. the ECC argument is | probably about a 50/50 kind of thing, you can get away | without it and ZFS will do it's best to detect and prevent | issues but if the data was flipped before it was given to ZFS | then there's nothing anyone can do. You might get some false | positives when reading data back if you got some flaky ram | but as long as you have parity or redundancy on the disks | then things should still get read correctly even if a false | problem is detected. That might mean you want to run a scrub | (essentially ZFS's version of fsck) more often to look for | potential issues but it shouldn't fundamentally be a big | deal. If you end up wanting 24/7 highly available storage | that won't blip out occasionally you'll probably really want | the ECC ram but if you're fine with having to reboot it | occasionally or tell it to repair problems that it thinks | were there (but weren't because the disk is fine but the ram | wasn't) then you should be fine. The extra checksums and data | that ZFS can use for all this can make it really robust even | on bad hardware. I had a bios update cause some massive PCIE | bus issues that I didn't realize were going on for a bit and | ZFS kept all my data in good condition even though writes | were sometimes just never happening because of ASPM causing | issues with my controller card. | UI_at_80x24 wrote: | As always, it depends on your use-case. | | I have several file-servers all use ZFS exclusively. and 10x | that number of servers using ZFS as the system FS. | | Rule of thumb that I like: 1GB RAM/TB of storage. This seems | to give me the best bang-for-our-buck. | | For a small (under 20) number of office users, doing general | 'office' stuff, using Samba, it's overkill. | | For large media shares with heavy editor access, and heavy | strains on the network, it's a minimum. | | Depends on what the server is serving. | | DeDUP is a different story. The RAM is used to store the | frequently accessed data. If you are using DeDUP you fill the | motherboard with as much RAM as will fit. NO EXCEPTIONS! This | may have been the line of thinking that scared you away from | it. | | I have a 100TB server that is just used for writing data to | and is never read from (sequential file back-ups before it's | moved to "long term storage"). It has 8GB of RAM, and is | barely touched. | | I also have a 20TB server with 2TB of RAM, that keeps the RAM | maxed out with DeDUP usage. | | ECC: It's insurance, and it's worth it. | the8472 wrote: | btrfs does have some advantages over zfs - no | data duplicated between page cache and arc - no upgrade | problems on rolling distros - balance allows | restructuring the array - offline dedup, no need for | huge dedup tables - ability to turn off checksumming for | specific files - O_DIRECT support - reflink copy | - fiemap - easy to resize | chasil wrote: | - defragmentation | nix23 wrote: | But the main thing of an fs is to preserve your files...btrfs | can't even check the most important point. | edgyquant wrote: | I see this a lot but have never had problems with BTRFS and | I've used it both on my larger disks (2+tb) and my root | (250gb ssd) across multiple computers for the last four | years. | the8472 wrote: | The checksumming helps to spot faulty hardware, that's a | step above most other filesystems and often smart info too. | akvadrako wrote: | Checksums don't help against bugs. You are much less | likely to lose your whole disk with ext4 or ZFS than | BTRFS. | baaym wrote: | And even included in the kernel | donmcronald wrote: | BTRFS was useful for me. When those (RAID5) parity patches got | rejected many, many years ago for non-technical reasons like | not matching a business case/goal or similar, it changed my | view of open source. | | That was the day I realized that some open source participants | and supporters are interested in having open source projects | that are good enough to act as a barrier to entry, but not good | enough to compete with their commercial offerings. | | Judge the world from that perspective for a while and it can | help to explain why so much open source feels 80% done and | never gets the last 20% of the polish needed to make it great. | imiric wrote: | Simplicity. There's a lot of complexity in ZFS I'd rather not | depend on, and because it does so many things it's a big | investment and liability to switch to. | | While I understand why it would be useful in a corporate | setting, for personal use I've found the combination of | LUKS+LVM+SnapRAID to work well and don't see the benefit of | switching to ZFS. Two of those are core Linux features, and | SnapRAID has been rock solid, though thankfully I haven't | tested its recovery process, but it seems straightforward from | the documentation. Sure I don't have the real-time error | correction of ZFS and other fancy features, but most of those | aren't requirements for a personal NAS. | [deleted] | nix23 wrote: | > LUKS+LVM+SnapRAID | | + your fs | | Yeah that sounds like a lot less complexity | imiric wrote: | ZFS has all of these features and more. If I don't need | those extra features by definition it's a less complex | system. | | Using composable tools is also better from a maintenance | standpoint. If tomorrow SnapRAID stops working, I can | replace just that component with something else without | affecting the rest of the system. | TimWolla wrote: | > If tomorrow SnapRAID stops working, I can replace just | that component with something else without affecting the | rest of the system. | | Can you actually? If some layer of that storage stack | stops working then you can no longer access your existing | data, because all these layers need to work correctly to | correctly reassemble the data read from disk. | imiric wrote: | It's a hypothetical scenario :) In reality if there's a | project shutdown there would be enough time to migrate to | a different setup. Of course it would be annoying to do, | but at least it's possible. With a system like ZFS I'm | risking having to change the filesystem, volume manager, | storage array, encryption and whatever other feature I | depended on. It's a lot to buy into. | nix23 wrote: | Since all those tools are from different dev's the system | gets more complex. But hey if you really think that ZFS | is to complex to hold 55 petabytes because it has to many | potential bugs you should tell them: | | https://computing.llnl.gov/projects/zfs-lustre | imiric wrote: | Thankfully I don't have to manage 55 petabytes of data, | but good luck to them. | | Did you miss the part where I mentioned "for personal | use"? | | > Since all those tools are from different dev's the | system gets more complex. | | I fail to see the connection there. Whether software is | developed by a single entity or multiple developers has | no relation to how complex the end user system will be. | | But many small tools focused on just the functionality I | need allows me to build a simpler system overall. | funcDropShadow wrote: | > Whether software is developed by a single entity or | multiple developers has no relation to how complex the | end user system will be. | | The first part of this sentence is probably true, as far | as I see, but the complexity of a system perceived by the | user depends primarily on the "surface" of the system. | That surface includes the UI, the documentation and | important concepts you have to understand for effective | usage of the system. And in that regard, ZFS wins hands | down against LUKS + LVM + SnapRaid + your FS of choice. | Some questions a user of that LVM stack has to answer, | aren't even asked of a ZFS user. E.g. the question how to | split the space between volumes or how to change the size | of volumes. | j1elo wrote: | What about if you were just starting today, with 0 knowledge | about basically anything related to storage and how to do it | right? | | That's my case, I'm learning before setting up a cheap home | lab and a NAS, and I'm wondering if biting into ZFS is just | the best option that I have given today's ecosystem. | imiric wrote: | I would still go with a collection of composable tools | rather than something monolithic as ZFS, and to avoid the | learning curve. But again, for personal use. If you're | planning to use ZFS in a professional setting it might be | good to experiment with it at home. | j1elo wrote: | As mentioned in the sibling comment, one thing I like is | having systems that don't require me to supervise, fix | things, etc. In part that's why I've been alwas a user of | ext4, it just works. | | But I've recently found bitrotin some of my data files | and now that I happened to be learning about how to build | a NAS, I wanted to make the jump to some FS that helps me | with that task. | | Could you mention which tools you would use to replace | ZFS? Think of checksumming, snapshotting, and to a lesser | degree, replication/RAID. | throw0101a wrote: | > _That 's my case, I'm learning before setting up a cheap | home lab and a NAS, and I'm wondering if biting into ZFS is | just the best option that I have given today's ecosystem._ | | ZFS is the simplest stack that you can learn IMHO. But if | you want to learn all the moving parts of an operating | system for (e.g.) professional development, then more | complex may be more useful. | | If you want to created a mirrored pair of disks in ZFS, you | do: _sudo zpool create mydata mirror /dev/sda /dev/sdb_ | | In the old school fashion, you first partition with _gdisk_ | , then you use _mdadm_ to create the mirroring, then | (optionally) LVM to create volume management, then _mkfs_. | cogman10 wrote: | I dove into ZFS for my home lab as a relative novice. | | It's not terrible, but there are a few new concepts to come | to grips with. Once you have them down, it's not terrible. | | If you don't plan on raiding, IMO, ZFS is overkill. The | check-summing is nice, but you can get that from other | filesystems. | | Maintenance is fairly straight forward. I've even done a | disk swap without too much fuss. | | The biggest issue I had was setting up raid z on root with | ubuntu was a PITA (at the time at least, March of this | year). I ended up switching over to debian instead. Once | setup, things have been pretty smooth. | j1elo wrote: | Two things I like from it, as per what I've read so far: | | * Checksumming | | * As you mention, easy maintenance | | * Snapshots and how useful they are for backups | | In the end what I value is stuff that works reliably, | doesn't get in the way, and requiring minimal | supervision. And in the particular case of FS, I'd like | to adopt a system that helps avoid bitrot in my data. | | Could you drop some names that you would consider as good | alternatives of ZFS? | gregmac wrote: | For my big media volume, which had existed for around 10 years, | I use snapraid. | | Because of several things: | | * I can mix disk sizes | | * I can add new disks over time as needed | | * If something dies, up to the entire server, I can just stick | any data disk in another system and read it | | I didn't want to become a zfs expert (and the learning curve | seems steep!), and I didn't want to spend thousands of dollars | on new gear (dedicated NAS box and a bunch of matched-size | disks). | | I repurposed my old workstation into a server, spent a few | hours getting it set up, and it works. I've had two disks fail | (one data, one parity, and recovered from both). Every time | I've added a new disk, it's been 50-100% larger than my | existing disks. | | I've also migrated the entire setup to a new system (newer old | retired workstation), running proxmox, and was pleasantly | surprised it only took about an hour to get that volume back up | (incidentally, that server runs zfs as well.. I just don't use | it for my large media storage volume). | joshstrange wrote: | UnRaid and Synology user here and I completely agree with all | your points. The knowledge that at worst I will lose the data | on just 1 disk (or 2 if I fail during a rebuild) is very | calming. If not for UnRaid there is no way I could manage the | size of the media volume I maintain (from a time, energy, and | money perspective). I mean if you know ZFS well and trust | yourself then more power to you but UnRaid and friends fill a | real gap. | atmosx wrote: | The learning curve of ZFS compared to every alternative out | there is significantly lower IMO. The interface is easier and | the guides online are great. | | There are drawbacks as the one discussed here, but as a Linux | user who doesn't want to mess up with the FS and uses ZFS for | the backup server, the experience has been great so far. ___________________________________________________________________ (page generated 2021-06-18 23:00 UTC)