[HN Gopher] ZFS fans, rejoice - RAIDz expansion will be a thing ...
       ___________________________________________________________________
        
       ZFS fans, rejoice - RAIDz expansion will be a thing soon
        
       Author : rodrigo975
       Score  : 173 points
       Date   : 2021-06-17 07:33 UTC (1 days ago)
        
 (HTM) web link (arstechnica.com)
 (TXT) w3m dump (arstechnica.com)
        
       | milofeynman wrote:
       | My resizing consists of buying 8 more hard drives that are 2x the
       | previous 8 and moving data over every few years (:
        
         | garmaine wrote:
         | FYI you don't have to move data. You can just replace each disk
         | one at a time, and after the last replacement you magically
         | have a bigger vpool.
        
       | curtis3389 wrote:
       | Does anyone know if this also means a draid can be expanded?
        
       | bearjaws wrote:
       | Just upgraded my home NAS, had to swap all 8 drives, took 7
       | days... Not to mention it doubled the size of the array, I would
       | have been much happier with an incremental increase.
        
         | dsr_ wrote:
         | With RAID10, one could swap out 2 drives to get a size
         | increase.
         | 
         | With 2 4 disk vdevs, one could swap out 4 drives for a size
         | increase.
         | 
         | So I'm assuming you have a single 8 disk vdev, and no spare
         | places to put disks.
        
         | xanaxagoras wrote:
         | I did that once, and the experience was a big part of why I use
         | unraid now.
        
       | louwrentius wrote:
       | > Data newly written to the ten-disk RAIDz2 has a nominal storage
       | efficiency of 80 percent--eight of every ten sectors are data--
       | but the old expanded data is still written in six-wide stripes,
       | so it still has the old 67 percent storage efficiency.
       | 
       | This makes this feature quite 'meh'. The whole goal is capacity
       | expansion and you won't be able to use the new capacity unless
       | you rewrite all existing data, as I understand it.
       | 
       | This feature is mostly relevant for home enthusiasts and I think
       | it doesn't really bring the desired behavior this user group
       | wants and needs.
       | 
       | > Undergoing a live reshaping can be pretty painful, especially
       | on nearly full arrays; it's entirely possible that such a task
       | might require a week or more, with array performance limited to a
       | quarter or less of normal the entire time.
       | 
       | Not an issue for home users as they often don't have large work
       | loads thus this process is fast and convenient. Even if it would
       | take two days.
        
       | uniqueuid wrote:
       | The article is a great example of all the somewhat surprising
       | peculiarities in ZFS. For example, the conversion will keep the
       | stripe width and block size, meaning your throughput of existing
       | data won't improve. So it's not quite a full re-balance.
       | 
       | Other fun things are the flexible block sizes and their relation
       | to the size you're writing and compression ... Chris Siebenmann
       | has written quite a bit about it (https://utcc.utoronto.ca/~cks/s
       | pace/blog/solaris/ZFSLogicalV...).
       | 
       | One thing I'm particularly interested in is to see if this new
       | patch offers a way to decrease fragmentation on existing and
       | loaded pools (allocation changes if they are too full, and this
       | patch will for the first time allow us to avoid building a
       | completely new pool).
       | 
       | [edit] The PR is here: https://github.com/openzfs/zfs/pull/12225
       | 
       | I also recommend reading the discussions in the ZFS repository -
       | they are quite interesting and reveal a lot of the reasoning
       | behind the filesystem. Recommended even to people who don't write
       | filesystems as a living.
        
         | chungy wrote:
         | > The article is a great example of all the somewhat surprising
         | peculiarities in ZFS. For example, the conversion will keep the
         | stripe width and block size, meaning your throughput of
         | existing data won't improve. So it's not quite a full re-
         | balance.
         | 
         | This is generally in-line with other ZFS operations. For
         | example, changing compression policies will not rewrite
         | existing data and only new data is affected.
         | 
         | It simplifies some code paths and keeps performance good no
         | matter what. You don't get a surprising reduction on
         | performance.
        
           | [deleted]
        
       | nwmcsween wrote:
       | I'm starting to get concerned about the ZFS issue list, there are
       | a ton of gotchas hiding in using OpenZFS that will cause data
       | loss:
       | 
       | * Swap on ZVOL (data loss)
       | 
       | * Hardlocking when removing ZIL (this has caused dataloss for us)
        
       | nimbius wrote:
       | this might sound like a troll comment but its coming from someone
       | with almost zero experience with raid. What is the purpose of ZFS
       | in 2021 if we have hardware RAID and linux software RAID? BTRFS
       | does RAID too. Why would people choose ZFS in 2021 if both Oracle
       | and Open Source users have 2 competing ZFS? are they
       | interoperable?
        
         | rektide wrote:
         | No matter what happens, people will seemingly forever declare
         | BTRFS is not as stable and not as safe. There's a status page
         | that details what BTRFS thinks of itself[1], and I doubt any of
         | the many people docking BTRFS have read or know or care what
         | that page says. There is one issue still being worked out to
         | completion, a "write hole" problem, involving two separate
         | failures, an unplanned/power-loss shut-down, followed by a
         | second disk failure, which can result in some data being
         | lost[2] in RAID5/6 scenarios.
         | 
         | Other than that one extreme double-failure scenario being
         | worked out, BTRFS has proven remarkably stable for a while now.
         | A decade ago that wasn't quite as absolutely bulletproof, but
         | today the situation is much different. Personally, it feels to
         | me like there is a persistent & vocal small group of people who
         | seemingly either have some agenda that makes them not wish to
         | consider BTRFS, or they are unwilling to review & reconsider
         | how things might have changed in the last decade. Not to
         | belabor the point but it's quite frustrating, and it feels a
         | bit odd that BTRFS is such a persistent target of slander &
         | assault. Few other file systems seem to face anywhere near as
         | much criticism, never so out of hand/casually, and honestly, in
         | the end, it just seems like there's some continent of ZFS folks
         | with some strange need to make themselves feel better by
         | putting others down.
         | 
         | One big sign of trust: Fedora 35 Cloud looks likely to switch
         | to BTRFS as default[3], following Fedora 33 desktop lat year
         | making the move. A number of big names use BTRFS, including
         | Facebook. I have yet to see any hyperscalers interested in ZFS.
         | 
         | I'm excited to see ZFS start to get some competent
         | expandability. Expanding ZFS used to be a nightmare. I'll
         | continue running BTRFS for now, but I'm excited to see file
         | systems flourish. Things I wouldn't do? Hardware RAID.
         | Controllers are persnickety weird devices, each with their own
         | invisible sets of constraints & specific firmware issues. If at
         | all possible, I'd prefer the kernel figure out how to make
         | effective use out of multiple disks. BTRFS, and now it seems
         | ZFS perhaps too, do a magical job of making that easy,
         | effective, & fast, in a safe way.
         | 
         | Edit: the current widely-adopted write hole fix is to use RAID1
         | or RAID1c3 or RAID1c4 (3 copy RAID1, 4 copy RAID1) for meta-
         | data, RAID5/6 for data.
         | 
         | [1] https://btrfs.wiki.kernel.org/index.php/Status
         | 
         | [2] https://btrfs.wiki.kernel.org/index.php/RAID56
         | 
         | [3]
         | https://www.phoronix.com/scan.php?page=news_item&px=Fedora-C...
        
           | Datagenerator wrote:
           | Netflix has been using ZFS in production for many years now.
           | Unnamed research companies are using ZFS moving PB's of data.
           | NetApp is FreeBSD based and was on the forefront of what we
           | now call ZFS. I'm totally biased, designed many production
           | critical systems with ZFS at it's core in one way or another.
           | The power of ZFS send and receive function is tremendous to
           | say the least, it beats any file based synchronizing methods.
        
           | webmobdev wrote:
           | One guess I can make for the "hate" BTRFS gets is probably
           | because everyone loves their data and doesn't expect to
           | "fight" with a file system to get access to it.
           | 
           | E.g. Sailfish OS is perhaps the only mobile OS I know that
           | uses / used BTRFS in _production_ (and they adopted it nearly
           | 6-7 years ago!). And some of its users have had issues with
           | BTRFS in the earlier versions - https://together.jolla.com/qu
           | estions/scope:all/sort:activity... ... in fact, I too
           | remember that once or twice, we had to manually run the btrfs
           | balancer before doing an OS update. For Sailfish OS on Tablet
           | Jolla even experimented with LVM and ext4, and perhaps even
           | considered dropping BTRFS. (I don't know what it uses for
           | newer versions of Sailfish OS now - I think it allows the
           | user to choose between BTRFS or LVM / EXT4).
           | 
           | Most users consider a file system (be it ZFS or BTRFS) to be
           | a really low-level system software with which they only wish
           | to interact transparently (even I got anxious when I had to
           | run btrfs balancer on Sailfish OS the first time worrying
           | what would happen if there was not enough free space to do
           | the operation and hoping I wouldn't lose my data). Even on
           | older systems, everybody frustrated over the need to run a
           | defragmenter.
           | 
           | Perhaps because of improper expectations or configurations,
           | some of the early adopters of BTRFS got burnt with it after
           | possibly even losing their precious data. It's hard to forget
           | that kind of experience and thus perhaps the "continuing
           | hate" you see for BTRFS - a PR issue that BTRFS' proponents
           | needs to fix.
           | 
           | (It's interesting to see the progress BTRFS has made. Thanks
           | to your post, I may consider it for future Linux
           | installations over EXT4. Except for the hands-on tinkering it
           | required once or twice, I remember it as being rock-solid on
           | my Sailfish mobile.)
        
             | chasil wrote:
             | Suse uses btrfs in production for the root filesystem, and
             | they have done so for years.
        
         | sz4kerto wrote:
         | I don't want to be trolling either, but a simple Google search
         | gives you really detailed answers. Or just look at Wikipedia:
         | https://en.wikipedia.org/wiki/ZFS
         | 
         | Some highlights: hierarchical checksumming, CoW snapshots,
         | deduplication, more efficient rebuilds, extremely configurable,
         | tiered storage, various caching strategies, etc.
        
         | magicalhippo wrote:
         | > What is the purpose of ZFS in 2021 if we have hardware RAID
         | and linux software RAID?
         | 
         | Others have touched on the main points, I just wanted to stress
         | that an important distinction between ZFS and hardware RAID and
         | linux software RAID (by which I assume you mean MD) is that the
         | latter two present themselves as block devices. One has to put
         | a file system on top to make use of them.
         | 
         | In contrast, ZFS does away with this traditional split, and
         | provides a filesystem as well as support for a virtual block
         | device. By unifying the full stack from the filesystem down to
         | the actual devices, it can be smarter and more resilient.
         | 
         | The first few minutes of this[1] presentation does a good job
         | of explaining why ZFS was built this way and how it improves on
         | the traditional RAID solutions.
         | 
         | [1]: https://www.youtube.com/watch?v=MsY-BafQgj4
        
         | wyager wrote:
         | ZFS RAID is the best RAID implementation in many respects.
         | Hardware RAID is bad at actually fixing errors on disk (as
         | opposed to just transparently correcting) and surfacing errors
         | to the user.
         | 
         | BTRFS is frequently not considered stable enough for production
         | usage.
         | 
         | ZFS has dozens of useful features besides RAID. Transparent
         | compression, instant atomic snapshots, incremental snapshot
         | sync, instant cloning of file systems, etc etc.
         | 
         | Yes, different ZFS implementations are mostly compatible in my
         | experience, and they should become totally compatible as
         | everyone moves to OpenZFS. FreeBSD 13 and Linux currently have
         | ZFS feature parity I believe.
        
         | tehbeard wrote:
         | I can't speak with much experience, but what I have gleamed is.
         | 
         | - You generally want to avoid hardware raid, if the card dies
         | you'll likely need to source a compatible replacement vs.
         | grabbing another SATA/was expander and reconstructing the
         | array.
         | 
         | - zfs handles the stack all the way from drives to filesystem,
         | allowing them to work together (i.e filesystem usage info can
         | better dictate what gets moved around tiered storage, or better
         | raid recovery.
        
           | LambdaComplex wrote:
           | My understanding is that hardware RAID is mainly a thing in
           | the Windows world, because apparently its software RAID
           | implementation is garbage
        
         | nickik wrote:
         | > What is the purpose of ZFS in 2021 if we have hardware RAID
         | 
         | Hardware RAID is actually older then ZFS style software RAID.
         | ZFS was specifically designed fix the issues with hardware
         | RAID.
         | 
         | The problem with Hardware RAID is that is has no ideas what
         | going on on top of it, and even worse, its a mostly a bunch of
         | closed-source fireware from a vendor. And they cost money.
         | 
         | You can find lots of terrible story about those.
         | 
         | ZFS is open-source and battle tested.
         | 
         | > linux software RAID
         | 
         | Not sure what you are referring too.
         | 
         | > BTRFS does RAID too.
         | 
         | BTRFS is basically copied many of the features done in ZFS.
         | BTRFS has a history of being far less stable. ZFS is far more
         | battle tested. They say its stable now, but they had said that
         | many times. It eat my data twice so I have not followed the
         | project anymore. A file system in my opinion gets exactly 1
         | chance with me.
         | 
         | They each have some features the other doesn't but broadly
         | speaking they are similar technology.
         | 
         | The new bcacheFS is also coming up and adding some interesting
         | features.
         | 
         | > Why would people choose ZFS in 2021 if both Oracle and Open
         | Source users have 2 competing ZFS?
         | 
         | Not sure what that has do with anything. Oracle is an evil
         | company, they tried take all these great open source
         | technologies away from people and the community thought against
         | it. Most of the ZFS team left after the merger.
         | 
         | The Open-Source version is arguable better, and has far more of
         | the original designers working on it. The two code bases have
         | diverged a lot since then.
         | 
         | At the end of the day ZFS is incredibly battle tested, works
         | incredibly well at what it does. And had a incredible
         | reputation of stability basically since it came out. They
         | question in my opinion is why not ZFS, then why ZFS.
        
           | _tom_ wrote:
           | > ZFS is far more battle tested. They say its stable now, but
           | they had said that many times. It eat my data twice
           | 
           | Did you mean "it ate my data" to apply to ZFS? Or did you
           | mean BTRFS?
        
             | znpy wrote:
             | It was probably BTRFS.
             | 
             | I never fell for the BTRFS meme but many friends of mine
             | did, and many of them ended up with a corrupted filesystem
             | (and lost data).
        
           | garmaine wrote:
           | He was referring to mdadm RAID.
        
         | Quekid5 wrote:
         | > hardware RAID
         | 
         | That's just the worst of all worlds: Usually proprietary _and_
         | you get the extreme aversion to improvement (or any change
         | really) of hardware vendors.
         | 
         | This ZFS changes is going to come, and it may end up being
         | complex to implement for users... but it's happening. At the
         | risk of being hyperbolic: Something like would never be
         | possible with a HW raid system unless it had explicitly been
         | designed for it from the start.
         | 
         | Also: ZFS does much more than any hardware RAID ever did.
        
         | boomboomsubban wrote:
         | They are not interoperable but they're barely competing as
         | Solaris is dead. Does Oracle Linux even offer Oracle ZFS? I
         | assume they stick to btrfs considering they are the original
         | developers.
         | 
         | RAID does not feature the data protection offered by a copy on
         | write filesystem, and OpenZFS is the most stable and portable
         | option.
        
         | nix23 wrote:
         | It's pretty easy, having lots of experience with HW-Raid and
         | SW-Raid, software it the way to go because:
         | 
         | 1. Do you trust Firmware...i don't, i can tell you storys about
         | freaking out san's...never had that with solaris or freebsd and
         | zfs.
         | 
         | 2. Why having a additional abstraction layer, HW Raid caching
         | vs FS-Caching, no transparency for error correction, not smart
         | raid rebuild etc.
         | 
         | the list can go on and on, but HW-Raid is a thing of the past
         | (exceptions are specialized san's etc)
        
         | usefulcat wrote:
         | The last time I had to use HW raid it was horrible. The
         | software for managing the RAID array was a poorly documented,
         | difficult to use proprietary blob. I used it for years and the
         | experience never improved. And this is a thing where if you
         | make a mistake you can destroy the very data that you've gone
         | to such lengths to protect. Having switched to ZFS several
         | years ago, I lack to the words to express how much I don't miss
         | having to deal with that.
        
       | nickik wrote:
       | I prefer just to have mirrors but its cool that it slowly coming,
       | some people seem to really want this feature.
       | 
       | ZFS has been amazing to me, I have zero complaints.
       | 
       | I just wish it wouldn't have taken so long to come to /root on
       | linux. Even still today you have to a lot of work unless you want
       | to use the new support in Ubuntu.
       | 
       | This license snafu is so terrible, open-source licenses excluding
       | each other. Crazy. The world would have been a better place if
       | linux had incorporated ZFS long ago. (And no we don't need yet
       | another legal discussion, my point is just that its sad).
        
       | 1980phipsi wrote:
       | This will be very useful!
       | 
       | TIL FreeNAS is now TrueNAS.
        
         | znpy wrote:
         | Actually there's more!
         | 
         | A new version of TrueNAS is in the works, it's called TrueNAS
         | scale and it's going to be Linux-based (no more FreeBSD).
         | 
         | I'm frankly happy, because TrueNAS is great as a NAS operating
         | system but I really wanted to run containers where my storage
         | is and having to run a VM adds a really unnecessary overhead
         | (plus, it's another machine to manage)
        
       | d33lio wrote:
       | I'll believe it when I see it, why anyone uses BTRFs (UnRaid or
       | any other form of software raid that _isn 't_ ZFS) is still
       | beyond me. At least when we're not talking SSD's ;)
       | 
       | ZFS is incredible, curious to mess around with these new
       | features!
        
         | mixedCase wrote:
         | I just put two 8TB drives into btrfs because it's a home
         | server, I can't provision things up front. One day I may put a
         | third 8TB drive and turn this RAID1 into RAID5. btrfs lets me
         | do that, zfs doesn't, simple as.
         | 
         | One day I may switch the whole thing to bcachefs, which I've
         | donated and am looking forwards to. For the moment, btrfs will
         | have to do.
         | 
         | EDIT: downvoted by... the filesystem brigade?
        
           | edgyquant wrote:
           | There are a large group of people who really dislike BTRFS. I
           | think they were probably burned by it at some point but I've
           | never had trouble and I've been using it since it became the
           | default on fedora.
        
           | nix23 wrote:
           | >RAID5
           | 
           | I wish you lots of fun with that on btrfs :)
           | 
           | Edit:
           | 
           | https://btrfs.wiki.kernel.org/index.php/Status
           | 
           | RAID56 Unstable n/a write hole still exists
           | 
           | > treated as if I'm storing business data or precious
           | memories without backups, guess I'm just dumb
           | 
           | No your not, but don't use unstable features in a filesystem
        
             | mixedCase wrote:
             | Well, that's the idea! This a low I/O media server where
             | all the important stuff (<5G of photos) has 2+ redundancy,
             | once remotely, and on every workstation I sync, with the
             | rest of the data being able to crash and burn without much
             | repercussion.
             | 
             | The whole point of me using RAID1 (and maybe later RAID5)
             | is that if a disk goes bust, odds are I can still watch a
             | movie from it until I can get another disk. What's more, if
             | I ever fill the RAID1 and I don't feel like breaking the
             | piggy bank for another disk, I can go JBOD as far as my
             | usecase is concerned.
             | 
             | But hey, if the orange website tells me all servers are
             | supposed to be treated as if I'm storing business data or
             | precious memories without backups, guess I'm just dumb. On
             | that note: donations welcome, each 8TB disk costs close to
             | 500 USD here in Uruguay, so if anyone's first world opinion
             | can buy me a couple so I can use the Right Filesystem(tm),
             | I'd appreciate it!
        
         | jbverschoor wrote:
         | Licensing. Similarly, otherwise it would've been included in
         | macOS a long time ago (as the default fs according to some..)
        
           | tw04 wrote:
           | The reason it didn't end up in macOS is because NetApp sued
           | Sun for patent infringement. Apple wanted nothing to do with
           | that lawsuit and quickly abandoned the project.
           | 
           | As others have stated, dtrace has the exact same license and
           | has been in MacOS for years.
        
           | jen20 wrote:
           | The licensing is nothing to do with it on OSX - indeed DTrace
           | (also under the CDDL) has been shipping in it for years.
        
           | bsder wrote:
           | I do believe that the license was fine for macOS but when
           | Oracle bought Sun that killed it cold.
           | 
           | Jobs _never_ liked anybody other than himself holding all the
           | cards. Having Ellison and Oracle holding the keys to ZFS was
           | just never going to fly.
        
             | spullara wrote:
             | I had ZFS on a Mac from Apple for a short amount of time
             | during one of the betas :( I think TimeMachine was going to
             | be based on it but they pulled out.
        
               | codetrotter wrote:
               | FYI there is a third-party effort for making OpenZFS
               | usable on macOS.
               | 
               | https://openzfsonosx.org/
               | 
               | I used it for a while but unfortunately since they are
               | not many people working on this and they are not working
               | on it full time it can take them a good while from a new
               | version of macOS is released until OpenZFS is usable with
               | that version of macOS. This was certainly the case a
               | while ago and why I stopped using OpenZFS on macOS and
               | went back to only using ZFS on FreeBSD and Linux instead
               | of additionally using it on macOS. So with my Mac
               | computers I only use APFS.
        
             | qaq wrote:
             | Jobs and Ellison were really close friends
        
               | jamiek88 wrote:
               | And also cold hearted clear eyed businessmen unlikely to
               | allow friendship to affect their corporations.
               | 
               | I'd love to be a fly on the wall for some of those
               | conversations.
        
             | tw04 wrote:
             | That makes absolutely no sense. Jobs and Ellison were best
             | friends. Oracle acquiring Sun would have made it MORE
             | attractive, not less.
             | 
             | https://www.cnet.com/news/larry-ellison-talks-about-his-
             | best...
        
             | ghaff wrote:
             | It's a combination of the license and the fact that it's
             | Oracle, of all entities, that owns the copyright. Perhaps
             | either one by itself wouldn't be a dealbreaker but the
             | combination is. And, of course, Oracle could have changed
             | the license at any time after buying Sun.
             | 
             | (Of course, Jobs may have just decided he didn't want to
             | depend on someone else for the MacOS filesystem in any
             | case.)
             | 
             | ADDED: And as others noted, there were also some storage
             | patent-related issues with Sun. So just a lot of potential
             | complications.
        
           | ghaff wrote:
           | And it's arguably even a bigger issue on Linux distros.
        
             | mnd999 wrote:
             | It's a moderate pain on Linux and then only really that if
             | you're running on something bleeding-edge like Arch.
             | Otherwise it's just a kernel module like any other.
        
               | ghaff wrote:
               | But it doesn't ship with either Red Hat or SUSE distros,
               | which is an issue for supported commercial use.
        
           | justaguy88 wrote:
           | Whats Oracle's play here, do they somehow make money out of
           | ZFS which makes them reluctant to re-license it?
        
             | sneak wrote:
             | Is there a CLA for OpenZFS/ZoL? I don't believe there is,
             | so I don't think Oracle can unilaterally relicense it.
        
         | Dylan16807 wrote:
         | > why anyone uses BTRFs (UnRaid or any other form of software
         | raid that isn't ZFS) is still beyond me.
         | 
         | BTRFS can do after-the-fact deduplication (with much better
         | performance than ZFS dedup) and copy-on-write files. And you
         | can turn snapshots into editable file systems.
        
           | eptcyka wrote:
           | I've had 3 catastrophic BTRFS failures. In two cases, the
           | root filesystem just ran out of space and there was no way to
           | repair the partition. Last time, the partition was just
           | rendered unmountable after a reboot. All data was lost.No
           | such thing has ever happened with ZFS for me.
        
             | Dylan16807 wrote:
             | I've had some annoying failures too. But I wasn't listing
             | pros and cons, I was explaining that there _are_ some very
             | notable features that ZFS lacks.
        
               | eptcyka wrote:
               | That's fair. However, when listing notable features for
               | the sake of comparing software, I think it's important to
               | also list other characteristics of a given piece of
               | software. If we were to compare software by feature sets
               | alone, one might argue that Windows has the most
               | features, so Windows must be best OS.
        
             | tux1968 wrote:
             | A recent Fedora install here came with a new default of
             | BTRFS use rather than ext4. So i'm curious about your
             | experience, were any of those catastrophic failures recent?
             | Do you know of any patches entering the kernel that purport
             | to fix the issues you experienced?
        
           | benlivengood wrote:
           | I think cloning a zfs snapshot into a writeable filesystem
           | matches at least the functionality of btrfs writeable
           | snapshots, but I could be ignorant about some use-cases.
        
             | Dylan16807 wrote:
             | Let's say you want to clear out part of a snapshot of
             | /home, but keep the rest.
             | 
             | So you clone it and delete some files. All good so far, but
             | the snapshot is still wasting space and needs to be
             | deleted.
             | 
             | But to make this happen, your clone has to stop being copy-
             | on-write. All the data that exists in both /home and the
             | clone will now be duplicated.
             | 
             | And you could say "plan ahead more", but even if you split
             | up your drive into many filesystems, now you have the
             | problem that you can't move files between these different
             | directories without making extra copies.
        
         | auxym wrote:
         | RAM?
         | 
         | Everytime I looked into setting up a freenas box, every
         | hardware guide insisted that ungodly amounts of absolutely-has-
         | to-be-ECC RAM was essential, and I just gave up at that point.
        
           | colechristensen wrote:
           | ZFS likes RAM and uses it to get better performance (and
           | don't think about using dedup without huge ram), but you
           | don't need it and can change the defaults.
           | 
           | ECC tends to attract zealots after a perfect error-free
           | existence which ECC does tend towards but doesn't deliver, it
           | just reduces errors. I personally don't care about a tiny
           | amount of bit rot (zfs will prevent most of this) and
           | rebooting my storage machine now and then.
           | 
           | You can run ZFS/freenas on a crappy old machine and you'll be
           | just fine as long as you aren't hosting storage for dozens of
           | people and you aren't a digital archivist trying to keep
           | everything for centuries.
           | 
           | Real advice:
           | 
           | * Mirrored vdevs perform way better than raidz, I don't think
           | the storage gain is worth it until you have dozens of drives
           | 
           | * Dedup isn't worth it
           | 
           | * Enable lz4 compression everywhere
           | 
           | * Have a hot spare
           | 
           | * You can increase performance by adding a vdev set and by
           | adding RAM
           | 
           | * Use drives with the same capacity
        
             | InvaderFizz wrote:
             | > Dedup isn't worth it
             | 
             | To add to that, ZFS dedup is a lie and you should forget
             | its existence unless you have a very specific scenario of
             | being a SAN with a massive amount of RAM, and even then,
             | you had better be damn sure.
             | 
             | I really wish ZFS had either an option to store the Dedup
             | Table on a NVMe like Optane, or to do an offline
             | deduplication job.
        
             | hpfr wrote:
             | Does rebooting help with soft errors in non-ECC RAM? I
             | would have thought bit flips would be transient in nature,
             | but I'm not really familiar.
        
               | AdrianB1 wrote:
               | Running ZFS (FreeNAS/TrueNAS) on 2 home made NAS devices
               | for years and years, I can say it is rock solid without
               | ever using ECC RAM due to lack of choices. I can bet
               | there were many soft-errors in all these years, but so
               | far I never had problems that could not be recovered; the
               | biggest issue ever was destroying the boot USB storage in
               | months, but that was partially solved lately, I moved to
               | fixed drives as boot drive and later I moved to
               | virtualization for boot disk and OS, so the problem
               | completely went away.
        
             | livueta wrote:
             | > Enable lz4 compression everywhere
             | 
             | Is the perf penalty low enough now that it just doesn't
             | matter? I've always disabled compression on datasets I know
             | are going to store only high-entropy data, like encoded
             | video, that has a poor compression ratio.
             | 
             | I second the hot spare recommendation many times over. It
             | can save your bacon.
        
               | simcop2387 wrote:
               | It's generally the other way around actually, aside from
               | storing already highly compressed datasets (e.g. video).
               | The compression from lz4 will get you better effective
               | performance because of the lower amount of io that has to
               | be done, both in throughput and latency on zfs. This is
               | because your CPU can usually do lz4 at hundreds of gb/s
               | compared to the dozen you might get on your spinning rust
               | disks.
        
               | livueta wrote:
               | Neat! Makes sense.
        
           | n8ta wrote:
           | The freenas hardware requirements themselves say "8 GB RAM
           | (ECC recommended but not required)"
           | 
           | https://www.freenas.org/hardware-requirements/
           | 
           | I myself use freenas with 16GB of non-ECC ram.
           | 
           | Of course it is possible to have a bit flip in memory that is
           | then dutifully stored incorrectly by ZFS to disk, but this
           | was a possibility without ZFS as well.
           | 
           | I've actually been waiting for this feature for since I first
           | setup my pool. It seemed theoretically possible we were just
           | waiting for an implementation.
        
           | dsr_ wrote:
           | Neither quantity nor ECC is essential.
           | 
           | ZFS defaults to assuming it is the primary reason for your
           | box to exist, but it only takes two lines to define more
           | reasonable RAM usage: zfs_arc_min and zfs_arc_max. On a NAS
           | type server, I would think setting the max to half of your
           | RAM is reasonable. Maybe 3/4 if you never do anything except
           | storage.
           | 
           | ECC is not recommended because ZFS has some kind of special
           | vulnerability without it; ECC is recommended because ZFS has
           | taken care of all the more likely chances of undetectable
           | corruption, so that's the next step.
        
             | fpoling wrote:
             | It is not that simple regarding ECC. Since ZFS uses more
             | memory, the probability of hitting a memory bug is simply
             | higher with it.
        
               | amarshall wrote:
               | But it doesn't really use more memory. The ARC gives the
               | impression of high memory usage because it's different
               | than the OS page cache and usually called out explicitly
               | and not ignored in many monitoring tools like the OS
               | cache is. Linux--without ZFS--will happily consume nearly
               | all RAM with _any_ filesystem if enough data is read and
               | written.
        
               | dsr_ wrote:
               | This is correct. Any filesystem using the kernel's
               | filesystem cache will do this, too.
               | 
               | For a long running, non-idle system, a good rule of thumb
               | is that all RAM not being actively used is being used by
               | evictable caching.
        
               | zerd wrote:
               | A colleague who was used to other UNIXes was
               | transitioning to Linux for a database. He saw in free
               | that used was more at more than 90%, so he added more
               | ram. But to his surprise it was still using 90%! He kept
               | adding ram. I told him that he had to subtract the buffer
               | and cached values (this was before free had the Available
               | column).
        
               | mark-wagner wrote:
               | Before the Available column there was the -/+
               | buffers/cache line that provided the same information.
               | Maybe it was too confusing.
               | total      used      free   shared buffers    cached
               | Mem: 12286456  11715372    571084        0   81912
               | 6545228       -/+ buffers/cache:  5088232   7198224
               | Swap: 24571408     54528  24516880
        
           | ahofmann wrote:
           | Others have said good things (ECC is good by itself, has not
           | much to do with ZFS) and it is actually quite easy to check
           | if you need much RAM for ZFS. Start a (Linux) VM with a few
           | hundred megabytes of RAM and run ZFS an on it. Of course, it
           | will not be as performant as having a lot of RAM. But it will
           | not crash, or hang or be unusable in one way or another.
           | 
           | Sources: - https://www.reddit.com/r/DataHoarder/comments/3s7v
           | rd/so_you_... - https://www.reddit.com/r/homelab/comments/8s6
           | r2r/what_exactl... - My own tests with around 8 TB ZFS data
           | in a Linux vm with 256 MB RAM.
        
           | IgorPartola wrote:
           | Heh so you have that backwards. All RAM should be ECC if you
           | care about what's stored in it. It's not a ZFS requirement,
           | it's just that ZFS specifically cares about data integrity so
           | it advises you to use ECC RAM. But it's not like any other
           | file system is immune from random RAM corruption: it's not,
           | it just won't tell you about it.
        
           | handrous wrote:
           | The "you need at least 32GB of memory and it _has to be_ ECC,
           | or don 't even bother trying to use ZFS" crowd has done some
           | serious harm to ZFS adoption. Sure, that's what you need if
           | you want _excellent_ data integrity guarantees and to use
           | _all_ of ZFS ' advanced features. If you're fine with merely
           | way-better-than-most-other-filesystems data integrity
           | guarantees and using only _most_ of ZFS ' advanced features,
           | you don't need those.
        
             | tombert wrote:
             | I really don't know where the "You gotta have ECC RAM!"
             | thing started. I've been running a ZFS RAID on Nvidia
             | Jetson Nanos for years now and haven't had any issues at
             | all with data integrity.
             | 
             | I don't see why ZFS would be more prone to data integrity
             | issues spawning from a lack of ECC than any other
             | filesystem.
        
               | kurlberg wrote:
               | Years ago I saw it at:
               | 
               | https://www.truenas.com/community/threads/ecc-vs-non-ecc-
               | ram...
               | 
               | (the gist of the scary story is that faulty ram while
               | scrubbing might kill "everything".) However, in the end
               | ECC appears to NOT be so important, e.g., see
               | 
               | https://news.ycombinator.com/item?id=23687895
        
               | radiowave wrote:
               | Relevant quote from one of ZFS's primary designers, Matt
               | Ahrens: "There's nothing special about ZFS that
               | requires/encourages the use of ECC RAM more so than any
               | other filesystem. ... I would simply say: if you love
               | your data, use ECC RAM. Additionally, use a filesystem
               | that checksums your data, such as ZFS."
        
               | tombert wrote:
               | Yeah, I remember reading that a few years ago.
               | 
               | If I were running a server farm or something, then yeah,
               | I'd probably use ECC memory, but I think if you're
               | running a home server, then the argument that ZFS
               | necessitates ECC more than Ext4 or Btrfs or XFS or
               | whatever doesn't really seem to be accurate.
        
               | oarsinsync wrote:
               | > the argument that ZFS necessitates ECC more than Ext4
               | or Btrfs or XFS or whatever doesn't really seem to be
               | accurate
               | 
               | Agreed.
               | 
               | > If I were running a server farm or something, then
               | yeah, I'd probably use ECC memory, but I think if you're
               | running a home server
               | 
               | Then you should still use ECC RAM, regardless of what
               | filesystem you're using.
               | 
               | No, really. ECC matters
               | (https://news.ycombinator.com/item?id=25622322)
               | generally.
        
               | tombert wrote:
               | Fair enough, though AFAIK none of the SBC systems out
               | there have ECC, and I generally use SBCs due to the low
               | power consumption.
        
           | simcop2387 wrote:
           | You really only end up needing that if and only if you're
           | also going to do live deduplication of large amounts of data.
           | Very few people actually need that, just using compression
           | with lz4 or zstd depending on your needs will suffice for
           | just about everyone and perform better. the ECC argument is
           | probably about a 50/50 kind of thing, you can get away
           | without it and ZFS will do it's best to detect and prevent
           | issues but if the data was flipped before it was given to ZFS
           | then there's nothing anyone can do. You might get some false
           | positives when reading data back if you got some flaky ram
           | but as long as you have parity or redundancy on the disks
           | then things should still get read correctly even if a false
           | problem is detected. That might mean you want to run a scrub
           | (essentially ZFS's version of fsck) more often to look for
           | potential issues but it shouldn't fundamentally be a big
           | deal. If you end up wanting 24/7 highly available storage
           | that won't blip out occasionally you'll probably really want
           | the ECC ram but if you're fine with having to reboot it
           | occasionally or tell it to repair problems that it thinks
           | were there (but weren't because the disk is fine but the ram
           | wasn't) then you should be fine. The extra checksums and data
           | that ZFS can use for all this can make it really robust even
           | on bad hardware. I had a bios update cause some massive PCIE
           | bus issues that I didn't realize were going on for a bit and
           | ZFS kept all my data in good condition even though writes
           | were sometimes just never happening because of ASPM causing
           | issues with my controller card.
        
           | UI_at_80x24 wrote:
           | As always, it depends on your use-case.
           | 
           | I have several file-servers all use ZFS exclusively. and 10x
           | that number of servers using ZFS as the system FS.
           | 
           | Rule of thumb that I like: 1GB RAM/TB of storage. This seems
           | to give me the best bang-for-our-buck.
           | 
           | For a small (under 20) number of office users, doing general
           | 'office' stuff, using Samba, it's overkill.
           | 
           | For large media shares with heavy editor access, and heavy
           | strains on the network, it's a minimum.
           | 
           | Depends on what the server is serving.
           | 
           | DeDUP is a different story. The RAM is used to store the
           | frequently accessed data. If you are using DeDUP you fill the
           | motherboard with as much RAM as will fit. NO EXCEPTIONS! This
           | may have been the line of thinking that scared you away from
           | it.
           | 
           | I have a 100TB server that is just used for writing data to
           | and is never read from (sequential file back-ups before it's
           | moved to "long term storage"). It has 8GB of RAM, and is
           | barely touched.
           | 
           | I also have a 20TB server with 2TB of RAM, that keeps the RAM
           | maxed out with DeDUP usage.
           | 
           | ECC: It's insurance, and it's worth it.
        
         | the8472 wrote:
         | btrfs does have some advantages over zfs                  - no
         | data duplicated between page cache and arc        - no upgrade
         | problems on rolling distros        - balance allows
         | restructuring the array        - offline dedup, no need for
         | huge dedup tables        - ability to turn off checksumming for
         | specific files        - O_DIRECT support        - reflink copy
         | - fiemap        - easy to resize
        
           | chasil wrote:
           | - defragmentation
        
           | nix23 wrote:
           | But the main thing of an fs is to preserve your files...btrfs
           | can't even check the most important point.
        
             | edgyquant wrote:
             | I see this a lot but have never had problems with BTRFS and
             | I've used it both on my larger disks (2+tb) and my root
             | (250gb ssd) across multiple computers for the last four
             | years.
        
             | the8472 wrote:
             | The checksumming helps to spot faulty hardware, that's a
             | step above most other filesystems and often smart info too.
        
               | akvadrako wrote:
               | Checksums don't help against bugs. You are much less
               | likely to lose your whole disk with ext4 or ZFS than
               | BTRFS.
        
           | baaym wrote:
           | And even included in the kernel
        
         | donmcronald wrote:
         | BTRFS was useful for me. When those (RAID5) parity patches got
         | rejected many, many years ago for non-technical reasons like
         | not matching a business case/goal or similar, it changed my
         | view of open source.
         | 
         | That was the day I realized that some open source participants
         | and supporters are interested in having open source projects
         | that are good enough to act as a barrier to entry, but not good
         | enough to compete with their commercial offerings.
         | 
         | Judge the world from that perspective for a while and it can
         | help to explain why so much open source feels 80% done and
         | never gets the last 20% of the polish needed to make it great.
        
         | imiric wrote:
         | Simplicity. There's a lot of complexity in ZFS I'd rather not
         | depend on, and because it does so many things it's a big
         | investment and liability to switch to.
         | 
         | While I understand why it would be useful in a corporate
         | setting, for personal use I've found the combination of
         | LUKS+LVM+SnapRAID to work well and don't see the benefit of
         | switching to ZFS. Two of those are core Linux features, and
         | SnapRAID has been rock solid, though thankfully I haven't
         | tested its recovery process, but it seems straightforward from
         | the documentation. Sure I don't have the real-time error
         | correction of ZFS and other fancy features, but most of those
         | aren't requirements for a personal NAS.
        
           | [deleted]
        
           | nix23 wrote:
           | > LUKS+LVM+SnapRAID
           | 
           | + your fs
           | 
           | Yeah that sounds like a lot less complexity
        
             | imiric wrote:
             | ZFS has all of these features and more. If I don't need
             | those extra features by definition it's a less complex
             | system.
             | 
             | Using composable tools is also better from a maintenance
             | standpoint. If tomorrow SnapRAID stops working, I can
             | replace just that component with something else without
             | affecting the rest of the system.
        
               | TimWolla wrote:
               | > If tomorrow SnapRAID stops working, I can replace just
               | that component with something else without affecting the
               | rest of the system.
               | 
               | Can you actually? If some layer of that storage stack
               | stops working then you can no longer access your existing
               | data, because all these layers need to work correctly to
               | correctly reassemble the data read from disk.
        
               | imiric wrote:
               | It's a hypothetical scenario :) In reality if there's a
               | project shutdown there would be enough time to migrate to
               | a different setup. Of course it would be annoying to do,
               | but at least it's possible. With a system like ZFS I'm
               | risking having to change the filesystem, volume manager,
               | storage array, encryption and whatever other feature I
               | depended on. It's a lot to buy into.
        
               | nix23 wrote:
               | Since all those tools are from different dev's the system
               | gets more complex. But hey if you really think that ZFS
               | is to complex to hold 55 petabytes because it has to many
               | potential bugs you should tell them:
               | 
               | https://computing.llnl.gov/projects/zfs-lustre
        
               | imiric wrote:
               | Thankfully I don't have to manage 55 petabytes of data,
               | but good luck to them.
               | 
               | Did you miss the part where I mentioned "for personal
               | use"?
               | 
               | > Since all those tools are from different dev's the
               | system gets more complex.
               | 
               | I fail to see the connection there. Whether software is
               | developed by a single entity or multiple developers has
               | no relation to how complex the end user system will be.
               | 
               | But many small tools focused on just the functionality I
               | need allows me to build a simpler system overall.
        
               | funcDropShadow wrote:
               | > Whether software is developed by a single entity or
               | multiple developers has no relation to how complex the
               | end user system will be.
               | 
               | The first part of this sentence is probably true, as far
               | as I see, but the complexity of a system perceived by the
               | user depends primarily on the "surface" of the system.
               | That surface includes the UI, the documentation and
               | important concepts you have to understand for effective
               | usage of the system. And in that regard, ZFS wins hands
               | down against LUKS + LVM + SnapRaid + your FS of choice.
               | Some questions a user of that LVM stack has to answer,
               | aren't even asked of a ZFS user. E.g. the question how to
               | split the space between volumes or how to change the size
               | of volumes.
        
           | j1elo wrote:
           | What about if you were just starting today, with 0 knowledge
           | about basically anything related to storage and how to do it
           | right?
           | 
           | That's my case, I'm learning before setting up a cheap home
           | lab and a NAS, and I'm wondering if biting into ZFS is just
           | the best option that I have given today's ecosystem.
        
             | imiric wrote:
             | I would still go with a collection of composable tools
             | rather than something monolithic as ZFS, and to avoid the
             | learning curve. But again, for personal use. If you're
             | planning to use ZFS in a professional setting it might be
             | good to experiment with it at home.
        
               | j1elo wrote:
               | As mentioned in the sibling comment, one thing I like is
               | having systems that don't require me to supervise, fix
               | things, etc. In part that's why I've been alwas a user of
               | ext4, it just works.
               | 
               | But I've recently found bitrotin some of my data files
               | and now that I happened to be learning about how to build
               | a NAS, I wanted to make the jump to some FS that helps me
               | with that task.
               | 
               | Could you mention which tools you would use to replace
               | ZFS? Think of checksumming, snapshotting, and to a lesser
               | degree, replication/RAID.
        
             | throw0101a wrote:
             | > _That 's my case, I'm learning before setting up a cheap
             | home lab and a NAS, and I'm wondering if biting into ZFS is
             | just the best option that I have given today's ecosystem._
             | 
             | ZFS is the simplest stack that you can learn IMHO. But if
             | you want to learn all the moving parts of an operating
             | system for (e.g.) professional development, then more
             | complex may be more useful.
             | 
             | If you want to created a mirrored pair of disks in ZFS, you
             | do: _sudo zpool create mydata mirror /dev/sda /dev/sdb_
             | 
             | In the old school fashion, you first partition with _gdisk_
             | , then you use _mdadm_ to create the mirroring, then
             | (optionally) LVM to create volume management, then _mkfs_.
        
             | cogman10 wrote:
             | I dove into ZFS for my home lab as a relative novice.
             | 
             | It's not terrible, but there are a few new concepts to come
             | to grips with. Once you have them down, it's not terrible.
             | 
             | If you don't plan on raiding, IMO, ZFS is overkill. The
             | check-summing is nice, but you can get that from other
             | filesystems.
             | 
             | Maintenance is fairly straight forward. I've even done a
             | disk swap without too much fuss.
             | 
             | The biggest issue I had was setting up raid z on root with
             | ubuntu was a PITA (at the time at least, March of this
             | year). I ended up switching over to debian instead. Once
             | setup, things have been pretty smooth.
        
               | j1elo wrote:
               | Two things I like from it, as per what I've read so far:
               | 
               | * Checksumming
               | 
               | * As you mention, easy maintenance
               | 
               | * Snapshots and how useful they are for backups
               | 
               | In the end what I value is stuff that works reliably,
               | doesn't get in the way, and requiring minimal
               | supervision. And in the particular case of FS, I'd like
               | to adopt a system that helps avoid bitrot in my data.
               | 
               | Could you drop some names that you would consider as good
               | alternatives of ZFS?
        
         | gregmac wrote:
         | For my big media volume, which had existed for around 10 years,
         | I use snapraid.
         | 
         | Because of several things:
         | 
         | * I can mix disk sizes
         | 
         | * I can add new disks over time as needed
         | 
         | * If something dies, up to the entire server, I can just stick
         | any data disk in another system and read it
         | 
         | I didn't want to become a zfs expert (and the learning curve
         | seems steep!), and I didn't want to spend thousands of dollars
         | on new gear (dedicated NAS box and a bunch of matched-size
         | disks).
         | 
         | I repurposed my old workstation into a server, spent a few
         | hours getting it set up, and it works. I've had two disks fail
         | (one data, one parity, and recovered from both). Every time
         | I've added a new disk, it's been 50-100% larger than my
         | existing disks.
         | 
         | I've also migrated the entire setup to a new system (newer old
         | retired workstation), running proxmox, and was pleasantly
         | surprised it only took about an hour to get that volume back up
         | (incidentally, that server runs zfs as well.. I just don't use
         | it for my large media storage volume).
        
           | joshstrange wrote:
           | UnRaid and Synology user here and I completely agree with all
           | your points. The knowledge that at worst I will lose the data
           | on just 1 disk (or 2 if I fail during a rebuild) is very
           | calming. If not for UnRaid there is no way I could manage the
           | size of the media volume I maintain (from a time, energy, and
           | money perspective). I mean if you know ZFS well and trust
           | yourself then more power to you but UnRaid and friends fill a
           | real gap.
        
           | atmosx wrote:
           | The learning curve of ZFS compared to every alternative out
           | there is significantly lower IMO. The interface is easier and
           | the guides online are great.
           | 
           | There are drawbacks as the one discussed here, but as a Linux
           | user who doesn't want to mess up with the FS and uses ZFS for
           | the backup server, the experience has been great so far.
        
       ___________________________________________________________________
       (page generated 2021-06-18 23:00 UTC)