Title: My BTRFS cheatsheet
       Author: Solène
       Date: 29 August 2022
       Tags: btrfs linux
       Description: 
       
       # Introduction
       
       I recently switched my home "NAS" (single disk!) to BTRFS, it's a
       different ecosystem with many features and commands, so I had to write
       a bit about it to remember the various possibilities...
       
       BTRFS is an advanced file-system supported in Linux, it's somehow
       comparable to ZFS.
       
       # Layout
       
       A BTRFS file-system can be made of multiple disks and aggregated in
       mirror or "concatenated", it can be split into subvolumes which may
       have specific settings.
       
       Snapshots and quotas are applying on subvolumes, so it's important to
       think beforehand when creating BTRFS subvolumes, one may want to use a
       subvolume for /home and for /var for most cases.
       
       # Snapshots / Clones
       
       It's possible to take an instant snapshot of a subvolume, this can be
       used as a backup.  Snapshots can be browsed like any other directory. 
       They exist in two flavors: read-only and writable.  ZFS users will
       recognize writable snapshots as "clones" and read-only as regular ZFS
       snapshots.
       
       Snapshots are an effective way to make a backup and rolling back
       changes in a second.
       
       # Send / Receive
       
       Raw filesystem can be sent / receive over network (or anything
       supporting a pipe) to allow incremental differences backup.  This is a
       very effective way to do incremental backups without having to scan the
       entire file-system each time you run your backup.
       
       # Deduplication
       
       I covered deduplication with bees, but one can also use the program
       "duperemove" (works on XFS too!).  They work a bit differently, but in
       the end they have the same purpose. Bees operates on the whole BTRFS
       file-system, duperemove operates on files, it's different use cases.
       
 (HTM) duperemove GitHub project page
 (HTM) Bees GitHub project page
       
       # Compression
       
       BTRFS supports on-the-fly compression per subvolume, meaning the
       content of each file is stored compressed, and decompressed on demand. 
       Depending on the files, this can result in better performance because
       you would store less content on the disk, and it's less likely to be
       I/O bound, but also improve storage efficiency.  This is really content
       dependent, you can't compress binary files like pictures/videos/music,
       but if you have a lot of text and sources files, you can achieve great
       ratios.
       
       From my experience, compression is always helpful for a regular user
       workload, and newer algorithm are smart enough to not compress binary
       data that wouldn't yield any benefit.
       
       There is a program named compsize that reports compression statistics
       for a file/directory.  It's very handy to know if the compression is
       beneficial and to which extent.
       
 (HTM) compsize GitHub project page
       
       # Defragmentation
       
       Fragmentation is a real thing and not specific to Windows, it matters a
       lot for mechanical hard drive but not really for SSDs.
       
       Fragmentation happens when you create files on your file-system, and
       delete them: this happens very often due to cache directories, updates
       and regular operations on a live file-system.
       
       When you delete a file, this creates a "hole" of free space, after some
       time, you may want to gather all these small parts of free space to
       have big chunks of free space, this matters for mechanical disks has
       the physical location of data is tied to the raw performance.  The
       defragmentation process is just physically reorganizing data to order
       files chunks and free space into continuous blocks.
       
       Defragmentation can be used to force compression in a subvolume, like
       if you want to change the compression algorithm or enabled compression
       after saving the files.
       
       The command line is: btrfs filesystem defragment
       
       # Scrubbing
       
       The scrubbing feature is one of the most valuable feature provided by
       BTRFS and ZFS.  Each file in these file-system is associated with its
       checksum in some metadata index, this mean you can actually check each
       file integrity by comparing its current content with the checksum known
       in the index.
       
       Scrubbing costs a lot of I/O and CPU because you need to compute the
       checksum of each file, but it's a guarantee for validating the stored
       data.  In case of a corrupted file, if the file-system is composed of
       multiple disks (raid1 / raid5), it can be repaired from mirrored
       copies, it should work most of the time because such file corruption is
       often related to the drive itself, thus other drives shouldn't be
       affected.
       
       Scrubbing can be started / paused / resumed, this is handy if you need
       to operate heavy I/O and you don't want the scrubbing process to
       increase time.  While the scrub commands can take a device or a path,
       the path parameter is only used to find the related file-system, it
       won't just scrub the files in that directory.
       
       The command line is: btrfs scrub
       
       # Rebalancing
       
       When you are aggregating multiple disks into one BTRFS file-system,
       files are written on a disk and some other files are written to the
       other, after a while, a disk may contain more data than the other.
       
       The rebalancing purpose is to redistribute data across the disks more
       evenly.
       
       # Swap file
       
       You can't create a swap file on a BTRFS disk without a tweak.  You must
       create the file in a directory with the special attribute "no COW"
       using "chattr +C /tmp/some_directory", then you can move it anywhere as
       it will inherit the "no COW" flag.
       
       If you try to use a swap file with COW enabled on it, swapon will
       report a weird error, but you get more details in the dmesg output.
       
       # Converting
       
       It's possible to convert a ext2/3/4 file-system into BTRFS, obviously
       it must not be currently in use.  The process can be rolled back until
       a certain point like defragmenting or rebalancing.