Title: Sharing some statistics about BTRFS compression
       Author: Solène
       Date: 21 September 2022
       Tags: btrfs filesystem
       Description: 
       
       # Introduction
       
       As I'm moving to Linux more and more, I took the opportunity to explore
       the BTRFS file system which was mostly unknown to me.
       
       Let me share some data about compression ratio with BTRFS (ZFS should
       give similar results).
       
       # Work laptop
       
       ## First data
       
       This is my work computer with a big Nix store, and some build programs
       involving a lot of cache files and many git repositories.
       
       ```
       Processed 3570629 files, 894690 regular extents (1836135 refs), 2366783 inline.
       Type       Perc     Disk Usage   Uncompressed Referenced
       TOTAL       61%       55G          90G         155G
       none       100%       35G          35G          52G
       zlib        37%       20G          54G         102G
       prealloc   100%      138M         138M          67M
       ```
       
       The output reads that the real disk usage is 61%, so 39% of the disk
       compressed data.  We have more details per compression algorithm about
       the content, `none` represents uncompressed data and `zlib` the files
       compressed using this algorithm.
       
       Files compressed with zlib are down to 37% of their real size, this is
       not bad.  I made a mistake when creating the BTRFS mount point: I used
       zlib compression algorithm which is quite obsolete nowadays.  For
       history record, zlib is the library used to provide the "deflate
       compression algorithm" found in zip or gzip.
       
       Let's change the compression to use zstd algorithm instead.  This can
       be changed with the command `btrfs filesystem defrag -czstd -r /`. 
       Basically, all files are scanned, if they can be compressed with zstd,
       they are rewritten on the disk with the new algorithm.
       
       ## Data after switching to zstd
       
       After 37 minutes of recompressing everything, the results are
       surprising.  It didn't change much!
       
       ```
       Processed 3570427 files, 928646 regular extents (1869080 refs), 2364661 inline.
       Type       Perc     Disk Usage   Uncompressed Referenced
       TOTAL       60%       54G          90G         155G
       none       100%       33G          33G          51G
       zstd        37%       21G          56G         104G
       prealloc   100%      138M         138M          67M
       ```
       
       Real data usage on the disk is now 60% instead of 61% with zlib, not
       much of an improvement, I'd have expected zstd to perform a lot better.
       
       However, I didn't measure compression and decompression times.  zstd
       should perform a lot better in this area, so I'll stick with zstd.
       
 (HTM) LinuxReviews: comparison of compression algorithms
       
       # Personal computer
       
       My own laptop has a huge Nix store, a lot of binaries files (music,
       pictures), a few hundreads of gigabytes of video games.  I suppose it's
       quite a realistic and balanced environment.
       
       ```
       Processed 1804099 files, 755845 regular extents (1295281 refs), 980697 inline.
       Type       Perc     Disk Usage   Uncompressed Referenced
       TOTAL       93%      429G         459G         392G
       none       100%      414G         414G         332G
       zstd        34%       15G          45G          59G
       prealloc   100%       92M          92M          91M
       ```
       
       The saving due to compression is 30 GB, but this only count as 7% of
       the global file system.  That's not impressive compared to the other
       computer, but having an extra 30 GB for free is clearly something I
       enjoy.