[HN Gopher] Zstandard v1.5.0
       ___________________________________________________________________
        
       Zstandard v1.5.0
        
       Author : ascom
       Score  : 92 points
       Date   : 2021-05-14 16:11 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | aidenn0 wrote:
       | Zstd is so much better than the commonly-used alternatives that I
       | get mildly annoyed when given a .tar.{gz,xz,bz2} it's not like
       | it's a huge deal, but a much smaller file (compared to gz) or
       | similarly sized with much faster decompression (comared to xz,
       | bz2) just makes me a tiny bit happier.
        
         | xxpor wrote:
         | The only problem I have with it is when I first heard about it
         | I thought it was another name for the old school .Z/COMPRESS
         | algorithm.
        
         | infogulch wrote:
         | Your comment made me curious what a Zstandard-compressed tar
         | file's extension would be, and apparently it's .tar.zst
        
         | apendleton wrote:
         | I agree with the general premise that there's no reason to ever
         | use gzip anymore (unless you're in an environment where you
         | can't install stuff), but interestingly my experience with the
         | tradeoffs is apparently not the same as yours. I tend to find
         | that zstd and gzip give pretty similar compression ratios for
         | the things I tend to work with, but that zstd is way faster,
         | and that xz offers better compression ratios than either, but
         | is slow. So like, my personal decision matrix is "if I really
         | care about compression, use xz; if I want pretty good
         | compression and great speed -- that is, if before I would have
         | used gzip -- use zstd; and if I really want the fastest
         | possible speed and can give up some compression, use lz4."
        
           | jopsen wrote:
           | Most of the time you also care about ease of use and
           | compatibility.
        
             | apendleton wrote:
             | Maybe in a generic-you sense ("one also cares"), but if by
             | "you" you mean me, no, most of my compression needs are in
             | situations where I control both the compression and
             | decompression sides of the interaction, e.g., deciding how
             | to store business data at rest on s3, and debating the
             | tradeoffs between cost of space, download time, and
             | decompression time/CPU use. We migrated a bunch of
             | workflows at my last job from gzip to either lz4 or zstd to
             | take advantage of better tradeoffs there, and if I were
             | building a similar pipeline from scratch now, gzip would
             | not be a contender. Adding an extra dependency to my
             | application is pretty trivial, in exchange for shaving ten
             | minutes' worth of download and decompression time off of
             | every CI run.
        
           | aidenn0 wrote:
           | A few comments:
           | 
           | 1. There are two speeds: compression and decompression; lz4
           | only beats zstd when decompressing ("zstd -1" will compress
           | faster than lz4, and you can crank that up several levels and
           | still beat lz4_hc on cmopression). bzip2 is actually fairly
           | competitive at compression for the ratios it achieves but
           | loses badly at decompression.
           | 
           | 2. "zstd --ultra -22" is nearly identical compression to xz
           | on a corpus I just tested (an old gentoo distfiles snapshot)
           | while decompressing much faster (I didn't compare compression
           | speeds because the files were already xz compressed).
           | 
           | [edit]
           | 
           | Arch linux (which likely tested a larger corpus than I)
           | reported a 0.8% regression in size when switching from xz to
           | zstd using a compression level 20. This supports your
           | assertion that xz will beat zstd in compression ratio.
           | 
           | [edit2]
           | 
           | bzip2 accidentally[1] outperforms all other compression
           | algorithms I've tried handily on large files that are all
           | zero; for example 1GB of zeroes with "dd if=/dev/zero
           | bs=$((1024*1024)) count=1024 |bzip2 -9 > foo.bz2" generates a
           | file that is only 785 bytes. zstd is 33k and xz is 153k. Of
           | course my non-codegolfed script for generating 1GB of zeros
           | is only 38 bytes...
           | 
           | 1: There was a bug in the original BWT implementation that
           | had degenerate performance on long strings of identical
           | bytes, so bzip2 includes an RLE pass before the BWT.
        
       | markdog12 wrote:
       | Good blog post on zstd:
       | https://gregoryszorc.com/blog/2017/03/07/better-compression-...
        
       | greatgoat420 wrote:
       | > Single file Libs > This move reflects a commitment on our part
       | to support this tool and this pattern of using zstd going
       | forward.
       | 
       | I love that they are moving toward supporting an amalgamation
       | build. I and many others reach for SQLite because of this
       | feature, and I think this will really increase the adoption of
       | Zstd.
        
         | felixhandte wrote:
         | Glad to hear it! It's a pretty hefty single file, so it
         | probably won't be qualifying for
         | https://github.com/nothings/single_file_libs anytime soon...
         | but hopefully people find it useful nonetheless.
        
       | rektide wrote:
       | When can we bring this to the web? Zstd aka RFC8478[1] is so
       | good. That it can continue to improve at all feels almost
       | unbelievable, but @Cyan4973 &al continue to make it faster,
       | somehow.
       | 
       | Especially on mobile, with large assets, I feel like zstd's
       | lightning fast decompression time could be a huge win. It used to
       | be that Brotli was the obvious choice for achieving high
       | compression, but it doesn't feel so clear to me now. Here's one
       | random run-off between the two[2].
       | 
       | The other obvious use case is if there is large-ish dynamic-ish
       | data, where the cost of doing Brotli compression each time might
       | be too high.
       | 
       | [1] https://datatracker.ietf.org/doc/html/rfc8478
       | 
       | [2] https://peazip.github.io/fast-compression-benchmark-
       | brotli-z...
        
         | ac29 wrote:
         | Caddy supports Zstd encoding:
         | https://caddyserver.com/docs/caddyfile/directives/encode
         | 
         | On the client end, curl does:
         | https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
        
         | meltedcapacitor wrote:
         | Brotli is such an ugly hack (hardcoded dictionary with a
         | snapshot of the world as it looked from Mountain View on some
         | random day...), the quicker it dies the better.
        
           | nvllsvm wrote:
           | It's also impossible to identify whether an arbitrary file is
           | compressed with brotli. It lacks a magic number.
           | https://github.com/google/brotli/issues/298
        
       | rubyist5eva wrote:
       | Is this the same "zstd" compression used in Fedora's btrfs
       | transparent block level compression? I have been thoroughly
       | impressed with it in Fedora 34. If that's true, I had no idea
       | that it was a Facebook project. Color me shocked.
        
         | gliptic wrote:
         | It wasn't originally. Facebook hired Yann Collet well after
         | zstd was a working thing.
        
         | terrelln wrote:
         | Yeah, it is.
         | 
         | The Linux kernel is currently using zstd-1.3.1, and I'm working
         | on getting it updated to the latest zstd version.
        
           | post-factum wrote:
           | Looking forward to having modern zstd in-kernel! Thanks for
           | your efforts.
        
         | ipsum2 wrote:
         | According to https://en.wikipedia.org/wiki/Btrfs the core
         | developers of brtfs work at Facebook.
         | 
         | > In June 2012, Chris Mason left Oracle for Fusion-io, which he
         | left a year later with Josef Bacik to join Facebook. While at
         | both companies, Mason continued his work on Btrfs.[27][17]
        
         | greatgoat420 wrote:
         | Facebook actually does a decent amount of work on Fedora, and
         | were even part of the force behind using btrfs as the default.
        
       | sudeepj wrote:
       | zstandard continues to amazes me. Compared to zlib (level=4 I
       | think) it seems to have best of both worlds (good speed &
       | comparable compression ratio).
        
       ___________________________________________________________________
       (page generated 2021-05-14 23:00 UTC)