[HN Gopher] Zstandard v1.5.0 ___________________________________________________________________ Zstandard v1.5.0 Author : ascom Score : 92 points Date : 2021-05-14 16:11 UTC (6 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | aidenn0 wrote: | Zstd is so much better than the commonly-used alternatives that I | get mildly annoyed when given a .tar.{gz,xz,bz2} it's not like | it's a huge deal, but a much smaller file (compared to gz) or | similarly sized with much faster decompression (comared to xz, | bz2) just makes me a tiny bit happier. | xxpor wrote: | The only problem I have with it is when I first heard about it | I thought it was another name for the old school .Z/COMPRESS | algorithm. | infogulch wrote: | Your comment made me curious what a Zstandard-compressed tar | file's extension would be, and apparently it's .tar.zst | apendleton wrote: | I agree with the general premise that there's no reason to ever | use gzip anymore (unless you're in an environment where you | can't install stuff), but interestingly my experience with the | tradeoffs is apparently not the same as yours. I tend to find | that zstd and gzip give pretty similar compression ratios for | the things I tend to work with, but that zstd is way faster, | and that xz offers better compression ratios than either, but | is slow. So like, my personal decision matrix is "if I really | care about compression, use xz; if I want pretty good | compression and great speed -- that is, if before I would have | used gzip -- use zstd; and if I really want the fastest | possible speed and can give up some compression, use lz4." | jopsen wrote: | Most of the time you also care about ease of use and | compatibility. | apendleton wrote: | Maybe in a generic-you sense ("one also cares"), but if by | "you" you mean me, no, most of my compression needs are in | situations where I control both the compression and | decompression sides of the interaction, e.g., deciding how | to store business data at rest on s3, and debating the | tradeoffs between cost of space, download time, and | decompression time/CPU use. We migrated a bunch of | workflows at my last job from gzip to either lz4 or zstd to | take advantage of better tradeoffs there, and if I were | building a similar pipeline from scratch now, gzip would | not be a contender. Adding an extra dependency to my | application is pretty trivial, in exchange for shaving ten | minutes' worth of download and decompression time off of | every CI run. | aidenn0 wrote: | A few comments: | | 1. There are two speeds: compression and decompression; lz4 | only beats zstd when decompressing ("zstd -1" will compress | faster than lz4, and you can crank that up several levels and | still beat lz4_hc on cmopression). bzip2 is actually fairly | competitive at compression for the ratios it achieves but | loses badly at decompression. | | 2. "zstd --ultra -22" is nearly identical compression to xz | on a corpus I just tested (an old gentoo distfiles snapshot) | while decompressing much faster (I didn't compare compression | speeds because the files were already xz compressed). | | [edit] | | Arch linux (which likely tested a larger corpus than I) | reported a 0.8% regression in size when switching from xz to | zstd using a compression level 20. This supports your | assertion that xz will beat zstd in compression ratio. | | [edit2] | | bzip2 accidentally[1] outperforms all other compression | algorithms I've tried handily on large files that are all | zero; for example 1GB of zeroes with "dd if=/dev/zero | bs=$((1024*1024)) count=1024 |bzip2 -9 > foo.bz2" generates a | file that is only 785 bytes. zstd is 33k and xz is 153k. Of | course my non-codegolfed script for generating 1GB of zeros | is only 38 bytes... | | 1: There was a bug in the original BWT implementation that | had degenerate performance on long strings of identical | bytes, so bzip2 includes an RLE pass before the BWT. | markdog12 wrote: | Good blog post on zstd: | https://gregoryszorc.com/blog/2017/03/07/better-compression-... | greatgoat420 wrote: | > Single file Libs > This move reflects a commitment on our part | to support this tool and this pattern of using zstd going | forward. | | I love that they are moving toward supporting an amalgamation | build. I and many others reach for SQLite because of this | feature, and I think this will really increase the adoption of | Zstd. | felixhandte wrote: | Glad to hear it! It's a pretty hefty single file, so it | probably won't be qualifying for | https://github.com/nothings/single_file_libs anytime soon... | but hopefully people find it useful nonetheless. | rektide wrote: | When can we bring this to the web? Zstd aka RFC8478[1] is so | good. That it can continue to improve at all feels almost | unbelievable, but @Cyan4973 &al continue to make it faster, | somehow. | | Especially on mobile, with large assets, I feel like zstd's | lightning fast decompression time could be a huge win. It used to | be that Brotli was the obvious choice for achieving high | compression, but it doesn't feel so clear to me now. Here's one | random run-off between the two[2]. | | The other obvious use case is if there is large-ish dynamic-ish | data, where the cost of doing Brotli compression each time might | be too high. | | [1] https://datatracker.ietf.org/doc/html/rfc8478 | | [2] https://peazip.github.io/fast-compression-benchmark- | brotli-z... | ac29 wrote: | Caddy supports Zstd encoding: | https://caddyserver.com/docs/caddyfile/directives/encode | | On the client end, curl does: | https://curl.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html | meltedcapacitor wrote: | Brotli is such an ugly hack (hardcoded dictionary with a | snapshot of the world as it looked from Mountain View on some | random day...), the quicker it dies the better. | nvllsvm wrote: | It's also impossible to identify whether an arbitrary file is | compressed with brotli. It lacks a magic number. | https://github.com/google/brotli/issues/298 | rubyist5eva wrote: | Is this the same "zstd" compression used in Fedora's btrfs | transparent block level compression? I have been thoroughly | impressed with it in Fedora 34. If that's true, I had no idea | that it was a Facebook project. Color me shocked. | gliptic wrote: | It wasn't originally. Facebook hired Yann Collet well after | zstd was a working thing. | terrelln wrote: | Yeah, it is. | | The Linux kernel is currently using zstd-1.3.1, and I'm working | on getting it updated to the latest zstd version. | post-factum wrote: | Looking forward to having modern zstd in-kernel! Thanks for | your efforts. | ipsum2 wrote: | According to https://en.wikipedia.org/wiki/Btrfs the core | developers of brtfs work at Facebook. | | > In June 2012, Chris Mason left Oracle for Fusion-io, which he | left a year later with Josef Bacik to join Facebook. While at | both companies, Mason continued his work on Btrfs.[27][17] | greatgoat420 wrote: | Facebook actually does a decent amount of work on Fedora, and | were even part of the force behind using btrfs as the default. | sudeepj wrote: | zstandard continues to amazes me. Compared to zlib (level=4 I | think) it seems to have best of both worlds (good speed & | comparable compression ratio). ___________________________________________________________________ (page generated 2021-05-14 23:00 UTC)