[HN Gopher] Zstandard RFC 8878
       ___________________________________________________________________
        
       Zstandard RFC 8878
        
       Author : itroot
       Score  : 63 points
       Date   : 2021-11-19 19:41 UTC (3 hours ago)
        
 (HTM) web link (datatracker.ietf.org)
 (TXT) w3m dump (datatracker.ietf.org)
        
       | buryat wrote:
       | i forgive facebook all their abuses just because they gave us
       | zstd
        
         | metafex wrote:
         | they didn't though. zstd has been around even before the main
         | dev joined fb, i distinctly remember it being under the persons
         | personal github name.
        
         | kzrdude wrote:
         | You could read about lz4 and then later zstd on
         | http://fastcompression.blogspot.com/ long before he joined
         | facebook.
        
         | oofbey wrote:
         | I think you should read more about Facebook. Try e.g. the
         | Damien Collins email dump, and read about how their android app
         | tricked people into letting it record all phone call and text
         | message records, knowing full well users would hate it if they
         | found out.
         | 
         | Clearly they produce good technology. But the company is
         | morally bankrupt.
        
           | Y_Y wrote:
           | The PDF of of the information I assume you're referring to
           | is: https://www.parliament.uk/documents/commons-
           | committees/cultu...
           | 
           | Here is a quote from the summary, but I could not find where
           | it was substantiated in the 250-page document:
           | 
           | > Facebook knew that the changes to its policies on the
           | Android mobile phone system, which enabled the Facebook app
           | to collect a record of calls and texts sent by the user would
           | be controversial. To mitigate any bad PR, Facebook planned to
           | make it as hard of possible for users to know that this was
           | one of the underlying features of the upgrade of their app.
        
       | jeffbee wrote:
       | Does this mean the Zstd magic number is now cast in stone?
        
         | wmf wrote:
         | The file format was finalized years ago, so yes.
        
         | cornstalks wrote:
         | It's an Informational RFC, not a Standards Track RFC
         | (https://en.wikipedia.org/wiki/Request_for_Comments#Status).
         | That said, I think the magic number is pretty firmly
         | established.
        
         | lifthrasiir wrote:
         | You may have mistaken Brotli (whose file format has no magic
         | number and prevents an easy identification) with Zstandard
         | (whose file format does have defined magic numbers 28 B5 2F FD
         | or [50-5F] 2A 4D 18).
        
           | jeffbee wrote:
           | No I'm not confused, it's just that the Zstd magic number has
           | had _8 different values_ over the years, so I 'm just
           | wondering if we're past that yet.
        
             | lifthrasiir wrote:
             | Ah sure. The wire format has been fixed since 0.8.0
             | (2016-08), so you must have seen a very early phase of
             | development (which took one full year).
        
         | stouset wrote:
         | Can you shed some light on why this might be something of
         | concern?
        
       | ggm wrote:
       | It's said to be a good fit for ZFS. I tend to lz4 because its
       | baked into the older systems I use, but it may be at a point
       | where my default should be zstd.
       | 
       | bz2/gz still predominates for compressed objects in filestore
       | from what I can see.
        
         | [deleted]
        
       | thriftwy wrote:
       | Zstandard has very cool dictionary training feature, which allows
       | to keep a separate dictionary and have a 50% ratio compression on
       | very small (~100b) but repetitive data such as database records.
        
       | kzrdude wrote:
       | by the way, zlib-ng also seems interesting. In the sense that
       | it's cleaning up and improving a very aged library
       | https://github.com/Dead2/zlib-ng
        
       | vlovich123 wrote:
       | Is there a reason zstd isn't popular for HTTP and only brotli and
       | gzip see adoption?
        
         | zinekeller wrote:
         | Because Facebook doesn't have a browser.
         | 
         | (But seriously, Mozilla engineers have warned the Chrome team
         | that they are too rush with the inclusion of Brotli, since that
         | compression wars are heating up. They still proceeded though,
         | which is unsurprising.)
        
           | lifthrasiir wrote:
           | While that might well be one reason, it should be also noted
           | that Zstandard optimizes for the decompression speed with a
           | reasonable compression ratio while Brotli concentrates on the
           | compression ratio at the slight expense of speed (though it
           | is very hard to do a fair comparison). This is evident from
           | their defaults, where zstd uses a fairly low level (3 out of
           | -7..22) and Brotli uses the maximum level (11 out of 1..11).
           | But both have the decompression speeds far exceeding 100 MB/s
           | which is the practical limit for most Internet users, so
           | zstd's higher decompression speed wouldn't matter much in the
           | web context.
        
         | lifthrasiir wrote:
         | Brotli was arguably designed specifically for the web, because
         | it was originally used in the WOFF2 font format and also had a
         | large amount of preset dictionary collected from the web
         | (including HTML, CSS and JS fragments). Zstandard had no such
         | consideration, and while it _could_ be as efficient as Brotli
         | with a correct dictionary it does have a less merit compared to
         | Brotli in the web context.
        
         | duskwuff wrote:
         | Brotli has some pretty wild optimizations for web content,
         | including a gigantic (~120 KB) predefined dictionary packed
         | full of sequences commonly used in HTML/JS/CSS content. This
         | gives it a huge advantage on small text files.
        
         | jhgb wrote:
         | I assume it's because it's very new? That would seem like an
         | obvious explanation.
        
           | wolf550e wrote:
           | zstd is from 2015.
        
             | loeg wrote:
             | For comparison, brotli is from 2013.
        
       | erichocean wrote:
       | Does Zstandard still have the junk Facebook license attached to
       | it?
        
         | lifthrasiir wrote:
         | Not since 1.3.1.
        
           | erichocean wrote:
           | Thanks!
        
       | felixhandte wrote:
       | As someone on the Zstd team, I'm always happy to see it on HN!
       | I'm curious though what motivates the submission?
        
         | thewakalix wrote:
         | Probably its use in elfshaker[0].
         | 
         | [0] https://news.ycombinator.com/item?id=29277779
        
         | kzrdude wrote:
         | Zstd is always interesting.
         | 
         | For many applications (file formats), ubiquity is important, so
         | it would be fun if zstd becomes ubiquitous and can be relied on
         | to be available. Let's say for example in future versions of
         | HDF (HDF5 or later).
        
       | m0zg wrote:
       | Zstd is an amazing bit of work and all I ever use for data
       | compression nowadays (or LZ4 when speed is even more critical).
       | Several times the compression/decompression speed of gzip,
       | approximately the same compression ratio with default settings.
       | 
       | It's also supported by tar in recent Linux distros, if zstd is
       | installed, so "tar acf blah.tar.zst *" works fine, and "tar xf
       | blah.tar.zst" works automatically as well. Give it a try, folks,
       | and retire gzip shortly afterwards.
        
         | nigeltao wrote:
         | > Several times the compression/decompression speed of gzip
         | 
         | Just be careful that you're comparing against the best
         | implementation of gzip. One recent re-implementation of zcat
         | was 3.1x faster than /bin/zcat (and the CRC-32 implementation
         | within was 7.3x faster than /bin/crc32). Both programs decode
         | exactly the same file format. They're just different
         | implementations. For details, see:
         | https://nigeltao.github.io/blog/2021/fastest-safest-png-deco...
        
         | mjevans wrote:
         | I get why someone might want to avoid .zstd ; but that is the
         | short name offered for humans.
         | 
         | Was .zs not sufficient if a file format ending in 'std' is so
         | abhorrent?
        
           | m0zg wrote:
           | I'm not the one who came up with the extension. It just sort
           | of organically happened I guess. I'd prefer "zstd" myself,
           | but, frankly, "zst" is fine as well.
        
             | diroussel wrote:
             | Four letter extensions work really well for .java and
             | .json. It seems strange the abbreviate zstd anymore.
        
               | m0zg wrote:
               | I'like to point out though that "zstd" is itself an
               | abbreviation, and "zstandard" would be quite onerous.
        
       ___________________________________________________________________
       (page generated 2021-11-19 23:01 UTC)