[HN Gopher] Zstandard RFC 8878 ___________________________________________________________________ Zstandard RFC 8878 Author : itroot Score : 63 points Date : 2021-11-19 19:41 UTC (3 hours ago) (HTM) web link (datatracker.ietf.org) (TXT) w3m dump (datatracker.ietf.org) | buryat wrote: | i forgive facebook all their abuses just because they gave us | zstd | metafex wrote: | they didn't though. zstd has been around even before the main | dev joined fb, i distinctly remember it being under the persons | personal github name. | kzrdude wrote: | You could read about lz4 and then later zstd on | http://fastcompression.blogspot.com/ long before he joined | facebook. | oofbey wrote: | I think you should read more about Facebook. Try e.g. the | Damien Collins email dump, and read about how their android app | tricked people into letting it record all phone call and text | message records, knowing full well users would hate it if they | found out. | | Clearly they produce good technology. But the company is | morally bankrupt. | Y_Y wrote: | The PDF of of the information I assume you're referring to | is: https://www.parliament.uk/documents/commons- | committees/cultu... | | Here is a quote from the summary, but I could not find where | it was substantiated in the 250-page document: | | > Facebook knew that the changes to its policies on the | Android mobile phone system, which enabled the Facebook app | to collect a record of calls and texts sent by the user would | be controversial. To mitigate any bad PR, Facebook planned to | make it as hard of possible for users to know that this was | one of the underlying features of the upgrade of their app. | jeffbee wrote: | Does this mean the Zstd magic number is now cast in stone? | wmf wrote: | The file format was finalized years ago, so yes. | cornstalks wrote: | It's an Informational RFC, not a Standards Track RFC | (https://en.wikipedia.org/wiki/Request_for_Comments#Status). | That said, I think the magic number is pretty firmly | established. | lifthrasiir wrote: | You may have mistaken Brotli (whose file format has no magic | number and prevents an easy identification) with Zstandard | (whose file format does have defined magic numbers 28 B5 2F FD | or [50-5F] 2A 4D 18). | jeffbee wrote: | No I'm not confused, it's just that the Zstd magic number has | had _8 different values_ over the years, so I 'm just | wondering if we're past that yet. | lifthrasiir wrote: | Ah sure. The wire format has been fixed since 0.8.0 | (2016-08), so you must have seen a very early phase of | development (which took one full year). | stouset wrote: | Can you shed some light on why this might be something of | concern? | ggm wrote: | It's said to be a good fit for ZFS. I tend to lz4 because its | baked into the older systems I use, but it may be at a point | where my default should be zstd. | | bz2/gz still predominates for compressed objects in filestore | from what I can see. | [deleted] | thriftwy wrote: | Zstandard has very cool dictionary training feature, which allows | to keep a separate dictionary and have a 50% ratio compression on | very small (~100b) but repetitive data such as database records. | kzrdude wrote: | by the way, zlib-ng also seems interesting. In the sense that | it's cleaning up and improving a very aged library | https://github.com/Dead2/zlib-ng | vlovich123 wrote: | Is there a reason zstd isn't popular for HTTP and only brotli and | gzip see adoption? | zinekeller wrote: | Because Facebook doesn't have a browser. | | (But seriously, Mozilla engineers have warned the Chrome team | that they are too rush with the inclusion of Brotli, since that | compression wars are heating up. They still proceeded though, | which is unsurprising.) | lifthrasiir wrote: | While that might well be one reason, it should be also noted | that Zstandard optimizes for the decompression speed with a | reasonable compression ratio while Brotli concentrates on the | compression ratio at the slight expense of speed (though it | is very hard to do a fair comparison). This is evident from | their defaults, where zstd uses a fairly low level (3 out of | -7..22) and Brotli uses the maximum level (11 out of 1..11). | But both have the decompression speeds far exceeding 100 MB/s | which is the practical limit for most Internet users, so | zstd's higher decompression speed wouldn't matter much in the | web context. | lifthrasiir wrote: | Brotli was arguably designed specifically for the web, because | it was originally used in the WOFF2 font format and also had a | large amount of preset dictionary collected from the web | (including HTML, CSS and JS fragments). Zstandard had no such | consideration, and while it _could_ be as efficient as Brotli | with a correct dictionary it does have a less merit compared to | Brotli in the web context. | duskwuff wrote: | Brotli has some pretty wild optimizations for web content, | including a gigantic (~120 KB) predefined dictionary packed | full of sequences commonly used in HTML/JS/CSS content. This | gives it a huge advantage on small text files. | jhgb wrote: | I assume it's because it's very new? That would seem like an | obvious explanation. | wolf550e wrote: | zstd is from 2015. | loeg wrote: | For comparison, brotli is from 2013. | erichocean wrote: | Does Zstandard still have the junk Facebook license attached to | it? | lifthrasiir wrote: | Not since 1.3.1. | erichocean wrote: | Thanks! | felixhandte wrote: | As someone on the Zstd team, I'm always happy to see it on HN! | I'm curious though what motivates the submission? | thewakalix wrote: | Probably its use in elfshaker[0]. | | [0] https://news.ycombinator.com/item?id=29277779 | kzrdude wrote: | Zstd is always interesting. | | For many applications (file formats), ubiquity is important, so | it would be fun if zstd becomes ubiquitous and can be relied on | to be available. Let's say for example in future versions of | HDF (HDF5 or later). | m0zg wrote: | Zstd is an amazing bit of work and all I ever use for data | compression nowadays (or LZ4 when speed is even more critical). | Several times the compression/decompression speed of gzip, | approximately the same compression ratio with default settings. | | It's also supported by tar in recent Linux distros, if zstd is | installed, so "tar acf blah.tar.zst *" works fine, and "tar xf | blah.tar.zst" works automatically as well. Give it a try, folks, | and retire gzip shortly afterwards. | nigeltao wrote: | > Several times the compression/decompression speed of gzip | | Just be careful that you're comparing against the best | implementation of gzip. One recent re-implementation of zcat | was 3.1x faster than /bin/zcat (and the CRC-32 implementation | within was 7.3x faster than /bin/crc32). Both programs decode | exactly the same file format. They're just different | implementations. For details, see: | https://nigeltao.github.io/blog/2021/fastest-safest-png-deco... | mjevans wrote: | I get why someone might want to avoid .zstd ; but that is the | short name offered for humans. | | Was .zs not sufficient if a file format ending in 'std' is so | abhorrent? | m0zg wrote: | I'm not the one who came up with the extension. It just sort | of organically happened I guess. I'd prefer "zstd" myself, | but, frankly, "zst" is fine as well. | diroussel wrote: | Four letter extensions work really well for .java and | .json. It seems strange the abbreviate zstd anymore. | m0zg wrote: | I'like to point out though that "zstd" is itself an | abbreviation, and "zstandard" would be quite onerous. ___________________________________________________________________ (page generated 2021-11-19 23:01 UTC)