[HN Gopher] QOI - The "Quite OK Image Format" for fast, lossless...
       ___________________________________________________________________
        
       QOI - The "Quite OK Image Format" for fast, lossless image
       compression
        
       Author : JeanMo
       Score  : 98 points
       Date   : 2021-12-23 13:12 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | booi wrote:
       | Seems like they benchmarked it against libpng which shows
       | anywhere from 3-5x faster decompression and 30-50x compression.
       | That's pretty impressive and even though libpng isn't the most
       | performant of the png libraries, it's by far the most common.
       | 
       | I think the rust png library is ~4x faster than libpng which
       | could erase the decompression advantage but that 50x faster
       | compression speed is extremely impressive.
       | 
       | Can anybody tell if there's any significant feature differentials
       | that might explain the difference (color space, pixel formats, ..
       | etc)?
        
         | sakras wrote:
         | I think fundamentally it's faster just because it's dead
         | simple. It's just a mash of RLE, dictionary encoding, and delta
         | encoding, and it does it all in a single pass. PNG has to break
         | things into chunks, apply a filter, deflate, etc.
        
       | corysama wrote:
       | I think QOI inspired the creation of
       | https://github.com/richgel999/fpng which creates standard PNGs
       | and compares itself directly to QOI.
        
       | cornstalks wrote:
       | A couple previous interesting discussions from this past month:
       | 
       | - "The QOI File Format Specification" 214 points | 3 days ago |
       | 54 comments: https://news.ycombinator.com/item?id=29625084
       | 
       | - "QOI: Lossless Image Compression in O(n) Time" 1057 points | 29
       | days ago | 293 comments:
       | https://news.ycombinator.com/item?id=29328750
        
       | zigzag312 wrote:
       | Is there any open source audio compression format like that?
       | Lossless and very fast. I haven't found any yet.
       | 
       | EDIT: I'm thinking about a format that would be suitable as a
       | replacement for uncompressed WAV files in DAWs. Rendered tracks
       | often have large sections of silence and uncompressed WAVs have
       | always seemed wasteful to me.
        
         | 323 wrote:
         | WavPack might fit the bill. It has decent software support. Not
         | sure if DAWs can use it natively, they might unpack it to a
         | temp folder.
         | 
         | https://www.wavpack.com/
        
           | zigzag312 wrote:
           | Reaper does. Unfortunately, WavPack has a bit too much
           | performance overhead.
        
             | 323 wrote:
             | It's a 20 year old format.
             | 
             | ZStandard is a very good compressor, with an especially
             | fast decompressor. Maybe someone should try using this
             | instead of zlib in an audio format (FLAC, WavPack, ...)
        
               | BlueSwordM wrote:
               | I mean, is there really a need for utilizing ZStd for
               | audio compression?
               | 
               | FLAC is extremely good at compression audio, has very
               | fast encode and uber fast decode. It also doesn't use
               | zlib...
        
         | LeoPanthera wrote:
         | FLAC is always lossless, but has a variable compression ratio
         | so you can trade compression for speed.
         | 
         | Using the command line "flac" tool, "flac -0" is the fastest,
         | "flac -8" is the slowest, but produces the smallest files.
         | 
         | In my experience, 0-2 all produce roughly equivalent sized
         | files, as do 4-8.
        
           | makapuf wrote:
           | I tried passing stereo wavs in 2 x 16bits (4bytes) as rgba
           | for qoi but I haven't been very successful.
        
             | adgjlsfhk1 wrote:
             | That's not surprising. QOI is heavily optimized for images
             | which tend to be relatively continuous, while audio tends
             | to oscillate a ton.
        
         | dmitrygr wrote:
         | gzip -1 is lossless and fast. It will somewhat compress pcm
         | data :)
        
           | zigzag312 wrote:
           | You would loose fast seeking ability with gzip. Or am I
           | mistaken?
        
             | jzwinck wrote:
             | You can only seek within a gzip file if you write it with
             | some number of Z_FULL_FLUSH points which are resumable. The
             | command line gzip program does not support this, but it's
             | easy using zlib. For example you might do a Z_FULL_FLUSH
             | roughly every 50 MB of compressed data. Then you can seek
             | to any byte in the file, search forward or backward for the
             | flush marker, and decompress forward from there as much as
             | you want. If your data is sorted or is a time series, it's
             | easy to implement binary search this way. And standard
             | gunzip etc will still be able to read the whole file as
             | normal.
        
         | StreamBright wrote:
         | ALAC? FLAC? What is the problem with these?
        
           | zigzag312 wrote:
           | FLAC is limited to 24 bit depth. I was thinking of
           | intermediate format suitable for use in DAWs and samplers
           | that also supports floating point to avoid clipping.
        
             | LeoPanthera wrote:
             | 24-bit integer and 32-bit float have the same dynamic range
             | available, so you are not losing any fidelity.
             | 
             | However, frankly, if you're working professionally with
             | audio like that, the best solution is simply to have
             | sufficient disk space available to work with raw audio.
             | 
             | Use FLAC to compress the final product, when you are done.
        
               | zigzag312 wrote:
               | With 24-bit integer you are at risk of clipping.
               | 
               | EDIT: Floating point is useful while you are working to
               | avoid any accidental clipping. As an intermediate format,
               | like a ProRes for video. FLAC is great as a final format.
        
               | kloch wrote:
               | They have the same precision but float has vastly larger
               | dynamic range due to the 8-bit exponent. When normalized
               | and quantized for output this does result in roughly the
               | same effective dynamic range (depending on how much of
               | the integer range was originally used).
               | 
               | The issue is audio is typically mixed close to maximum so
               | any processing steps can easily lead to clipping. One
               | solution is to use float or larger integers internally
               | during each processing step and normalize/convert back to
               | 24-bit integer to write to disk. Another (better imo)
               | option would be to do all intermediate steps and disk
               | saves in a floating point format and only
               | normalize/quantize for output once.
               | 
               | I haven't worked with professional audio in over 25 years
               | (before everything went fully digital) but I would be
               | surprised if floating point formats were not an option
               | for encoding and intermediate workflows. Many
               | quantization steps seems like a bad idea.
        
               | zigzag312 wrote:
               | > I would be surprised if floating point formats were not
               | an option for encoding and intermediate workflows.
               | 
               | For bouncing tracks to disk, uncompressed 32-bit floating
               | point formats are avaliable, but I am not aware of any
               | fast losslessly compressed 32-bit floating point format.
        
               | adzm wrote:
               | Most DAWs and plugins and audio interfaces nowadays use
               | floating point internally.
        
               | 323 wrote:
               | All professional audio production software these days
               | internally works with 32/64 bit floats. That's the native
               | format, because it allows you to go above 0 dBFS (maximum
               | level), as long as you go back below it at the end of the
               | chain.
        
             | StreamBright wrote:
             | Nice, I was not aware.
        
             | artiii wrote:
             | check WavPack (32pcm, floats etc) but it's slower(not much)
             | than flac, offering slighty beter compresion.
        
               | zigzag312 wrote:
               | WavPack seems a bit too slow already. 3x slower decode
               | compared to FLAC in this test
               | https://stsaz.github.io/fmedia/audio-formats/
        
         | wombatmobile wrote:
         | I'd also like to know what's the best (or any) lossless audio
         | compression process/tools.
         | 
         | My application is to send audio (podcast recordings) to a
         | remote audio engineer friend who will do the post processing,
         | then round trip it to me to complete the editing.
         | 
         | Wav is so big it makes a 1 hr podcast a difficult proposition.
         | 
         | MP3 is unsuitable because compression introduces too many
         | artefacts the quality suffers unacceptably.
         | 
         | What do other people do in this circumstances?
        
           | phonon wrote:
           | 1 hour of CD quality mono FLAC encoded is about 100-150 MB.
           | Is that small enough?
        
           | selectodude wrote:
           | FLAC and ALAC can be losslessly converted to back to WAV and
           | cuts the file size in half.
        
       | FullyFunctional wrote:
       | Since we are rehashing this for the 3rd (4th?) time, I'll repeat
       | mine (and apparently many others) key critique: there is no
       | thought at all to enabling parallel decoding, be it, thread-
       | parallel or SIMD (or both). That makes it very much a past
       | millennium style format that will age very poorly.
       | 
       | At the very least, break it into chunks and add an offset
       | directory header. I'm sure one could do something much better,
       | but it's a start.
       | 
       | EDIT: typo
        
         | meltedcapacitor wrote:
         | A thread can scan the opcodes only to find cut-off points and
         | distribute actual decoding to other cores. Surely you can do
         | that with some simd magic, as well as the decoding threads,
         | without needing to encode properties of today's simd in the
         | encoding.
        
         | stathibus wrote:
         | _Who cares_ that it 's not set up for simd?
         | 
         | Seriously, who?
         | 
         | This project is interesting because of how well it does
         | compared to other systems of much higher complexity and without
         | optimizing the implementation to high heaven. We can all learn
         | something from that.
        
           | FullyFunctional wrote:
           | Good question. The answer is all the poor souls that N years
           | later find themselves stuck with a data in a legacy format
           | that they have to struggle to decode faster.
           | 
           | Of all the artifacts in our industry, few things live longer
           | than formats. Eg. we are still unpacking tar files (Tape
           | ARchieve), transmitted over IPv4, decoded by machines running
           | x86 processors (and others, sure). All of these formats
           | couldn't possible anticipate the evolution that follow nor
           | predicted the explosive popularity they would have. And all
           | of these (the latter two notably) have overheads that have
           | real material costs. IPv6 fixed all the misaligned fields,
           | but IPv4 is still dominant. Ironically, RISC-V didn't learn
           | from x86 but added variable length instructions making
           | decoding harder to scale than necessary.
           | 
           | I'm not sure what positive lessons you think we should learn
           | from QOI. It's not hard to come up with simple formats. It's
           | much harder coming up with a format that learns from past
           | failures and avoids future pitfalls.
        
             | ricardobeat wrote:
             | QOI is designed with a very specific purpose in mind, which
             | is fast decoding for games. This kind of image will be very
             | unlikely be large enough to benefit from multi threading,
             | and if you have a lot of them you can simply decode in
             | parallel. It's not meant to the the "best" image format.
        
             | nynx wrote:
             | Unrelated to the rest of your comment, but risc-v does not
             | have variable-length instructions. It has compressed
             | instructions, but they're designed in such a way to be
             | easily and efficiently integrated into the decoder for
             | normal instructions, which are all 32 bits.
        
       | jqpabc123 wrote:
       | Interesting format. It would be much more interesting if browsers
       | supported it.
        
         | dnautics wrote:
         | Not sure what you're expecting given how old it is. Why not
         | write a polyfill as an exercise for yourself? Convert it to
         | png, then save as an image tag to a data url.
         | 
         | Here look some people adapted to ios in _one_ hour faffing
         | around on twitch:
         | https://www.twitch.tv/videos/1241476768?tt_medium=mobile_web...
        
         | ReactiveJelly wrote:
         | It's always gonna be chicken-and-egg for this, and browsers
         | won't spend the time sandboxing and supporting a codec until
         | it's already popular.
         | 
         | So this will probably see a JS / Webasm shim, and if that
         | proves popular, Blink and Gecko will consider it.
         | 
         | The day might come soon when browsers just greenlight a webasm
         | interface for codecs. "We'll put packets in through this
         | function, and take frames out through this function, like
         | ffmpeg. Other than that, you're running in a sandbox with X MB
         | of RAM, Y seconds of CPU per frame, and no I/O. Anything you
         | can within that, is valid."
        
       | sreekotay wrote:
       | If QOI is interesting because of speed, you might take a look at
       | fpng, a recent/actively developed png reader/writer that is
       | achieving comparable speed/compression to QOI, while staying png
       | compliant.
       | 
       | https://github.com/richgel999/fpng
       | 
       | Disclaimer: have not actively tried either.
        
         | jws wrote:
         | I find it interesting that QOI avoids any kind of Huffman style
         | coding.
         | 
         | Huffman encoding lets you store frequently used values in fewer
         | bits than rarely occurring values, but the cost of a naive
         | implementation is a branch on every encoded bit. You can
         | mitigate this by making a state machine keyed by "accumulated
         | prefix bits" and as many bits as you want to process in a
         | whack, these tables will blow out your L1 data cache and trash
         | a lot of your L2 cache as well.1
         | 
         | The "opcode" strategy in QOI is going to give you branches, but
         | they appear nearly perfectly predictable for common image
         | types2, so that helps. It has a table of recent colors, but
         | that is only of a few cache lines.
         | 
         | In all, it seems a better fit for the deep pipelines and wildly
         | varying access speeds across cache and memory layers which we
         | find today.
         | 
         | 
         | 1 I don't think it ever made it into a paper, but in the
         | mid-80s, when the best our Vax ethernet adapters could do was
         | ~3Mbps I was getting about 10Mbps of decompressed 12 bit
         | monochrome imagery out of a ~1.3MIP computer using this
         | technique.
         | 
         | 2 I also wouldn't be surprised if this statement is false. It
         | just seems that for continuous tone images one of RGBA, DIFF,
         | or LUMA is going to win for any given region of a scan line.
        
           | adgjlsfhk1 wrote:
           | One thing to note is that QOI composes really nicely with
           | high quality entropy encoders like LZ4 and ZSTD. LZ4 gives a
           | roughly 5% size reduction with negligible speed impact, and
           | ZSTD gives a 20% size reduction with moderate speed impact
           | (https://github.com/nigeltao/qoi2-bikeshed/issues/25).
        
       ___________________________________________________________________
       (page generated 2021-12-23 23:00 UTC)