[HN Gopher] Wuffs' PNG image decoder
       ___________________________________________________________________
        
       Wuffs' PNG image decoder
        
       Author : est31
       Score  : 274 points
       Date   : 2021-04-06 17:32 UTC (5 hours ago)
        
 (HTM) web link (nigeltao.github.io)
 (TXT) w3m dump (nigeltao.github.io)
        
       | ape4 wrote:
       | Wuffs == Wrangling Untrusted File Formats Safely it appears
        
       | the8472 wrote:
       | _> decompressing the entire image all-at-once (into one large
       | intermediate buffer) instead of one row at a time (into smaller,
       | re-usable buffers). All-at-once requires more intermediate memory
       | but allows substantially more of the image to be decoded in the
       | zlib-decompressor's fastest code paths._
       | 
       | Mozilla's imagelib has a neat trick to gain performance while
       | working with smaller memory buffers when you want to render the
       | output image at a smaller resolution than it is encoded. They
       | call it downscale-during-decode.
       | 
       | So while this library pulls zlib decompression into the decode
       | loop imagelib optimizes at the other end of the pipeline and
       | pulls downscaling into the decode loop.
       | 
       | imagelib's optimization has the advantage that it helps with both
       | small images as long as they're not displayed pixel-exact and
       | also makes decoding humungous images feasible without causing the
       | browser to explode with OOMs.
        
       | [deleted]
        
       | 1-6 wrote:
       | I wonder if the same performance benefits apply to Encoding PNGs.
        
         | dividuum wrote:
         | Looks like there isn't an encoder for PNG:
         | https://github.com/google/wuffs/tree/main/std/png
         | 
         | Given the project goals, I guess most encoders don't make a lot
         | of sense: For image encoding you basically provide an "x * y *
         | bytes_per_pixel" memory area and an encoder does its magic on
         | that. There isn't really any complicated untrusted input in
         | that case.
        
           | 1-6 wrote:
           | Darn, I had a handful of NumPy arrays which could use a fast
           | encoder.
        
             | xyzzy_plugh wrote:
             | But I'm sure there are plenty of fast encoders? Wuffs isn't
             | the path to a fast encoder. It's a path to something safe
             | that is also pretty fast.
             | 
             | If you're encoding PNGs from NumPY arrays you don't care
             | _at all_ about safety.
             | 
             | If, for some reason I can't quite comprehend, whatever
             | encoder you have on hand isn't fast enough, you can just
             | disable compression optimizations -- but at that point you
             | may as well be writing out a bitmap.
        
           | commandlinefan wrote:
           | The performance tradeoff isn't as justifiable, either: most
           | images are decompressed orders of magnitude more times than
           | they're compressed (generally just once).
        
             | robocat wrote:
             | I presume there are companies that do a lot of image
             | compression which have the financial incentive to employ
             | people to improve the compression efficiency. Either
             | because of the CPU costs where the company will get a
             | return on engineering hours to improve the speed of the
             | algorithm; or because of reduced bandwidth costs where the
             | company has an incentive to reduce the size of the
             | compressed image even if the CPU usage might be higher.
             | 
             | However it isn't obvious whether such companies have any
             | incentive to share their code, even just because the code
             | is too specific to their hardware?
        
             | mjevans wrote:
             | optipng and pngcrush come to mind, but in the spirit of big
             | O notation a run through those might be considered one
             | macro-encoding process.
        
             | Someone wrote:
             | I'm not sure I would take that bet. How much security
             | camera footage is compressed, recorded, never looked at,
             | and discarded after x days?
             | 
             | Certainly in number of images (as opposed to number of
             | bytes), those might tilt things the other way.
        
         | minimaxir wrote:
         | semi-related since the post is talking about Rust, oxipng is a
         | decently fast PNG optimizer, albeit not an encoder per se:
         | https://github.com/shssoichiro/oxipng
        
       | ExcavateGrandMa wrote:
       | sounds epic but an eager use of SIMD for anything...
       | 
       | hmmmm.
        
       | ChuckMcM wrote:
       | Nice. Still putting everything into memory make algorithm go
       | brrrr is not too surprising :-). I'm really happy that folks
       | continue to improve it though.
        
       | matu3ba wrote:
       | Are there any compiletime benchmarks?
       | 
       | I would be interested how much their approach scales in terms of
       | code size, since they use likely SMT solvers to guarantee having
       | no arithmetic overflows (which is more than Rust guarantees).
       | Ideally as comparison with compile times of Rust programs.
        
         | aw1621107 wrote:
         | > since they use likely SMT solvers to guarantee having no
         | arithmetic overflows
         | 
         | At least based on a quick skim of the docs, they use a
         | combination of programmer assertions, interval arithmetic, and
         | their type system for bounds checking and ensuring no
         | arithmetic overflows.
        
           | matu3ba wrote:
           | Checked as well. They can only do this, because they require
           | that heap memory owned by pointers does not get fragmented.
           | So programmers must manage pointer offsets to each heap
           | structure themself. Hence you never get potential pointer
           | soup (pointers pointing to subparts and pointing to other
           | things), which Rust resolves with pointer lifetime checking.
           | 
           | Also loop invariants etc must all be annotated. I would be
           | more interested how they plan to handle IO effects or if they
           | want to omit that.
        
       | miohtama wrote:
       | First time hearing about Wuffs. It says Wuffs compiles to C. In
       | this case, will the choice of C compiler affect the performance?
        
       | com2kid wrote:
       | My favorite part of the Wuffs github page has to be the
       | definition it gives for Dependent Types
       | 
       | > ...Dependent types are a way to implement compile-time bounds
       | checking, but they're not the only way, and there's more to
       | programming languages than their type systems. Wuffs does not use
       | dependent types.
       | 
       | Wuffs looks really interesting! I don't think I've ever even
       | heard of a language that is designed to only be used in an
       | auxiliary role along side another general purpose programming
       | language!
       | 
       | Edit: As pointed out below, scripting languages like Lua are used
       | in an auxiliary role, but in a very different way! Wuffs is
       | designed for use in the deepest most crucial parts of an app,
       | where as scripting languages are typically GC'd and used at the
       | very highest levels, often times by animations, game designers,
       | or even end users!
        
         | tyingq wrote:
         | Ragel might be a related example: http://www.colm.net/open-
         | source/ragel/
        
         | kccqzy wrote:
         | I don't like that your quote elided the actual definition of
         | dependent types. It reflects badly on the project for people
         | who didn't bother to read the full quote. For completeness,
         | this seems to be the full quote:
         | 
         | > A type that depends on another value. For example, a variable
         | n's type might be "the length of s", for some other slice-typed
         | variable s. Dependent types are a way to implement compile-time
         | bounds checking, but they're not the only way, and there's more
         | to programming languages than their type systems. Wuffs does
         | not use dependent types.
        
         | coldtea wrote:
         | > _I don 't think I've ever even heard of a language that is
         | designed to only be used in an auxiliary role along side
         | another general purpose programming language!_
         | 
         | Embeddedable scripting languages come to mind. E.g.:
         | 
         | "Lua is a lightweight, high-level, multi-paradigm programming
         | language designed primarily for embedded use in applications."
        
           | com2kid wrote:
           | Ah, of course. But opposite use case!
           | 
           | Lua is used at the very top level of a program, sometimes its
           | even user accessible!
           | 
           | Wuffs is the exact opposite, it is designed for use in the
           | deepest levels of a program!
           | 
           | Very cool idea.
        
         | zellyn wrote:
         | lex, yacc?
        
         | masklinn wrote:
         | > Wuffs looks really interesting! I don't think I've ever even
         | heard of a language that is designed to only be used in an
         | auxiliary role along side another general purpose programming
         | language!
         | 
         | It's positioned somewhat separately as you don't codegen from
         | it, but TLA+? It exists to model and verify a program, but you
         | still write the program separately.
        
           | wahern wrote:
           | It compiles to C.
        
             | masklinn wrote:
             | I don't think TLA+ compiles to C, not as part of the normal
             | usage / workflow.
        
               | wahern wrote:
               | Right, but Wuffs does. I suspect I misunderstood your
               | previous comment.
        
         | blt wrote:
         | Halide (https://halide-lang.org/) is one example, designed for
         | implementing low-level image processing functions.
        
         | fpoling wrote:
         | Wasm does at runtime what Wuffs does at the compile time for a
         | typical slowdown of 2-3 times. As with Wuffs there is no
         | allocation in the basic Wasm, the program works on pre-
         | allocated buffers.
        
         | zucker42 wrote:
         | Futhark is a auxillary language that targets parallel
         | processors (e.g. GPUs) to be used alongside a general purpose
         | language.
        
         | [deleted]
        
         | stefan_ wrote:
         | There is T0, part of BearSSL. This presentation has an
         | explanation of it (starts slide 19):
         | 
         | https://t1lang.github.io/NorthSec-20190516.pdf
        
       | The_rationalist wrote:
       | So will it makes its way into browsers?
        
       | rurban wrote:
       | Looks good, but has somewhat widely false claims:
       | 
       | > SMHasher is a test and benchmark suite for a variety of hash
       | function implementations. It can provide data for claims like
       | "our new Foo hash function is faster than the widely used Bar,
       | Baz and Qux hash functions". However, when comparing Foo to
       | CRC-32, be aware that a SIMD-accelerated CRC-32 implementation
       | can be 47x faster than SMHasher's simple CRC-32 implementation.
       | 
       | In fact smhasher implements both, the slow soft crc32 he cites,
       | and the fastest crc32_pclmul, which is not just 47x faster, but
       | 5000x faster. Other than wuff's implementation of the
       | crc32_pclmul variant.
       | 
       | Compare https://github.com/rurban/smhasher/#smhasher
       | crc32  392.10 vs crc32_pclmul 1972140.38 MiB/sec
       | 
       | Reminds me a lot on ATS
        
       | quelsolaar wrote:
       | The check-summing in PnG is poorly designed. It uses 2 different
       | checksum on top of each other. The format itself checksum each
       | chunk, but then the pixel data is using zlibs compression header
       | that have a second checksum on top of that, using a different
       | algorithm.
        
       | wiml wrote:
       | Another tool along these lines is Galois' Ivory language
       | https://ivorylang.org/ , a Haskell-embedded language for writing
       | safe/reliable C.
        
         | matu3ba wrote:
         | You might be interested in cakeml and cogent. What Safety
         | Integrity Level can ivory guarantee? level2?
        
         | moonchild wrote:
         | Another one is verifast.
        
       | zelphirkalt wrote:
       | > Also, unlike Go or Rust, Wuffs' memory safety is enforced at
       | compile time, not by inserting runtime checks that e.g. the i in
       | a[i] is within bounds or that (x + y) doesn't overflow a u32.
       | 
       | Am I missing something, or is this statement simply wrong? Have
       | not used Go, but Rust checks its stuff at compile time as far as
       | I know.
        
         | cakoose wrote:
         | Yes, Rust checks a lot of things at compile time, but not all.
         | For example, dynamic bounds checks (indexing into a Vec or
         | slice).
        
         | alpaca128 wrote:
         | Rust does use runtime checks for overflows and will panic if an
         | overflow happens. In cases where you don't care or where it's
         | intentional you can use `overflowing_add()` and its variants.
        
           | steveklabnik wrote:
           | ... in debug builds only, by default. In release, by default,
           | there are no checks, and you get two's compliment overflow,
           | though the correct thing to do is use those variants, yes.
        
         | nneonneo wrote:
         | Both of these checks happen at runtime in Rust. Overflow checks
         | in particular only happen in debug mode, not in release mode.
        
           | legulere wrote:
           | * if llvm does not optimize them out
        
         | nwmcsween wrote:
         | Statement is correct, Rust does insert runtime bounds checking
         | when needed.
        
         | steveklabnik wrote:
         | Rust does as much as possible at compile time, and some of its
         | most famous features are entirely at compile time, but it will
         | also use runtime checks where appropriate.
         | 
         | Some amount of runtime checking is inherent in any program.
         | Even dependently typed programs must, for example, check input
         | at runtime. They can make it easier to have absolutely minimal
         | runtime checks, however. Rust is adding a limited form of
         | dependent types for this reason.
        
           | NickPollard wrote:
           | I've not been following recent Rust development as closely,
           | can you elaborate on what limited form of dependent types
           | Rust is adding?
        
             | nyanpasu64 wrote:
             | Functions and types can take integers as monomorphization-
             | time template parameters (const generics).
             | 
             | It would be nice if you could pass (n, t) where n is
             | supplied at runtime, the type of t depends on the value of
             | n, and the compiler only lets you use t if your code is
             | valid for all reachable values of n.
             | 
             | eg. fn(n: usize, arr1: &[i32; n], arr2: &[i32, n]) allowed
             | the function body to assume the two slices can be zipped
             | without losing elements.
             | 
             | Note that i don't have enough experience with either
             | dependent types, zz, or wuffs to explain much further.
        
               | the8472 wrote:
               | > eg. fn(n: usize, arr1: &[i32; n], arr2: &[i32, n])
               | allowed the function body to assume the two slices can be
               | zipped without losing elements.
               | 
               | Often you can wrangle the optimizer into eliminating all
               | but one bounds check by passing slices instead and then
               | slicing one of the inputs in terms of length of the
               | other. The Zip iterator adapter also has optimizations
               | for the case of two slices where iteration will only
               | require two length queries at the beginning and no
               | further bounds checks.
        
         | pitaj wrote:
         | This is true for Rust in most cases but with const generics and
         | fixed-size arrays, Rust can check those as part of the type
         | system at compile time.
        
         | aidenn0 wrote:
         | Rust will sometimes have to insert implicit runtime checks.
         | Wuffs appears to require the user to insert explicit runtime
         | checks when needed (it will reject any flow that could possibly
         | overflow, but if you check for overflow it's smart enough to
         | allow that).
        
         | justwalt wrote:
         | This was my understanding as well, at least for the majority of
         | memory related tasks.
         | 
         | Edit: actually, it would help if I read the actual quote in
         | your post. I think they're right in saying that those things
         | are checked at runtime in Rust, though I could be mistaken.
        
         | masklinn wrote:
         | If the error is flagrant enough e.g.                   let a =
         | [0;4];         println!("{}", a[5]);
         | 
         | or                   let a = 128u8 + 128u8;
         | 
         | the compiler may tell you, but in general you can add any two
         | `u8` or index an array with any `usize`, the issues will be
         | checked for at runtime (or not at all for the overflow in
         | release mode, by default).
        
         | boogies wrote:
         | I think it's right: https://uploads.peterme.net/nimsafe.html
        
       | ivoras wrote:
       | Surprisingly, Go's png decoder is half as slow as the default
       | libpng :(
       | 
       | https://nigeltao.github.io/blog/2021/fastest-safest-png-deco...
        
         | kevincox wrote:
         | This isn't that surprising. In general Go is about half the
         | speed of C to barring specifically optimized code or calling
         | out to a faster library this would be expected.
        
         | michaelcampbell wrote:
         | Ugh, what does "half as slow" mean? Twice as fast? Half the
         | speed?
         | 
         | Any comparison to "slow" is generally the wrong direction.
        
       | adamrezich wrote:
       | why did the title change on this submission? is it not the
       | fastest in the world?
        
         | bugfix wrote:
         | The title changed multiple times. I've seen at least 3 in the
         | last 30 minutes (safest, fastest and now this one).
        
           | adamrezich wrote:
           | why? usually I only see this happen for misleading headlines
           | and appending "(year)" to old articles that might be
           | construed as recent. what's misleading about the actual
           | article's headline? as far as I can tell it seems to be true.
        
       | bob1029 wrote:
       | I know there are information theory tradeoffs, but if you want to
       | go _really_ fast, use JPEG. Specifically, LibJpegTurbo.
       | 
       | I have been playing around with this, and it is fast enough to
       | encode a 1080p+ 8bpp framebuffer in under 5 milliseconds. This is
       | fast enough for interactive applications. Most web browsers
       | likely use something similar to this, so as long as you encode
       | quickly on the server you can expect fast decode on the client.
        
       | dividuum wrote:
       | Interesting. First time I've head of Wuffs. As someone that still
       | uses C, this in particular sounds neat:
       | 
       | (from https://github.com/google/wuffs#goals-and-non-goals)
       | 
       | """ Wuffs' goal is to produce software libraries that are as safe
       | as Go or Rust, roughly speaking, but as fast as C, and that can
       | be used anywhere C libraries are used.
       | 
       | [...] Wuffs the Library is available as transpiled C code. Other
       | C/C++ projects can use that library without requiring the Wuffs
       | the Language toolchain. Those projects can use Wuffs the Library
       | like using any other third party C library. """
        
         | quotemstr wrote:
         | All of Windows (including the NT kernel) is written using the
         | same basic proof model that Wuffs uses, except that the
         | constraints are specified as annotations on top of C and C++
         | instead of as a new programming language.
         | 
         | I prefer the annotation approach, TBH, but I'm glad to see
         | people working on safety. Specifically, Wuffs (like Rust) has
         | this problem where it tries to fill the same niche as C and C++
         | and improve on these older languages in specific ways, but in
         | addition to being different in ways necessary to achieve the
         | language's objective is also different from C and C++ in
         | totally gratuitous ways that are basically just aesthetic
         | differences --- for example, "type var" vs "var: type".
         | 
         | I'm a big believer in technical continuity. IMHO, a lot of
         | recent language developments should have instead been
         | extensions of C, C++, Java, or Python.
        
           | WalterBright wrote:
           | > I'm a big believer in technical continuity.
           | 
           | So am I. D is designed to be an easy transition from C and
           | C-With-Classes. For example, here is some code in C that was
           | translated to D (it's part of the Digital Mars C++ compiler):
           | 
           | C:
           | 
           | https://github.com/DigitalMars/Compiler/blob/dmc-
           | cxx/dm/src/...
           | 
           | D:
           | 
           | https://github.com/DigitalMars/Compiler/blob/master/dm/src/d.
           | ..
           | 
           | They look pretty much the same. The code generated is the
           | same. In those repositories you can also see how I translated
           | the C versions to D with plenty of examples.
           | 
           | The biggest impediment is the C preprocessor. You wouldn't
           | really want to carry that forward.
           | 
           | After removing dependency on the C preprocessor, most of the
           | work is global search/replacing things like `->` to `.`.
        
           | kbenson wrote:
           | > I'm a big believer in technical continuity. IMHO, a lot of
           | recent language developments should have instead been
           | extensions of C, C++, Java, or Python.
           | 
           | If that's what you really wanted, and if we always did that,
           | then you would probably want extensions to Perl instead of
           | Python, since it predated it slightly and was much more
           | popular much earlier.
           | 
           | Instead, I think what you're really asking for is language
           | extensions to the things you are familiar with. Those happen
           | to be things a lot of other people are also familiar with,
           | but that doesn't change that the same strategy results in
           | Perl still getting the lion's share of attention. I, for one,
           | would be perfectly happy with that, as I love Perl, but many
           | people would not, and possibly not for any real _technical_
           | reason.
           | 
           | Anyone that thinks the case for Rust or D is better served by
           | extensions to C or C++ but thinks that Python is objectively
           | a better language than Perl should probably look deeply at
           | exactly why they hold those views.
           | 
           | P.S. That's not really to say you hold those views. Consider
           | this comment less a critique of your specific statement than
           | slight tangent spurred by it.
        
           | remexre wrote:
           | I work on (well, mostly near) an extensible C compiler,
           | designed so extension authors can independently create
           | extensions, and users can import them as easily as libraries:
           | https://github.com/melt-umn/ableC/ (and I'd love to answer
           | any questions you have about our approach or code.)
           | 
           | IMO this approach hasn't taken off because maintaining
           | compatibility with C while adding safety (or really just
           | about any property) means implementing your own sublanguage
           | that can't arbitrarily call C functions while maintaining
           | your safety properties. On the other hand, C being able to
           | call into your sublanguage easier is a benefit versus jury-
           | rigging Cargo into your build system (in the case of Rust).
           | 
           | On the other hand, this approach works great for adding
           | extensions that increase the expressive power of C with new
           | abstractions, for example algebraic data types, C++-like
           | templating, etc.
        
           | amelius wrote:
           | But the C++ spec has become quite unwieldy, so I can
           | understand the desire to start a totally new language.
        
             | edflsafoiewq wrote:
             | The previous post on the blog has an observation that
             | applies to languages as well
             | 
             | > Software has a Peter Principle. If a piece of code is
             | comprehensible, someone will extend it, so they can apply
             | it to their own problem. If it's incomprehensible, they'll
             | write their own code instead. Code tends to be extended to
             | its level of incomprehensibility.
        
           | matu3ba wrote:
           | What tool/framework do they use for annotation and are there
           | standards like SMTlib2, but for annotating those kind of
           | things? Can this tools verify that the annotations are
           | correct wrt the code below or above?
        
             | quotemstr wrote:
             | https://docs.microsoft.com/en-us/cpp/code-
             | quality/understand...
        
         | mleonhard wrote:
         | Wuffs touts its lack of memory allocation as a safety feature
         | [0]. Rust could add a `#![forbid(alloc)]` declaration and
         | achieve the same effect. I think this would be a good idea.
         | 
         | Wuffs types are not parameterized with lifetimes, so they may
         | contain references only to items with static lifetime. This
         | means that Wuffs cannot implement linked lists, hash tables, or
         | other common data structures. To support those structures,
         | Wuffs will need to either add lifetime parameters and a borrow-
         | checker (like Rust) or add reference counting (like Rust `Arc`,
         | C++ `shared_ptr`, Golang, Java, etc).
         | 
         | [0]
         | https://github.com/google/wuffs/blob/v0.3.0-beta.1/doc/note/...
        
           | [deleted]
        
           | marius_k wrote:
           | As I understand wuffs goal is to be good enough for parsers
           | coders decoders. It looks like they can still achieve that by
           | sacrificing some of the nice features of other languages.
        
           | andrepd wrote:
           | Isn't that stupidly limiting?
        
             | amelius wrote:
             | After NoSQL, now we also have NoAlloc :)
        
           | pipeep wrote:
           | > Rust could add a `#![forbid(alloc)]` declaration and
           | achieve the same effect.
           | 
           | Isn't that pretty much what `#![no_std]` is for? You can
           | still use an allocator with no_std, but you have to
           | explicitly import it.
           | 
           | https://docs.rust-embedded.org/book/intro/no-std.html
        
           | nynx wrote:
           | Rust already has a "mode" that forbids allocations: no-std
           | without the alloc crate.
        
             | [deleted]
        
         | andrewla wrote:
         | The language itself is a little unreadable at first glance, but
         | the idea of it is a very good one. sbt [1] is an amazing
         | project for small embeddable toy programs, but using fuzzers
         | rapidly shows how unsafe the code is, and the benchmarks in the
         | original article show the extent of performance compromises
         | made to make it work.
         | 
         | It seems like this would be an interesting approach to a lot of
         | security programming where it involves data structures, since
         | those have typically been a source of issues. Having a memory-
         | safe ASN.1 parser would be really nice, considering how much
         | difficulty that has caused in the security space.
         | 
         | With a lot of security programming there's a reliance on
         | constant-time algorithms, which this may not be well-suited to,
         | however.
         | 
         | [1] https://github.com/nothings/stb
        
       ___________________________________________________________________
       (page generated 2021-04-06 23:00 UTC)