[HN Gopher] Lossless Image Compression Through Super-Resolution
       ___________________________________________________________________
        
       Lossless Image Compression Through Super-Resolution
        
       Author : beagle3
       Score  : 269 points
       Date   : 2020-04-07 13:17 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | Animats wrote:
       | This is a lot like "waifu2x".[1] That's super-resolution for
       | anime images.
       | 
       | [1] https://github.com/nagadomi/waifu2x
        
       | dvirsky wrote:
       | How do ML based lossy codecs compare to state of the art lossy
       | compression? Intuitively it sounds like something AI will do much
       | better. But this is rather cool.
        
         | MiroF wrote:
         | They perform better, from what I've read.
        
           | qayxc wrote:
           | Depends entirely on your definition of "better".
           | 
           | In terms of quality vs bit rate, ML-based methods are
           | superior.
           | 
           | In terms of computation and memory requirements, they're
           | orders of magnitude worse. It's a trade-off; TINSTAAFL.
        
             | MiroF wrote:
             | > memory requirements
             | 
             | Agreed, although this bit is unclear - the compressed
             | representations of the ML-based methods take up much less
             | space in memory than traditional methods, but yes - the
             | decompression pipeline is memory-intensive due to
             | intermediary feature maps.
        
       | nojvek wrote:
       | Does anyone know how much better the compression ratio is
       | compared to png? Which is also a lossless encoder.
        
       | trevyn wrote:
       | On the order of 10% smaller than WebP, substantially slower
       | encode/decode.
        
         | baq wrote:
         | is webp lossless?
        
           | dubcanada wrote:
           | It's both lossless and lossy -
           | https://en.wikipedia.org/wiki/WebP
        
           | propinquity wrote:
           | Webp supports lossy and lossless.
        
         | [deleted]
        
         | ltbarcly3 wrote:
         | The encode/decode is almost certainly not optimized, it's using
         | Pytorch and is a research project, a 10x speedup with a tuned
         | implementation is probably easily reachable, and I wouldn't be
         | surprised if 100x were possible even without using a GPU.
        
           | qayxc wrote:
           | Where did you get that from? PyTorch is already pretty
           | optimised and relies on GPU acceleration.
           | 
           | The only parts that are slow in comparison are the bits
           | written in Python and those are just the frontend
           | application.
           | 
           | There's not much room for performance improvement.
        
         | fredophile wrote:
         | That could be an acceptable trade off for some applications. I
         | could see this being useful for companies that host a lot of
         | images. You only need to encode an image once but pay the
         | bandwidth costs every time someone downloads it. Decoding speed
         | probably isn't the limiting factor of someone browsing the web
         | so that shouldn't negatively impact your customer's experience.
        
           | imhoguy wrote:
           | > Decoding speed probably isn't the limiting factor of
           | someone browsing the web so that shouldn't negatively impact
           | your customer's experience.
           | 
           | Unless it is with battery powered devices. However I would
           | say that with general web browsing without ad-blocking it
           | wouldn't count much either it terms of bandwidth or
           | processing milliwatts.
        
       | jbverschoor wrote:
       | I though superresolution uses multiple input files to "enhance".
       | For example - extracting a highres image from a video clip
        
         | s_gourichon wrote:
         | They reformulate the decompression problem in the shape of a
         | supperresolution problem conforming to what you just wrote.
         | Instead of getting variety through images of a video clip they
         | use the generalization properties of a neural network.
         | 
         | "For lossless super-resolution, we predict the probability of a
         | high-resolution image, conditioned on the low-resolution input"
        
       | propter_hoc wrote:
       | This is really interesting but out of my league technically. I
       | understand that super-resolution is the technique of inferring a
       | higher-resolution truth from several lower-resolution captured
       | photos, but I'm not sure how this is used to turn a high-
       | resolution image into a lower-resolution one. Can someone explain
       | this to an educated layman?
        
         | mywittyname wrote:
         | From peaking at the code, it seems like each lower res image is
         | a scaled down version of the original plus a tensor that is
         | used to upscale to the previous image. The resulting tensor is
         | saved and the scaled image is used as the input to the next
         | iteration.
         | 
         | The decode process takes the last image from the process above,
         | and iteratively applies the upscalers until the original image
         | has been reproduced.
         | 
         | Link to the code in question:
         | https://github.com/caoscott/SReC/blob/master/src/l3c/bitcodi...
        
           | peter_d_sherman wrote:
           | If we substitute "information" for "image", "low information"
           | for "low resolution" and "high information" for "high
           | resolution", perhaps compression could be obtained
           | generically on any data (not just images) by taking a high
           | information bitstream, using a CNN or CNN's (as per this
           | paper) to convert it into a shorter, low information
           | bitstream plus a tensor, and then an entropy (difference)
           | series of bits.
           | 
           | To decompress then, reverse the CNN on the low information
           | bitstream with the tensor.
           | 
           | You now have a high information bitstream which is _almost_
           | like your original.
           | 
           | Then use the entropy series of bits to fix the difference.
           | You're back to the original.
           | 
           | Losslessly.
           | 
           | So I wonder if this, or a similar process can be done on non-
           | image data...
           | 
           | But that's not all...
           | 
           | If it works with non-image data, it would also say that
           | mathematically, low information (lower) numbers could be
           | converted into high information (higher) numbers with a
           | tensor and entropy values...
           | 
           | We could view the CNN + tensor as mathematical function, and
           | we can view the entropy as a difference...
           | 
           | In other words:
           | 
           |  _Someone who is a mathematician might be able to derive some
           | identities, some new understandings in number theory from
           | this_...
        
             | valine wrote:
             | Convolution only works on data that is spatially related,
             | meaning data points that are close to each other are more
             | related than data points that are far apart. It doesn't
             | give meaningful results on data like spreadsheets where
             | columns or rows can be rearranged without corrupting the
             | underlying information.
             | 
             | If by non-image data you mean something like audio, then
             | yes it could probably work.
        
       | crazygringo wrote:
       | This is utterly fascinating.
       | 
       | To be clear -- it stores a low-res version in the output file,
       | uses neural networks to predict the full-res version, then
       | encodes the difference between the predicted full-res version and
       | the actual full-res version, and stores that difference as well.
       | (Technically, multiple iterations of this.)
       | 
       | I've been wondering when image and video compression would start
       | utilizing standard neural network "dictionaries" to achieve
       | greater compression, at the (small) cost of requiring a local NN
       | file that encodes all the standard image "elements".
       | 
       | This seems like a great step in that direction.
        
         | OskarS wrote:
         | It's a really cool idea, but I don't know if this would ever be
         | a practical method for image compression. First of all, you
         | could never change the neural network without breaking the
         | compression, so you can't ever "update" it. Like: what if you
         | figure out a better network? Too bad! I mean, I guess you
         | could, but then you need to to version the files and keep
         | copies of all the networks you've ever used, but this gets
         | messy quick.
         | 
         | And speaking of storing the networks: I don't know that you
         | would ever want to pay the memory hit that it would take to
         | store the entire network in memory just to decompress images or
         | video, nor the performance hit the decompression takes. The
         | trade-off here is trading reduced drive space for massively
         | increased RAM and CPU/GPU time. I don't know any case where
         | you'd want to make that trade-off, at least not at this
         | magnitude.
         | 
         | Again though: it's an awesome idea. I just don't know that's
         | ever going to be anything other than a cool ML curiosity.
        
           | cbhl wrote:
           | Even if it's not useful for general-purpose compression, it
           | may still be useful in a more restricted domain. In text
           | compression, Brotli can be found in Chrome with a dictionary
           | that is tuned for HTTP traffic. And in audio compression,
           | LPCnet is a research codec that used Wavenet (neural nets for
           | speech synthesis) to compress speech to 1.6kb/s (prior
           | discussion from 2019 at
           | https://news.ycombinator.com/item?id=19520194).
        
           | daveguy wrote:
           | I think the idea is the network is completely trained and
           | encoded along with the image and delta data. A new network
           | would just require retraining and storing that new network
           | along with the image data. It doesn't use a global network
           | for all compressions.
        
             | codeflo wrote:
             | I don't think this would work, the size of the network
             | would likely dominate the size of the compressed image.
        
               | mstade wrote:
               | Wouldn't the network be part of the decoder?
        
               | mastre_ wrote:
               | Yes, and this is why you couldn't update the network.
               | Still, much like how various compression algos have
               | "levels," this standard could be more open in this
               | regard, adding new networks (sort of what others above
               | refer to as versions) and the image could just specify
               | which network it uses. Maybe have a central repo from
               | where the decoder could pull a network it doesn't have
               | (i.e. I make a site and encode all 1k images on it using
               | my own network, pull the network to your browser once so
               | you can decode all 1k images). And even support a special
               | mode where the image explicitly includes the network to
               | be used for decoding it along with image data (could make
               | sense for a very large images, as well as for
               | specialized/demonstrational/test purposes).
               | 
               | All in all, a very interesting idea.
        
               | mstade wrote:
               | I wonder what the security implications of all this is,
               | sounds dangerous to just run any old network. I suppose
               | maybe if it's sandboxed enough with very strongly defined
               | inputs and outputs then the worst that could happen is
               | you get garbled imagery?
        
             | stefs wrote:
             | they include the trained models under the "model weights"
             | section. imagenet is ~20mb, openimages is ~17mb.
             | 
             | now this might be prohibitive for images over the web, but
             | it'd be interesting whether it might be applicable for
             | images with huge resolutions for printing, where single
             | images are are hundreds of megabytes
        
           | crazygringo wrote:
           | For a standard network, you're right there would only be one
           | version. So you just make sure it's very carefully put
           | together. (If a massively better one comes along, then you
           | just make it a new file format.)
           | 
           | And as for performance/resources -- great point. But what
           | about video, where the space/bandwidth improvements become
           | drastically more important?
           | 
           | Since h.264 and h.265 already has dedicated hardware, would
           | it be reasonable to assume that a chip dedicated to this
           | would handle it just fine?
           | 
           | And that if you've already got hardware for video, then of
           | course you'd just re-use it for still images?
        
             | bryanrasmussen wrote:
             | >(If a massively better one comes along, then you just make
             | it a new file format.)
             | 
             | I guess you could have versioning of your file format, and
             | some sort of organization that standardized it.
        
               | colejohnson66 wrote:
               | Then you get the layperson who doesn't understand that
               | and asks why their version 42 .imgnet won't open in a
               | program only supporting up to 10 (but they don't know
               | their image is v42 and the program only supports v10).
               | It's easier to understand different formats more than
               | different versions
        
           | burntoutfire wrote:
           | > I don't know that you would ever want to pay the memory hit
           | that it would take to store the entire network in memory just
           | to decompress images or video, nor the performance hit the
           | decompression takes.
           | 
           | The big memory load wouldn't neccesarily be a problem for the
           | likes of Youtube and Netflix - they could just have dedicated
           | machines which do nothing else but decoding. The performance
           | penalty could be a killer though.
        
           | tobr wrote:
           | > First of all, you could never change the neural network
           | without breaking the compression, so you can't ever "update"
           | it. Like: what if you figure out a better network? Too bad!
           | 
           | Isn't this just a special version of a problem any type of
           | compression will always have? There's all kinds of ways you
           | can imagine improving on a format like JPEG, but the reason
           | it's useful is because it's locked down and widely supported.
        
             | HelloNurse wrote:
             | Usual compression standards are mostly adaptive, estimating
             | statistical models of input from implicit prior
             | distributions (e.g. the probability of A followed by B
             | begins at p(A)p(B)), reasonable assumptions (e.g. scanlines
             | in an image follow the same distribution), small and fixed
             | tables and rules (e.g. the PNG filters): not only a low
             | volume of data, but data that can only change as a major
             | change of algorithm.
             | 
             | A neural network that models upscaling is, on the other
             | hand, not only inconveniently big, but also completely
             | explicit (inviting all sorts of tweaking and replacement)
             | and adapted to a specific data set (further _demanding_
             | specialized replacements for performance reasons).
             | 
             | Among the applications that are able to store and process
             | the neural network, which is no small feat, I don't think
             | many would be able to amortize the cost of a tailored
             | neural network over a large, fixed set of very homogeneous
             | images.
             | 
             | The imagenet64 model is over 21 MB: saving 21 MB over PNG
             | size, at 4.29 vs 5.74 bpp (table 2a in the article),
             | requires a set of more than 83 MB of perfectly
             | imagenet64-like PNG images, which is a lot. Compressing
             | with a custom upscaling model the image datasets used for
             | neural network experiments, which are large and stable, is
             | the most likely good application (with the side benefit of
             | producing useful and interesting downscaled images for free
             | in addition to compressing the originals).
        
           | ltbarcly3 wrote:
           | Why can't you update it?
           | 
           | There could be a release of a new model every 6 months or
           | something (although even that is probably too often, the
           | incremental improvement due to statistical changes in the
           | distribution of images being compressed isn't likely to
           | change much over time), and you just keep a copy of all the
           | old models (or lazily download them like msft foundation c++
           | library versions when you install an application).
           | 
           | The models themselves aren't very large.
        
             | zo1 wrote:
             | Not just that, but you could take a page out of the
             | "compression" book and treat the NN as a sort of dictionary
             | in that it is part of the compressed payload. Maybe not the
             | whole NN, but perhaps deltas from a reference
             | implementation, assuming the network structure remains the
             | same and/or similar.
        
             | spuz wrote:
             | I don't know why this comment was downvoted - it's a
             | legitimate question.
             | 
             | One scenario I can picture is the Netflix app on your TV.
             | Firstly, they create a neural network trained on the video
             | data in their library and ship it to all their clients
             | while they are idle. They could then stream very high-
             | quality video at lower bandwidth than they currently use
             | and, assuming decoding can be done quickly enough, provide
             | a great experience for their users. Any updates to the
             | neural network could be rolled out gradually and in the
             | background.
        
               | devinplatt wrote:
               | Google used to do something called SDCH (Shared
               | Dictionary Compression for HTTP), where a delta
               | compression dictionary was downloaded to Chrome.
               | 
               | The dictionary had to be updated from time to time to
               | keep a good compression rate as the Google website
               | changed over time. There was a whole protocol to handle
               | verifying what dictionary the client had and such.
        
           | contravariant wrote:
           | If you've got a big enough image you can include the model
           | parameters with the image.
        
         | jbverschoor wrote:
         | Great explanation. If it is and stays lossless, it would make
         | an awesome photo archiving and browsing tool.
         | 
         | Browse thumbnails, open original. Without any processes to
         | generate / keep in sync these files.
        
           | greesil wrote:
           | The jpeg2000 standard allows for multiscale resolutions by
           | using wavelet transforms instead of the discrete cosine
           | transformation
        
           | mceachen wrote:
           | The majority of photos you already have most likely contain
           | thumbnail and larger preview images embedded in the EXIF
           | header.
           | 
           | Raw images typically contain an embedded, full-sized JPEG
           | version of the image as well.
           | 
           | All of these are easily extracted with `exiftool -b
           | -NameOfBinaryTag $file > thumb.jpg`.
           | 
           | I've found while making PhotoStructure that the quality of
           | these embedded images are surprisingly inconsistent, though.
           | Some makes and models do odd things, like handle rotation
           | inconsistently, add black bars to the image (presumably to
           | fit the camera display whose aspect ratio is different from
           | the sensor), render the thumb with a color or gamma shift, or
           | apply low quality reduction algorithms (apparent due to
           | nearest-neighbor jaggies).
           | 
           | I ended up having to add a setting that lets users ignore
           | these previews or thumbnails (to choose between "fast" and
           | "high quality").
        
             | jbverschoor wrote:
             | The point is to have originals available at a good
             | compression rate. Having a thumbnail in the original sucks,
             | as I don't want lossy compression on my originals.
        
         | Scene_Cast2 wrote:
         | There is already a startup that makes a video compression codec
         | based on ML - http://www.wave.one/video-compression - I am
         | personally following their work because I think it's pretty
         | darn cool.
        
         | dodobirdlord wrote:
         | This technique has also been used in the ogg-opus audio codec.
        
         | baq wrote:
         | so you could say it precomputes a function (and it's inverse)
         | which allows computing a very space-efficient information-dense
         | difference between a large image and its thumbnail?
        
         | ska wrote:
         | It's an old idea really, or a collection of old ideas with a NN
         | twist. Not really clear how much that latter bit brings to the
         | table but interesting to think about.
         | 
         | The "dictionary" approach was roughly what vector quantization
         | was all about. The idea of turning lossy encoders into lossless
         | by also encoding the error is a old one too, but somewhat
         | derailed by focus on embedable codecs with an ideal of each
         | additional bit read will improve your estimate.
         | 
         | I think the potentially novelty here is really in the
         | unfortunately-named-but-too-late-now super-resolution aspects.
         | You could do the same sort of thing ages ago with say IFS
         | projection, or wavelet (and related) trees, or VQ dictionaries
         | with a resolution bump, but they were limited by the training a
         | bit (although this approach might have some overtraining issues
         | that make it worse for particular applications.
        
         | lidHanteyk wrote:
         | Of the papers at Mahoney's page [0], "Fast Text Compression
         | with Neural Networks" dates to 2000; people have been applying
         | these techniques for decades.
         | 
         | [0] http://mattmahoney.net/dc/
        
         | cztomsik wrote:
         | Here's great book on this http://mattmahoney.net/dc/dce.html
         | 
         | The guy was using neural networks for compression long time ago
         | before it was a thing again.
         | 
         | EDIT: Oh, somebody mentioned it already (but it's really good,
         | free & totally worth reading)
        
         | tobib wrote:
         | Indeed very fascinating.
         | 
         | Reminds me of doing something similar, albeit a thousand times
         | dumber in ~2004 when I had to find a way to "compress" interior
         | automotive audio data, indicator sounds, things like that. At
         | some point instead of using traditional copression, I
         | synthesized a wave function and and only stored its parameters
         | and the delta from the actual wave which achieved great
         | compression ratios. It was expensive to compress but virtually
         | free to decompress. And as a side effect my student mind was
         | forever blown by the beauty of it.
        
         | GuB-42 wrote:
         | Even though the implementation details are far from trivial,
         | the general idea is fairly typical. Most advanced compression
         | algorithms work the same way.
         | 
         | - Using the previously decoded data, try to predict what's
         | next, and the probability or being right
         | 
         | - Using an entropy coder, encode the difference between what is
         | predicted and the actual data. The predicted probability will
         | be used to define how many bits to assign to each possible
         | value. The higher the probability, the less bits will be used
         | for a "right" answer and the more bits will be used for a
         | "wrong" answer.
         | 
         | Decoding works by "replaying" what the encoder did.
         | 
         | The most interesting part is the prediction. So much that some
         | people think of compression as a better test for AIs that the
         | Turing test. You are basically asking the computer to solve one
         | of these sequence-based IQ tests.
         | 
         | And of course neural networks are one of the first thing we
         | tend to think of when we want to implement an AI, and
         | unsurprising it is not an uncommon approach for compression.
         | For instance, the latest PAQ compressors use neural networks.
         | 
         | Of course, all that is a general idea. How to do it in practice
         | is where the real challenge is. Here, the clever part is to
         | "grow" the image from low to high resolution. Which kinds of
         | reminds me of wavelet compression.
        
           | ithinkso wrote:
           | Anyone interested in this approach to lossless compression
           | should visit (and try to win!) Hutter Prize website[1][2].
           | The goal is to compress 1GB of english wikipedia
           | 
           | [1] http://prize.hutter1.net/
           | 
           | [2] https://en.wikipedia.org/wiki/Hutter_Prize
        
           | remcob wrote:
           | > encode the difference between what is predicted and the
           | actual data
           | 
           | Minor nitpick: In the idealized model there is no single
           | prediction that you can take the difference with. There is
           | just a probability distribution and you encode the actual
           | data using this distribution.
           | 
           | Taking the difference between the most likely prediction and
           | the actual data is just a very common implementation
           | strategy.
        
             | anamexis wrote:
             | How is "the most likely prediction" not a "single
             | prediction that you can take the difference with"?
        
         | dorgo wrote:
         | My interpreataion: Create and distribute a library of all
         | possible images (except ones which look like random noise or
         | are otherwise unlikely to ever be needed). When you want to
         | send an image, find it in the library and send its index
         | instead. Use advanced compression (NNs) to reduce the size of
         | the library.
        
       | Der_Einzige wrote:
       | This technology is super awesome... and it's been available for
       | awhile.
       | 
       | A few years ago, I worked for #bigcorp on a product which, among
       | other things, optimized and productized a super resolution model
       | and made it available to customers.
       | 
       | For anyone looking for it - it should be available in several
       | open source libraries (and closed source #bigcorp packages) as an
       | already trained model which is ready to deploy
        
       | hinkley wrote:
       | I wonder how well this technique works when the depth of field is
       | infinite?
       | 
       | Out of focus parts of an image should be pretty darned easy to
       | compress using what is effectively a thumbnail.
       | 
       | That said, the idea of having an image format where 'preview'
       | code barely has to do any work at all is pretty damned cool.
        
       | asciimike wrote:
       | Reminds me of RAISR (https://ai.googleblog.com/2016/11/enhance-
       | raisr-sharp-images...).
       | 
       | I remember talking with the team and they had production apps
       | using it and reducing bandwidth by 30%, while only adding a few
       | hundred kb to the app binary.
        
       | Animats wrote:
       | How does it work for data other than Open Images, if trained on
       | Open Images? If it recognizes fur, it's going to be great on cat
       | videos.
        
       | acjohnson55 wrote:
       | Interesting. It sounds like the idea is fundamentally like
       | factoring out knowledge of "real image" structure into a neutral
       | net. In a way, this is similar to the perceptual models used to
       | discard data in lossy compression.
        
         | dehrmann wrote:
         | I wonder if there's a way to do this more like traditional
         | compression; performance is a huge issue for compression, and
         | taking inspiration from a neural network might be better than
         | actually using one. Conceptually, this is like a learned
         | dictionary that's captured by the neural net, it's just that
         | this is fuzzier.
        
           | retrac wrote:
           | Training the model is extremely expensive computationally,
           | but using it often isn't.
           | 
           | For example, StyleGAN takes months of compute-time on a
           | cluster of high-end GPUs to train to get the photorealistic
           | face model we've all seen. But generating new faces from the
           | trained model only takes mere seconds on a low-end GPU or
           | even a CPU.
        
       | ilaksh wrote:
       | I asked a question about a similar idea on Stack Overflow in
       | 2014. https://cs.stackexchange.com/questions/22317/does-there-
       | exis...
       | 
       | They did not have any idea and they were dicks about it as usual.
        
       | [deleted]
        
       | pbhjpbhj wrote:
       | It seems like "lossless" isn't quite right; some of the
       | information (as opposed to just the algo) seems to be in the NN?
       | 
       | Is a soft-link a lossless compression?
       | 
       | It's like the old joke about a pub where they optimise by
       | numbering all the jokes, .. just the joke number isn't enough, it
       | can be used to losslessly recover the joke, but it's using the
       | community storage to hold the data.
        
         | tverbeure wrote:
         | When you consider the compression algorithm itself a form of
         | information, your point about it not being quite lossless could
         | be applied to any lossless compression method.
        
           | pbhjpbhj wrote:
           | Yes, but there's a line somewhere, isn't there? A link to a
           | database isn't "lossless compression", even if it returns an
           | artefact unchanged.
        
         | chickenpotpie wrote:
         | As long as you get back the exact same image you put in, it's
         | lossless.
        
           | pbhjpbhj wrote:
           | So this "https://news.ycombinator.com/reply?id=22804687&goto=
           | threads%... is a lossless encoding of your comment because it
           | returns the content? That doesn't seem right to me.
        
             | chickenpotpie wrote:
             | We're talking about lossless compression. A URL is a way to
             | locate a resource, it is not compression. Compression is
             | taking an existing recourse and transforming it into
             | something smaller. A URL isn't a transformation. If I
             | delete my comment the URL no longer refers to anything.
        
             | jfkebwjsbx wrote:
             | No, because that uses external data.
        
           | yters wrote:
           | In that sense, I can losslessly compress everything down to
           | zero bits, and recover the original artifact perfectly with
           | the right algorithm.
        
             | chickenpotpie wrote:
             | Yes you can. However, it doesn't mean it's good lossless
             | compression.
        
             | ebg13 wrote:
             | You're ignoring two things:
             | 
             | 1) that the aggregate savings from compressing the images
             | needs to outweigh the initial cost of distributing the
             | decompressor.
             | 
             | 2) to be lossless, decompression must be deterministic an
             | unambiguous, so you can't compress _everything_ down to
             | zero bits; you can compress only _one_ thing down to zero
             | bits, because otherwise you wouldn't be able to
             | unambiguously determine which thing is represented by your
             | zero bits.
        
               | yters wrote:
               | In each case I pick out an algorithm beforehand that will
               | inflate my zero bits to whatever artifact I desire.
        
               | karpierz wrote:
               | Then that becomes part of the payload that you
               | decompress, and you no longer have a 0 byte payload.
        
               | ebg13 wrote:
               | "I will custom write a new program to (somehow) generate
               | each image and then distribute that instead of my image"
               | is not a compression algorithm. But I think you'd do well
               | over at the halfbakery.
        
               | yters wrote:
               | It works well if it's the only image!
        
               | ebg13 wrote:
               | Now you're chasing your own tail. You've gone from "I can
               | losslessly compress everything" to "I can losslessly
               | compress exactly one thing only".
        
               | yters wrote:
               | I'm arguing it's the same as this image compression
               | technique. They rely on a huge neural network which must
               | exist wherever the image is to be decompressed.
               | 
               | If I'm allowed to bring along an unlimited amount of
               | background data, then I can compress everything down to
               | zero bits.
               | 
               | In contrast, an algorithm like LZ78 can expressed in a 5
               | line python script and perform decently on a wide variety
               | of data types.
        
               | ebg13 wrote:
               | > _If I 'm allowed to bring along an unlimited amount of
               | background data, then I can compress everything down to
               | zero bits._
               | 
               | If by "background data" you mean the decompressor, this
               | is patently false. No matter how much information is
               | contained in the decompressor (The Algorithm + stable
               | weights that don't change), you can only compress one
               | thing down to any given new representation ( low
               | resolution image + differential from rescale using stable
               | weights ).
               | 
               | If by "background data" you mean new data that the
               | decompressor doesn't already have, then you're ignoring
               | the definition of compression. Your compressed data is
               | all bits sent on the fly that aren't already possessed by
               | the side doing the decompression regardless of obtuse
               | naming scheme.
               | 
               | > _I 'm arguing it's the same as this image compression
               | technique._
               | 
               | That's wrong, because this scheme doesn't claim to send a
               | custom image generator instead of each image, which is
               | what you're proposing.
        
               | milesvp wrote:
               | You have to convey which algorithm, which takes bits. And
               | at the very least you need a pointer to a file, which
               | also takes bits. You'd do well to look for archives of
               | alt.comp.compression.
               | 
               | There was also a classic thread that surfaced recently on
               | HN about a compression challenge whereby someone tried to
               | disengenuously attempt to compress a file of random data
               | (uncompressable by definition) by splitting it on a
               | character then deleting that character from each file.
               | Was a simple algorithm, that appeared to require fewer
               | bits to encode. The problem is, all this person did was
               | shift the bits to the filesystem's metadata, which is not
               | obvious from the command line. The final encoding ended
               | up taking more bits once you take said metadata into
               | account.
        
             | atorodius wrote:
             | You are missing the point that you can use the same
             | algorithm to compress N - infinity images. So the algorithm
             | size amortizes.
        
               | yters wrote:
               | Is that true?
        
               | nullc wrote:
               | Unless it's overfit on some particular inputs, but if
               | so-- it's bad science.
               | 
               | Ideally they would have trained the network on a non-
               | overlapping collection of images from their testing but
               | if they did that I don't see it mentioned in the paper.
               | 
               | The model is only 15.72 MiB (after compression with xz),
               | so it would amortize pretty quickly... even if it was
               | trained on the input it looks like it still may be pretty
               | competitive at a fairly modest collection size.
        
         | c3534l wrote:
         | This is true of all compression formats. The receiver has to
         | know how to decode it. A good compression algorithm will
         | attempt to send only the data that the receiver can't predict.
         | If we both know you're sending English words, then I shouldn't
         | have to tell you "q then u" - we should assume there's a "u"
         | after "q" unless otherwise specified. This isn't new to this
         | technique, it's a very common and old one (it's one of the
         | first ones I learned watching a series of youtube lectures on
         | compression or maybe it was just information theory in general)
         | and it has been commonly called lossless compression none the
         | less.
        
           | pbhjpbhj wrote:
           | https://news.ycombinator.com/item?id=22807000
           | 
           | [Aside: I've heard they know nothing of qi in Qatar, ;oP]
        
             | c3534l wrote:
             | You're just not sending the database - you're sending
             | whether or not it's the database. If you can only send a
             | binary "is the database or is not the database" then 0 and
             | 1 is indeed fully losslessly compressed information. If
             | that's really what you want, then that's how you would do
             | it. Full, perfect, lossless compression reduces your data
             | down to only that information not shared between the sender
             | and the receiver. Sending either 1 or 0 is, in fact,
             | exactly what you want to do if the receiver already knows
             | the contents of the database. Compression asks the question
             | "what is the smallest amount of data I have to receive for
             | the person on the other side to reconstruct the original
             | data?" If the answer is "1 bit" then that's a perfectly
             | valid message - the only information the receiver is
             | missing is a single bit.
        
       | 6510 wrote:
       | Reminds me of this.
       | 
       | https://en.wikipedia.org/wiki/Jan_Sloot
       | 
       | Gave me a comical thought if such things can be permitted.
       | 
       | You split into rgb and b/w, turn the pictures into blurred vector
       | graphics. Generate and use an incredibly large spectrum of
       | compression formulas made up of separable approaches that each
       | are sorted in such a way that one can dial into the most movie-
       | like result.
       | 
       | 3d models for the top million famous actors and 10 seconds of
       | speech then deepfake to infinite resolution.
       | 
       | Speech to text with plot analysis since most movies are pretty
       | much the same.
       | 
       | Sure, it wont be lossless but replacing a few unknown actors with
       | famous ones and having a few accidental happy endings seems
       | entirely reasonable.
        
       | tjchear wrote:
       | Would massive savings be achieved if an image sharing app like
       | say, Instagram were to adopt it, considering a lot of user-
       | uploaded travel photos of popular destinations look more or less
       | the same?
        
         | chickenpotpie wrote:
         | My guess is that it would be much more expensive unless it's a
         | frequently accessed image. CPU and GPU time is much more
         | expensive than storage costs on any cloud provider.
        
           | greenpizza13 wrote:
           | Where it might be very useful is for companies who distribute
           | Cellular IOT devices where they pay for each byte uploaded.
           | That could have a real impact on cost with the tradeoff being
           | more work on-device (which can be optimized).
        
             | chickenpotpie wrote:
             | Also could be great for using up spare CPU/GPU cycles or
             | having an AWS spot instance that triggers when pricing is
             | low to compress images.
        
           | kevinventullo wrote:
           | Wouldn't it be cheaper if the image is _infrequently_
           | accessed? I 'm thinking in the extreme case where you have
           | some 10-year-old photo that no one's looked at in 7 years. In
           | that case the storage costs are everything because the
           | marginal CPU cost is 0.
        
             | chickenpotpie wrote:
             | It depends if the decompression is done on the server or on
             | the client. If the client is doing the decompressing it
             | would be better to compress frequently accessed images
             | because it would lower bandwidth costs. If the server does
             | the decompressing it would be better for infrequently
             | accessed images to save on CPU costs.
        
           | ackbar03 wrote:
           | Thats sort of the conclusion I reached when I was looking at
           | the stuff before. The economics of it don't quite work out
           | yet
        
       | ackbar03 wrote:
       | This is interesting but I'm not sure if the economics of it will
       | ever work out. It'll only be practical when the computation costs
       | become lower than storage costs
        
         | MiroF wrote:
         | > computation costs become lower than storage costs
         | 
         | Most applications are memory movement constrained rather than
         | compute constrained.
        
         | baliex wrote:
         | Think of network bandwidth too!
        
         | The_Colonel wrote:
         | Think youtube or netflix - it's compressed once and then
         | delivered hundred million times to consumers.
        
           | ackbar03 wrote:
           | but if its something that's requested / viewed a lot thats
           | probably something don't want to be compressing/decompressing
           | all the time. Neural networks still take quite a lot of
           | computational power and require GPUs.
           | 
           | If its something you don't necessarily require all the time
           | its still probably cheaper to just store it instead of run it
           | through a ANN. You just need to look at the prices of a GPU
           | server compared with storage costs on AWS and the estimated
           | run time to see there is still a large difference.
           | 
           | I mean I could be wrong (and I'd love to be since I looked at
           | a lot of SR stuff before) but that's sort of the conclusion I
           | reached before and I don't really see anything has
           | significantly changed since
        
       | eximius wrote:
       | Is this actually lossless - that is, the same pixels as the
       | original are recovered, guaranteed? I'm surprised such guarantees
       | can be made from a neural network.
        
         | tasty_freeze wrote:
         | The way many compressors work is based on recent data, they try
         | to predict immediately following data. The prediction doesn't
         | have to be perfect; it just has to be good enough that only the
         | difference between the prediction and the exact data needs to
         | be encoded, and encoding that delta usually takes fewer bits
         | than encoding the original data.
         | 
         | The compression scheme here is similar. Transmit a low res
         | version of an image, use a neural network to guess what a 2x
         | size image would look like, then send just the delta to fix
         | where the prediction was wrong. Then do it again until the
         | final resolution image is reached.
         | 
         | If the neural network is terrible, you'd still get a lossless
         | image recovery, but the amount of data sent in deltas would be
         | greater than just sending the image uncompressed.
        
           | eximius wrote:
           | Ah, I understand! I wasn't aware that was how they worked!
        
         | jlebar wrote:
         | The neural net predicts the upscaled image, then they add on
         | the delta between that prediction and the desired output. No
         | matter what the neural net predicts, you can always generate
         | _some_ delta.
        
           | eximius wrote:
           | I was failing to understand the purpose of the delta. :)
        
             | magicalhippo wrote:
             | Note that this is similar to how for example MPEG does it
             | with the intermediate frames and motion vectors. First it
             | encodes a full frame using basically regular JPEG, then for
             | the next frames it first does motion estimation by
             | splitting the image into 8x8 blocks and then for each block
             | it tries to find the position in the previous frame which
             | best fits it. The difference in position is called the
             | motion vector for that block.
             | 
             | It can then take all the "best fit" block from the previous
             | frame and use it to generate a prediction of the next
             | frame. It then computes the difference between the
             | prediction and the actual frame, and stores this difference
             | along with the set of motion vectors used to generate the
             | prediction image.
             | 
             | If nothing much has changed, just camera moving a bit
             | about, the difference between the prediction and the actual
             | frame data is very small and can be easily compressed.
             | Also, the range of the motion vectors is typically limited
             | to +/-16 pixels, and you only need one per block, so they
             | take up very little space.
        
       | m3at wrote:
       | Related for an other domain, lossless text compression using
       | LSTM: https://bellard.org/nncp/
       | 
       | (this is by Fabrice Bellard, one wonder how he can achieve so
       | much)
        
       ___________________________________________________________________
       (page generated 2020-04-07 23:00 UTC)