[HN Gopher] Lossless Image Compression Through Super-Resolution ___________________________________________________________________ Lossless Image Compression Through Super-Resolution Author : beagle3 Score : 269 points Date : 2020-04-07 13:17 UTC (9 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | Animats wrote: | This is a lot like "waifu2x".[1] That's super-resolution for | anime images. | | [1] https://github.com/nagadomi/waifu2x | dvirsky wrote: | How do ML based lossy codecs compare to state of the art lossy | compression? Intuitively it sounds like something AI will do much | better. But this is rather cool. | MiroF wrote: | They perform better, from what I've read. | qayxc wrote: | Depends entirely on your definition of "better". | | In terms of quality vs bit rate, ML-based methods are | superior. | | In terms of computation and memory requirements, they're | orders of magnitude worse. It's a trade-off; TINSTAAFL. | MiroF wrote: | > memory requirements | | Agreed, although this bit is unclear - the compressed | representations of the ML-based methods take up much less | space in memory than traditional methods, but yes - the | decompression pipeline is memory-intensive due to | intermediary feature maps. | nojvek wrote: | Does anyone know how much better the compression ratio is | compared to png? Which is also a lossless encoder. | trevyn wrote: | On the order of 10% smaller than WebP, substantially slower | encode/decode. | baq wrote: | is webp lossless? | dubcanada wrote: | It's both lossless and lossy - | https://en.wikipedia.org/wiki/WebP | propinquity wrote: | Webp supports lossy and lossless. | [deleted] | ltbarcly3 wrote: | The encode/decode is almost certainly not optimized, it's using | Pytorch and is a research project, a 10x speedup with a tuned | implementation is probably easily reachable, and I wouldn't be | surprised if 100x were possible even without using a GPU. | qayxc wrote: | Where did you get that from? PyTorch is already pretty | optimised and relies on GPU acceleration. | | The only parts that are slow in comparison are the bits | written in Python and those are just the frontend | application. | | There's not much room for performance improvement. | fredophile wrote: | That could be an acceptable trade off for some applications. I | could see this being useful for companies that host a lot of | images. You only need to encode an image once but pay the | bandwidth costs every time someone downloads it. Decoding speed | probably isn't the limiting factor of someone browsing the web | so that shouldn't negatively impact your customer's experience. | imhoguy wrote: | > Decoding speed probably isn't the limiting factor of | someone browsing the web so that shouldn't negatively impact | your customer's experience. | | Unless it is with battery powered devices. However I would | say that with general web browsing without ad-blocking it | wouldn't count much either it terms of bandwidth or | processing milliwatts. | jbverschoor wrote: | I though superresolution uses multiple input files to "enhance". | For example - extracting a highres image from a video clip | s_gourichon wrote: | They reformulate the decompression problem in the shape of a | supperresolution problem conforming to what you just wrote. | Instead of getting variety through images of a video clip they | use the generalization properties of a neural network. | | "For lossless super-resolution, we predict the probability of a | high-resolution image, conditioned on the low-resolution input" | propter_hoc wrote: | This is really interesting but out of my league technically. I | understand that super-resolution is the technique of inferring a | higher-resolution truth from several lower-resolution captured | photos, but I'm not sure how this is used to turn a high- | resolution image into a lower-resolution one. Can someone explain | this to an educated layman? | mywittyname wrote: | From peaking at the code, it seems like each lower res image is | a scaled down version of the original plus a tensor that is | used to upscale to the previous image. The resulting tensor is | saved and the scaled image is used as the input to the next | iteration. | | The decode process takes the last image from the process above, | and iteratively applies the upscalers until the original image | has been reproduced. | | Link to the code in question: | https://github.com/caoscott/SReC/blob/master/src/l3c/bitcodi... | peter_d_sherman wrote: | If we substitute "information" for "image", "low information" | for "low resolution" and "high information" for "high | resolution", perhaps compression could be obtained | generically on any data (not just images) by taking a high | information bitstream, using a CNN or CNN's (as per this | paper) to convert it into a shorter, low information | bitstream plus a tensor, and then an entropy (difference) | series of bits. | | To decompress then, reverse the CNN on the low information | bitstream with the tensor. | | You now have a high information bitstream which is _almost_ | like your original. | | Then use the entropy series of bits to fix the difference. | You're back to the original. | | Losslessly. | | So I wonder if this, or a similar process can be done on non- | image data... | | But that's not all... | | If it works with non-image data, it would also say that | mathematically, low information (lower) numbers could be | converted into high information (higher) numbers with a | tensor and entropy values... | | We could view the CNN + tensor as mathematical function, and | we can view the entropy as a difference... | | In other words: | | _Someone who is a mathematician might be able to derive some | identities, some new understandings in number theory from | this_... | valine wrote: | Convolution only works on data that is spatially related, | meaning data points that are close to each other are more | related than data points that are far apart. It doesn't | give meaningful results on data like spreadsheets where | columns or rows can be rearranged without corrupting the | underlying information. | | If by non-image data you mean something like audio, then | yes it could probably work. | crazygringo wrote: | This is utterly fascinating. | | To be clear -- it stores a low-res version in the output file, | uses neural networks to predict the full-res version, then | encodes the difference between the predicted full-res version and | the actual full-res version, and stores that difference as well. | (Technically, multiple iterations of this.) | | I've been wondering when image and video compression would start | utilizing standard neural network "dictionaries" to achieve | greater compression, at the (small) cost of requiring a local NN | file that encodes all the standard image "elements". | | This seems like a great step in that direction. | OskarS wrote: | It's a really cool idea, but I don't know if this would ever be | a practical method for image compression. First of all, you | could never change the neural network without breaking the | compression, so you can't ever "update" it. Like: what if you | figure out a better network? Too bad! I mean, I guess you | could, but then you need to to version the files and keep | copies of all the networks you've ever used, but this gets | messy quick. | | And speaking of storing the networks: I don't know that you | would ever want to pay the memory hit that it would take to | store the entire network in memory just to decompress images or | video, nor the performance hit the decompression takes. The | trade-off here is trading reduced drive space for massively | increased RAM and CPU/GPU time. I don't know any case where | you'd want to make that trade-off, at least not at this | magnitude. | | Again though: it's an awesome idea. I just don't know that's | ever going to be anything other than a cool ML curiosity. | cbhl wrote: | Even if it's not useful for general-purpose compression, it | may still be useful in a more restricted domain. In text | compression, Brotli can be found in Chrome with a dictionary | that is tuned for HTTP traffic. And in audio compression, | LPCnet is a research codec that used Wavenet (neural nets for | speech synthesis) to compress speech to 1.6kb/s (prior | discussion from 2019 at | https://news.ycombinator.com/item?id=19520194). | daveguy wrote: | I think the idea is the network is completely trained and | encoded along with the image and delta data. A new network | would just require retraining and storing that new network | along with the image data. It doesn't use a global network | for all compressions. | codeflo wrote: | I don't think this would work, the size of the network | would likely dominate the size of the compressed image. | mstade wrote: | Wouldn't the network be part of the decoder? | mastre_ wrote: | Yes, and this is why you couldn't update the network. | Still, much like how various compression algos have | "levels," this standard could be more open in this | regard, adding new networks (sort of what others above | refer to as versions) and the image could just specify | which network it uses. Maybe have a central repo from | where the decoder could pull a network it doesn't have | (i.e. I make a site and encode all 1k images on it using | my own network, pull the network to your browser once so | you can decode all 1k images). And even support a special | mode where the image explicitly includes the network to | be used for decoding it along with image data (could make | sense for a very large images, as well as for | specialized/demonstrational/test purposes). | | All in all, a very interesting idea. | mstade wrote: | I wonder what the security implications of all this is, | sounds dangerous to just run any old network. I suppose | maybe if it's sandboxed enough with very strongly defined | inputs and outputs then the worst that could happen is | you get garbled imagery? | stefs wrote: | they include the trained models under the "model weights" | section. imagenet is ~20mb, openimages is ~17mb. | | now this might be prohibitive for images over the web, but | it'd be interesting whether it might be applicable for | images with huge resolutions for printing, where single | images are are hundreds of megabytes | crazygringo wrote: | For a standard network, you're right there would only be one | version. So you just make sure it's very carefully put | together. (If a massively better one comes along, then you | just make it a new file format.) | | And as for performance/resources -- great point. But what | about video, where the space/bandwidth improvements become | drastically more important? | | Since h.264 and h.265 already has dedicated hardware, would | it be reasonable to assume that a chip dedicated to this | would handle it just fine? | | And that if you've already got hardware for video, then of | course you'd just re-use it for still images? | bryanrasmussen wrote: | >(If a massively better one comes along, then you just make | it a new file format.) | | I guess you could have versioning of your file format, and | some sort of organization that standardized it. | colejohnson66 wrote: | Then you get the layperson who doesn't understand that | and asks why their version 42 .imgnet won't open in a | program only supporting up to 10 (but they don't know | their image is v42 and the program only supports v10). | It's easier to understand different formats more than | different versions | burntoutfire wrote: | > I don't know that you would ever want to pay the memory hit | that it would take to store the entire network in memory just | to decompress images or video, nor the performance hit the | decompression takes. | | The big memory load wouldn't neccesarily be a problem for the | likes of Youtube and Netflix - they could just have dedicated | machines which do nothing else but decoding. The performance | penalty could be a killer though. | tobr wrote: | > First of all, you could never change the neural network | without breaking the compression, so you can't ever "update" | it. Like: what if you figure out a better network? Too bad! | | Isn't this just a special version of a problem any type of | compression will always have? There's all kinds of ways you | can imagine improving on a format like JPEG, but the reason | it's useful is because it's locked down and widely supported. | HelloNurse wrote: | Usual compression standards are mostly adaptive, estimating | statistical models of input from implicit prior | distributions (e.g. the probability of A followed by B | begins at p(A)p(B)), reasonable assumptions (e.g. scanlines | in an image follow the same distribution), small and fixed | tables and rules (e.g. the PNG filters): not only a low | volume of data, but data that can only change as a major | change of algorithm. | | A neural network that models upscaling is, on the other | hand, not only inconveniently big, but also completely | explicit (inviting all sorts of tweaking and replacement) | and adapted to a specific data set (further _demanding_ | specialized replacements for performance reasons). | | Among the applications that are able to store and process | the neural network, which is no small feat, I don't think | many would be able to amortize the cost of a tailored | neural network over a large, fixed set of very homogeneous | images. | | The imagenet64 model is over 21 MB: saving 21 MB over PNG | size, at 4.29 vs 5.74 bpp (table 2a in the article), | requires a set of more than 83 MB of perfectly | imagenet64-like PNG images, which is a lot. Compressing | with a custom upscaling model the image datasets used for | neural network experiments, which are large and stable, is | the most likely good application (with the side benefit of | producing useful and interesting downscaled images for free | in addition to compressing the originals). | ltbarcly3 wrote: | Why can't you update it? | | There could be a release of a new model every 6 months or | something (although even that is probably too often, the | incremental improvement due to statistical changes in the | distribution of images being compressed isn't likely to | change much over time), and you just keep a copy of all the | old models (or lazily download them like msft foundation c++ | library versions when you install an application). | | The models themselves aren't very large. | zo1 wrote: | Not just that, but you could take a page out of the | "compression" book and treat the NN as a sort of dictionary | in that it is part of the compressed payload. Maybe not the | whole NN, but perhaps deltas from a reference | implementation, assuming the network structure remains the | same and/or similar. | spuz wrote: | I don't know why this comment was downvoted - it's a | legitimate question. | | One scenario I can picture is the Netflix app on your TV. | Firstly, they create a neural network trained on the video | data in their library and ship it to all their clients | while they are idle. They could then stream very high- | quality video at lower bandwidth than they currently use | and, assuming decoding can be done quickly enough, provide | a great experience for their users. Any updates to the | neural network could be rolled out gradually and in the | background. | devinplatt wrote: | Google used to do something called SDCH (Shared | Dictionary Compression for HTTP), where a delta | compression dictionary was downloaded to Chrome. | | The dictionary had to be updated from time to time to | keep a good compression rate as the Google website | changed over time. There was a whole protocol to handle | verifying what dictionary the client had and such. | contravariant wrote: | If you've got a big enough image you can include the model | parameters with the image. | jbverschoor wrote: | Great explanation. If it is and stays lossless, it would make | an awesome photo archiving and browsing tool. | | Browse thumbnails, open original. Without any processes to | generate / keep in sync these files. | greesil wrote: | The jpeg2000 standard allows for multiscale resolutions by | using wavelet transforms instead of the discrete cosine | transformation | mceachen wrote: | The majority of photos you already have most likely contain | thumbnail and larger preview images embedded in the EXIF | header. | | Raw images typically contain an embedded, full-sized JPEG | version of the image as well. | | All of these are easily extracted with `exiftool -b | -NameOfBinaryTag $file > thumb.jpg`. | | I've found while making PhotoStructure that the quality of | these embedded images are surprisingly inconsistent, though. | Some makes and models do odd things, like handle rotation | inconsistently, add black bars to the image (presumably to | fit the camera display whose aspect ratio is different from | the sensor), render the thumb with a color or gamma shift, or | apply low quality reduction algorithms (apparent due to | nearest-neighbor jaggies). | | I ended up having to add a setting that lets users ignore | these previews or thumbnails (to choose between "fast" and | "high quality"). | jbverschoor wrote: | The point is to have originals available at a good | compression rate. Having a thumbnail in the original sucks, | as I don't want lossy compression on my originals. | Scene_Cast2 wrote: | There is already a startup that makes a video compression codec | based on ML - http://www.wave.one/video-compression - I am | personally following their work because I think it's pretty | darn cool. | dodobirdlord wrote: | This technique has also been used in the ogg-opus audio codec. | baq wrote: | so you could say it precomputes a function (and it's inverse) | which allows computing a very space-efficient information-dense | difference between a large image and its thumbnail? | ska wrote: | It's an old idea really, or a collection of old ideas with a NN | twist. Not really clear how much that latter bit brings to the | table but interesting to think about. | | The "dictionary" approach was roughly what vector quantization | was all about. The idea of turning lossy encoders into lossless | by also encoding the error is a old one too, but somewhat | derailed by focus on embedable codecs with an ideal of each | additional bit read will improve your estimate. | | I think the potentially novelty here is really in the | unfortunately-named-but-too-late-now super-resolution aspects. | You could do the same sort of thing ages ago with say IFS | projection, or wavelet (and related) trees, or VQ dictionaries | with a resolution bump, but they were limited by the training a | bit (although this approach might have some overtraining issues | that make it worse for particular applications. | lidHanteyk wrote: | Of the papers at Mahoney's page [0], "Fast Text Compression | with Neural Networks" dates to 2000; people have been applying | these techniques for decades. | | [0] http://mattmahoney.net/dc/ | cztomsik wrote: | Here's great book on this http://mattmahoney.net/dc/dce.html | | The guy was using neural networks for compression long time ago | before it was a thing again. | | EDIT: Oh, somebody mentioned it already (but it's really good, | free & totally worth reading) | tobib wrote: | Indeed very fascinating. | | Reminds me of doing something similar, albeit a thousand times | dumber in ~2004 when I had to find a way to "compress" interior | automotive audio data, indicator sounds, things like that. At | some point instead of using traditional copression, I | synthesized a wave function and and only stored its parameters | and the delta from the actual wave which achieved great | compression ratios. It was expensive to compress but virtually | free to decompress. And as a side effect my student mind was | forever blown by the beauty of it. | GuB-42 wrote: | Even though the implementation details are far from trivial, | the general idea is fairly typical. Most advanced compression | algorithms work the same way. | | - Using the previously decoded data, try to predict what's | next, and the probability or being right | | - Using an entropy coder, encode the difference between what is | predicted and the actual data. The predicted probability will | be used to define how many bits to assign to each possible | value. The higher the probability, the less bits will be used | for a "right" answer and the more bits will be used for a | "wrong" answer. | | Decoding works by "replaying" what the encoder did. | | The most interesting part is the prediction. So much that some | people think of compression as a better test for AIs that the | Turing test. You are basically asking the computer to solve one | of these sequence-based IQ tests. | | And of course neural networks are one of the first thing we | tend to think of when we want to implement an AI, and | unsurprising it is not an uncommon approach for compression. | For instance, the latest PAQ compressors use neural networks. | | Of course, all that is a general idea. How to do it in practice | is where the real challenge is. Here, the clever part is to | "grow" the image from low to high resolution. Which kinds of | reminds me of wavelet compression. | ithinkso wrote: | Anyone interested in this approach to lossless compression | should visit (and try to win!) Hutter Prize website[1][2]. | The goal is to compress 1GB of english wikipedia | | [1] http://prize.hutter1.net/ | | [2] https://en.wikipedia.org/wiki/Hutter_Prize | remcob wrote: | > encode the difference between what is predicted and the | actual data | | Minor nitpick: In the idealized model there is no single | prediction that you can take the difference with. There is | just a probability distribution and you encode the actual | data using this distribution. | | Taking the difference between the most likely prediction and | the actual data is just a very common implementation | strategy. | anamexis wrote: | How is "the most likely prediction" not a "single | prediction that you can take the difference with"? | dorgo wrote: | My interpreataion: Create and distribute a library of all | possible images (except ones which look like random noise or | are otherwise unlikely to ever be needed). When you want to | send an image, find it in the library and send its index | instead. Use advanced compression (NNs) to reduce the size of | the library. | Der_Einzige wrote: | This technology is super awesome... and it's been available for | awhile. | | A few years ago, I worked for #bigcorp on a product which, among | other things, optimized and productized a super resolution model | and made it available to customers. | | For anyone looking for it - it should be available in several | open source libraries (and closed source #bigcorp packages) as an | already trained model which is ready to deploy | hinkley wrote: | I wonder how well this technique works when the depth of field is | infinite? | | Out of focus parts of an image should be pretty darned easy to | compress using what is effectively a thumbnail. | | That said, the idea of having an image format where 'preview' | code barely has to do any work at all is pretty damned cool. | asciimike wrote: | Reminds me of RAISR (https://ai.googleblog.com/2016/11/enhance- | raisr-sharp-images...). | | I remember talking with the team and they had production apps | using it and reducing bandwidth by 30%, while only adding a few | hundred kb to the app binary. | Animats wrote: | How does it work for data other than Open Images, if trained on | Open Images? If it recognizes fur, it's going to be great on cat | videos. | acjohnson55 wrote: | Interesting. It sounds like the idea is fundamentally like | factoring out knowledge of "real image" structure into a neutral | net. In a way, this is similar to the perceptual models used to | discard data in lossy compression. | dehrmann wrote: | I wonder if there's a way to do this more like traditional | compression; performance is a huge issue for compression, and | taking inspiration from a neural network might be better than | actually using one. Conceptually, this is like a learned | dictionary that's captured by the neural net, it's just that | this is fuzzier. | retrac wrote: | Training the model is extremely expensive computationally, | but using it often isn't. | | For example, StyleGAN takes months of compute-time on a | cluster of high-end GPUs to train to get the photorealistic | face model we've all seen. But generating new faces from the | trained model only takes mere seconds on a low-end GPU or | even a CPU. | ilaksh wrote: | I asked a question about a similar idea on Stack Overflow in | 2014. https://cs.stackexchange.com/questions/22317/does-there- | exis... | | They did not have any idea and they were dicks about it as usual. | [deleted] | pbhjpbhj wrote: | It seems like "lossless" isn't quite right; some of the | information (as opposed to just the algo) seems to be in the NN? | | Is a soft-link a lossless compression? | | It's like the old joke about a pub where they optimise by | numbering all the jokes, .. just the joke number isn't enough, it | can be used to losslessly recover the joke, but it's using the | community storage to hold the data. | tverbeure wrote: | When you consider the compression algorithm itself a form of | information, your point about it not being quite lossless could | be applied to any lossless compression method. | pbhjpbhj wrote: | Yes, but there's a line somewhere, isn't there? A link to a | database isn't "lossless compression", even if it returns an | artefact unchanged. | chickenpotpie wrote: | As long as you get back the exact same image you put in, it's | lossless. | pbhjpbhj wrote: | So this "https://news.ycombinator.com/reply?id=22804687&goto= | threads%... is a lossless encoding of your comment because it | returns the content? That doesn't seem right to me. | chickenpotpie wrote: | We're talking about lossless compression. A URL is a way to | locate a resource, it is not compression. Compression is | taking an existing recourse and transforming it into | something smaller. A URL isn't a transformation. If I | delete my comment the URL no longer refers to anything. | jfkebwjsbx wrote: | No, because that uses external data. | yters wrote: | In that sense, I can losslessly compress everything down to | zero bits, and recover the original artifact perfectly with | the right algorithm. | chickenpotpie wrote: | Yes you can. However, it doesn't mean it's good lossless | compression. | ebg13 wrote: | You're ignoring two things: | | 1) that the aggregate savings from compressing the images | needs to outweigh the initial cost of distributing the | decompressor. | | 2) to be lossless, decompression must be deterministic an | unambiguous, so you can't compress _everything_ down to | zero bits; you can compress only _one_ thing down to zero | bits, because otherwise you wouldn't be able to | unambiguously determine which thing is represented by your | zero bits. | yters wrote: | In each case I pick out an algorithm beforehand that will | inflate my zero bits to whatever artifact I desire. | karpierz wrote: | Then that becomes part of the payload that you | decompress, and you no longer have a 0 byte payload. | ebg13 wrote: | "I will custom write a new program to (somehow) generate | each image and then distribute that instead of my image" | is not a compression algorithm. But I think you'd do well | over at the halfbakery. | yters wrote: | It works well if it's the only image! | ebg13 wrote: | Now you're chasing your own tail. You've gone from "I can | losslessly compress everything" to "I can losslessly | compress exactly one thing only". | yters wrote: | I'm arguing it's the same as this image compression | technique. They rely on a huge neural network which must | exist wherever the image is to be decompressed. | | If I'm allowed to bring along an unlimited amount of | background data, then I can compress everything down to | zero bits. | | In contrast, an algorithm like LZ78 can expressed in a 5 | line python script and perform decently on a wide variety | of data types. | ebg13 wrote: | > _If I 'm allowed to bring along an unlimited amount of | background data, then I can compress everything down to | zero bits._ | | If by "background data" you mean the decompressor, this | is patently false. No matter how much information is | contained in the decompressor (The Algorithm + stable | weights that don't change), you can only compress one | thing down to any given new representation ( low | resolution image + differential from rescale using stable | weights ). | | If by "background data" you mean new data that the | decompressor doesn't already have, then you're ignoring | the definition of compression. Your compressed data is | all bits sent on the fly that aren't already possessed by | the side doing the decompression regardless of obtuse | naming scheme. | | > _I 'm arguing it's the same as this image compression | technique._ | | That's wrong, because this scheme doesn't claim to send a | custom image generator instead of each image, which is | what you're proposing. | milesvp wrote: | You have to convey which algorithm, which takes bits. And | at the very least you need a pointer to a file, which | also takes bits. You'd do well to look for archives of | alt.comp.compression. | | There was also a classic thread that surfaced recently on | HN about a compression challenge whereby someone tried to | disengenuously attempt to compress a file of random data | (uncompressable by definition) by splitting it on a | character then deleting that character from each file. | Was a simple algorithm, that appeared to require fewer | bits to encode. The problem is, all this person did was | shift the bits to the filesystem's metadata, which is not | obvious from the command line. The final encoding ended | up taking more bits once you take said metadata into | account. | atorodius wrote: | You are missing the point that you can use the same | algorithm to compress N - infinity images. So the algorithm | size amortizes. | yters wrote: | Is that true? | nullc wrote: | Unless it's overfit on some particular inputs, but if | so-- it's bad science. | | Ideally they would have trained the network on a non- | overlapping collection of images from their testing but | if they did that I don't see it mentioned in the paper. | | The model is only 15.72 MiB (after compression with xz), | so it would amortize pretty quickly... even if it was | trained on the input it looks like it still may be pretty | competitive at a fairly modest collection size. | c3534l wrote: | This is true of all compression formats. The receiver has to | know how to decode it. A good compression algorithm will | attempt to send only the data that the receiver can't predict. | If we both know you're sending English words, then I shouldn't | have to tell you "q then u" - we should assume there's a "u" | after "q" unless otherwise specified. This isn't new to this | technique, it's a very common and old one (it's one of the | first ones I learned watching a series of youtube lectures on | compression or maybe it was just information theory in general) | and it has been commonly called lossless compression none the | less. | pbhjpbhj wrote: | https://news.ycombinator.com/item?id=22807000 | | [Aside: I've heard they know nothing of qi in Qatar, ;oP] | c3534l wrote: | You're just not sending the database - you're sending | whether or not it's the database. If you can only send a | binary "is the database or is not the database" then 0 and | 1 is indeed fully losslessly compressed information. If | that's really what you want, then that's how you would do | it. Full, perfect, lossless compression reduces your data | down to only that information not shared between the sender | and the receiver. Sending either 1 or 0 is, in fact, | exactly what you want to do if the receiver already knows | the contents of the database. Compression asks the question | "what is the smallest amount of data I have to receive for | the person on the other side to reconstruct the original | data?" If the answer is "1 bit" then that's a perfectly | valid message - the only information the receiver is | missing is a single bit. | 6510 wrote: | Reminds me of this. | | https://en.wikipedia.org/wiki/Jan_Sloot | | Gave me a comical thought if such things can be permitted. | | You split into rgb and b/w, turn the pictures into blurred vector | graphics. Generate and use an incredibly large spectrum of | compression formulas made up of separable approaches that each | are sorted in such a way that one can dial into the most movie- | like result. | | 3d models for the top million famous actors and 10 seconds of | speech then deepfake to infinite resolution. | | Speech to text with plot analysis since most movies are pretty | much the same. | | Sure, it wont be lossless but replacing a few unknown actors with | famous ones and having a few accidental happy endings seems | entirely reasonable. | tjchear wrote: | Would massive savings be achieved if an image sharing app like | say, Instagram were to adopt it, considering a lot of user- | uploaded travel photos of popular destinations look more or less | the same? | chickenpotpie wrote: | My guess is that it would be much more expensive unless it's a | frequently accessed image. CPU and GPU time is much more | expensive than storage costs on any cloud provider. | greenpizza13 wrote: | Where it might be very useful is for companies who distribute | Cellular IOT devices where they pay for each byte uploaded. | That could have a real impact on cost with the tradeoff being | more work on-device (which can be optimized). | chickenpotpie wrote: | Also could be great for using up spare CPU/GPU cycles or | having an AWS spot instance that triggers when pricing is | low to compress images. | kevinventullo wrote: | Wouldn't it be cheaper if the image is _infrequently_ | accessed? I 'm thinking in the extreme case where you have | some 10-year-old photo that no one's looked at in 7 years. In | that case the storage costs are everything because the | marginal CPU cost is 0. | chickenpotpie wrote: | It depends if the decompression is done on the server or on | the client. If the client is doing the decompressing it | would be better to compress frequently accessed images | because it would lower bandwidth costs. If the server does | the decompressing it would be better for infrequently | accessed images to save on CPU costs. | ackbar03 wrote: | Thats sort of the conclusion I reached when I was looking at | the stuff before. The economics of it don't quite work out | yet | ackbar03 wrote: | This is interesting but I'm not sure if the economics of it will | ever work out. It'll only be practical when the computation costs | become lower than storage costs | MiroF wrote: | > computation costs become lower than storage costs | | Most applications are memory movement constrained rather than | compute constrained. | baliex wrote: | Think of network bandwidth too! | The_Colonel wrote: | Think youtube or netflix - it's compressed once and then | delivered hundred million times to consumers. | ackbar03 wrote: | but if its something that's requested / viewed a lot thats | probably something don't want to be compressing/decompressing | all the time. Neural networks still take quite a lot of | computational power and require GPUs. | | If its something you don't necessarily require all the time | its still probably cheaper to just store it instead of run it | through a ANN. You just need to look at the prices of a GPU | server compared with storage costs on AWS and the estimated | run time to see there is still a large difference. | | I mean I could be wrong (and I'd love to be since I looked at | a lot of SR stuff before) but that's sort of the conclusion I | reached before and I don't really see anything has | significantly changed since | eximius wrote: | Is this actually lossless - that is, the same pixels as the | original are recovered, guaranteed? I'm surprised such guarantees | can be made from a neural network. | tasty_freeze wrote: | The way many compressors work is based on recent data, they try | to predict immediately following data. The prediction doesn't | have to be perfect; it just has to be good enough that only the | difference between the prediction and the exact data needs to | be encoded, and encoding that delta usually takes fewer bits | than encoding the original data. | | The compression scheme here is similar. Transmit a low res | version of an image, use a neural network to guess what a 2x | size image would look like, then send just the delta to fix | where the prediction was wrong. Then do it again until the | final resolution image is reached. | | If the neural network is terrible, you'd still get a lossless | image recovery, but the amount of data sent in deltas would be | greater than just sending the image uncompressed. | eximius wrote: | Ah, I understand! I wasn't aware that was how they worked! | jlebar wrote: | The neural net predicts the upscaled image, then they add on | the delta between that prediction and the desired output. No | matter what the neural net predicts, you can always generate | _some_ delta. | eximius wrote: | I was failing to understand the purpose of the delta. :) | magicalhippo wrote: | Note that this is similar to how for example MPEG does it | with the intermediate frames and motion vectors. First it | encodes a full frame using basically regular JPEG, then for | the next frames it first does motion estimation by | splitting the image into 8x8 blocks and then for each block | it tries to find the position in the previous frame which | best fits it. The difference in position is called the | motion vector for that block. | | It can then take all the "best fit" block from the previous | frame and use it to generate a prediction of the next | frame. It then computes the difference between the | prediction and the actual frame, and stores this difference | along with the set of motion vectors used to generate the | prediction image. | | If nothing much has changed, just camera moving a bit | about, the difference between the prediction and the actual | frame data is very small and can be easily compressed. | Also, the range of the motion vectors is typically limited | to +/-16 pixels, and you only need one per block, so they | take up very little space. | m3at wrote: | Related for an other domain, lossless text compression using | LSTM: https://bellard.org/nncp/ | | (this is by Fabrice Bellard, one wonder how he can achieve so | much) ___________________________________________________________________ (page generated 2020-04-07 23:00 UTC)