[HN Gopher] Adobe Photoshop's 'super resolution' made my jaw hit... ___________________________________________________________________ Adobe Photoshop's 'super resolution' made my jaw hit the floor Author : giuliomagnifico Score : 143 points Date : 2021-03-13 19:03 UTC (3 hours ago) (HTM) web link (petapixel.com) (TXT) w3m dump (petapixel.com) | frereubu wrote: | Pixelmator did this a while ago (December 2019): | https://www.pixelmator.com/blog/2019/12/17/all-about-the-new... | Can't comment on the quality of either because I haven't used | them, but they both seem to go to 200%. | nerbert wrote: | I use it regularly (for my personal photos) and it depends on | the photo. My observation is that pictures of natural elements | (clouds, water, stars) tend to yield better results than for | example a family picture in my house. | marcodiego wrote: | The samples on the article show very good results at preserving | details so that curves do not get blurry when scaled up but are | not particularly impressive. | | Off topic: I remember a few years ago, some students got very | impressed by GIMP's Lanczos-3 upscaling that was much better than | the photoshop version they had access at the time. | PaulHoule wrote: | The 'Preserve Details 2.0' upscaler from photoshop does an | amazingly good job, in particularly I started with a 500x500 | square image of a gundam sketch illustration which showed scan | lines when printed directly on a 8 inch square but with 4x4 | scaling the image was close to perfect. | uniqueid wrote: | I thought it was interesting when Pixelmator introduced 'ML Super | Resolution' but that was in 2019. | https://9to5mac.com/2019/12/17/pixelmator-pro-super-resoluti... | [deleted] | jl6 wrote: | Is it just me who found the article immensely frustrating for | lacking same-region-crop side by side comparisons? | ghshephard wrote: | I don't even understand why someone would use a headline like | "Jaw hit the floor" without even bothering to share the two | images. It's not like Adobe Photoshop doesn't have the ability | to export images... | lupire wrote: | Adobe paid to promote their stuff | rrrrrrrrrrrryan wrote: | No, it's not just you. An overlay with a slider is pretty much | the standard way of comparing two near-identical images these | days, but this article not even having a side-by-side is just | downright lazy. | 0-_-0 wrote: | Exactly, an article about superresolution doesn't atually show | what superresolution is doing, despite the terribly clickbait-y | headline. | [deleted] | Waterluvian wrote: | Does the ML algorithm with the millions of images training set | get run locally or as a remote service? | | When I use ML these days is it all hardcore data crunching by | remote servers or is some of it running on my phone/laptop? | Invictus0 wrote: | The model is trained on Adobe servers and run locally on your | device. The training of the model is much more processor | intensive than actually utilizing the trained model, usually by | multiple orders of magnitude. | Waterluvian wrote: | Ah okay thanks. | | So any model that can be trained for generic use (Eg. Not | trying to deep fake my specific face) can presumably be run | on local machines. | | Thanks! | robk wrote: | Topaz Gigapixel is quite good though despite the negative | comments in the article. I'd rather give my money to Topaz for | this one feature than keep paying Adobe subscription fees | indefinitely | AmVess wrote: | I've been using it for a year. Each version gets a little | better than the last and it gets regular updates. | | It also has the rather large benefit of not being tied to | Creative Cloud, which is one of the worst bits of software ever | made. | CharlesW wrote: | Because they're David to Adobe's Goliath, I also feel compelled | to mention that I've just recently discovered/purchased this | and am incredibly impressed with it. | afavour wrote: | I guess professional photos have always been touched up and this | is just an automation of this process. But I've long felt a | little odd about the use of machine learning in photography this | way. How long before Google Photos recreates dark photos by just | reassembling all of the items it believes are there from machine | learning? | medlyyy wrote: | I mean, the only reason things like this aren't commonplace is | cost (of the skillset, tools). Basically anything's possible | these days with CGI, everything is purely a matter of the | amount of effort you want to put in. | | And for artistic purposes, why does it matter how the final | result was arrived at? If we have powerful and easy techniques | for realising an artistic vision, that doesn't seem like a bad | thing? | kristiandupont wrote: | ...I guess the _enhance_ jokes were just rendered void! | | https://www.youtube.com/watch?v=Vxq9yj2pVWk | [deleted] | endisneigh wrote: | I'm not sure if that's the case with this tech. I could see in | the near future a scenario in which many, many individuals | (thousands) are photographing the same things in the same area | and you can intelligently superimpose things to "enhance", tho. | benlumen wrote: | CSI and the like were just ahead of their time | | edit: I was joking, but people pointing out that you still | can't create something out of nothing etc might not be thinking | big enough. I think this technology absolutely has the | potential to help. police are literally still using artists | impressions - photofits, to find perpetrators | hojjat12000 wrote: | I think the artist impression has a lot more value than a | highly realistic generated face. If you see an artistic | impression, you will see the facial features that were | noticeable. Such as a mole, the shape of the nose, or the | thickness of the eyebrows. Then you have a template that your | brain uses to match those features with any face that you | see. However, if I show you a highly realistic face, your | brain will take a different impression. Your brain is trained | on faces for thousands of years. It will try to match the | face perfectly. | | An artist impression tells the audience that it is | inaccurate. A realistic photo tells the audience that this is | _exactly_ who we are looking for. | medlyyy wrote: | Yep. To be useful for exploring potential "true" values, a | system would probably need some way of showing you the | distribution of its guesses, so you can get an idea of | whether there is any significant information there. | | That aside, you'd still probably need a ML PhD to have a | chance of correctly interpreting the results, given the | myriad potential issues with current systems. | tyingq wrote: | This demo is fun to see the ML take different guesses at how to | inpaint the missing data: | http://www.chuanxiaz.com/project/pluralistic/ | qayxc wrote: | Absolutely not. If there's not enough information available, | there's not enough information, full stop. | | Plausible (i.e. "good looking" or "believable") results are not | the same as actual data, which is why _enhance_ wouldn 't work | on vehicle licence plates or faces for example. | | Sure, the result might be a plausible looking face or text, but | it's still not a valid representation of what was originally | captured. That's the danger with using such methods for | extracting actual information - it looks fine and is suitable | for decorative purposes, but nothing else. | nwienert wrote: | No there certainly is a chance for ML to improve here. | | Let's take the classic example of enhancing a blurry photo to | get a license plate. | | Humans may not be able to see much in the blur, but an AI | trained on many different highly down-res'd images could at | least give you plausible outcomes using far less data than a | human brain would be able to say anything with confidence. | | You wouldn't hold it up as the absolute truth, but you'd run | the potential plate and see if it matched some other data you | have. | | So yes, it wouldn't magically add any more information to the | image, but it could be far better at taking low information | and giving plausible outcomes that are then necessary to | verify. | qayxc wrote: | > Let's take the classic example of enhancing a blurry | photo to get a license plate. | | That's not the same as fabricating information, though. A | blurry image still contains a whole bunch of information | and correlation data that just isn't present in a handful | of pixels. | | This is not super-resolution, but something different | entirely. Super-resolution would mean to produce a readable | license plate from just a handful of pixels. That is an | impossible task, since the pixels alone would necessarily | match more than one plate. | | The algorithm would therefore have to "guess" and the | result will match something that is has been trained on | (read: plausible), but by no means the correct one, no | matter how many checks you run on a database. | | To illustrate the point, I took an image of a random | license plate, and scaled it down to 12x6 pixels. 4x super- | resolution would bring it to 48x24 pixels and should | produce perfectly readable results. | | Here's how it looks (original, down-scaled to 48x24, and | down-scaled to 12x6 pixels): | https://pasteboard.co/JSu3WDU.png | | The 48x24 pixel version could easily be upscaled to even | make the state perfectly readable. A 4x super-resolution | upscale of the 12x6 version, however, would be doomed to | fail no matter what. | | That's what I'm getting at. | | Just for shits and giggles, here's the AI 4x super- | resolution result: https://pasteboard.co/JSu7jkP.png | | Edit: while I'm having fun with super-resolution, here's | the upscaled result from the 48x24 pixel version: | https://pasteboard.co/JSu9Qh6.png | hojjat12000 wrote: | Actually the opposite. These algorithms are more | susceptible to noise, they may generate sharp perfect | license plate numbers (that are totally fabricated and | completely wrong) from a blurry image. But by no means | should you even consider the results to have hints of | truth. | | GAN produces totally different results if you slightly | change the input. | | So, as others are also saying, these "enhances" are great | for decoration and absolutely should be ignored as facts or | truth (specially when it comes to face and license plate | and others used by the law enforcement). | nwienert wrote: | No. If their training set isn't too far off what you use | it for, it is valid. Just because it's not guaranteed, | doesn't mean it's not more accurate than hitting sharpen | and squinting. | | You're fighting against "would it be reliable" but that | isn't the claim. | | The claim is could it be better than human, and the | answer is yes, it just depends on how well trained it is | and the dataset. | | But this is also entirely testable. I guarantee much like | Go, if we set up a "human vs AI guess the blurry image" | competition that AI will blow us out of the water. It's | simply a data * training issue, and humans don't spend | hours on end practicing enhancing images like they do | playing Chess. | | Again - it won't be perfect, obviously. It will have | false positives, of course. | | Doesn't mean it can't be better than human. | | Also GANs are pretty irrelevant, the model structure has | nothing to do with the theory. | perl4ever wrote: | >These algorithms are more susceptible to noise, they may | generate sharp perfect license plate numbers (that are | totally fabricated and completely wrong) from a blurry | image. | | This is not really an issue that is new or limited to | things that are called AI. | | https://www.theregister.com/2013/08/06/xerox_copier_flaw_ | mea... | medlyyy wrote: | That's a fair point. Police use artist sketches from | witness descriptions to help identify a suspect. It's a | similar idea. | | The difficulty will be making sure people treat it the same | way, because it _looks_ like a normal image. | qayxc wrote: | Hm. So I took the example image, upscaled by 200%, applied a | sharpen filter (all in Paint.NET) and compared the result to the | AI upscaled image. TBH, I couldn't see a difference. | | 2x upscaling isn't all that impressive to begin with (e.g. | produce 4 pixels from 1) and can be done in fairly high quality | using traditional non-learning algorithms. | | I'm much more impressed by 4x and 8x super-resolution. I'm really | not sure what the big deal is with 2x. | FpUser wrote: | Same here. Playing a bit with CLAHE, microcontrast and | sharpening gives visually the same if not even better results. | crazygringo wrote: | Came here to say the same thing. | | I was expecting the upscaled image to have extra "invented" | detail from the supposed ML, as I've seen elsewhere. | | But looking at these upscaled images, there isn't any at all. | There's no extra texture, nothing. | | I can't find any difference at all, like you say, from just | some bicubic interpolation with sharpening. | | No jaw dropping here, unfortunately. | formerly_proven wrote: | I concur, the second surfer example - the enhanced image looks | a lot like blowing up the original 2x and applying careful | sharpening. | afavour wrote: | > and applying careful sharpening | | That alone makes it worthwhile, surely? No careful | application required. | ShamelessC wrote: | For sure, but the title is quite an overstatement and reads | like it's from someone who haven't really been paying | attention to the many existing open source super resolution | offerings. | [deleted] | ISL wrote: | As a metrologist (and photographer), the difficulty with these | techniques is that they can over-represent the information | contained within an image; they present an image of what was | "probably" there, rather than representing what was. These aren't | so different from our own brains, which remember what we thought | we saw, rather than the light that reached our retinas. | | These methods are already in extensive use (most smartphone | images use extensive noise-reduction techniques), but we must be | ever-cognizant that image-processing techniques can add yet | another layer of nuance and uncertainty when we try to understand | an image. | social_quotient wrote: | It's like systems that add zeros (floating point math) to the | right of the decimal. 18.00000 is not the same as 18. | | Thoughts? | dejj wrote: | You probably want go with 0.1+0.2 != 0.3 | HenryBemis wrote: | I would say that it's like pixel's RGB at address 1x1 is | 0-0-0 and pixel at address 1x2 is 0-0-2 and squeezing | between them a pixel with color 0-0-1 (averaging the two | values near it)(assuming doing this on a image that has 1 | pixel height and e.g. 2 pixes width; so that the new image | would be would be: | | 1x1 0-0-0 (original) | | 1x2 0-0-1 (made up) | | 1x3 0-0-2 (original) | etrautmann wrote: | No - 000000 is not based on a statistical model of what's | most likely to be right if the decimal place. In natural | images - the statistical structure allows for this image | upscaling but without revealing any previously hidden detail | - just using know statistics of the world to show what might | be there. | gmfawcett wrote: | That's not how floating point math works? At least not for | standard floats (IEEE 754), and except for very large | integers (near 2^m, where m is the mantissa of the FP type). | Floats have an exact representation for integers within their | mantissa range -- i.e., '18' is exactly the same as | '18.0000'. | genericone wrote: | I dont think you mean for floating point, but for mechanical | tolerances. Many times, you dont want to pay an extra $50,000 | for the 5 digits of precision... but sometimes you do. Shitty | system if it automatically messed up all your part | tolerances. | dorkwood wrote: | An example I saw getting traction on Twitter a few months ago | was a photo of Melania Trump that was purported to be a body | double. Since the original image was blurry, someone used an AI | upscaler to "enhance" the photograph and increase the | resolution. Then the comments started to roll in: the teeth are | different! The tip of her nose doesn't match! It's not her! | | Technically, they were correct -- it wasn't her. It was an | algorithm's best-guess reconstruction based on training data of | other people's faces. Unfortunately, neither the original | poster or anyone else in the thread seemed to grasp this | concept. | roughly wrote: | I think this point is worth pushing on a bit harder, which is | to say that the "additional details" in the picture are guesses | by the software, not actual additional details. The data | present in the picture is fixed, the software uses that data to | build educated guesses on what was actually there. If the photo | doesn't contain enough data to actually determine what a given | piece of text in the image says, the software can provide a | guess, but it's just that, a guess. Similarly, if the photo | doesn't provide enough detail to positively identify a person, | the "super resolution" one cannot be used to positively | identify them either, as it's a guess made from incomplete | data, not genuinely new data. | | The point is worth belaboring because people have a tendency to | take the output from these systems as Truth, and while they can | be interesting and useful, they should not be used for things | for which the truth has consequences without understanding | their limitations. | | You're right to compare this to how our brains reconstruct our | own memories, and the implications that has for eyewitness | testimony should inform how we consider the outputs from these | systems. | TimTheTinker wrote: | This "guessing" is nice for the sake of artistry, but we've | got to be careful when knowing what actually _was_ there is | important--like when photos are submitted as evidence in | court cases, or when determining the identity of a person | from a photo as part of an investigation. | smnrchrds wrote: | I hope such photos are submitted as camera takes them. With | our without this new feature, photoshopping a photo before | presenting it to court must be illegal. | roughly wrote: | If you consider photos taken by cell phones, it's hard to | really say what "as the camera takes them" means - a lot | of ML-driven retouching happens "automagically" with most | modern cell phones already and I'd expect more in the | future. | kqr wrote: | It goes even further than that. Image sensors don't | capture images. They record electricity that can be | interpreted as an image. | | This might seem like a quibble, but once you dive a | little deeper into it, you realise that there's enormous | latitude and subjectivity in the way you do that | interpretation. | | What's even crazier is that this didn't come with digital | photography. Analogue film photography has the same | problem. The silver on the film doesn't become an image | until it's interpreted by someone in the darkroom. | | There is no such thing as an objective photograph. It's | always a subjective interpretation of an ambiguous | record. | klodolph wrote: | Analog photography you could at least use E-6. Processing | was tightly controlled and standardized, and once | processed, you had an image. | | The nice thing about this was that you could hand the E-6 | off to a magazine and end up with a photograph printed in | the magazine that was very close to the original film. | Any color shifts or changes in contrast you could see | just with your eyes. You could drop the film in a scanner | and visually confirm that the scan looks identical to the | original. (You cannot do this with C-41.) | | This was not used for forensic photography, though. The | point of using E-6 was for the photographer to make | artistic decisions and capture them on film, so they can | get back to taking photos. My understanding is that crime | scene photography was largely C-41, once it was | relatively cheap. | rozab wrote: | In some use cases, like OCR, the accuracy of these guesses | can be established in a scientific way. And it tends to be | very good. | sanj wrote: | Unless they're not: | https://www.dkriesel.com/en/blog/2013/0802_xerox- | workcentres... | roughly wrote: | I agree; I'd say two things in response, though: | | 1. However good the guess is, it's still just that: a | guess. Taking the standard of "evidence in a murder case", | the OCR can and probably should be used to point | investigators in the right direction so they can go and | collect more data, but it should not be considered | sufficient as evidence itself. | | 2. OCR is a relatively constrained solution space - success | in those conditions doesn't mean the same level of accuracy | can or will be reached outside of that constrained space. | | To be clear, though - I'm making a primarily epistemic | argument, not one based on utility. There are a lot of | areas for which these kind of machine guessing systems are | of enormous utility, we just shouldn't confuse what they're | doing with actual data collection. | sweezyjeezy wrote: | If you're using this to try and enhance super grainy CCTV | footage to get a face or license plate I'd agree. Purely in the | context of this article, the author is just upscaling an | already high-definition image 2x. There's very little artifice | that can be really added at this level that a human could | perceive IMO. | newobj wrote: | This is perceptual/creative enhance. Not Blade Runner enhance. | jrockway wrote: | I think this is probably good for what people use photos for; | it lets them show a crop without the image looking pixelated. | That means if they just want a photo to draw you in to their | blog post, they don't have to take a perfect photograph with | the right lens and right composition at the right time. And I | think that's fine. No new information is created by ML | upscaling, but it will look just good enough to fade into the | background. | | I personally take a lot of high resolution art photos. One that | is deeply in my memory is a picture I took of the Manhattan | bridge from the Brooklyn side with a 4x5 camera. I can get out | the negative and view it under magnification and read the | street signs across the river. (I would link you, but Google | downrez'd all my photos, so the negatives are all I have.) ML | upscaling probably won't let you do that, but on the other | hand, it's probably pointless. It's not something that has a | commercial use, it's just neat. If you want to know what the | street signs on the FDR say, you can just look at Google Street | View. | | (OK, maybe it does have some value. I used to work in an office | that had pictures blown up to room-size used as wallpaper in | conference rooms. It looked great, and satisfied my desire to | get close and see every detail. But, you know you're taking | that kind of picture in advance, and you use the right tools. | You can rent a digital medium format camera. You can use film | and get it drum scanned. But, for people that just need a | picture for an article, fake upscaling is probably good enough. | The picture isn't an art exhibit, or an attempt to collect | visual data. It's just something to draw you into the article | in the 3 milliseconds before you see a wall of text and | bounce.) | danaliv wrote: | _> These aren 't so different from our own brains, which | remember what we thought we saw, rather than the light that | reached our retinas._ | | Never mind memories; there are parts of our eyes that aren't | responsive to light at all. We're always hallucinating. | [deleted] | prox wrote: | Would it be right to say it is an synthesis on top of a | analysis? It wasn't what was observed. For some things it might | not matter, but "it looks shopped" isn't really a positive in | my book. Although the use case in the article is pretty handy, | to print stuff a lot larger. | [deleted] | oblio wrote: | I like how he realizes the impact for pro cameras but doesn't | highlight the elephant in the room: phone cameras. | | This means that soon for many people digital cameras outside of | their smartphone will become an even more niche product. | agumonkey wrote: | I always expected smartphones already used approximation of SR | to compensate for lower quality optics / ICs. | t-writescode wrote: | This is already the case. I rarely have a need to take out my | SLR - it's just too bulky to have a reason for it, unless I'm | going on an adventure where photography is one of the or the | main purpose. | | I've gone on hiking trips where my "challenge" was to only use | my phone camera. It wasn't much of a challenge for landscapes. | dingaling wrote: | > It wasn't much of a challenge for landscapes. | | Well if you collapse the problem space to a single point that | corresponds to a phone's standard field of view, then it | won't be a problem... | | But what if you wanted to catch a photo of a rare bird in | flight at 500mm equivalent, or a surfer caught at 1/4000th of | a second? | HenryBemis wrote: | Most people are like that. But when I go for a 'photowalk' I | cannot imagine not using my (#1) DSLR (or my (#2) super-duper | zoom point-and-shoot camera). | | Phone (imho) is for quick and dirty, not for a 'it's time to | do proper photography'. | Causality1 wrote: | Any chance of Adobe licensing this tech out? I can see it making | one hell of a difference when it comes to zoomed-in pictures on | phones. | lprd wrote: | Neat. If only Adobe would do away with their absurd pricing | models. I'll never use an adobe product again after trying to end | my subscription with them. | wlesieutre wrote: | If you're on a Mac, Pixelmator Pro is a $40 purchase with a | similar feature | | https://www.pixelmator.com/blog/2019/12/17/all-about-the-new... | Black101 wrote: | Yeah, a monthly payment forever is ridiculous... | lprd wrote: | I think what bothers me the most is that its not just a | monthly subscription. When you sign up, you are entering a | one year contract with them. Sure, you can cancel at any | time...just pay the remaining amount due and you can walk | away. | omnimus wrote: | Exactly. One of the arguments Adobe was making in | professional circles about the subscription switch was that | people will save money because they will be able to | subscribe to each piece of software and for short periods | of time when they need it. | | Truth is that anything other than the full suite (and maybe | the photographer plan) doesn't make sense financially. And | then they killed the month by month subscription as you | said. | Grakel wrote: | Absolutely. And the results shown in this article aren't | particularly impressive. I'll be sticking with photopea.com, | even though I get free CC through work. | lprd wrote: | Creative Cloud was the point I noticed a shift in Adobe's | priorities. I don't know if they switched CEO's at the time, | but I starting disliking Adobe more and more from that point | forward. I couldn't believe the amount of crud Creative Cloud | puts on your system, not the mention all of the tracking and | phoning home their software does. | mcrutcher wrote: | This seems highly relevant as to what is actually going on: | http://www.johncostella.com/magic/ | zamadatix wrote: | For the tl;dr the bit at the end (and linked paper) can cover | the topic without the backstory if that's not your sort of | thing: | | "As noted above, in 2021 I analytically derived the Fourier | transform of the Magic Kernel in closed form, and found, | incredulously, that it is simply the cube of the sinc function. | This implies that the Magic Kernel is just the rectangular | window function convolved with itself twice--which, in | retrospect, is completely obvious. This observation, together | with a precise definition of the requirement of the Sharp | kernel, allowed me to obtain an analytical expression for the | exact Sharp kernel, and hence also for the exact Magic Kernel | Sharp kernel, which I recognized is just the third in a | sequence of fundamental resizing kernels. These findings | allowed me to explicitly show why Magic Kernel Sharp is | superior to any of the Lanczos kernels. It also allowed me to | derive further members of this fundamental sequence of kernels, | in particular the sixth member, which has the same | computational efficiency as Lanczos-3, but has far superior | properties." | tomc1985 wrote: | A non-tech muggle's jaw hitting the floor is practically par for | course. I'm so tired of reading these breathless assessments from | people who don't know any better | intricatedetail wrote: | I hate articles where author shows an option but won't actually | tell where it is located in the application. I spent 10 minutes | looking for it in latest PS and couldn't find it. Then I clicked | at link to related article about "Enhance Details" and it seems | like the option could be in Lightroom instead? I tried to use it | myself because the illustrations in the article don't look to | impressive, but authors enthusiasm got me to look for it. | perl4ever wrote: | What I felt was missing was a "conventional" enlargement with | the previous best algorithm side by side with the AI one. | marcodiego wrote: | How about a less clickbaity title? | zarmin wrote: | oh man, wait til you see the rest of the internet. | spion wrote: | So what happens if you take a tiny 10x10 image and run it through | "super resolution" about 8-10 times? | sweezyjeezy wrote: | https://openaccess.thecvf.com/content_cvpr_2018/papers/Yu_Su... | | stuff like this - assuming it's a GAN under the hood it just | tries to guess a 'plausible' possible interpolation, but if | you're giving it very little information about what's in the | original image, there will be a wide range of plausible images | it could have arisen from, so the output can be very far from | the truth. | Someone wrote: | I suggest to read the Adobe blogpost at | https://blog.adobe.com/en/publish/2021/03/10/from-the-acr-te... | instead. It has sample images side-by-side with bicubic | upsampling. | | Even better comparisons are in the blog post for a competing | product: https://www.pixelmator.com/blog/2019/12/17/all-about- | the-new... (likely the same algorithm, but using a different | training set, so results will be different from what Adobes | product does). | | It has comparisons with nearest neighbor, bilinear and Lanczos | filters and uses a slider to make it easier to see the | difference. | | Papers on this task: https://paperswithcode.com/task/image-super- | resolution ___________________________________________________________________ (page generated 2021-03-13 23:01 UTC)