[HN Gopher] Color-Diffusion: using diffusion models to colorize ...
       ___________________________________________________________________
        
       Color-Diffusion: using diffusion models to colorize black and white
       images
        
       Author : dvrp
       Score  : 101 points
       Date   : 2023-08-03 20:24 UTC (2 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | buildbot wrote:
       | Does it work on arbitrary image sizes?
       | 
       | One of the nice features of the somewhat old Deoldify colorizer
       | is support for any resolution. It actually does better than
       | photoshops colorization: https://blog.maxg.io/colorizing-
       | infrared-images-with-photosh...
       | 
       | Edit - technically, I suppose, the way Deoldify works is by
       | rendering the color at a low resolution and then applying the
       | filter to a higher resolution using OpenCV. I think the same sub-
       | sampling approach could work here...
        
         | erwannmillon wrote:
         | Technically yes, the encoder and unet are convolutional and
         | support arbitrary input sizes, but the model was trained at
         | 64x64px bc of compute limitations. You could probably resume
         | the training from a 64x64 resolution checkpoint and train at a
         | higher resolution.
         | 
         | But like most diffusion models, they don't generalize very well
         | to resolutions outside of their training dataset
        
       | asciimov wrote:
       | I'm not a fan of b&w colorization. Often the colors are wrong,
       | either outright color errors (like choices for clothing or cars)
       | or often not taking in to account lighting conditions (late in
       | day shadows but midday brightness).
       | 
       | Then there is the issue of B&W movies. Using this kind of tech
       | might not give pleasing results as the colors used for sets and
       | outfits were chosen to work well for film contrast and not for
       | story accuracy. That "blue" dress might really be green. (Please,
       | just leave B&W movies the way they are.)
        
         | zamadatix wrote:
         | I think colorization with some effort put in can be pretty
         | decent. E.g. I prefer the 2007 colorization of It's a Wonderful
         | Life to the original. It's never perfect but I don't think
         | that's a prerequisite to being better. Some will always
         | disagree though.
         | 
         | About every completely automated colorized video tends to be
         | pretty bad though. Particularly the YouTube "8k colorized
         | interpolated" kind of low effort channels where they just let
         | them pump out without caring if it's actually any good.
        
       | iamflimflam1 wrote:
       | I wonder if this can be used for color correction in videos.
        
       | erwannmillon wrote:
       | Btw, I did this in pixel space for simplicity, cool animations,
       | and compute costs. Would be really interesting to do this as an
       | LDM (though of course you can't really do the LAB color space
       | thing, unless you maybe train an AE specifically for that color
       | space. )
       | 
       | I was really interested in how color was represented in latent
       | space and ran some experiments with VQGAN clip. You can actually
       | do a (not great) colorization of an image by encoding it w/
       | VQGAN, and using a prompt like "a colorful image of a woman".
       | 
       | Would be fun to experiment with if anyone wants to try, would
       | love to see any results if someone wants to build
        
         | xigency wrote:
         | Question, how long did it take to train this model and what
         | hardware did you use?
        
         | carbocation wrote:
         | > _I did this in pixel space for simplicity, cool animations,
         | and compute costs_
         | 
         | A slight nitpick, wouldn't doing diffusion in the latent space
         | be cheaper?
        
       | data-ottawa wrote:
       | Off topic: this has an absolutely 90's sci-fi movie effect
       | watching the gifs, it's funny how the tech just wound up looking
       | like that.
        
         | erwannmillon wrote:
         | hahaha it reminded me of some "zoom and enhance" stuff when I
         | was making the animations
        
           | nerdponx wrote:
           | Looks like something you'd see in an X Files episode.
        
           | barrkel wrote:
           | It reminded me of the days of antenna pass-through VCR
           | players, where you had to tune into your VCR's broadcast
           | signal when you couldn't use SCART.
        
       | snvzz wrote:
       | All the examples are portraits of people.
       | 
       | I have to wonder whether it works well with anything else.
        
         | erwannmillon wrote:
         | trained on celebA, so no, but you could for sure train this on
         | a more varied dataset
        
           | Eisenstein wrote:
           | Would it be as simple as feeding it a bunch of decolorized
           | images along with the originals?
        
             | atorodius wrote:
             | yes, so infinite training data. but the challenge will be
             | scaling to large resolutions and getting global consistency
        
               | jrockway wrote:
               | Is that challenging? Humans have awful color resolution
               | perception, so even if you have a huge black-and-white
               | image, people would think it looks right with even with
               | very low-resolution color information. Or, if the AI
               | hallucinates a lot of high frequency color noise, it
               | wouldn't be noticable.
               | 
               | Wikipedia has a great example image here:
               | https://en.wikipedia.org/wiki/Chroma_subsampling. Most
               | people would say all of them looked fine at 1:1
               | resolution.
        
               | atorodius wrote:
               | I meant more from a comoute standpoint, the models are
               | expensive to run full res
        
               | jrockway wrote:
               | I see what you mean. I think that you can happily scale
               | the B&W image down, run the model, and then scale the
               | chroma information back up.
               | 
               | Something I was thinking about after writing the comment
               | is that the model is probably trained on chroma-
               | subsampled images. Digital cameras do it with the bayer
               | filter, and video cameras add 4:2:0 subsampling or
               | similar subsampling as they compress the image. So the AI
               | is probably biased towards "look like this photo was
               | taken with a digital camera" versus "actually reconstruct
               | the colors of the image". What effect this actually has,
               | I don't know!
        
               | atorodius wrote:
               | good point, I hadn't realized that you only need to
               | predict chroma! That actully greatly simplifies things
               | 
               | re. chroma subsampling in training data: this is actually
               | a big problem and a good generative model will absolutely
               | learn to predict chroma subsampled values (or JPEG
               | artifacts even!). you can get around it by applying
               | random downscaling with antialiasing during training.
        
               | drapado wrote:
               | I guess you can always use a two-stage process. First
               | colorize, then upscale
        
               | atorodius wrote:
               | yeah, you can use SOTA super res, but that tends to be
               | generative too (even diffusion based on its own, or more
               | commonly based on GANs). it can be a challenge to
               | synthesize the right high res details.
               | 
               | but that's basically the stable diffusion paper
               | (diffusion in latent space plus GAN superres)
        
             | erwannmillon wrote:
             | basically the training works as follows: Take a color image
             | in RGB. Convert it to LAB. This is an alternative color
             | space where the first channel is a greyscale image, and two
             | channels that represent the color information.
             | 
             | In a traditional pixel-space (non latent) diffusion model,
             | you noise all the RGB channels and train a Unet to predict
             | the noise at a given timestep.
             | 
             | When colorizing an image, the Unet always "knows" the black
             | and white image (i.e the L channel).
             | 
             | This implementation only adds noise to the color channels,
             | while keeping the L channel constant.
             | 
             | So to train the model, you need a dataset of colored
             | images. They would be converted to LAB, and the color
             | channels would be noised.
             | 
             | You can't train on decolorized images, because the neural
             | network needs to learn how to predict color with a black
             | and white image as context. Without color info, the model
             | can't learn.
        
               | coldtea wrote:
               | > _You can 't train on decolorized images, because the
               | neural network needs to learn how to predict color with a
               | black and white image as context. Without color info, the
               | model can't learn._
               | 
               | I think the parent means with delocorized images used to
               | test the success and guide the training (since they can
               | be readily compared with the colored image they resulted
               | from which would be the perfect result).
               | 
               | Not to use decolorized images alone to train for coloring
               | (which doesn't even make sense).
        
               | omoikane wrote:
               | Is there a reason for using LAB as opposed to YCbCr? My
               | understanding is that YCbCr is another model that
               | separates luma (Y) from chroma (Cb and Cr), but JPEG uses
               | YCbCr natively, so I wonder if there would be any
               | advantage in using that instead of LAB?
        
               | TylerE wrote:
               | The Y in YCbCr is linear, and is just a grayscale image.
               | The L channel in lab is non-linear (as are A and B), and
               | is a complex transfer function designed to mimic the
               | response of the human eye.
               | 
               | A YCbCr colorspace is directly mapped from RGB, and thus
               | is limited to that gamut.
               | 
               | LAB can encode colors brighter than diffuse white (ala
               | #ffffff), like an outdoor scene in direct sunlight.
               | 
               | Sorta HDR (LAB) vs non-HDR (YCbCr).
               | 
               | This image (https://upload.wikimedia.org/wikipedia/common
               | s/thumb/f/f3/Ex...) is a good demo, left side was
               | processed in LAB, right in YCbCr). Even reduced back down
               | to a jpeg, the left side is obviously more lifelike,
               | since the highlights and tones were preserved until much
               | later in processing pipeline.
        
               | atorodius wrote:
               | You can take arbitrary images and convert them to
               | grayscale for training, and do conditional diffusion
        
               | bemusedthrow75 wrote:
               | But convert them to grayscale how?
               | 
               | Black and white film doesn't have one single colour
               | sensitivity. Play around with something like DxO FilmPack
               | sometime (it has excellent measurement-based
               | representations of black and white film stocks).
               | 
               | It's a much more complex problem than it might seem on
               | the surface.
        
               | atorodius wrote:
               | fair, but can't you just randomize the grayscale
               | generation for training?
        
               | bemusedthrow75 wrote:
               | But since you do not have access to colour originals of
               | historical photos in almost every instance, you cannot
               | possibly train the network to have any instinct for the
               | colour sensitivity of the medium, can you?
               | 
               | An extreme example:
               | 
               | https://www.cabinetmagazine.org/issues/51/archibald.php
               | 
               | https://www.messynessychic.com/2016/05/05/max-factors-
               | clown-...
               | 
               | Colourising old TV footage can _only_ result in a
               | misrepresentation, because the underlying colour is false
               | to have any kind of usable representation on the medium
               | itself.
               | 
               | And this caricatured example underpins the problem with
               | colourisation: contemporary bias is unavoidable, and can
               | be misleading. Can you take a black and white photo of an
               | African-American woman in the 1930s and accurately colour
               | her skin?
               | 
               | You cannot.
        
               | [deleted]
        
               | dragonwriter wrote:
               | > Can you take a black and white photo of an African-
               | American woman in the 1930s and accurately colour her
               | skin?
               | 
               | AI colorization will, in general, be _plausible_ , not
               | _accurate_.
        
               | morelisp wrote:
               | In other words, bullshit.
        
               | snvzz wrote:
               | The original color information just isn't there.
               | 
               | So bullshit is the best you're going to get.
        
               | morelisp wrote:
               | Well, you could also _not put more bullshit in the world
               | by not doing the thing._
        
               | roywiggins wrote:
               | People have been colorizing photos as long as there have
               | been photos.
        
               | wruza wrote:
               | Why are you so negative about it? Pretty sure many people
               | would find it impressive to colorize old photos to look
               | at them as if these were taken in color.
               | 
               | Should artists not put their bs in the world? Writers?
               | Musicians? Most of it is made up but plausible to make
               | you feel something subjective.
        
               | dragonwriter wrote:
               | No more so than any other colorization method that isn't
               | dependent on out-of-band info about the particular image
               | (and even that is just more constrained informed
               | guesswork.)
               | 
               | That's what happens when you are filling in missing info
               | that isn't in your source.
               | 
               | EDIT: Of course, color photography can be "bullshit"
               | rather than accurate in relation to the actual colors of
               | things in the image; as is the case with the red, blue,
               | and _green_ (actual colors of the physical items)
               | uniforms in Star Trek: The Original Series. But, also
               | fairly frequently, lots of not-intentionally-distortive
               | reproductions of skin tones (often most politically
               | sensitive in the US with racially non-White subjects,
               | where there are also plenty of examples of _deliberate_
               | manipulation.)
        
               | morelisp wrote:
               | Showing color X on TVs by actually making the thing color
               | Y in the studio, well, _filming_ , not bullshit. It's an
               | intentional choice playing out as intended. It is meant
               | to communicate a particular thing and does so.
        
               | dragonwriter wrote:
               | That particular thing was _not_ intentional, and is the
               | reason why the (same color in person, different material)
               | command wrap uniform that is supposed to be color-matched
               | to the made-as-green uniforms isn't on screen.
               | 
               | But, yes, in general inaccurate color reproduction can be
               | intentionally manipulated with planning to intentionally
               | create appearances in photos that do not exist in
               | reality.
        
               | jackpeterfletch wrote:
               | _shrug_ people like looking at colorised photos because
               | it helps root the image within the setting of the real
               | world they occupy.
               | 
               | For some it's more evocative, irregardless of the
               | absolute accuracy.
               | 
               | Having a professional do it for that picture of your
               | great grandad is expensive.
               | 
               | Having a colourisation subreddit do it is probably worse
               | for accuracy.
               | 
               | I think there is a place for this bullshit.
        
               | erwannmillon wrote:
               | Yeah, the model is racist for sure. That's a limitation
               | of the dataset though (celeb A is not known for its
               | diversity, but it was easy for me to work with, I trained
               | this model on Colab)
               | 
               | And plausibility is a feauture, not a bug.
               | 
               | There are always many plausibily correct colorizations of
               | an image, which you want the model to be able to capture
               | in order to be versatile.
               | 
               | Many colorization models introduce additional losses
               | (such as discriminator losses) that avoid constraining
               | the model to a single "correct answer" when the solution
               | space is actually considerably larger.
        
               | atorodius wrote:
               | This is true, but if you have some reference images, you
               | can probably adapt some of the recent diffusion
               | adaptation work such as DreamBooth, to tell the model
               | ,,hey this period looked like this", and finetune it.
               | 
               | https://dreambooth.github.io/
        
       | ChrisArchitect wrote:
       | Author's writeup on this from May:
       | https://medium.com/@erwannmillon/color-diffusion-colorizing-...
        
       | aziaziazi wrote:
       | How much would it cost to colorize a movie with a fork of this?
        
         | morelisp wrote:
         | [flagged]
        
         | NBJack wrote:
         | I think the bigger question is would it be stable enough. Many
         | SD like models struggle with consistency across multiple images
         | (i.e. frames) even when content doesn't change much. Would he a
         | cool problem to see tackled.
        
           | erwannmillon wrote:
           | temporal coherence is def an issue with these types of
           | models, though I haven't tested it out with ColorDiffusion.
           | Assuming you're not doing anything autoregressive (from frame
           | to frame) to do temporal coherence, you can also parallelize
           | the colorization of each frame, which would affect cost.
           | 
           | Tbh most cost effective would be a conditional GAN though
        
             | lajamerr wrote:
             | Change up the model. That allows it to see previous frames
             | and 1-2 future frames.
             | 
             | Then train the model on movies that are color and then turn
             | them black and white.
             | 
             | That way you can train temporal coherence.
        
         | leetharris wrote:
         | Quick math:
         | 
         | 24 frames per second * 60 seconds per minute * 90 minute movie
         | length = 129600 frames
         | 
         | If you could get cost to a penny per frame, about $13k? But I'd
         | bet you could easily get it an order of magnitude less in terms
         | of cost. So $1500 or so?
         | 
         | And that's assuming you do 100% of frames and don't have any
         | clever tricks there.
        
           | caturopath wrote:
           | I'm willing to bet that if you just treated each frame as an
           | image, it would result in some weird stuff when you played
           | them as a movie.
           | 
           | > penny per frame
           | 
           | Where did this come from?
        
             | leetharris wrote:
             | I do lots of large scale ML work, this was just sort of a
             | random educated "order of magnitude" guess.
        
       | jurassic wrote:
       | This is a cool party trick, but I don't see a need for this in
       | any real applications. Black and white is its own art form, and a
       | lot of really great black and white images would look like
       | absolute garbage if you could convert them to color. This is
       | because the things that make a great black and white image
       | (dramatic contrasts, emphasis on shape/geometry, texture, etc)
       | can lose a lot of their impact when you introduce color. Our
       | aesthetic tolerance for contrast seems significantly reduced in
       | color because our expectations for the image are more anchored in
       | how things look in the real world. And colors which can be very
       | pleasing in some images are just distracting in others.
       | 
       | So all this is to say.... I don't think there would be commercial
       | demand to, say, "upgrade" classic movies with color. Those films
       | were shot by cinematographers who were steeped in the black &
       | white medium and made lighting and compositional choices that
       | take greatest advantage of those creative limitations.
        
         | [deleted]
        
         | dragonwriter wrote:
         | > I don't think there would be commercial demand to, say,
         | "upgrade" classic movies with color.
         | 
         | There was, and maybe there will be again once we get far enough
         | from the consumer burnout from the absolute deluge of that in,
         | mostly, the 1980s-1990s.
         | 
         | https://en.m.wikipedia.org/wiki/List_of_black-and-white_film...
        
         | simonw wrote:
         | I've run colorization like this against historic photographs
         | and it had a very real impact on me - I found myself able to
         | imagine life when the photo or video was taken much more easily
         | when it was no longer in black and white.
         | 
         | Here's an example I really enjoyed, of a snowball fight in
         | 1896:
         | https://twitter.com/JoaquimCampa/status/1311391615425093634
        
         | bemusedthrow75 wrote:
         | > I don't think there would be commercial demand to, say,
         | "upgrade" classic movies with color.
         | 
         | Alas there has been serious money in this in the past (VHS and
         | as I understand it US cable TV).
         | 
         | I would not assume that we have more taste now than we did
         | then. (The state of cinema suggests the opposite to me at
         | least.)
        
         | MrVandemar wrote:
         | Some of the old Doctor Who stories that were filmed in colour
         | they only have black and white copies of. The colourisations
         | have been ... very good, better than I would have thought, but
         | not perfect. Could be an a good application.
        
         | pythonguython wrote:
         | Counterexample: They Shall Not Grow Old, a WW1 documentary film
         | with mostly colorized footage with recreated audio. The film
         | was commercially successful and I found it to be a great watch.
        
       | bemusedthrow75 wrote:
       | Colourising old photographs is the banal apotheosis application
       | of diffusion AI.
       | 
       | It's the pinnacle of the whole thing: "imagine it for me in a way
       | that conforms to my contemporary expectations".
       | 
       | If you're going to colourise images, have the decency to do it by
       | hand. If possible on a print with brushes.
       | 
       | Edit: didn't think this would be popular. Maybe it's the
       | historical photography nerd in me, but colourising images without
       | effort and thought is like smashing vintage glass windows for the
       | fun of it: cultural vandalism.
        
         | crazygringo wrote:
         | If you're going to write code, have the decency to do it on
         | punch cards. If possible by hand punching, rather than using a
         | keypunch machine.
        
           | bemusedthrow75 wrote:
           | This isn't the point I am making.
           | 
           | The point I am making is that colourisation is subjective
           | art, and that alone.
           | 
           | Colourisation cannot fail to enforce contemporary biases
           | based on poor understanding of the materials. It will darken
           | or lighten skin inappropriately, and mislead in any number of
           | ways.
           | 
           | Doing it by hand (in photoshop or on a print) acknowledges
           | the inherent bias that is involved in colourisation.
           | 
           | Automating it is banal at best and dangerous at worst;
           | colourised images risk distorting history.
        
             | dragonwriter wrote:
             | > Doing it by hand (in photoshop or on a print)
             | acknowledges the inherent bias that is involved in
             | colourisation.
             | 
             | No, doing it by hand doesn't acknowledge that your
             | interpretation is a fallible interpretation shaped by bias,
             | just like translating a written work (e.g., the Bible, for
             | a noted example where this has been done often without any
             | such acknowledgement being conveyed) by human effort
             | doesn't do that.
             | 
             | Acknowledging bias in translation of either kind is _an
             | entirely separate action_ , orthogonal to the method of the
             | translation itself.
        
             | geon wrote:
             | How can it affect the lightness channel when it is locked?
        
               | bemusedthrow75 wrote:
               | The point is that the source black and white image is not
               | truthful about skin colour. The film locks in a level of
               | lightness but that lightness may be very wrong (depending
               | on the red and blue sensitivity of the film, the colour
               | of the light, the time of day, the print, whether a
               | filter was being sued).
               | 
               | So if you colourise an image of someone who appears to be
               | a light-skinned 1930s African-American with colours that
               | appear to conform to our contemporary understanding of
               | light-skinned Black people of our era, you might be
               | getting it right, of course.
               | 
               | But you might be getting it quite, quite wrong, in a way
               | that matters.
        
             | coldtea wrote:
             | > _Automating it is banal at best and dangerous at worst;
             | colourised images risk distorting history_
             | 
             | Well, faces still have a certain tint, the sky is mostly
             | blue, the grass green, water is blue, mud pools are brown,
             | the ground too, a lot of historical fabrics are certain
             | inherent colors, known flowers have known colors,
             | brownstones have red/brown color. A lot of it, is just not
             | that subjective.
             | 
             | Besides different color film stock (or camera sensor "color
             | science") can already result in dozens of widely different
             | colorings of the same exactly scene.
        
               | bemusedthrow75 wrote:
               | > Well, faces still have a certain tint
               | 
               | Do they? A _certain_ tint?
               | 
               | You _cannot_ accurately colourise skin from photographic
               | film without an _enormous_ amount of knowledge of the
               | taking and processing of the film, and of the lighting
               | and subject.
               | 
               | An AI can't do it any better than a painter. You can't
               | take a scan of a print or a negative and get skin tones
               | right.
               | 
               | Think about how weird the skin tones are from scans of
               | wet-plate photography plates compared to the same process
               | used in antiquity with the aim of producing a carbon
               | print.
        
               | coldtea wrote:
               | > _Do they? A certain tint?_
               | 
               | Yes. There's just not a single one across all faces - but
               | I wasn't meaning that.
               | 
               | What I mean is, we know the kind of tints a face will
               | have. A face is not suddenly going to be blue or green or
               | poppy red. And by how light a black and white face
               | appears, we can tell quite well if it's a darker one
               | (oilish to brown) or lighter (pinkish towards more pale).
               | 
               | If we get it wrong within a range it's no big deal. Color
               | film stocks would also vary it widely.
               | 
               | Hell, even actual people who met the person we colourise
               | in real life will remember (or even experience in real
               | time) their face's hue somewhat differently each.
        
               | bemusedthrow75 wrote:
               | But how brown? How pink? How light? How dark?
               | 
               | This is an enormously important issue.
               | 
               | Black and white films of different technologies and
               | manufacturers and eras actually lighten or darken skin
               | tones. Really _very_ significantly.
               | 
               | And it's not going to be obvious from the final positive,
               | unless there's _extensive_ data with those images about
               | how the photography was done. And there never is.
               | 
               | Editing because I can no longer reply: the question of
               | whether a skin tone is a dark one or a light one has had
               | severe real life impacts on people whose lives are now
               | only represented in photographs. You can't write this off
               | as micromanagement; it's about the ethics of
               | representation.
        
               | coldtea wrote:
               | > _But how brown? How pink? How light? How dark? This is
               | an enormously important issue_
               | 
               | Is it?
               | 
               | If 2 colour film stocks took the same image of them, it
               | would show their hue a little (or a lot) different.
               | 
               | Even if two different people actually met the same
               | person, they will probably describe their face as
               | slightly different tones from memory. (And let's not even
               | get into different types of color-blindness they could
               | have had).
               | 
               | Hell, a person's hue will even look different to the same
               | person looking at them, in real time, depending on the
               | changes in lighting and the shade at the scene as they
               | talk (e.g. sun behind clouds vs directly sun vs shade vs
               | bulbs).
               | 
               | It's not really "enormously important" to micromanage the
               | (non-existent) exact right brown or right pink.
        
             | PartiallyTyped wrote:
             | > Automating it is banal at best and dangerous at worst;
             | colourised images risk distorting history.
             | 
             | There's a lot of irony in acknowledging this but not
             | acknowledging that each and everyone of us has their own
             | biases inherent to our perception and experiences.
             | 
             | Like the blue and white dress; we all perceive things
             | differently even on identical images, monitors, screens,
             | etc.
        
             | crazygringo wrote:
             | > _Colourisation cannot fail to enforce contemporary biases
             | based on poor understanding of the materials. It will
             | darken or lighten skin inappropriately, and mislead in any
             | number of ways._
             | 
             | If anything, an AI trained on a large and diverse dataset
             | is probably going to wind up being much _more_ accurate
             | with regards to skin color than a human colorist would be
             | in most cases.
             | 
             | The problem here isn't whether colorization is done by man
             | or machine; it's just ensuring that colorized photos are
             | identified as such. Which they usually are -- that's not a
             | new problem to be solved.
        
               | bemusedthrow75 wrote:
               | No it's not, not really.
               | 
               | A diverse data set of black and white images doesn't have
               | any kind of knowledge of the colour sensitivity of the
               | medium in that moment.
               | 
               | What film was it? How was it processed? Is it a scan of a
               | negative or a print? What was the colour of the lighting?
               | Was a particular colour tint filter used on the lens? Was
               | the subject wearing makeup optimised for black and white
               | photography?
               | 
               | The black and white image, standing alone, cannot tell
               | you this, I think. Sure, it might get a bit better at,
               | say, identifying a 1950s TV show. But what is the
               | "correct" accurate colour representation of that scene,
               | when televisual makeup was wildly unnatural in colour?
        
               | crazygringo wrote:
               | But do people have any of that knowledge either? Most of
               | the time, I don't think so -- they colorize stuff in a
               | way that just "looks right" or "looks natural" or "looks
               | nice" to their eye, that's all.
               | 
               | And the dataset an AI is going to train on should be
               | using original color photos that are then converted to
               | B&W across a wide variety of color curves. So it should
               | be fairly robust to all sorts of film types. So again, I
               | repeat that it's probably going to wind up being _more_
               | accurate with regard to skin tone than a human (with
               | their aesthetic biases) usually would.
        
               | bemusedthrow75 wrote:
               | > But do people have any of that knowledge either? Most
               | of the time, I don't think so -- they colorize stuff in a
               | way that just "looks right" or "looks natural" or "looks
               | nice" to their eye, that's all.
               | 
               | No, indeed. Which is why doing it by hand is more
               | respectful of the notion that it is subjective.
               | 
               | Automatic colourisation is and will be viewed
               | differently, as more "scientific", when it's still
               | absolutely beholden to the same biases and maybe
               | misconceptions that we can't unpick because they come
               | from poor training data.
               | 
               | Finally: "original colour photos" are also a problem. Not
               | only for the part of the history where they don't exist.
               | But also for the part of history (until the early 1960s)
               | when the colour rendition of those photos was false or
               | incomplete. You can get a little closer to understanding
               | what that colour looked like, but it's important to
               | understand that colour emulsions vary in the way they
               | work: it's not black and white film with extra colour
               | sensitivity.
               | 
               | So at best you will be colourising the black and white
               | film to look like the colour film, which is not reality.
               | And there are well-understood problems with correct
               | representation of skin tones with colour film until the
               | mid-eighties.
               | 
               | I can see your point; I just think there's a bigger
               | picture here (pun not intended) that you're not seeing.
        
               | crazygringo wrote:
               | > _Automatic colourisation is and will be viewed
               | differently, as more "scientific"_
               | 
               | Then the solution is to correct that misperception, not
               | deny ourselves a useful tool.
               | 
               | > _I can see your point; I just think there 's a bigger
               | picture here (pun not intended) that you're not seeing._
               | 
               | My overarching point is that this is a tool like any
               | other. And the idea that "doing it by hand is more
               | respectful of the notion that it is subjective" I will
               | push back on 100%.
               | 
               | There is nothing disrespectful about colorizing a photo,
               | automatically or by hand. But it should always be clearly
               | communicated that it is subjective not objective, whether
               | human or machine.
               | 
               | Again, if someone believes the colorization is somehow
               | "real" or "scientific" because a computer did it, then
               | correct their misbelief. Don't stop using the tool.
               | That's the bigger picture here.
        
             | erwannmillon wrote:
             | Fair enough. Honestly this was just a fun side project. I
             | actually coded this up last october when I was doing a deep
             | dive to learn about diffusion models, and saw that no one
             | had ever applied them to colorization. This was just a fun
             | opportunity to build a project that no one had done before
        
         | pkoiralap wrote:
         | Making music without actually knowing anything about it is the
         | banal apotheosis application of Generative AI. - Music nerd in
         | me
         | 
         | Creating art without actually knowing anything about it is the
         | banal apotheosis application of Diffusion AI. - Artist in me
         | 
         | Using ChatGPT to write essays that are better than anyone could
         | have ever written is the banal apotheosis application of LLMs -
         | Teacher in me
         | 
         | It is already here. Better use, appreciate, and try to
         | understand how it works rather than complaining about it doing
         | a better job. In this instance, for example, the model can be
         | made to generate multiple outputs or even better, generate
         | output based on precise user input.
        
           | bemusedthrow75 wrote:
           | I'm actually concerned it is doing a _worse_ job, in
           | important ethical ways, than a hand colourist. But I 've
           | explained elsewhere.
           | 
           | Colourisation cannot be done accurately from a black and
           | white image without context that is almost always lacking.
           | Hand colouring is _less_ dishonest.
        
         | dragonwriter wrote:
         | > But since you do not have access to colour originals of
         | historical photos in almost every instance, you cannot possibly
         | train the network to have any instinct for the colour
         | sensitivity of the medium, can you?
         | 
         | Plenty of people say that about colorization period, which,
         | while I disagree, seems more sensible than your position to me,
         | which just seems to be fetishizing suffering.
        
         | coldtea wrote:
         | When did colorizing images become an "art"?
         | 
         | What if the "effort" way is less accurate?
        
           | vorpalhex wrote:
           | There is a community of people who carefully recolor
           | historical photos by hand. It's really beautiful time
           | consuming work and often they invest heavily to get the
           | colors to be correct.
        
           | bemusedthrow75 wrote:
           | The effort is obviously going to be less accurate.
           | 
           | But it reflects the fact that an accurate colourisation of a
           | black and white image without access to every possible detail
           | about the scene and processing from the photographer's
           | perspective is impossible.
           | 
           | Black and white film is substantially more complex and varied
           | than people understand. Its sensitivities are complex and
           | vary from processing run to processing run, and people at the
           | time knew of the weaknesses of black and white and often used
           | false colour to get an acceptable rendition.
           | 
           | Colourisation is a form of expression, not a form of
           | recovery.
        
             | coldtea wrote:
             | > _But it reflects the fact that an accurate colourisation
             | of a black and white image without access to every possible
             | detail about the scene and processing from the photographer
             | 's perspective is impossible._
             | 
             | Accurate colourisation is impossible even in a color
             | photograph. There is no "canonical" film stock that
             | accurately represents all actual real-life colors.
             | 
             | The expectation from colourisation is not an accurate
             | representation of the original colors, but a good
             | application of color based on our knowledge (whether from
             | historical facts a human colorist knows or from training
             | with similar objects and materials a NN did) that matches a
             | realistic representation of the scene.
             | 
             | If a human colourist draws a dress and doesn't know the
             | color of it, nor have they any historical information about
             | what the person depicted wore that day, they're going to
             | take a guess. That's kind of what the NN will do as well.
        
         | vorpalhex wrote:
         | I think a lot of it depends on what you are doing and why.
         | 
         | Yes, recolors can be inaccurate but they can make historical
         | moments feel more alive and connected. At the same time one can
         | imagine the issues of a recolor that is inaccurate and that is
         | troubling with historical photographs.
         | 
         | At the same time I have a bunch of old family photos I'd love
         | to recolorize. Maybe the colors won't be quite right but that's
         | an OK failure mode for family photos!
         | 
         | I'd love to see a version where you can drop just a spot or two
         | of the correct color and let the AI fill it out. My grandmother
         | had stark red hair but most algorithms will color her as a
         | blond. It'd be nice to fix that, using one of the color photos
         | we do have.
        
           | erwannmillon wrote:
           | You can do this with spatial palette t2i or controlnet. Give
           | a super lores spatial palette as conditioning like this: http
           | s://camo.githubusercontent.com/8e488996fd309165fb065b0cd...
           | 
           | https://github.com/TencentARC/T2I-Adapter
        
         | geon wrote:
         | How was anything destroyed? the original grayscale is still
         | there.
        
           | bemusedthrow75 wrote:
           | Colourised images absolutely replace mono images in image
           | searches, unfortunately; I've seen this again and again. It
           | gets more difficult to find originals.
           | 
           | But also you have to consider that bias is being introduced
           | in the colour rendition. That causes damage.
           | 
           | For example, you could see a photograph of an African
           | American woman in the 20s or 30s, and your AI would say, this
           | is an African American woman and colour her skin in some way.
           | 
           | But a lighter-skinned-looking African American woman in a
           | pre/early-post-war photo is a challenge. She may have had
           | darker skin -- been unable to "pass" -- and the film simply
           | didn't get that across because of its colour sensitivity.
           | 
           | Or she may actually have been light-skinned and able to
           | "pass" (or wearing makeup that helped).
           | 
           | Automatically colouring that image introduces risks to the
           | reading of history; you can read that woman's entire life
           | completely wrong.
           | 
           | It's also common with photos of men from that era who worked
           | outdoors. Many of them will come across much darker-skinned
           | in photos than they actually would have appeared in real
           | life, because not-readily-visible sun damage can look odd in
           | mono. But if you colourise all those sun-baked people the
           | same way, what happens to those of mixed heritage among them?
           | (A thing that is already rather "airbrushed out" of history.)
           | 
           | Without knowing about the lighting, the material, the
           | processing and the source of the positive (is it a negative
           | scan? was it a good one? or is it a scan of a print?) you
           | cannot make accurate impressions of skin tone.
           | 
           | And given the power and importance of photography in the
           | history of the USA in particular -- photography coincides
           | with and actually helps define the modern unified US self-
           | image -- this is not something to blaze through without care.
           | 
           | This is a far less tricky problem in more homogeneous
           | societies, obviously. But even then, there is this perception
           | from photographs that British women in the 1920s were all
           | deathly pale; colourisation preserves that illusion that
           | actually comes in part from photographic style.
        
         | mrkeen wrote:
         | Nice, I'll have to try smashing vintage glass windows. Thanks
         | for the tip!
        
         | erwannmillon wrote:
         | touche, nevertheless, colors go brrrrrrrr
        
           | bemusedthrow75 wrote:
           | Don't get me wrong. It's impressive technology. I'm amazed at
           | what it can do.
           | 
           | Also horrified.
        
       ___________________________________________________________________
       (page generated 2023-08-03 23:00 UTC)