hngopher.com

       [HN Gopher] Diffusion with Offset Noise: Finetuning SD to genera...
       ___________________________________________________________________
        
       Diffusion with Offset Noise: Finetuning SD to generate very dark or
       light images
        
       Author : siraben
       Score  : 89 points
       Date   : 2023-02-27 17:55 UTC (5 hours ago)
        
 (HTM) web link (www.crosslabs.org)
 (TXT) w3m dump (www.crosslabs.org)
        
       | sschueller wrote:
       | The whole thing with SD is extremely interesting and difficult to
       | keep up with as there are so many new things coming in almost
       | daily.
       | 
       | Just last week I saw ControlNet[1] which ads a lot more control.
       | 
       | Today I saw what corridor crew[2] did to stabilize the randomness
       | when you want to make videos. Very exciting.
       | 
       | [1] https://github.com/lllyasviel/ControlNet
       | 
       | [2] https://www.youtube.com/watch?v=_9LX9HSQkWo
        
         | p1esk wrote:
         | In the corridow crew video you posted - the most interesting
         | part is the beginning (interesting to ML practitioners).
        
         | cwkoss wrote:
         | The A1111 webUI and extension ecosystem is a beautiful example
         | of what FOSS can be.
         | 
         | - User can install with a few clicks, no thinking about command
         | line
         | 
         | - webUI has hover tooltips on everything, so user can most
         | figure out what's going on without ever needing to touch
         | documentation
         | 
         | - A1111 has a tab which can load a list of extensions from a
         | github page
         | 
         | - Click to install, then refresh UI and extension just works
         | 
         | - Users are getting new incredibly powerful extensions every
         | week or two - deforum lets you sequentially generate as many
         | frames as you want and stitch them into a video, controlnet
         | lets you copypaste features from a source image to your target
         | image(s). Controlnet was added to A1111 ~2 weeks ago and is
         | already integrated into the deforum tab so you can use both
         | together.
         | 
         | Truly beautiful. I'd love to see more FOSS projects that felt
         | so user friendly, generous with features, and rapid. Really fun
         | to play with new cutting edge tech every couple weeks.
        
       | tangjurine wrote:
       | An easy extension would be to have random patches of the image be
       | offset by a random color, of varying sizes.
       | 
       | Pretty cool!
        
       | Lerc wrote:
       | Is the noise function used for just the starting data provided to
       | the first iteration of denoising, or does it get called
       | repeatedly throughout the iterations?
        
         | jackmott wrote:
         | [dead]
        
       | ec109685 wrote:
       | Diffusion is so interesting. Unlike LLM that have some parallels
       | to how the human mind works, it's not as obvious that reverse
       | engineering from noise to a prompt has any similar parallels.
       | 
       | Will this cause us to hit walls at some point or actually exceed
       | what a human can create?
        
         | albertzeyer wrote:
         | I would actually say the opposite.
         | 
         | We know that the biological brain does a lot of iterative
         | refinement via recurrent processing (attractor dynamics), which
         | is very similar to how diffusion works.
         | 
         | However, while prediction is also a core functionality of the
         | brain, it's not really that you always auto-regressively
         | generate word-by-word.
        
       | braingenious wrote:
       | There is already a publicly-available model that uses this!
       | 
       | https://civitai.com/models/11193/illuminati-diffusion-v11
        
         | SV_BubbleTime wrote:
         | As a ckpt that's cool, but I'd like to see it as a LORA so you
         | can use any checkpoint you already have. That would (let's face
         | it... will next week) be amazing.
        
           | jupiterelastica wrote:
           | What is LORA in this context? Google only brought me to wifi
           | networks :D
        
             | braingenious wrote:
             | https://huggingface.co/blog/lora
             | 
             | I'm really new to all this and I'm learning new stuff about
             | it every day!
             | 
             | I only understand a fraction of what this article says
             | though :(
        
             | SV_BubbleTime wrote:
             | LOw-Rank Adaptation
             | 
             | It's a way to cut just the
             | components/styles/themes/patterns out from a model and
             | apply them into other models.
             | 
             | So if I have a Disney characters checkpoint, but I really
             | like this MakeGiantEyes checkpoint, if I can get it down to
             | a MakeGiantEyes LoRA, I can apply that on top of my Disney
             | Characters model which is already a custom trained set. It
             | definitely does not always work, but when it does it's like
             | magic. At a practical level, it's a model-modifier.
             | 
             | For example... Here is a Peter Griffin LoRA.
             | https://civitai.com/models/13606/peter-griffin-lora or
             | https://civitai.com/models/13763/thomas-the-train-i-lora
             | 
             | ... It took me a minute to get those because I had to sort
             | through a LOT things that would probably get me banned
             | here. If anyone wanted to know nationalities were using SD
             | more... It's Asians, hands down, all day long, and I think
             | that's interesting.
             | 
             | EDIT: And if you were wondering what a Textual Inversion is
             | vs a LoRA... Don't ask me! They're both model modifiers,
             | but as I understand it, textual inversions are good for
             | faces (which is why most of those are people, and they are
             | kilobytes in size), and LoRAs aren't as good for faces
             | specifically but better for themes.
             | 
             | TI exmaples (I couldn't use any of the million women...
             | there are almost none that would be appropriate to post.
             | Even though civitai does a good job of removing the nsfw
             | posts of real people even with cloths on some are just
             | still too much... Thirst is driving AI now)
             | https://civitai.com/models/11039/ian-mckellen or
             | https://civitai.com/models/8060/seu-madruga
        
           | BudaDude wrote:
           | Turns it out it came faster than that.
           | 
           | https://civitai.com/models/8765/theovercomer8s-contrast-
           | fix-...
        
             | SV_BubbleTime wrote:
             | Yea, why was I thinking! Next week is cotnrolNet2 followed
             | by SD3 (but not screwed up this time).
        
       | Jack000 wrote:
       | I'm curious if this generalizes to mid frequencies (ie. add some
       | blurred noise in addition to the offset) and what effect that
       | might have on the generations.
        
       | 2bitencryption wrote:
       | This is really interesting. Another thing I noticed in my fun
       | with SD is that it is _extremely_ stubborn about colors during
       | the denoising process.
       | 
       | That is, whatever color a region of the image has during
       | denoising step 3, it will almost surely have that color at step
       | 50, even if it makes no logical sense for the thing in that
       | location to have that color.
       | 
       | This may not seem bad, but it's annoying when doing anything
       | image-to-image, because regardless of the prompt you give it, the
       | colors are "sticky".
       | 
       | If you have an image of an apple, and you use image-to-image with
       | the prompt "an image of an orange", you will get a very reddish
       | orange (in my experience at least).
        
         | sp332 wrote:
         | Does it work to make a greyscale image and let the denoiser
         | find a color by descent?
        
           | danielvf wrote:
           | From experience in related things - you are going to get a
           | fairly grey orange, and fairly grey overall image.
        
       ___________________________________________________________________
       (page generated 2023-02-27 23:00 UTC)