[HN Gopher] Generate images fast with SD 1.5 while typing on Gradio ___________________________________________________________________ Generate images fast with SD 1.5 while typing on Gradio Author : smusamashah Score : 76 points Date : 2023-11-12 15:54 UTC (7 hours ago) (HTM) web link (twitter.com) (TXT) w3m dump (twitter.com) | Der_Einzige wrote: | The fact that LCM loras turn regular SD models into psudo-LCM | models is insane. | | Most people in the AI world don't understand that ML is like | actual alchemy. You can merge models like they are chemicals. A | friend of mine called it "a new chemistry of ideas" upon seeing | many features in Automatic1111 (including model and token merges) | used simultaneously to generate unique images. | | Also, loras exist on a spectrum based on their dimensionality. | Tiny loras should only be capable of relatively tiny changes. My | guess is that this is a big lora, nearly the same size as the | base checkpoint. | keonix wrote: | Wait until you hear about frankenmodels. You rip parts of one | model (often attention heads) and transplant them in another | and somehow that produces coherent results! Witchcraft | | https://huggingface.co/chargoddard | GaggiX wrote: | >somehow that produces coherent results | | with or without finetuning? Also is there a practical | motivation for creating them? | keonix wrote: | > with or without finetuning? | | With, but it's still bonkers that it works so well | | >Also is there a practical motivation for creating them? | | You could get in-between model sizes (like 20b instead of | 13b or 34b). Before better quantization it was useful for | inference (if you are unlucky with vram size), but now I | see this being useful only for training because you can't | train on quants | GaggiX wrote: | lcm-lora-sdv1-5 is 67.5M, lcm-lora-sdxl is 197M, so they are | much smaller than the entire model, would be cool to check the | rank used with these LoRAs tho | liuliu wrote: | 64. | temp72840 wrote: | This is nuts. I did a double take at this comment - I thought | you _must_ have been talking about LoRAing a LCM distilled from | Stable Diffusion. | | LCMs are spooky black magic, I have no intuitions about them. | ttul wrote: | When I was taking Jeremy Howard's course last fall, the | breakthrough in SD was going from 1000 steps to 50 steps via | classifier-free guidance, which is this neat hack where you | run inference with your conditioning and without and then mix | the result. To this day I still don't get it. But it works. | | Now we find this way to skip to the end by building a model | that learns the high dimensional curvature of the path that a | diffusion process takes through space on its way to an | acceptable image, and we just basically move the model along | that path. That's my naive understanding of LCM. Seems to | good to be true, but it does work and it has a good | theoretical basis too. Makes you wonder what is next? Will | there be a single step network that can train on LCM to | predict the final destination? LoL that would be pushing | things too far.. | hadlock wrote: | Sounds like we've invented the kind of psychic time travel | they use in Minority report. Let me show you right over to | the Future Crimes division. We're arresting this guy making | cat memes today because the curve of his online history | traces that of a radicalized blah blah blah | ttul wrote: | To me, the crazy thing about LoRA is they work perfectly well | adapting models checkpoints that were themselves derived from | the base model on which the LoRA was trained. So you can take | the LCM LoRA for SD1.5 and it works perfectly well on, say, | RealisticVision 5.1, a fine-tuned derivative of SD1.5. | | You'd think that the fine tuning would make the LCM LoRA not | work, but it does. Apparently the changes in weights introduced | through even pretty heavy fine tuning does not wreck the | transformations the LoRA needs to make in order to make LCM or | other LoRA adaptations work. | | To me this is alchemy. | yorwba wrote: | Finetuning and LoRAs both involve additive modifications to | the model weights. Addition is commutative, so the order in | which you apply them doesn't matter for the resulting | weights. Moreover, neural networks are designed to be | differentiable, i.e. behave approximately linearly with | respect to small additive modifications of the weights, so as | long as your finetuning and LoRA change the weights only a | little bit, you can finetune with or without the LoRA, | respectively train the LoRA on the finetuned model or its | base, and get mostly the same result. | | So this is something that can be somewhat explained using not | terribly handwavy mathematics. Picking hyperparameters on the | other hand... | smusamashah wrote: | Ok. I have seen the term LCM Lora a number of times. I have | used both stable Diffusion and LORAs for fun for quite a while. | But I always thought this LCM Lora is a new thing. It's simply | not possible using current samplers to return an image under 4 | steps. What you are saying is that just by adding a Lora we can | get existing models and samplers to generate a good enough | image in 4 steps? | jyap wrote: | Yes check out this blog post: | https://huggingface.co/blog/lcm_lora | | I've used it with my home GPU. Really fast which makes it | more interactive and real-time. | catwell wrote: | It's a different sampler too. | jimmySixDOF wrote: | And here is a demo mashed up using LeapMotion free space hand | tracking and a projector to manipulate a "bigGAN's high- | dimensional space of pseudo-real images" to make it more like a | modern dance meets sculpting meets spatial computing with a hat | tip to the 2008 work of Johnny Chung Lee while at Carnage Mellon. | | https://x.com/graycrawford/status/1100935327374626818 | smlacy wrote: | https://nitter.net/abidlabs/status/1723074108739706959 | r-k-jo wrote: | Here is a collection of demos with fast LCM on HuggingFace | | https://huggingface.co/collections/latent-consistency/latent... ___________________________________________________________________ (page generated 2023-11-12 23:00 UTC)