[HN Gopher] High-performance image generation using Stable Diffu...
       ___________________________________________________________________
        
       High-performance image generation using Stable Diffusion in KerasCV
        
       Author : tosh
       Score  : 317 points
       Date   : 2022-09-28 08:28 UTC (14 hours ago)
        
 (HTM) web link (keras.io)
 (TXT) w3m dump (keras.io)
        
       | ShamelessC wrote:
       | Nice! I'll take anything over the huggingface version - the API
       | design by huggingface where CLIP is in transformers, everything
       | else is in diffusers...not a great developer experience [unless
       | youre the type of person that likes their python to look like
       | half-baked J2EE).
        
       | capableweb wrote:
       | Tried to get this running on my 2080ti (11GB VRAM) but hitting
       | OOM issues. So while performance seems better (but can't actually
       | test this myself), I'm unable to actually verify it as it doesn't
       | run. Some of the Pytorch forks works on as little as 6GB of VRAM
       | (or maybe even 4GB?), but always good to have implementations
       | that optimize for various factors, this one seems to trade memory
       | usage for raw generation speed.
       | 
       | Edit: there seems to be a more "full" version of the same work
       | available here, made by one of the authors of the submission
       | article: https://github.com/divamgupta/stable-diffusion-
       | tensorflow
        
         | WithinReason wrote:
         | Just breaking the attention matrix multiply into parts allows a
         | significant reduction of memory consumption at minimal cost.
         | There are variants out there that do that and more.
         | 
         | Short version: Attention works as a matrix multiply that looks
         | like this: s(QK)V where QK is a large matrix but Q,K,V and the
         | result are all small. You can break Q and V into horizontal
         | strips. Then the result is the vertical concatenation of:
         | s(Q1*K)*V1         s(Q2*K)*V2         s(Q3*K)*V3         ...
         | s(QN*K)*VN
         | 
         | Since you're reusing the memory for the computation of each
         | block you can get away with much less simultaneous RAM use.
        
           | liuliu wrote:
           | PyTorch doesn't offer an inplace softmax which contributes
           | about 1GiB extra memory for inference (of stable diffusion).
           | Although all these are not significant improvements comparing
           | to just switch to FlashAttention inside the UNet model.
        
           | GistNoesis wrote:
           | Yeah, the problem is indeed in the attention computation.
           | 
           | You can do something like that but it's far from optimal.
           | 
           | From memory consumption perspective, the right way to do it,
           | is to never materialize the intermediate matrices.
           | 
           | You can do it, by using a customop, that compute att =
           | scaledAttention(Q,K,V) and the gradient dQ,dK,dV =
           | scaledAttentionBackward(Q,K,V,att,datt)
           | 
           | The memory needed for these ops is the memory to store
           | Q,K,V,attn,dQ,dK,dV,dattn + extra temporary memory.
           | 
           | When you do the work to minimize memory consumption, this
           | extra temporary memory is really small : 6
           | _attention_horizon^2_ number_of_core_running_in_parallel
           | numbers.
           | 
           | But even though there is not much re computation, this kernel
           | won't run as fast due to the pattern of memory access, unless
           | you spend some time manually optimizing it.
           | 
           | The place to do it is at the level of the autodiff framework
           | aka tensorflow or pytorch, with low level c++/cuda code.
           | 
           | Anybody can write some custom kernel, but deploying,
           | maintaining them and distributing them is a nightmare. So the
           | only people that could and should have done it, are the
           | tensorflow or pytorch guys.
           | 
           | In fact they probably have, but it's considered a strategic
           | advantage and reserved for internal use only.
           | 
           | The mere mortals like us, have to use some workarounds
           | (splitting matrices, Kheops, gradient checkpointing... ) to
           | not be too much penalized by the limited ops of the out of
           | the box autodiff frameworks like tensorflow or torch.
        
         | Karuma wrote:
         | There are forks that even work on 1.8 of VRAM! They work great
         | on my GTX 1050 2GB.
         | 
         | This is by far the most popular and active right now:
         | https://github.com/AUTOMATIC1111/stable-diffusion-webui
        
           | jtap wrote:
           | Just as another point of reference. I followed the windows
           | install. I'm running this on my 1060 with 6GB memory. With no
           | setting changes takes about 10 seconds to generate an image.
           | I often run with sampling steps up to 50 and that takes about
           | 40 seconds to generate an image.
        
           | rmurri wrote:
           | What settings and repo are you using for GTX 1050 with 2GB?
        
             | Karuma wrote:
             | I'm using the one I linked in my original post:
             | https://github.com/AUTOMATIC1111/stable-diffusion-webui
             | 
             | The only command line argument I'm using is --lowvram, and
             | usually generate pictures at the default settings at
             | 512x512 image size.
             | 
             | You can see all the command line arguments and what they do
             | here: https://github.com/AUTOMATIC1111/stable-diffusion-
             | webui/wiki...
        
           | [deleted]
        
           | extesy wrote:
           | > This is by far the most popular and active right now:
           | https://github.com/AUTOMATIC1111/stable-diffusion-webui
           | 
           | While technically the most popular, I wouldn't call it "by
           | far". This one is a very close second (500 vs 580 forks):
           | https://github.com/sd-webui/stable-diffusion-webui/tree/dev
        
             | Karuma wrote:
             | That's why I said "right now", since I feel that most
             | people have moved from the one you linked to AUTOMATIC's
             | fork by now. hlky's fork (the one you linked) was by far
             | the most popular one until a couple of weeks ago, but some
             | problems with the main developer's attitude and a never-
             | ending migration from Gradio to Streamlit filled with
             | issues made it lose its popularity.
             | 
             | AUTOMATIC has the attention of most devs nowadays. When you
             | see any new ideas come up, they usually appear in
             | AUTOMATIC's fork first.
        
           | jaggs wrote:
           | This needs Windows 10/11 though?
        
             | Karuma wrote:
             | Nope. There are instructions for Windows, Linux and Apple
             | Silicon in the readme:
             | https://github.com/AUTOMATIC1111/stable-diffusion-webui
             | 
             | There's also this fork of AUTOMATIC1111's fork, which also
             | has a Colab notebook ready to run, and it's way, way faster
             | than the KerasCV version:
             | https://github.com/TheLastBen/fast-stable-diffusion
             | 
             | (It also has many, many more options and some nice, user-
             | friendly GUIs. It's the best version for Google Colab!)
        
               | jaggs wrote:
               | Brilliant thanks.
        
           | sophrocyne wrote:
           | While AUTOMATIC is certainly popular, calling it the most
           | active/popular would be ignoring the community working on
           | Invoke. Forks don't lie.
           | 
           | https://github.com/invoke-ai/InvokeAI
        
             | counttheforks wrote:
             | > Forks don't lie.
             | 
             | They sure do. InvokeAI is a fork of the original repo
             | CompVis/stable-diffusion and thus shares its fork counter.
             | Those 4.1k forks are coming from CompVis/stable-diffusion,
             | not InvokeAI.
             | 
             | Meanwhile AUTOMATIC1111/stable-diffusion-webui is not a
             | fork itself, and has 511 forks.
        
               | pwillia7 wrote:
               | Subjectively, AUTOMATIC has taken over -- I have not
               | heard of invoke yet but will check it out.
        
               | toqy wrote:
               | The only reason to use it imo has been if you need mac/m1
               | support, but that's probably in other forks by now
        
               | sophrocyne wrote:
               | Welp - TIL.
               | 
               | Thanks for the correction.
               | 
               | Any idea on how to count forks of a downstream fork? If
               | anyone would know... :)
        
       | rcarmo wrote:
       | This is _markedly_ faster than the PyTorch versions I've seen
       | (nothing against the library, just categorizing the
       | implementations). It would be nice to see this including the
       | little quality of life additional models (eye fixes, upscaling,
       | etc.), but I suspect the optimizations are transferrable.
       | 
       | Either way, getting 3 images for 25 iterations under 10 seconds
       | (quick Colab test, which is where I've taken to comparing these
       | things) is just ridiculously faster.
        
         | zone411 wrote:
         | Which GPU did you test on Colab? Are you comparing with one of
         | the fp16 PyTorch versions? Their test shows little improvement
         | on V100.
         | 
         | PyTorch is now quite a bit more popular than Keras in research-
         | type code (except when it comes from Google) so I don't know if
         | these enhancements will get ported. This port was done by
         | people working on Keras which is kind of telling - there isn't
         | a lot of outside interest.
        
           | _ntka wrote:
           | This is not true, the initial Keras port of the model was
           | done by Divam Gupta who is not affiliated with Keras or
           | Google. He works at Meta.
           | 
           | The benchmark in the article uses mixed precision (and
           | equivalent generation settings) for both implementations,
           | it's a fair benchmark.
           | 
           | In the latest StackOverflow global developer survey,
           | TensorFlow had 50% more users than PyTorch.
        
             | zone411 wrote:
             | Two Keras creators are listed as authors on this post. If
             | they were not involved, this should be specified. I
             | specifically talked about research and StackOverflow is not
             | in any way representative of what's used. Do you disagree
             | that the majority of neural net research papers now only
             | have PyTorch implementations, not TensorFlow? Also,
             | according to Google Trends, PyTorch is more popular: https:
             | //trends.google.com/trends/explore?geo=US&q=pytorch,te....
             | BTW, I would love it if TF made a strong comeback, it's
             | always better to have two big competing frameworks and I
             | have some issues with PyTorch, including with its
             | performance.
        
             | polygamous_bat wrote:
             | > In the latest StackOverflow global developer survey,
             | TensorFlow had 50% more users than PyTorch.
             | 
             | It also doesn't help that PyTorch has its own discussion
             | forum [1] where most pytorch questions end up.
             | 
             | [1]: https://discuss.pytorch.org/
        
           | kgwgk wrote:
           | Should we expect people not working on keras to have the
           | interest and ability to get it to work on keras?
        
             | zone411 wrote:
             | If these people have existing Keras code they want to
             | integrate or they are interested in developing it further
             | in Keras, then it shouldn't require any insider knowledge
             | to create a Keras version of a small but popular open-
             | source project like this. I am very sure we'd get a PyTorch
             | version made by outsiders quickly if Stable Diffusion was
             | originally released in Keras/TF.
        
               | kgwgk wrote:
               | What is your definition of outsider?
               | 
               | We got a Keras version made by Divam Gupta very quickly
               | after Stable Diffusion was released.
               | 
               | Is he not an outsider?
        
               | zone411 wrote:
               | From what I can tell this Keras version was just released
               | (the date on the post is Sep. 25) and the first author
               | listed is the creator of Keras. Is this incorrect? I am
               | not familiar with Divam Gupta and I would consider
               | outsiders to be people not paid by Google.
        
               | kgwgk wrote:
               | https://mobile.twitter.com/divamgupta/status/157123450432
               | 020...
               | 
               | https://github.com/divamgupta/stable-diffusion-tensorflow
               | 
               | Now they are working together. That may be "telling" to
               | you but I'm not sure why that should cast a negative
               | light on Keras, really.
        
               | zone411 wrote:
               | I didn't say that it casts a negative light on Keras.
               | Just on its popularity among outsiders. There are
               | thousands of great libraries out there that are much less
               | popular than Keras or PyTorch. And BTW, JAX is a useful
               | Google-created framework that's growing in popularity
               | among researchers and pushed PyTorch to improve
               | (functorch), so I have nothing against Google projects.
        
               | kgwgk wrote:
               | The reason why we're having this discussion is that what
               | you call a Keras outsider ported Stable Diffusion to
               | Keras last week.
               | 
               | It's hard to understand how that can say anything
               | negative about the popularity of Keras among outsiders.
        
               | zone411 wrote:
               | So why are Keras creators listed as authors on this post
               | and why is it on Keras' official site? Compare this to
               | hundreds of PyTorch SD forks that have been thrown up on
               | GitHub.
               | 
               | The OP was wondering whether additional enhancements will
               | also be ported and that's what I responding to. It's
               | simply much less likely that a new paper will get a Keras
               | implementation than a PyTorch implementation.
        
         | nextaccountic wrote:
         | Is this faster even after applying the optimizations that
         | reduce VRAM usage? (some of which the Keras version seem to
         | lack)
        
       | labarilem wrote:
       | Very interesting performance. Also a very good write-up. Can't
       | wait to try this.
        
       | gpderetta wrote:
       | I have a mediocre GPU but a fast CPU (with a lot of RAM). Would I
       | see improvements there?
       | 
       | I guess I should give it a try.
        
         | senthilnayagam wrote:
         | tried it yesterday, on intel i9 macbook pro it takes about 300
         | seconds per image.
        
           | gpderetta wrote:
           | You mean the keras version? How does it compare to the
           | original one? Currently on my 10850k I get 2.4s/iteration,
           | which is borderline usable. I haven't managed (nor tried very
           | hard) to get the cuda version working on my 1070; I expect to
           | be a little better, but I don't want to fight with ram
           | issues.
        
           | ttflee wrote:
           | How many steps did you perform?
           | 
           | I tried some and found no major differences after 16 steps or
           | so with given random seed.
        
         | ttflee wrote:
         | On intel MacBookPro 2020, CPU-only, the original one[1] using
         | pytorch utilized one core only. A tensorflow implementation[2]
         | with oneDNN support which utilized most of the cores ran at
         | ~11sec/iteration. Another OpenVINO based implementation[3] ran
         | at ~6.0sec/iteration.
         | 
         | [1] https://github.com/CompVis/stable-diffusion/
         | 
         | [2] https://github.com/divamgupta/stable-diffusion-tensorflow/
         | 
         | [3] https://github.com/bes-dev/stable_diffusion.openvino/
        
           | gpderetta wrote:
           | Yes, I use [3] and I get 2.4s/iter on my 10 core machine. I
           | was wondering if keras would give additional help here. I'll
           | have to try I guess.
        
       | erwinh wrote:
       | Not necessarily my expertise but if as stated by the article, 2
       | lines of code can already get a 2x performance gain, what more
       | can be done to improve performance in the coming years?
        
         | londons_explore wrote:
         | It's not two lines of code... It's 2 lines that enable tens of
         | thousands of lines of library code by invoking a new
         | optimizer...
        
         | MintsJohn wrote:
         | I'm curious whether this really is "the fastest model yet"
         | there are pytorch optimizations as well.
         | 
         | Something like global optimization has been done in pytorch,
         | here's a blog about it: https://www.photoroom.com/tech/stable-
         | diffusion-25-percent-f...
         | 
         | Mixed precision seems pretty much default looking at a few
         | Stable Diffusion notebooks.
         | 
         | More intriguing, there's also a more local optimization that
         | makes pytorch faster: https://www.photoroom.com/tech/stable-
         | diffusion-100-percent-...
         | 
         | Unless it's already there, that last one would be interesting
         | to add to keras.
         | 
         | All in all this machine learning ecosystem is wild, as a
         | software dev, things like cache locality and preferring
         | computation over memory access are basic optimizations, yet in
         | machine learning it seems wildly disregarded, I've seen models
         | happily swapping between gpu and system memory to do numpy
         | calculations.
         | 
         | Hopefully stable diffusion changes things, the work towards
         | optimizations is there, it just seems often disregarded. As
         | stable diffusion is one popular open model that, when
         | optimized, can be run locally (and not as saas, where you just
         | add extra compute power, which seems cheaper than engineers)
         | and has a lot of enthusiasm behind it, it might just be the
         | spark that makes optimization sexy again.
        
       | shadowgovt wrote:
       | Bonus points for this article being one of the clearest
       | explanations for how Stable Diffusion works that I've seen to-
       | date.
        
       | unspecldn wrote:
       | How do I deploy this? Can someone offer some guidance please?
        
       | monkmartinez wrote:
       | Is the H5 file type that much different than whatever the Pytorch
       | versions are using?
       | 
       | The model is loaded from Huggingface during the instantiation of
       | the stable diffusion class. It is loaded as an H5 file which I
       | believe is unique to Keras[0]. I don't have any experience with
       | Keras so I can't say if that is good or bad. I wanted to see
       | where they were getting the weights as the blog post didn't
       | demonstrate an explicit loading function/call like Pytorch.
       | 
       | Gonna run it and see... although I have like 40GB of stable
       | diffusion weights on my computer now.
       | 
       | [0] https://github.com/keras-team/keras-
       | cv/blob/master/keras_cv/...
        
       | mikereen wrote:
       | enhance
        
       | xiphias2 wrote:
       | ,, Note that when running on a M1 MacBookPro, you should not
       | enable mixed precision, as it is not yet well supported by
       | Apple's Metal runtime"
       | 
       | It is a bit sad if this is just a closed software issue that
       | cannot be fixed :(
        
         | ribit wrote:
         | Mixed precision won't do anything on Apple Silicon anyway since
         | there is no performance advantage to using FP16 (aside from
         | decreasing register pressure and RAM bandwidth which won't
         | happen here as data is FP32 to start with).
        
         | capableweb wrote:
         | Is it really that sad? Closed software/hardware won't get
         | support (official nor community) for things until the
         | maintainer of the software adds it, and people who buy that
         | kind of hardware is more than aware of this pitfall (and in
         | fact, see it as a benefit sometimes too).
        
           | lynndotpy wrote:
           | I'm a new MacOS user and, while I did anticipate some of
           | these issues, I do often find myself surprised when running
           | into them. This was one such surprise I hit recently
        
       | nextaccountic wrote:
       | Does this run on AMD?
       | 
       | A problem I see is that a lot of times everything works fine on
       | rocm+hip, but since nvidia dominates the machine learning market
       | (and thus most researches run nvidia), most forks don't bother
       | checking and just advertise compatibility with nvidia and
       | sometimes apple M1.
       | 
       | Problem is, AMD GPUs are much cheaper!
        
         | mrtksn wrote:
         | Well, high-end stuff is always on Nvidia and Apple Silicon
         | seems to get some love because of its unified memory that makes
         | it possible in first place plus its popularity among
         | developers.
         | 
         | AMD seems to be popular among gamers on budget and the budget
         | cards often don't have the VRAM required by default. So, AMD
         | seems to be in this weird place where the people who can make
         | it work don't care.
        
           | mrtranscendence wrote:
           | For what it's worth, at the consumer level AMD cards -- at
           | least recently -- have tended to have more VRAM than Nvidia
           | cards. My 3080 Ti, which I bought for $1400 (though it now
           | goes for ~$1k), has less RAM (12GB) than a 6800 XT that you
           | can get for $600 (16GB).
        
         | cypress66 wrote:
         | > Problem is, AMD GPUs are much cheaper!
         | 
         | Are they? I believe Nvidia (consumer) gpus have better
         | price/performance than amd for AI.
        
           | nextaccountic wrote:
           | I don't know about AI performance (does this happen only
           | because of the overhead of providing CUDA through rocm+HIP?),
           | but I was just checking and at least in my country (Brazil),
           | for any given memory size (12GB, 8GB, 4GB) I can find cheaper
           | AMD GPUs than NVidia GPUs
           | 
           | Here I'm considering that the main constraint is VRAM and
           | while stable diffusion now runs even on GPUs with 2GB RAM,
           | there's always new developments that require more VRAM (for
           | example, Dreambooth requires 12GB as of today)
        
           | mrtranscendence wrote:
           | Maybe for AI? For other tasks, especially gaming, they punch
           | well above their weight relative to Nvidia (though they lack
           | features in comparison). It's also possible to get a 16GB
           | card for much cheaper from AMD than Nvidia.
        
       | gdubs wrote:
       | Has anyone tried running this with an AMD card on Mac? At first
       | glance it's able to run on Metal (given the M1 compatibility)...
        
       | mrtksn wrote:
       | On a 16GB 8c8g Macbook Air M1, the PyTorch implementation takes
       | about 3.6s/step which is about 3 minutes per image with the
       | default parameters. I wonder how faster this would be. If there's
       | anyone out there with a similar system and wants to compare,
       | could you please write your findings?
        
         | thisisjasononhn wrote:
         | Not M1 comparible but I'm working on testing various GPU vs M1
         | comparisons, with a few accessible cloud providers. My
         | impression is times should be the same, but it's nice to hear
         | other real-world stats for M1 with SD. Makes me really want to
         | rent the Hetzner M1 now.
         | 
         | Which repo or build are you using BTW, is it the one related to
         | this readme?
         | 
         | https://github.com/magnusviri/stable-diffusion/blob/main/REA...
        
           | stared wrote:
           | I would love to see it, but this file is not accessible.
        
             | thisisjasononhn wrote:
             | Sorry about that, web link rot sure is real eh.
             | 
             | This is an example of the original file:
             | https://github.com/magnusviri/stable-
             | diffusion/blob/79ac0f34...
             | 
             | Which seems to have been renamed, and cleaned up a bit
             | here: https://github.com/magnusviri/stable-
             | diffusion/blob/main/doc...
             | 
             | However, per the note on the magnusviri repo, the following
             | repo should be used for a stable set of this SD Toolkit:
             | https://github.com/invoke-ai/InvokeAI
             | 
             | with instructions here https://github.com/invoke-
             | ai/InvokeAI/blob/main/docs/install...
        
           | mrtksn wrote:
           | >Which repo or build are you using BTW, is it the one related
           | to this readme? https://github.com/magnusviri/stable-
           | diffusion/blob/main/REA...
           | 
           | Yes, this one. However it was like a month ago I think, so
           | speeds might have improved. I'm getting ~2.2s/step with
           | another implementation:
           | https://news.ycombinator.com/item?id=33006447
        
             | thisisjasononhn wrote:
             | Wow, that sounds like a good improvement.
             | 
             | I am also wondering, do you follow the general advice of 1
             | iteration and 1 sample, for example:
             | 
             | --n_samples 1 --n_iter 1 (when referencing commands using
             | txt2img.py)
             | 
             | I figure you could wait a bit for things to process going
             | further, but curious just if you're getting results like
             | that with higher sample/iter settings.
        
               | mrtksn wrote:
               | I usually go with the default parameters.
        
         | mft_ wrote:
         | I've not tried it, but this approach apparently takes 10-20s
         | per image?
         | 
         | https://reddit.com/r/StableDiffusion/comments/xbo3y7/oneclic...
        
           | mrtksn wrote:
           | I just gave it a spin, it took 1 min 52 sec for a 50 steps
           | image and that is ~2.2s/step. It seems faster than my
           | original installation(which might also have improved speed as
           | it was at very beta stage when I tried it) but definitely not
           | 20 seconds for 50 steps image at 512x512 resolution.
           | 
           | Maybe they use lower parameters.
           | 
           | edit:
           | 
           | 50 steps at 256x256 resolution took 55 seconds.
           | 
           | 50 steps at 768x768 resolution took 8 min, exactly.
           | 
           | PS: my Macbook Air is modified with thermal pads, it takes a
           | bit longer to start throttling than usual. Either way, it's
           | very dependent on the ambient temperature.
        
       | WatchDog wrote:
       | I don't quite understand the benefit of mixed precision.
       | 
       | It seems like using high precision is useful for training, but if
       | not training, why not just use float16 weights and save the
       | memory?
        
         | NavinF wrote:
         | Converting weights to float16 after training will reduce
         | quality/accuracy whereas mixed precision has a negligible
         | effect on quality/accuracy and dramatically improves
         | performance.
         | 
         | If you really just want to save memory, there's plenty of other
         | low hanging fruit. It's just not a priority for most devs since
         | mid tier GPUs start at 10GB whereas a typical model only has
         | 0.5GB weights. Activations and intermediate calculations use
         | way more memory.
        
         | zone411 wrote:
         | You usually can. But it can take some work if you're using any
         | libraries that expect FP32 and it might be slower, depending on
         | the GPU. The FP16 support isn't quite as good as FP32.
        
       | dennisy wrote:
       | This is amazing! I am more used to TF so very happy to see this!
       | 
       | Has anyone got a suggestion on how to fine tune this model?
        
       | itronitron wrote:
       | someone should compare results with just doing a keyword search
       | on deviantart
        
       | JoeAltmaier wrote:
       | The otter examples highlight something you can't control using
       | these things: the 'eats shoots and leaves' phenomenon.
       | 
       | The prompt was "A cute otter in a rainbow whirlpool holding
       | shells, watercolor"
       | 
       | Seems like the otter should be holding shells, the way a normal
       | human parses it.
       | 
       | The tool showed the otter 'in holding-shells', which are shells
       | that hold otters apparently. Also some random shells strewn
       | about, as the technique is sensitive to spurious detail sprouting
       | up from single words.
       | 
       | Until the tool permits some kind of syntactic diagramming or so
       | forth, we'll not be able to control for this.
       | 
       | Just the other day here, I saw a picture of a fork and some
       | plastic mushrooms. The prompt was 'plastic eating mushrooms'
       | which was ambiguous even to humans. The tool chose to illustrate
       | the subclass of mushrooms 'eating-mushrooms' (as opposed to
       | poison mushrooms or decorative mushrooms I suppose) made of
       | plastic.
       | 
       | When we're playing around this can seem whimsical and artistic.
       | But a graphic designer might want some semblance of control over
       | the process.
       | 
       | Not sure how a solution would work.
        
         | CuriouslyC wrote:
         | Graphic designers lean on img2img in their workflows more than
         | txt2img, as that gives you the control you speak of.
        
         | UncleEntity wrote:
         | My favorite is when you do "<whatever> bla, bla, bla, wearing a
         | t-shirt by <artist>" and it gives you an image of <whatever>
         | wearing a t-shirt with a print in the style of the artist.
         | Which adds extra dimensions to play with so isn't all that bad.
        
         | CrazyStat wrote:
         | This is the compositionality problem--the language model
         | sometimes doesn't quite know how to put the words together.
         | Better language models will help in the future; in the mean
         | time you can give it a helping hand by prompt engineering or
         | using img2img.
        
       | honksillet wrote:
       | Can this be used to train you own model? I have a moderately
       | large medical image dataset that would like to try this with for
       | data augmentation.
        
       | jawadch93 wrote:
        
       ___________________________________________________________________
       (page generated 2022-09-28 23:00 UTC)