[HN Gopher] Show HN: InvokeAI, an open source Stable Diffusion t...
       ___________________________________________________________________
        
       Show HN: InvokeAI, an open source Stable Diffusion toolkit and
       WebUI
        
       Hey everyone!  Excited to be able to share the release of `InvokeAI
       2.0 - A Stable Diffusion Toolkit`, an open source project that aims
       to provide both enthusiasts and professionals a suite of robust
       image creation tools. Optimized for efficiency, InvokeAI needs only
       ~3.5GB of VRAM to generate a 512x768 image (and less for smaller
       images), and is compatible with Windows/Linux/Mac (M1 & M2).
       InvokeAI was one of the earliest forks off of the core CompVis repo
       (formerly lstein/stable-diffusion), and recently evolved into a
       full-fledged community driven and open source stable diffusion
       toolkit titled InvokeAI. The new version of the tool introduces an
       entirely new WebUI Front-end with a Desktop mode, and an optimized
       back-end server that can be interacted with via CLI or extended
       with your own fork.  This version of the app improves in-app
       workflows leveraging GFPGAN and Codeformer for face restoration,
       and RealESRGAN upscaling - Additionally, the CLI also supports a
       large variety of features: - Inpainting - Outpainting - Prompt
       Unconditioning - Textual Inversion - Improved Quality for Hi-
       Resolution Images (Embiggen, Hi-res Fixes, etc.) - And more...
       Future updates planned included UI driven outpainting/inpainting,
       robust Cross Attention support, and an advanced node workflow for
       automating and sharing your workflows with the community.  We're
       excited by the release, and about the future of democratizing the
       ability to create. Check out the repo (https://github.com/invoke-
       ai/InvokeAI) to get started, and join us on Discord
       (https://discord.gg/ZmtBAhwWhy)!
        
       Author : sophrocyne
       Score  : 207 points
       Date   : 2022-10-10 18:48 UTC (4 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | KaoruAoiShiho wrote:
       | Is there anything new here that might interest an existing user
       | of auti's gui to switch?
        
         | sophrocyne wrote:
         | To be fair- Auto has been acquiring features at an insane clip
         | (recently getting banned from the SD discord for accusations of
         | code theft lol)
         | 
         | I think Invoke is competitive for now, but biggest advantage is
         | an improved UX, and a large community with an ambitious roadmap
         | focused more on enthusiasts/pros.
         | 
         | I'd give it a whirl and see where you end up preferring to do
         | your SD projects :)
        
           | KaoruAoiShiho wrote:
           | Oh I see that my comment might be interpreted as snarky, I
           | was literally just asking to please list the stuff that are
           | new or different cause that would be very helpful for
           | everyone.
        
             | sophrocyne wrote:
             | Oh, no snark interpreted! Very valid question.
             | 
             | I legitimately think the answer is - - Experimental/novel
             | SD features, Automatic typically has them first - Better
             | UI/UX and exploration/gallery workflows, Invoke.
             | 
             | From a feature perspective, both have their own flavor of
             | certain features.
             | 
             | Goal of Invoke is to eventually have prompt/concept
             | library, a node-based workflow UI (with the ability to
             | share techniques), etc.
             | 
             | It's kind of a - switch if you want a better UX now; keep
             | an eye on it if you want new workflow solutions long term.
        
               | hleszek wrote:
               | Or you can stay on automatic1111 and use
               | https://diffusionui.com/b/automatic1111 for better UX.
        
         | nikkwong wrote:
         | One thing is that invoke-ai can be run via CLI or possibly
         | programatically. I haven't found a good way to do that with the
         | automatic GUI. Personally, I've also found features in
         | automatic to be buggy. For example, batching seems to always
         | break the UI for me personally. With the invoke-ai fork, I can
         | run the CLI and produce images all night if I want to.
         | 
         | The bees knees would be being able to use automatic as a CLI or
         | with some programmatic interface, because it is more feature
         | rich. But I haven't seen anything that allows me to do that
         | yet, so I'm stuck to its clunky UI or to use invoke-ai.
        
         | hleszek wrote:
         | You can use https://diffusionui.com/b/automatic1111 once the
         | automatic1111 webui is running:
         | 
         | - better inpainting
         | 
         | - a gallery to easily compare generations and easily regenerate
         | images with small modifications
         | 
         | - responsive design --> works great on mobile, swipe left/right
         | to switch between pictures in same generation and up/down to
         | switch to another generation to compare
         | 
         | Here is the repo: https://github.com/leszekhanusz/diffusion-ui
         | 
         | Also if you don't have the hardware, you can also get images
         | for free using the Stable Horde (https://stablehorde.net), a
         | cluster of backends provided for free by volunteers.
         | 
         | You can test it here: https://diffusionui.com/b/stable_horde
        
           | nobo122 wrote:
           | Automatic1111's webui is what 90% of the StableDiffusion
           | community uses but he recently made the decision to use a
           | company's proprietary code after it was leaked by a hacker
           | and when confronted about it, instead of removing it as
           | requested he chose to lie about it despite the git history
           | evidence and the fact that the paper he claimed to have used
           | as reference wasn't related at all to the techniques used by
           | the stolen code.
           | 
           | The company whose code was stolen works closely with the man
           | behind SD and the decision was made to merely ban him from
           | the community instead of torpedo-ing the repo via DMCA.
        
             | sdflhasjd wrote:
             | This seems to have been proven to not be the case, and the
             | code in-fact originates from a pre-stable-diffusion MIT
             | licensed repo.
             | 
             | It's not even a particularly large or interesting piece of
             | code, the only reason why it's controvertial is because of
             | the code is only necessary to use the NovelAI leaked
             | models.
        
           | [deleted]
        
       | iFire wrote:
       | Can you make the ui InvokeAI as easy to install as running a
       | Windows 11 command line script?
       | 
       | I couldn't get it to work following https://invoke-
       | ai.github.io/InvokeAI/installation/INSTALL_WI...
       | 
       | Similar to https://github.com/cmdr2/stable-diffusion-
       | ui/releases/tag/v2...
        
         | sophrocyne wrote:
         | One click install is a goal, we just need a contributor who is
         | confident taking it on as a project.
        
       | cercatrova wrote:
       | Speaking of SD, I wonder if 1.4 will be the last truly open
       | release as Emad said 1.5 would release a while ago but it's been
       | held up for "compliance" reasons. Maybe they got legal threats
       | due to using artists' works and stock images. If so, that would
       | be sad to see it.
       | 
       | In a way it reminds me of people who make unofficial remakes of
       | games but get cease and desists if they show gameplay while in
       | development. The correct move is to fully develop the game and
       | release it, then if you get C&Ds, too late, the game is already
       | available to download.
        
         | sophrocyne wrote:
         | My take is that the "genie is out of the bottle"
         | 
         | Single source "massive models" may be more difficult to get
         | out, but Emad said they're working in licensing a ton of
         | content to train future models. Even then, anyone can train new
         | models now - The output from Dreambooth and Textual Inversion
         | are already impressive, and seem like just the beginning.
         | 
         | Going to be an interesting road ahead.
        
           | cmxch wrote:
           | Definitely is out of the bottle, especially when training
           | capable cards are getting within reach of regular people.
           | 
           | Sort of hinted at it upthread, but would be interesting if
           | this eventually brings competition to the GPU compute space
           | (AMD, Intel?) .
        
           | judge2020 wrote:
           | Just train on the output of existing models minus any photos
           | with watermarks - being twice removed is sure to make it even
           | harder to claim copyright :)
        
         | swyx wrote:
         | make what you will of it but as of yesterday this was his
         | answer to one of my readers:
         | https://twitter.com/EMostaque/status/1579204017636667392
         | 
         | > No actually dev decision. Generative models are complex to
         | release responsibly and team still working on release
         | guidelines as they get much better, 1.5 is only a marginal FID
         | improvement.
        
           | londons_explore wrote:
           | Sounds like some middle manager is trying to put his foot
           | down and is saying things like "no more releases till we have
           | designed and tested a 37 step release signoff procedure"
        
         | F2hP18Foam wrote:
         | I wonder if the same will happen to Midjourney or Dall-E. I
         | have generated images on Midjourney that literally had a
         | 'Shutterstock' watermark plastered across them. This watermark
         | was conspicuously missing when the image was upscaled.
        
           | noduerme wrote:
           | I've had stock photo watermarks show up repeatedly in SD
           | generations as well.
        
       | nohat wrote:
       | I've been using a modified version of lsteins fork since almost
       | the beginning. Recommended! It does lack some of the features of
       | eg automatic1111, but it has good cli, and actually has a
       | license, which is pretty important (as novelai has learned).
        
       | tehsauce wrote:
       | A Shameless plug, if anyone is interested in building apps using
       | stable diffusion and wants to keep things as cheap as possible, I
       | built a very user-friendly API that is 1/4 the cost of the
       | official stable diffusion API. There is also a free demo.
       | 
       | You can try it out:
       | 
       | https://computerender.com.
        
         | capableweb wrote:
         | The page at https://computerender.com/cost.html has the title
         | "How is computerender 4x cheaper than other services hosting
         | Stable Diffusion?" but doesn't actually explain how/why it is
         | cheaper, just that "crowd-sourced servers are much more
         | difficult to work" without elaborating on how what you're doing
         | is different than that.
         | 
         | Care to shine some light on it? Using something like
         | runpod/vast.ai would be my guess?
        
       | neilv wrote:
       | Nice! lstein is the SD fork that I ended up using, and I'm
       | delighted to see it evolve into InvokeAI and keep getting better.
        
       | pdntspa wrote:
       | Min requirements say 12gb, I take it this doesn't have the
       | optimizations that automatic1111 has for <8gb cards?
        
         | capableweb wrote:
         | You can run it with lower VRAM for sure, up until some weeks
         | ago, I was using that repository with a 11GB card.
        
       | paulirish wrote:
       | PSA: You can email support@github to ask them to "detach my repo
       | as a fork", in case the repo has matured so much it shouldn't
       | have the "forked from ..." treatment.
        
         | suyash wrote:
         | That's all good but it's nice to give credit where credit is
         | due. I like how they do it in the README.
        
       | lawik wrote:
       | Oh, I used the dreeam.py script to back a Telegram bot. It later
       | ended up in my demo for my talk Chat Bots as User Interfaces
       | (with Elixir): https://www.youtube.com/watch?v=DFGHaER6_j4
       | 
       | I primarily used the InvokeAI release because I found it was easy
       | to get going with on Linux and then it was simple enough to hack
       | around with.
       | 
       | Also the first tool I've ever used where I've rode on the ragged
       | edge of what my 3070 is okay with. I've had graphical glitches
       | due to occupying all the video memory (KDE doesn't like it). I've
       | had to quit apps to make it work.
       | 
       | Thanks for making a useful thing of all this Stable Diffusion
       | stuff. I've enjoyed it.
        
       | cmsj wrote:
       | Yay! I built an IRC bot for SD using lstein's repo because it was
       | the first one that I could get to work reliably on M1, so I'm
       | really glad to see the process continue really well with
       | InvokeAI!
        
       | Timwi wrote:
       | Sounds awesome! Unfortunately, it says that it requires a GPU.
       | Please consider making it accessible to people without a GPU, for
       | example using OpenVino like this (command line only) project
       | does:
       | 
       | https://github.com/bes-dev/stable_diffusion.openvino
       | 
       | Thanks!
        
         | geuis wrote:
         | What you're asking for isn't entirely possible for local
         | installs. Yes, you can run SD on a cpu, but each image takes
         | minutes at a time vs seconds via gpu.
         | 
         | For example, it's not possible to run SD on my 2 year old 16
         | intel MacBook Pro. This is because PyTorch doesn't have support
         | for the slightly older AMD gpu on board. There's a newer
         | framework called RocM for AMD cards that allows them to work
         | with recent versions of PyTorch.
         | 
         | Given all that, the requirements to have a Nvidia card is
         | entirely acceptable, and for the most part a technical
         | requirement.
        
         | hleszek wrote:
         | Those who don't have a GPU could use the Stable Horde:
         | https://stablehorde.net
        
       | cmxch wrote:
       | How hard of a requirement is the NVidia graphics chip? Polaris
       | era AMD chips do work decently at the 4gb level (although a bit
       | finicky) and Navi/Big Navi AMD cards work reasonably well with
       | modern ROCm.
        
         | pja wrote:
         | Stable Diffusion works for me with a Polaris GPU. Had to
         | compile my own local copy of Tensorflow to use it, but
         | everything runs.
        
           | cmxch wrote:
           | Which documentation/build environment are you using?
           | 
           | I'm using Ubuntu(to follow what AMD has for ROCm) and
           | building the entirety of (gfx803 patched) ROCm from source.
           | 
           | It works with some forks but not others.
        
             | pja wrote:
             | IIRC it's just the standard AMD build from
             | https://repo.radeon.com/rocm/apt/5.2.3/ I think.
             | 
             | It's possible I had to do something weird, but I think it
             | supports Polaris OOB. rocminfo certainly thinks so.
             | 
             | I'm running it on Debian testing, with an equivs package to
             | get it to install cleanly:                   $ cat rocm-
             | equivs          Package: amdgpu-driver-fixes
             | Provides: python,libstdc++-7-dev,libgcc-7-dev
             | Architecture: all         Description: Fixes the AMD GPU
             | driver installation on Debian testing
             | 
             | I had to compile my own version of pytorch to get gfx803
             | support there.
             | 
             | I'll see if I can recreate the steps & create a runbook.
        
             | MayeulC wrote:
             | I am in the same boat with a gfx03 card. What patch did you
             | use? The ones here? https://github.com/xuhuisheng/rocm-
             | build
             | 
             | I also tried to compile pytorch with its Vulkan backend,
             | but ended throwing the towel as LDFLAGS are a mess to get
             | right (I successfully compiled it, but that was only part
             | of the build chain, and decided I had better things to
             | spend time on). I wonder how that would perform; ncnn works
             | pretty decently.
        
               | pja wrote:
               | I did try that, but I don't think the pytorch Vulkan
               | backend is complete enough. I never did get a working
               | install going down that route though, so I could be
               | wrong.
        
         | wasyl wrote:
         | I ran some SD fork on Radeon Pro Vega 20 GPU, I'm not familiar
         | with the whole setup but it was running "torch mps backend"?
         | Anyway it was pretty fast and worked well, so I'm a bit
         | surprised at lack of Intel macs support from all those SD forks
        
       | swyx wrote:
       | [OT] its been hard for me to trace the universe of stable
       | diffusion forks so ive been maintaining a list here:
       | https://github.com/sw-yx/prompt-eng#sd-major-forks
       | 
       | please let me know/send PRs if i missed anything, its been a
       | couple months so i'm overdue for a round of cleanup/reorganizing
        
         | jayd1616 wrote:
         | Here's another one for you:
         | https://github.com/brycedrennan/imaginAIry
        
           | stavros wrote:
           | Imaginairy is great.
        
         | wyldfire wrote:
         | Is "dreambooth" a fork? Or another feature that has been
         | created by composing Stable Diffusion w/something else?
        
           | zenlikethat wrote:
           | Another feature. You can "teach" SD a new concept, e.g., a
           | new person, with a limited number of training images.
        
           | swyx wrote:
           | yeah you're right, more the latter. i should split it out.
           | 
           | OG dreambooth is proprietary Google code. what everyone's
           | using is a third party replication of it using SD
        
         | capableweb wrote:
         | I'm personally working on a UI as well, that is using InvokeAI
         | :) It has a bit of a different focus, namely organization of
         | generated images and facilitating generating a lot of images
         | quickly via randomization. Here is the current page for it:
         | https://patreon.com/auto_sd_workflow
         | 
         | Currently expanding it a lot with some fun features:
         | 
         | - multi-gpu support (first UI that would support that I think)
         | 
         | - no-click installer (installs everything when you start it up)
         | that works on Windows, Linux and macOS
         | 
         | - A cloud version where you can "rent" access to the UI + very
         | powerful GPU instances without having to run anything locally
         | yourself.
         | 
         | Been waiting to submitting it all to HN as a Show HN but have
         | to wait a bit for everything to get into place first :)
        
         | grosswait wrote:
         | Maybe I missed it but didn't see
         | https://github.com/divamgupta/diffusionbee-stable-diffusion-...
        
         | hexomancer wrote:
         | Shameless plug: I was frustrated with the poor UI of notebook-
         | based frontends so I wrote a desktop version here:
         | https://github.com/ahrm/UnstableFusion .
         | 
         | Here is a video of some of its features:
         | https://www.youtube.com/watch?v=XLOhizAnSfQ&t=1s
        
         | hleszek wrote:
         | I see you've got my https://github.com/leszekhanusz/diffusion-
         | ui gui but it seems to be linked to a completely unrelated
         | face-swapping interface?
         | 
         | And in communities, you can probably add
         | https://stablehorde.net
        
           | swyx wrote:
           | uhhh... copy paste brainfart sorry. thanks for correction
        
       ___________________________________________________________________
       (page generated 2022-10-10 23:00 UTC)