[HN Gopher] Easy Stable Diffusion XL in your device, offline ___________________________________________________________________ Easy Stable Diffusion XL in your device, offline Author : haebom Score : 219 points Date : 2023-12-01 14:34 UTC (8 hours ago) (HTM) web link (noiselith.com) (TXT) w3m dump (noiselith.com) | ProllyInfamous wrote: | The 16GB (base model) M2 Pro Mini, despite its overall | awesomeness (running DiffusionBee.app / etc)... does not meet | Minimum System Requirements (Apple Silicon requires 32GB RAM). | | So now I have to contemplate shopping for a new mac TWICE in one | year (never happened before). | sophrocyne wrote: | https://github.com/invoke-ai/InvokeAI - runs on Mac silicon, | can squeeze out SDXL images on a 16gb mac with SSD-1B or Turbo | models. | wsgeorge wrote: | Currently using SDXL (through Huggingface Diffusers) on an M1 | 16GB Mac. Takes on average 4-5mins to generate an image. It's | usable. | ttul wrote: | Good lord. I can get a 2048x2048 upscaled output from a very | complex ComfyUI workflow on a 4090 in 15 seconds. This | includes three IPAdapter nodes, a sampling stage, a three- | stage iterative latent upscaler, and multiple ControlNets. | Macs are not close to competitive for inference yet. | rsynnott wrote: | I mean, a 4090 would appear to cost $2000, and came out a | year ago; it has about 70bn transistors. The M1 could be | had for $700 for a desktop, $1000 as part of a laptop, came | out three years ago, and has 16bn transistors, some of | which are CPU. | | An M3 Ultra might be a more reasonable comparison for the | 4090. | michaelt wrote: | 24GB cards weren't always $2000. I've seen people on this | very forum [1] who brought two 3090s for just $600 each. | | Agree the prices are crazy right now, though. | | [1] https://news.ycombinator.com/item?id=37438847 | myself248 wrote: | When choosing a machine with non-expandable RAM, you went with | the minimum configuration? That's a choice, I suppose, but the | outcome wasn't exactly hard to foresee. | sophrocyne wrote: | There are already a number of local, inference options that are | (crucially) open-source, with more robust feature sets. | | And if the defense here is "but Auto1111 and Comfy don't have as | user-friendly a UI", that's also already covered. | https://github.com/invoke-ai/InvokeAI | GaggiX wrote: | Also just Krita with the diffusion AI plugin: | https://github.com/Acly/krita-ai-diffusion | blehn wrote: | No idea whether or not the UI is user-friendly, but the | installation steps alone for InvokeAI are already a barrier for | 99.9% of the world. Not to say Noiselith couldn't be open- | source, but it's clearly offering something different from | InvokeAI. | internet101010 wrote: | I switched to InvokeAI and won't go back to basic a1111 webui. | I like how everything is laid out, there are workflow features, | you can easily recall _all_ properties (prompt, model, lora, | etc.) used to generate an image, things can be organized into | boards, and all off the boards /images/metadata are stored in a | very well-designed sqlite database that can be tapped into via | DataGrip. | quitit wrote: | automatic1111: great for the fast implementation of the most | recent generative features | | comfyui: excellent for workflows and recalling the workflows, | as they're saved into the resulting image metadata (i.e. | sharing images, shares the image generation pipeline) | | InvokeAI: Great UX and community, arguably were a bit behind | in features as they were focused on making the UI work well. | Now at the stage of bringing in the best features of | competitors - Like you, I can easily recommend it above all | other options. | squeaky-clean wrote: | > recalling the workflows, as they're saved into the | resulting image metadata (i.e. sharing images, shares the | image generation pipeline) | | Doesn't a1111 already do this? Theres a PNG Info tab where | you can drag and drop a PNG and it will pull all the | prompt, inverse prompt, model, etc. And then a button to | send it to the main generation tab. It doesn't | automatically load the model, but that may be intentional | because of how long it takes to change loaded models. | holoduke wrote: | Can you actually use those workflows in some sort of API | from a script to automate it from lets say a python script. | Played arround with comfy. Really nice, but i would like to | automate it within my own environment. | sophrocyne wrote: | Yeah, Invoke's nodes/workflows backend can be hit via the | API. That's how the entire front-end UI (and workflow | editor/IDE) are built. | | I'm positive this can be done w/ Comfy too. | smcleod wrote: | Yeah invokeAI is fantastic! | AuryGlenz wrote: | I realize it may be good marketing, but it's odd to have the fact | that it's on device and offline be the primary differentiator | when that's probably how most people use Stable Diffusion | already. | | I'd probably focus more on it being easy to install and use, as | that's something that isn't done much. For me, if it doesn't have | Controlnet, upscaling, some kind of face detailer, and preferably | regional prompting, I'm out. | | I also kind of wish all of these people that want to make their | own SD generators would instead work on one of the open source | ones that already exist. | | While an app store might be a good idea, in a world with Auto111 | and all of their extensions I think it's going to go over poorly | with the Stable Diffusion community, for what it's worth. | michaelt wrote: | I think there's probably a bunch of people who don't use things | like A1111 because of the complexities of the download-this- | which-downloads-this-which-downloads-this-then-you-manually- | download-this-and-this setup model. | | I can see how something simpler might appeal to _new_ users, | even if it doesn 't appeal to _existing_ users. | AuryGlenz wrote: | Sure, and I agree with that. As I said, I'd probably push | that just as much as it being 'offline,' if not more. | philipov wrote: | You hit the nail on the head when you said it's good marketing, | but go all the way. The thing you find odd tells you who they | want to use their product; You're not their target audience. | They are trying to convert people from using online-only | services like Dall-E, not people who already use SD. | prepend wrote: | I've oddly found many cloud wrappers to stable diffusion. So I | like the upfront on device/offline description. | | It was weird when I was first playing with SD how many packages | did severe phone home or vms or whatever instead of just | downloading a bunch of stuff and running it. | solarkraft wrote: | I've used SD on my device, but I found it worth it to pay for | the hosted version because it's much faster. | alienreborn wrote: | Interesting, will check it out to see how it compares with | https://diffusionbee.com which I am using for last few months for | fun. | janmo wrote: | I just checked out both and Noiselith produces much, much | better results. | rgbrgb wrote: | Just installed, this is very cool. Local AI is the future I want | (and what I'm working on too). A few notes using it... | | Pros: | | - seems pretty self contained | | - built in model installer works really well and helps you | download anything from CivitAI (I installed | https://civitai.com/models/183354/sdxl-ms-paint-portraits) | | - image generation is high quality and stable | | - shows intermediate steps during generation | | Cons: | | - downloads 6.94GB SDXL model file somewhere without asking or | showing location/size. Just figured out you can find/modify the | location in the settings. | | - very slow on first generation as it loads the model, no record | of how long generations take but I'd guess a couple minutes (m1 | max macbook, 64GB) | | - multiple user feedback modules (bottom left is very intrusive | chat thing I'll never use + top right call for beta feedback) | | - not open source like competitors | | - runs 7 processes, idling at ~1GB RAM usage | | - non-native UX on macOS, missing hotkeys you'd expect, help | menu. electron app? | | Overall 4/5 stars, would open again :) | liuliu wrote: | You should check out Draw Things on macOS. It works well enough | for SDXL on 8GiB macOS devices. | miles wrote: | Are you the developer by any chance? If so, it would be | helpful to state it. | liuliu wrote: | I am. I thought this is obvious. My statement is objective. | I would go as far as: it is the only app works at 8GiB | macOS devices with SDXL-family models. | adamjc wrote: | How would that be obvious to anyone? | cellularmitosis wrote: | "You should check out this thing" has a very different | implied context than "You should check out this thing I | made". The first sounds like a recommendation from an | enthusiastic user, not from the the author. Because of | this, discovering that you are the author makes your | recommendation feel deceptive. | liuliu wrote: | I am sorry if you feel that way. I joined HN when it was | a small tight-knit community without much of marketing | presence. The "obvious" comment is more like "people know | other people" kind of thing. I didn't try to deceive | anyone to use the app (and why should I?). | | If you feel this is unannounced self-promotion, yes, it | is, and can be done better. | | --- | | Also, for the "objective" comment, it meant to say "the | original comment is still objective", not that you can be | objective only by being a developer. Being a developer | can obviously bias your opinion. | TheHumanist wrote: | What do you mean it was obvious? Only the developer could | make that objective statement? | ProfessorLayton wrote: | Whoa, well let me just say thanks for the awesome app!! | it's pretty entertaining to spin this up in situations | where I don't have internet (Airplane, subway etc.) | | I was also surprised on how well it ran on my iPhone 11 | before I replaced it with a 15 pro. | | (Let me know if you're looking for some Product Design | help/advice, totally happy to contribute pro bono. No | worries if not of course!) | vunderba wrote: | Nice app - but for future reference it is _very_ much not | obvious to any native English speaker. "You should check | out X" sounds like a random recommendation. | rgbrgb wrote: | Thanks. Yeah I played with your app early on and just fired | it up again to see the progress. Frankly I find the interface | pretty intimidating but it is cool that you can easily stitch | generations together. | | Unsolicited UX recs: | | - strongly recommend a default model. The list you give is | crazy long. It kind of recommends SD 1.5 in the UI text below | the picker but has the last one selected by default. Many of | them are called the same thing (ironically the name is | "Generic" lol). | | - have the panel on the left closed by default or show a | simplified view that I can expand to an "advanced" view. | Consider sorting the left panel controls by how often I would | want to edit them (personally I'm not going to touch the | model but it is the first thing). | | You are doing great work but I wouldn't underestimate the | value of simplifying the interface for a first-time user. It | seems to have a ton of features but I don't know what I | should actually be paying attention to / adjusting. | | Is there a business model attached to this or do you have a | hypothesis for what one might look like? | liuliu wrote: | Agreed on UX feedback. It accumulated a lot of crufts from | the old technologies to the new. This just echos my early | feedback that co-iterating UI and the technology is | difficult, you'd better pick the side you want to be on and | there is only one correct side (and unfortunately, the | current app is trying hard to be on both-side). | philote wrote: | Another con is it only works on Silicon Macs. | Vicinity9635 wrote: | Apple Silicon* I presume? | | This could honestly be the excuse I need (want) to order an | absolute beast of a macbook pro to replace my 2013 model. | quitit wrote: | If it's just for hobby/interest work, then just a heads-up | that even the 1st generation Apple Silicon will turn over | about one image a second with SDXL Turbo. The M3s of course | are quite a bit faster. | | The performance gains in recent models and PyTorch are | currently outpacing hardware advances by a significant | margin, and there are still large amounts of low-hanging | fruit in this regard. | wayfinder wrote: | If you want an absolute beast, especially for this stuff, | you probably want Intel + Nvidia. Apple Silicon is a beast | in power efficiency but a top of the line M3 does not come | close to the top of the line Intel + Nvidia combo. | Vicinity9635 wrote: | Well this would just be the excuse. I'm typing this on a | Ryzen 5950X w/32 GB of RAM and a 4090. So I guess I | already have the beast? | mikae1 wrote: | _> not open source like competitors_ | | Who are the competitors? | quitit wrote: | DiffusionBee: AGPL-3.0 license (Native app) | | InvokeAI: Apache license 2.0 (web-browser UI) | | automatic1111: AGPL-3.0 license (web-browser UI) | | ComfyUI:GPL-3.0 license (web-browser UI) | | There's more, but I don't pay enough attention to it | mikae1 wrote: | Thanks! https://lmstudio.ai/ too. For the more technically | inclined perhaps. | dragonwriter wrote: | I don't think lmstudio is competes with Stable Diffusion | frontends, even for the technically-inclined. | vunderba wrote: | I'd also recommend InvokeAI, an open source offering which | has a very nice editable canvas and is very performant with | diffusers. | | https://github.com/invoke-ai/InvokeAI | sytelus wrote: | +1 for asking download location. | amelius wrote: | So it's free, but not open source. | | What is the catch? | sib wrote: | They will have a non-free (as in beer) version once they exit | beta (per the website). | tracerbulletx wrote: | All the real homies use ComfyUI | weakfish wrote: | Elaborate? | tracerbulletx wrote: | I'm being kind of tongue in cheek because I understand that | this is for just making things really easy and ComfyUI is a | node based editor that most people would have trouble with. | But the best UI for local SD generation that the community is | using is https://github.com/comfyanonymous/ComfyUI | rish wrote: | Agreed. It's worth the learning curve for the sheer power | you can enable your workflows. I've always wanted to toy | around with node based architectures and this seemed quite | easy after using A1111 extensively. The community providing | ready to go workflows has made it quite enjoyable too. | ttul wrote: | If you are a programmer at heart, ComfyUI will feel very | comfortable (pun intended). It's basically a visual | programming environment optimized for the type of | compositional programming that machine learning models | desire. The next thing this space needs is someone to build | an API hosting every imaginable model on a vast farm of | GPUs in the cloud. Use ComfyUI and other apps to | orchestrate the models locally, but send data to the cloud | and benefit from sharing GPU resources far more | efficiently. | | If anyone has a spare thousand hours to kill, I would build | that and connect it up with the various front-ends | including ComfyUI, A111, etc.. not a small amount of | effort, but it will be rewarding. | verdverm wrote: | This is when I feel the 24G mem limit of the mac book/air | liuliu wrote: | Again, try Draw Things, it runs well for SDXL on 8GiB macOS | devices. | verdverm wrote: | yeah, I know there are options, I'm more interested in | language models than image generation anyway, so llama.cpp | brucethemoose2 wrote: | I would highly recommend Fooocus to anyone who hasn't tried: | https://github.com/lllyasviel/Fooocus | | There are a bajillion local SD pipelines, but this one is, _by | far_ , the one with the highest quality output out-of-the-box, | with short prompts. Its remarkable. | | And thats because it integrates a bajillion SDXL augmentations | that other UIs do not implement or enable by default. I've been | using stable diffusion since 1.5 came out, and even having | followed the space extensively, setting up an equivalent pipeline | in ComfyUI (much less diffusers) would be a pain. Its like a | "greatest hits and best defaults" for SDXL. | liuliu wrote: | Yeah, Fooocus is much better if you are going for the best | local generated result. Lvmin puts all his energy into making | beautiful pictures. Also it is GPL licensed, which is a + in my | book. | rvz wrote: | Looks like a complete contraption to setup and looks very | unpleasant to use at first glance when compared against | Noiselith. | | The hundreds of python scripts and having the user to touch the | terminal shows why something like Noiselith should exist for | normal users rather than developers or programmers. | | I would rather take a packaged solution that just works over a | bunch of scripts requiring a terminal. | liuliu wrote: | You have to make trade-off in software development. Fooocus | trades on the best picture rather than the most beautiful | interface, and also simplicity in its use. I think it is a | good trade-off given the technology is improving at breaking- | neck speed. | | Look, DiffusionBee is still maintained but still no SDXL | support. | | Anyone who bet that the technology is done and it is time to | focus on the UI is making the wrong bet. | rgbrgb wrote: | This project is really cool and I like the stated | philosophy on the README. I think it's making the right | trade-off in terms of setting useful defaults and not | showing you 100 arcane settings. However, the installation | is too hard. It's a student project and free so I'm not | criticizing the author at all but I think it's a pretty | fair and useful criticism of the software and likely a | significant bottleneck to adoption. | Tiberium wrote: | Huh? It has a really simple interface, much much much simpler | than anything else that uses SD/SDXL locally. Installation is | also simple for Windows/Linux, don't know about macOS. | Liquix wrote: | installation/setup is dead simple. up and running in under 3 | minutes: | | git clone https://github.com/lllyasviel/Fooocus.git | | cd Fooocus | | pip3 install -r requirements_versions.txt | | python3 entry_with_update.py | Filligree wrote: | Let's see... | | > pip3: command not found | | Okay. I'll need to install it? What package might that be | in, hmm. Moving on, I already know it's python. | | > /usr not writeable | | Guess I'll use sudo... | | = = = | | Obviously I know better than to do this, but _very few | people would_. This is not 'dead simple'! It's only simple | for Python programmers who are already familiar with the | ecosystem. | | Now, fortunately the actual documentation does say to use | venv. That's still not 'dead simple'; you still need to | understand the commands involved. There's definitely space | for a prepackaged binary. | pixl97 wrote: | The people that make software that does useful things, | and the people that understand system security live on | different planets. One day they'll meet each other and | have a religious war. | | This said, it's nice when developers attempt to detect | the executable they need and warn what package is | missing. | brucethemoose2 wrote: | There are projects that set up "fat" Python executables | or portable installs, but the problem in PyTorch ML | projects is that the download would be many gigabytes. | | Additionally, some package choices depend on hardware. | | In the end, a lot of the more popular projects have "one | click" scripts for auto installs, and there are some for | Fooocus specifically, but the issue there is its not as | visible as the main repo, and not necessarily something | the dev wants to endorse. | zirgs wrote: | Or you can use Stability Matrix package manager. | brucethemoose2 wrote: | Yeah, VoltaML is also another excellent choice in | stabilty matrix. | pmarreck wrote: | Have to build it yourself on Mac, and we all know how "fun" | building Python projects is | jessepasley wrote: | Just spent about 10 minutes building it on MacBook Pro M1. I | come with significant bias against Python projects, but | getting Fooocus to run was very, very easy. | pmarreck wrote: | That's good to know! | calamari4065 wrote: | Is this at all usable on a CPU-only system with a ton of RAM? | brucethemoose2 wrote: | Not really. There is a very fast LCM model preset now, but | its still going to be painful. | | SDXL in particular isn't one of those "compute light, | bandwidth bound" models like llama (or Fooocus's own mini | prompt expansion llm that in fact runs on the CPU). | | There is a repo focused on CPU-only SD 1.5. | neilv wrote: | Looks like the Web UI of the self-hosted install of Fooocus | sells out the user to Google Tag Manager. | | Can our entire field please realize that running this | surveillance is a bad move, and just stop doing it. | LorenDB wrote: | Why do we never see AMD support in these projects? | stuckkeys wrote: | I think it is a matter of why AMD does not support these | projects. NVIDIA is involved everywhere. They could easy do the | same. At least to what I have observed on the internetz. | mg wrote: | Would it be possible to run Stable Diffusion in the browser via | WebGPU? | skocznymroczny wrote: | https://websd.mlc.ai/#text-to-image-generation-demo | stuckkeys wrote: | Installed it. Ran it. Generated. Slow for some reason. Deleted | it. Looks similar to Pinokio, and that is opensource. | dreadlordbone wrote: | After installation, it wouldn't run on my Windows machine unless | I granted public and private network access. Kinda tripped up | since it says "offlilne". | kemotep wrote: | If you disconnected completely from the internet did it still | run? | | That is completely wrong to advertise it as "offline" if it | requires an active internet connection to run. | tredre3 wrote: | I had a similar experience. | | On the first run it downloads about 30GB of data. I don't know | if it would work offline on subsequent runs because for me it | never ran again without crashing! | | Also upon uninstallation it left behind all its data (not user | data, mind you. But the executable itself, its python venv, its | updater, and all the models. Uninstall basically just removed | the shortcut in the start menu). | m3kw9 wrote: | Does not work at all, it needs you to go and find a "model", like | just download it for man. | solarkraft wrote: | I find it interesting that it requires 16GB of RAM on Windows but | 32 on a Mac. Unfortunately that leaves me out ... | mthoms wrote: | I think that's probably because RAM on Mac is shared with the | GPU. On Windows, you need 16GB RAM _plus_ 8GB on GPU. | stared wrote: | I keep getting "Failed to generate. Please try again" 10 seconds | after model loading. It is hardly helpful, as trying again gives | the same error. | | Apple Silicon M1, 32GB RAM, in any case. | stets wrote: | definitely exciting to see more local clients come out. As | mentioned in other comments, there are some great ones out | already. I've used automatic1111 which is quick and doesn't | require a ton of tuning. But it still has lots of knobs and | options which makes it difficult initially. Fooocus is super | quick but of course less customization. | | Then there's ComfyUI, the holy grail of complicated, but with | that complication comes the ability to do so much. It is a node- | based app that allows you to create custom workflows. Once your | image is generated, you can pipe that "node" somewhere else and | modify it, eg: upscale the image or do other things. | | I'd like to see if Noiselith or some others offer support for | SDXLTurbo -- it came out only a few days ago but in my opinion is | a complete game-changer. It can generate 512x512 images in ~half | a second on consumer GPUs. The images aren't crazy quality but | that ability to make a prompt like "fox in the woods", see it | instantly and then add "wearing a hat" and see it instantly | generate again is so valuable. Prior to that, I'd wait 12 seconds | for an image. Sounds like not a big deal, but the value of being | able to iterate so quickly makes local image gen so much more | fun. | evanjrowley wrote: | How's support for AMD GPUs? I only saw Nvidia listed. | skocznymroczny wrote: | The main issue with AMD is that to get reasonable performance | you need to use ROCm, and ROCm is only available on Linux. They | started porting parts of ROCm to Windows but it's not enough to | be usable yet, might be different in few months. | seydor wrote: | but what s gonna happen to all those AI valuations if we all go | offline | kleiba wrote: | Sales prompt: "Young woman with blonde curls in front of a | fantasy world background, come hither eyes, sitting with her legs | spread, wearing a white shirt and jeans hot pants." | | I mean, really?? | smcleod wrote: | Yeah that's creepy as. | momojo wrote: | I'm genuinely curious how many people in the open source | community are pouring their sweat and blood into these projects | that are, at the end of the day, enabling guys to transform | their macbooks into insta-porn-books. | rcoveson wrote: | If the prompt wasn't somewhat sexual, divisive, or offensive it | would be wide open to the chorus of "still not as good as | midjourney/dall-e/imagen". Freedom from restriction is one of | the main selling points. | KolmogorovComp wrote: | Glad I'm not the only one who found it inappropriate. Feels | very much like a dog whistle. | rcoveson wrote: | What's subtle about it? In the dog whistle analogy, who are | they who cannot hear the whistle? | | To me this is more like yelling "ROVER! COME HERE BOY!" at | the top of your lungs. | NKosmatos wrote: | As others have stated, Local AI (completely offline after | model/weight download) is the way to go. If I have the hardware | why shouldn't I be able to run all these fancy software on my own | machine? | | There are many great suggestions and links to other | similar/better packages, so follow the comments for more info, | thanks :-) ___________________________________________________________________ (page generated 2023-12-01 23:00 UTC)