[HN Gopher] Run Stable Diffusion on Intel CPUs ___________________________________________________________________ Run Stable Diffusion on Intel CPUs Author : amrrs Score : 88 points Date : 2022-08-29 19:13 UTC (3 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | polskibus wrote: | can't get it to install requirements on Windows with Python 3.10 | and MS Build Tools 2022. Any tips? | smoldesu wrote: | I found a pretty good Docker container for it, though that's | only really switching you from solving Python problems to | Docker ones. Worth trying out if you have a Linux box or WSL | installed though: https://github.com/AbdBarho/stable-diffusion- | webui-docker | desindol wrote: | It needs python 3.9. | amrrs wrote: | On reddit I found some older GPUs take about 5 mins and here this | video[1] says 5 mins for CPU using this OpenVino library. Not | sure if OpenVino makes CPU chips compete with GPUs. Has anyone | heard of OpenVino before ? | | 1.https://youtu.be/5iXhhf7ILME | minimaxir wrote: | OpenVINO is developed by Intel themselves, and is one of many | methods to freeze models to make CPU inference possible and | performant. | | https://en.wikipedia.org/wiki/OpenVINO | T-A wrote: | https://github.com/openvinotoolkit/openvino#supported-hardwa... | torotonnato wrote: | 7' 12" on an ancient Intel Core i5-3350P CPU @ 3.10GHz (!) using | BERT BasicTokenizer, default arguments | [deleted] | aaaaaaaaaaab wrote: | Love this. OpenAI are _livid_. :^) | enchiridion wrote: | Why? | yieldcrv wrote: | Where can i get up to speed on what's coming down there pipeline | in this ai/ml image making scene? | | (And learn the agreed upon terms) | aaaaaaaaaaab wrote: | Noone can tell. | | Pandora's box has been opened. | | Nothing is true, everything is permitted. | yayr wrote: | then - how far away are we from having it on M1/M2 Macs, at least | with regular processing? openvino may be one path I suppose: | https://github.com/openvinotoolkit/openvino/issues/11554 | homarp wrote: | PyTorch for m1 (https://pytorch.org/blog/introducing- | accelerated-pytorch-tra... ) will not work: | https://github.com/CompVis/stable-diffusion/issues/25 says | "StableDiffusion is CPU-only on M1 Macs because not all the | pytorch ops are implemented for Metal. Generating one image | with 50 steps takes 4-5 minutes." | andybak wrote: | By comparison I can generate 512x512 images every 15 seconds | on an RTX 3080 (although there's an initial 30 second setup | penalty for each run) | yayr wrote: | those guys are also working on it atm :-) | https://github.com/lstein/stable-diffusion/pull/179 | yayr wrote: | looks like there is an easier path using metal shaders: | https://dev.to/craigmorten/setting-up-stable-diffusion-for-m... | | and https://github.com/magnusviri/stable-diffusion/tree/apple- | si... | zmmmmm wrote: | this worked fine for me, and running side by side with Intel | CPU + nVidia 2070 it actually does not take much longer (and | as a sibling said, seems to be working at full precision). It | is one of the first things I've done that has properly made | my M1 Max's fan spin up hard though! | garblegarble wrote: | I've been using this on my M1 Max and it works pretty well, | 1.65 iterations per second (full precision, whereas my PC's | 3080 can only do half-precision due to limited memory)... a | 50-iteration image in about 40 seconds or so. | MattRix wrote: | Your 3080 should be able to do full precision. Are you sure | you don't have the batch size set greater than 1, or | another issue along those lines? | garblegarble wrote: | Thank you and smoldesu for letting me know it should | work, I'll have a better look into what's going on - it | didn't immediately work on Windows in full precision | (probably a batch size issue as you suggested) and I gave | up... | | I shouldn't have given up so easily, but my tolerance for | annoyances on Windows is pretty low (that Windows machine | is kept for gaming, the last time I used a Windows | machine for anything but launching Steam was when Windows | 2000 was the hot new thing...) | smoldesu wrote: | > full precision, whereas my PC's 3080 can only do half- | precision due to limited memory | | What model are you using? I've been running full-precision | SD1.4 on my 3070, albeit with less than 10% VRAM headroom. | pmalynin wrote: | I got it working in about an hour on M1 ultra, mostly compiling | things and having to tweak some model code to be compatible | with metal. It works pretty well, about 1/10 to 1/20 of | performance I can get on a 3080. | motoboi wrote: | openvino is an unsung hero. | ByThyGrace wrote: | What's the status of running SD on AMD GPUs? | homarp wrote: | https://rentry.org/tqizb explains how to install ROCm and then | pytorch for ROCm | | ROCm does not support APU, here is the list of supported GPU: | https://docs.amd.com/bundle/Hardware_and_Software_Reference_... | synergy20 wrote: | what does APU mean here? | ace2358 wrote: | Cpu + gpu on the same die/chiplet thing. So integrated gpu | in and marketing speak is apu | barkingcat wrote: | Amd's integrated gpu together with the processor. | mysterydip wrote: | I didn't see any requirements on the page beyond a CPU on that | list. Do you need a certain amount of RAM? Will more speed things | up to a degree? ___________________________________________________________________ (page generated 2022-08-29 23:00 UTC)