[HN Gopher] PyTorch 2.0 Release ___________________________________________________________________ PyTorch 2.0 Release Author : DreamFlasher Score : 130 points Date : 2023-03-15 20:57 UTC (2 hours ago) (HTM) web link (pytorch.org) (TXT) w3m dump (pytorch.org) | [deleted] | mdaniel wrote: | discussion from (presumably) the PyTorch Conference announcement: | https://news.ycombinator.com/item?id=33832511 | brucethemoose2 wrote: | I'm hoping torch.compile is a gateway to "easy" non-Nvidia | accelerator support in PyTorch. | | Also, I have been using torch.compile for the Stable Diffusion | unet/vae since February, to good effect. I'm guessing similar | optimizations will pop up for LLaMA. | datadeft wrote: | Could you give a bit more details about this? Do you have a | link? | brucethemoose2 wrote: | See the above reply ^ | voz_ wrote: | Is there somewhere I can see your Stable Diffusion + | torch.compile code? I am interesting in how you integrated. | brucethemoose2 wrote: | In `diffusers` implementations (like InvokeAI) its pretty | easy: https://github.com/huggingface/diffusers/blob/42beaf1d2 | 3b5cc... | | But I also compile the VAE and some other modules, I will | reply again later when I can look at my local code. Some | modules (like face restoration or the scheduler) still dont | like torch.compile. | | For the Automatic1111 repo (and presumably other original | Stability AI implementations), I just add `m.model = | torch.compile(m.model)` here: | https://github.com/AUTOMATIC1111/stable-diffusion- | webui/blob... | | I tried changing the options in the config dict one by one, | but TBH nothing seems to make a significant difference behind | the default settings in benchmarks. | | I haven't messed with compiling LORA training yet, as I dont | train much and it is sufficiently fast, but I'm sure it could | be done. | brucethemoose2 wrote: | Here is the InvokeAI code, minus the codeformer/gfpgan | changes that dont work yet: | | https://gist.github.com/brucethemoose/ea64f498b0aa51adcc88f5. | .. | | I intend to start some issues for this on the repo soon(TM). | fpgaminer wrote: | The thing I'm looking forward to most is having Flash Attention | built-in. Right now you have to use xformers or similar, but that | dependency has been a nightmare to use, from breaking, to | requiring specific concoctions of installing dependencies or else | conda will barf, to being impossible to pin because I have to use | -dev releases which they constantly drop from the repositories. | | PyTorch 2.0 comes with a few different efficient transformer | implementations built-in. And unlike 1.13, they work during | training and don't require specific configurations. Seemed to | work just fine during my pre-release testing. Also, having it | built into PyTorch might mean more pressure to keep it optimized. | As-is xformers targets A100 primarily, with other archs as an | afterthought. | | And, as promised, `torch.compile` worked out of the box, | providing IIRC a nice ~20% speed up on a ViT without any other | tuning. | | I did have to do some dependency fiddling on the pre-release | version. Been looking forward to the "stable" release before | using it more extensively. | | Anyone else seeing nice boosts from `torch.compile`? | mardifoufs wrote: | >Python 3.11 support on Anaconda Platform | | >Due to lack of Python 3.11 support for packages that PyTorch | depends on, including NumPy, SciPy, SymPy, Pillow and others on | the Anaconda platform. We will not be releasing Conda binaries | compiled with Python 3.11 for PyTorch Release 2.0. The Pip | packages with Python 3.11 support will be released, hence if you | intend to use PyTorch 2.0 with Python 3.11 please use our Pip | packages. | | It really sucks that anaconda always lags behind. I know the | reasoning*, and I know it makes sense for what a lot of teams use | it for... but on our side we are now looking more and more into | dropping it since we are more of an R&D team. We already use | containers for most of our pipelines, so just using pip might be | viable. | | *Though I guess Anaconda chewed more than it can handle w.r.t | managing an entire Python universe, and keeping up to date. | Conda-forge is already almost a requirement but using the | official package (with pip, in this case) has its own benefits | for very complex packages like pytorch. | DreamFlasher wrote: | Afaik NumPy, SciPy, SymPy and Pillow are not managed/owned by | Anaconda? At least here: https://numpy.org/about/ Anaconda | isn't mentioned. | DreamFlasher wrote: | Ah, yeah they do have a Python 3.11 release, just not on | anaconda. Okay, yeah, for a couple of years now there isn't a | good reason anymore to use anaconda anyways. | mardifoufs wrote: | Yes that's the issue! Most of the software is already | ready, usable and just works... unless you use anaconda. | Now that I think about it, is there some technical reason | for that? I always thought it was mostly about stability, | but I can't imagine python 3.11 being so unstable as to | warrant waiting a whole year before even porting. | brucethemoose2 wrote: | The Arch Linux PyTorch 2.0 packages are great if you are | looking for "cutting edge," as they are compiled against CUDA | 12.1 now, instead of 11.8 like the official nightly releases. | You can also get AVX2 patched Python and optimized C Python | packages through CachyOS or ALHP. | | But even Arch is still stuck on Python 3.10 | simonw wrote: | "the MPS backend" - that's the thing that lets Torch run | accelerated on M1/M2 Macs! | datadeft wrote: | Yes, I am not sure at what extent is MPS a viable alternative | to CUDA. You seem to write a lot about ML models. Do you have a | detailed write about this subject? | sebzim4500 wrote: | Based on George Hotz's testing it is very broken. It's possible | it has improved since then, I guess but he streamed this a few | weeks ago. | dagmx wrote: | It supports a subset of the operators (as mentioned in the | release notes). I don't think it's broken for the ones that | it does support though. | mochomocha wrote: | That's been my experience. However when fallback to CPU | happens, it sometimes end up making a specific graph | execution slower. But that's explicitly mentioned by the | warning and pretty much expected. | brucethemoose2 wrote: | You'd think it would fall back to GPU/CPU for unsupported | operations instead of failing, but I guess thats easier | said than done. | norgie wrote: | Yes, this is my experience. Many off the shelf models still | don't work, but several of my own models work great as long | as they don't use unsupported operators. | glial wrote: | Where can I find a list of the supported operators? | bigbillheck wrote: | Based on George Hotz's performance at twitter I wouldn't bet | he wasn't holding it wrong. | jeron wrote: | so, you would bet he was holding it wrong? | danieldk wrote: | We tested inference for all spaCy transformer models and they | work: | | https://explosion.ai/blog/metal-performance-shaders | | It depends very much on the ops that your model is using. ___________________________________________________________________ (page generated 2023-03-15 23:00 UTC)