hngopher.com

       [HN Gopher] PyTorch 2.0 Release
       ___________________________________________________________________
        
       PyTorch 2.0 Release
        
       Author : DreamFlasher
       Score  : 130 points
       Date   : 2023-03-15 20:57 UTC (2 hours ago)
        
 (HTM) web link (pytorch.org)
 (TXT) w3m dump (pytorch.org)
        
       | [deleted]
        
       | mdaniel wrote:
       | discussion from (presumably) the PyTorch Conference announcement:
       | https://news.ycombinator.com/item?id=33832511
        
       | brucethemoose2 wrote:
       | I'm hoping torch.compile is a gateway to "easy" non-Nvidia
       | accelerator support in PyTorch.
       | 
       | Also, I have been using torch.compile for the Stable Diffusion
       | unet/vae since February, to good effect. I'm guessing similar
       | optimizations will pop up for LLaMA.
        
         | datadeft wrote:
         | Could you give a bit more details about this? Do you have a
         | link?
        
           | brucethemoose2 wrote:
           | See the above reply ^
        
         | voz_ wrote:
         | Is there somewhere I can see your Stable Diffusion +
         | torch.compile code? I am interesting in how you integrated.
        
           | brucethemoose2 wrote:
           | In `diffusers` implementations (like InvokeAI) its pretty
           | easy: https://github.com/huggingface/diffusers/blob/42beaf1d2
           | 3b5cc...
           | 
           | But I also compile the VAE and some other modules, I will
           | reply again later when I can look at my local code. Some
           | modules (like face restoration or the scheduler) still dont
           | like torch.compile.
           | 
           | For the Automatic1111 repo (and presumably other original
           | Stability AI implementations), I just add `m.model =
           | torch.compile(m.model)` here:
           | https://github.com/AUTOMATIC1111/stable-diffusion-
           | webui/blob...
           | 
           | I tried changing the options in the config dict one by one,
           | but TBH nothing seems to make a significant difference behind
           | the default settings in benchmarks.
           | 
           | I haven't messed with compiling LORA training yet, as I dont
           | train much and it is sufficiently fast, but I'm sure it could
           | be done.
        
           | brucethemoose2 wrote:
           | Here is the InvokeAI code, minus the codeformer/gfpgan
           | changes that dont work yet:
           | 
           | https://gist.github.com/brucethemoose/ea64f498b0aa51adcc88f5.
           | ..
           | 
           | I intend to start some issues for this on the repo soon(TM).
        
       | fpgaminer wrote:
       | The thing I'm looking forward to most is having Flash Attention
       | built-in. Right now you have to use xformers or similar, but that
       | dependency has been a nightmare to use, from breaking, to
       | requiring specific concoctions of installing dependencies or else
       | conda will barf, to being impossible to pin because I have to use
       | -dev releases which they constantly drop from the repositories.
       | 
       | PyTorch 2.0 comes with a few different efficient transformer
       | implementations built-in. And unlike 1.13, they work during
       | training and don't require specific configurations. Seemed to
       | work just fine during my pre-release testing. Also, having it
       | built into PyTorch might mean more pressure to keep it optimized.
       | As-is xformers targets A100 primarily, with other archs as an
       | afterthought.
       | 
       | And, as promised, `torch.compile` worked out of the box,
       | providing IIRC a nice ~20% speed up on a ViT without any other
       | tuning.
       | 
       | I did have to do some dependency fiddling on the pre-release
       | version. Been looking forward to the "stable" release before
       | using it more extensively.
       | 
       | Anyone else seeing nice boosts from `torch.compile`?
        
       | mardifoufs wrote:
       | >Python 3.11 support on Anaconda Platform
       | 
       | >Due to lack of Python 3.11 support for packages that PyTorch
       | depends on, including NumPy, SciPy, SymPy, Pillow and others on
       | the Anaconda platform. We will not be releasing Conda binaries
       | compiled with Python 3.11 for PyTorch Release 2.0. The Pip
       | packages with Python 3.11 support will be released, hence if you
       | intend to use PyTorch 2.0 with Python 3.11 please use our Pip
       | packages.
       | 
       | It really sucks that anaconda always lags behind. I know the
       | reasoning*, and I know it makes sense for what a lot of teams use
       | it for... but on our side we are now looking more and more into
       | dropping it since we are more of an R&D team. We already use
       | containers for most of our pipelines, so just using pip might be
       | viable.
       | 
       | *Though I guess Anaconda chewed more than it can handle w.r.t
       | managing an entire Python universe, and keeping up to date.
       | Conda-forge is already almost a requirement but using the
       | official package (with pip, in this case) has its own benefits
       | for very complex packages like pytorch.
        
         | DreamFlasher wrote:
         | Afaik NumPy, SciPy, SymPy and Pillow are not managed/owned by
         | Anaconda? At least here: https://numpy.org/about/ Anaconda
         | isn't mentioned.
        
           | DreamFlasher wrote:
           | Ah, yeah they do have a Python 3.11 release, just not on
           | anaconda. Okay, yeah, for a couple of years now there isn't a
           | good reason anymore to use anaconda anyways.
        
             | mardifoufs wrote:
             | Yes that's the issue! Most of the software is already
             | ready, usable and just works... unless you use anaconda.
             | Now that I think about it, is there some technical reason
             | for that? I always thought it was mostly about stability,
             | but I can't imagine python 3.11 being so unstable as to
             | warrant waiting a whole year before even porting.
        
         | brucethemoose2 wrote:
         | The Arch Linux PyTorch 2.0 packages are great if you are
         | looking for "cutting edge," as they are compiled against CUDA
         | 12.1 now, instead of 11.8 like the official nightly releases.
         | You can also get AVX2 patched Python and optimized C Python
         | packages through CachyOS or ALHP.
         | 
         | But even Arch is still stuck on Python 3.10
        
       | simonw wrote:
       | "the MPS backend" - that's the thing that lets Torch run
       | accelerated on M1/M2 Macs!
        
         | datadeft wrote:
         | Yes, I am not sure at what extent is MPS a viable alternative
         | to CUDA. You seem to write a lot about ML models. Do you have a
         | detailed write about this subject?
        
         | sebzim4500 wrote:
         | Based on George Hotz's testing it is very broken. It's possible
         | it has improved since then, I guess but he streamed this a few
         | weeks ago.
        
           | dagmx wrote:
           | It supports a subset of the operators (as mentioned in the
           | release notes). I don't think it's broken for the ones that
           | it does support though.
        
             | mochomocha wrote:
             | That's been my experience. However when fallback to CPU
             | happens, it sometimes end up making a specific graph
             | execution slower. But that's explicitly mentioned by the
             | warning and pretty much expected.
        
             | brucethemoose2 wrote:
             | You'd think it would fall back to GPU/CPU for unsupported
             | operations instead of failing, but I guess thats easier
             | said than done.
        
             | norgie wrote:
             | Yes, this is my experience. Many off the shelf models still
             | don't work, but several of my own models work great as long
             | as they don't use unsupported operators.
        
               | glial wrote:
               | Where can I find a list of the supported operators?
        
           | bigbillheck wrote:
           | Based on George Hotz's performance at twitter I wouldn't bet
           | he wasn't holding it wrong.
        
             | jeron wrote:
             | so, you would bet he was holding it wrong?
        
           | danieldk wrote:
           | We tested inference for all spaCy transformer models and they
           | work:
           | 
           | https://explosion.ai/blog/metal-performance-shaders
           | 
           | It depends very much on the ops that your model is using.
        
       ___________________________________________________________________
       (page generated 2023-03-15 23:00 UTC)