[HN Gopher] Growing open source from Torch to PyTorch
       ___________________________________________________________________
        
       Growing open source from Torch to PyTorch
        
       Author : plinkplonk
       Score  : 35 points
       Date   : 2021-08-02 20:23 UTC (2 days ago)
        
 (HTM) web link (soumith.ch)
 (TXT) w3m dump (soumith.ch)
        
       | posharma wrote:
       | PyTorch is amazing. The article was a good read. Although I'm
       | confused. How can a ML framework be not obsessed with
       | speed/performance?
        
         | smhx wrote:
         | Author here. Being conscious about speed and performance is
         | different from making that your competitive advantage or USP.
         | 
         | Our main focus is usability, and one of our secondary focuses
         | is to not look like clowns in the performance department.
         | 
         | So, we try to take more decisions that trade off performance
         | for usability than vice versa.
        
           | mirker wrote:
           | Thanks for the post.
           | 
           | One question: One of the advantages about having a clean
           | design is that performance is easier to optimize, since the
           | 80%/20% rule of performance becomes much more obvious. How
           | true was this in your experience? Were there any major
           | performance-related design changes or was performance
           | optimization a matter of tuning a few selected functions?
        
           | ampdepolymerase wrote:
           | You are doing a good job balancing the two. For Julia's Flux,
           | they did the opposite and it has severe performance problems
           | compared to PyTorch despite being more usable and easier to
           | install.
           | 
           | Installing PyTorch with Poetry is next to impossible. Flux
           | got this right by bundling the GPU drivers. Their
           | installation is also standardized and does not require the
           | weird pip -f flag for CPU only installations.
        
             | amkkma wrote:
             | >it has severe performance problems
             | 
             | It had. It's now around parity with pytorch.
             | 
             | And no, it wasn't about a usability tradeoff.
             | 
             | It was about being more general- More general compiler,
             | more general code, more composable code.
             | 
             | Then, the team has been optimizing that and including
             | compiler optimizations in the language that benefit all
             | code. ML type code stressed that in a particular way.
             | Pytorch does ML array heavy stuff as a special case.
             | 
             | Julia will be doing the same, but it's setting the
             | groundwork for domain specific optimizations to be done in
             | package and user space. A different sort of philosophy
             | 
             | It was about being more greedy and setting the groundwork
             | for a more powerful tool in general, at some short term
             | cost.
             | 
             | They could have just wrote a framework that baked in fp
             | 32/64,16 with cuda kernels and tracing and operator
             | overloading computational graphs and gotten more speedup
             | over pytorch (in fact, avalon.jl takes that approach.),
             | with better usability.
             | 
             | But they didn't and now there's a burgeoning ecosystem that
             | does things no other framework can't. It's not quite as
             | marginally beneficial for current vanilla ML because that
             | is stuck in a local optimum, but I think that is going to
             | change: https://www.stochasticlifestyle.com/useful-
             | algorithms-that-a...
             | 
             | In the meantime, places like MIT, moderna, NASA etc are
             | reaping the benefits.
        
               | jsinai wrote:
               | > In the meantime, places like MIT, moderna, NASA etc are
               | reaping the benefits.
               | 
               | Can you elaborate more? MIT is well known but would
               | interesting to know how Moderna and NASA are using Flux?
        
               | amkkma wrote:
               | Sure!
               | 
               | NASA: https://www.youtube.com/watch?v=tQpqsmwlfY0
               | 
               | Moderna: https://pumas.ai/
               | https://discourse.julialang.org/t/has-moderna-used-pumas-
               | ai-...
               | 
               | There are many many more. These unique and sought after
               | capability are what got Julia Computing its 24 mil series
               | A (https://twitter.com/Viral_B_Shah/status/14171284162063
               | 76960)
        
               | amkkma wrote:
               | Some specific steps that will push it past jax/pytorch
               | for chunky array heavy GPU code (can already beat or meet
               | openblas/MKL for kernels written in scalar form).
               | 
               | 1. Better compile time memory management
               | (https://github.com/aviatesk/EscapeAnalysis.jl)
               | 
               | 2. Linalg passes built on generic composable compiler
               | ecosystem: https://youtu.be/IlFVwabDh6Q?t=818
               | 
               | 3. Metatheory.jl egraph based symbolic optimization
               | interleaved with the abstract interpreter:
               | https://github.com/0x0f0f0f/Metatheory.jl
               | 
               | 4. Partial eval mixed concrete and abstract
               | interpretation
               | 
               | 5. Compiler based autoparallel with dagger.jl
               | 
               | 6. New compiler integrated AD (as a package) that isn't
               | based on an accidental lispy compiler hack like zygote:
               | https://github.com/JuliaDiff/Diffractor.jl
               | 
               | 7. Changes to array semantics which will include generic
               | immutability/ ownership concepts.
               | 
               | And many more. The key is that all the initial groundwork
               | that traded off fundamental flexibility for specific
               | speed will then feed back into making the ML usecase
               | faster than if it had focused on that initially. People
               | can do all kinds of crazy yet composable things, in pure
               | Julia without modifying the base compiler.
               | 
               | Bonus: Being able to modify the type lattice to track
               | custom program properties. This means that you don't need
               | to be stuck into global tradeoffs with a static type
               | system and can do things like opt in track array shapes
               | at compile time per module: https://twitter.com/KenoFisch
               | er/status/1407810981338796035 Other packages like for
               | quantum computing are planning to do their own analyses.
               | It's generic and the usecases and compositions aren't
               | frozen at the outset. (unlike for example, the swift
               | tensors fitting perfectly proposal).
        
             | smhx wrote:
             | We ship everything needed for userland -- including parts
             | of CUDA/CuBLAS and CuDNN that we need (which is why our
             | binaries are so fat).
             | 
             | GPU drivers would be kernel-land and I don't think we
             | actually can install GPU drivers as part of a `pip
             | install`. Will look into what Flux is doing, but I doubt
             | they ship GPU drivers.
             | 
             | Separately, thanks for flagging the Poetry issue, we might
             | prioritize it, especially if the fix is easy.
        
               | amkkma wrote:
               | yes Flux doesn't ship GPU drivers. It ships everything
               | else (like CUDA toolkit etc) as needed, using the
               | artifact / pkg system, for all mainstream OSes. Doesn't
               | interfere with system libraries.
               | 
               | https://julialang.org/blog/2019/11/artifacts/
        
       | albertzeyer wrote:
       | It was necessary to move away from Lua to stay relevant within
       | the machine learning community. Python was a natural choice
       | because there was Theano and TensorFlow.
       | 
       | PyTorch could make use of the best API ideas from the other
       | frameworks (also higher-level like Keras). And it was executed
       | well. All these core principles of easy debuggability are indeed
       | very important to win developers. Clean code, understandable
       | code, flexibility, these are all very related to that, or mostly
       | the same thing.
       | 
       | It's easy to get bloated, complex and complicated for a
       | successful framework though. I wonder how PyTorch will look in a
       | few years. I also remember the first TensorFlow releases, where
       | the whole source code was also quite easy to understand. Then
       | TensorFlow added more and more things, and many different types
       | of APIs, starting to deprecate some earlier things, etc. The
       | PyTorch internal code is also already much more complex than it
       | was initially.
       | 
       | One reason JAX is now popular is because it again started with a
       | fresh API. Despite being based on a new kind of idea of code
       | transformations, which seems nice and powerful.
       | 
       | When looking at these developments, I really wonder what the
       | future will look like. It's good to have new ideas and new or
       | improved APIs. It's also good to adapt things for new kinds of
       | hardware (GPUs, TPUs, maybe other neuromorphic hardware later).
        
         | jimsimmons wrote:
         | Keras was a copy of Torch API. If you read the original Keras
         | readme it literally says so.
        
       | amkkma wrote:
       | As a julia user, thanks for this! Inspiring and packed with
       | pearls. There's a lot we can learn from the python community
        
       | blt wrote:
       | This article does a good job explaining how PyTorch gained an
       | advantage over TensorFlow. The 1.0 release of TensorFlow with
       | graphs and feed_dicts was a little clunky but made sense. After
       | 1.0 the second-system effect took hold quickly. Eager mode,
       | Keras, TFX ... it all started to look like a mess.
        
       ___________________________________________________________________
       (page generated 2021-08-04 23:00 UTC)