[HN Gopher] Apple: Transformer architecture optimized for Apple ...
       ___________________________________________________________________
        
       Apple: Transformer architecture optimized for Apple Silicon
        
       Author : behnamoh
       Score  : 78 points
       Date   : 2023-03-23 22:31 UTC (28 minutes ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | totoglazer wrote:
       | (2022)
        
       | hannofcart wrote:
       | As someone entirely at sea with the rapid pace of development in
       | this sphere:
       | 
       | 1. Is this a new LLM from Apple?
       | 
       | 2. Is this a way to optimize running LLMs like Llama locally on
       | M1 macs?
       | 
       | 3. Something else altogether?
        
         | uoaei wrote:
         | 2. A Transformer is a core building block of LLMs.
         | 
         | > [T]he device spec for this reference implementation is M1 or
         | newer chips for the Mac and A14 and newer chips for the iPhone
         | and iPad
        
         | jeffbee wrote:
         | It's none of those things. It is tweaks of other existing code
         | to run better on apple's hardware. This other article is far
         | more informative than the repo:
         | https://machinelearning.apple.com/research/neural-engine-tra...
        
         | sheepscreek wrote:
         | #2. A way to optimize running LLMs locally on Apple Silicon
         | (including iPhones)
         | 
         | I am just a little better informed. As I understand it, their
         | code improves model performance and memory consumption using
         | PyTorch and Huggingface libraries.
        
       | iamsanteri wrote:
       | The race is on and ecosystems are moving fast.
        
       | great_psy wrote:
       | Maybe apple will have a bigger effect on ai adoption than any
       | other company.
       | 
       | Local inference is huge for anything that requires even a little
       | bit of privacy.
        
       | endisneigh wrote:
       | i'd say within 5 years apple will have optimized apple silicon
       | and their tech, along with language model improvements, such that
       | you will be able to get gpt-4 level performance in the iPhone 19
       | with inference happening entirely locally.
       | 
       | openai is doing great work and is serious competition, but I
       | think many underestimate big tech. once they're properly
       | motivated they'll catch up quick. I think we can agree that
       | openai is a sufficient motivator.
        
         | bottlepalm wrote:
         | Maybe we should launch 100 of them out into space in different
         | directions. Very low mass means we should be able to push it to
         | a pretty high velocity.
        
       | passwordoops wrote:
       | Weird I just read this tweet [0] arguing Apple will be launching
       | their own secure and private LLM that runs on device (edge
       | compute).
       | 
       | https://twitter.com/LinusEkenstam/status/1638999208911949845...
        
       | tinyhouse wrote:
       | This is great. I cannot wait to try it on my laptop as I like to
       | do dev locally. But I don't understand the development part -
       | besides on device, how would you deploy this on a server let's
       | say given they are all linux based?
        
       | au8er wrote:
       | While the github contains the code, the article describing the
       | optimisations are here:
       | https://machinelearning.apple.com/research/neural-engine-tra....
       | 
       | TL;DR: execution of pytorch models on apple's neural engine and
       | standard data-oriented optimisations (changing matrix layout,
       | chunking to optimise temporal cache locality, and minimising
       | redundant memory copies)
        
       ___________________________________________________________________
       (page generated 2023-03-23 23:00 UTC)