[HN Gopher] What we know about the Apple Neural Engine ___________________________________________________________________ What we know about the Apple Neural Engine Author : SerCe Score : 265 points Date : 2023-03-25 11:04 UTC (11 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | jeffrallen wrote: | Ane means donkey in French. | | Just sayin'. | marginalia_nu wrote: | Alright, then, Apple Semantic System. | eastbound wrote: | No those initials were already taken by Atlassian Software | Systems. They seem to have lodged the paperwork with that | name in 2002, and to have dropped it later on (they rather | went with TEAM when going IPO in 2015), but back in 2010 when | I applied, there was a book (collection of news articles) in | the waiting room for candidates titled "Atlassian Software | Systems". | | Great guys. | | https://youtu.be/VfyUbuFoiBU | marginalia_nu wrote: | Can't make this shit up. | dylan604 wrote: | Don't forget the Advance SubStation Alpha subtitle format | mrweasel wrote: | Which shortened also means donkey, brilliant. | ls612 wrote: | Does anyone know if the neural engine on the new M1/M2 Max is | directly hooked up to the unified memory the way the GPU is? | wmf wrote: | Define directly I guess. | ls612 wrote: | My understanding is that the CPU and GPU both have DMA to the | memory at some incredible speed since it's all on the same | chip. Does the ANE have that same DMA speed and latency? | jamiek88 wrote: | I believe so as it's used by adobe amongst others, this was | from a convo with an adobe engineer gushing about the | UMA/DMA and what an improvement it was from the fans | whirring jet engine end of Intel era. | | I can't find any documentation about it though just | everyone working under that assumption. | anentropic wrote: | Do we think Apple are going to provide more info and maybe a | public API over time? | | Or they are keeping it obscure for commercial reasons? | | Or just not very competent/don't care? | | Seems weird having these amazing chips and only blunt tools | my123 wrote: | CoreML. | | Directly exposing the ANE wouldn't make much sense, as it's an | IP block that changes between generations in incompatible ways. | brookst wrote: | This is the answer. CoreML gives you an abstraction over | different generations and sizes of underlying NPU. | | You might not _want_ the abstraction, but love it or hate it, | that's kind of the Apple way. | | It will be very interesting to see what their next chips look | like since we're getting to the point where HW designs will | reflect the rise of the, uh, transformers. | sebzim4500 wrote: | Can this really be everything publically known about the ANE? | Sounds hard to believe, I would have thought someone would have | reverse engineered _something_ about it by now. | SomeHacker44 wrote: | See other commenter above about GeoHot's analysis which is much | more in depth. | detrites wrote: | My question too. This semi-answer on the page seems to | contradict itself (source: https://github.com/hollance/neural- | engine/blob/master/docs/p... ): | | "> Can I program the ANE directly? | | Unfortunately not. You can only use the Neural Engine through | Core ML at the moment. | | There currently is no public framework for programming the ANE. | There are several private, undocumented frameworks but | obviously we cannot use them as Apple rejects apps that use | private frameworks. | | (Perhaps in the future Apple will provide a public version of | AppleNeuralEngine.framework.)" | | The last part links to this bunch of headers: | | https://github.com/nst/iOS-Runtime-Headers/tree/master/Priva... | | So might it be more accurate to say you can program it | directly, but won't end up with something that can be | distributed on the app store? | saagarjha wrote: | Correct. (It is also unlikely that Apple exposes the Neural | Engine directly.) | djoldman wrote: | geohot's findings: | | https://github.com/geohot/tinygrad/tree/master/accel/ane | ljlolel wrote: | https://news.ycombinator.com/item?id=35302833 | detrites wrote: | Ok, this is much more like what I expected from the OP. | | Anyone disappointed, here be full details on everything. | bitL wrote: | It's really terrible that Apple markets this as the next big | thing but forgets to include detailed documentation so people | have to experiment and figure out what works... | lonelyasacloud wrote: | Part Apple's docs haven't been great for a while, part that's | just how they roll, and part trying (like most everyone) to | figure out what their strategy is going to be in a post GPT4 | world [0]. | | [0] Persist with their own models running locally, how much to | integrate with rest of the OS and maintain privacy moral | ground, that sort of thing. | barkingcat wrote: | Apple didn't "forget" they never want to ever release apple | proprietary docs. It's their competitiveness advantage/ moat. | saagarjha wrote: | People don't have to do anything. You use CoreML to program it. | cynicalsecurity wrote: | Proprietary software, dude. It really sucks. | m3kw9 wrote: | Can LLM run in it? | egman_ekki wrote: | maybe https://github.com/apple/ml-ane-transformers | enzomanico wrote: | Yes of cour | simonw wrote: | So my phone and my laptop both have the capability to perform 15 | trillion operations per second, just in the neural engine? | | What kind of things are taking advantage of this right now? It's | gotta be more than just Face ID right? | | What's my laptop likely to be doing with that? | k_bx wrote: | They're putting it everywhere they can. From Notes to pressing | pause on Video in QuickTime or Safari and copying text from a | frame instantly. | [deleted] | gedy wrote: | I can imagine some thing like Siri running on device much more | effectively against local content. The cynic in me doesn't want | to hope too much for cloudless services like this, but one can | hope. | sroussey wrote: | This is true since iOS 15 moved Siri on device. | sroussey wrote: | https://www.engadget.com/ios-15-siri-on-device-app- | privacy-1... | IIAOPSW wrote: | Siri wasn't a product. She was an emergent feature they | couldn't extinguish. | conradev wrote: | It's used for a variety of things: | | - Biometrics (Face ID and Touch ID) | | - Image analysis (face matching, aesthetic evaluations, etc) | | - Text to speech and speech to text (smaller models on device, | used for privacy/latency/reliability) | | - Small ad-hoc models like Raise to Speak on Apple Watch, the | Hey Siri detector | (https://machinelearning.apple.com/research/hey-siri) | | These things have been in phones for 5 years now and have been | used from day one | simonw wrote: | Right, but do any of those things really need 15 trillion | operations for second? Have they been getting noticeably | better with upgraded phone models? | jamiek88 wrote: | Yes definitely. | | I could only find a blurry YouTube video of the instruction | manual for an old old heater in my house. | | I paused the video on the bit I needed the guy had zoomed | into and was able to copy and paste the text that I could | barely read into a notes doc. | | There's no one splashy thing just lots of little quality of | life improvements. | burnished wrote: | I got one recently and generally think the phone is | garbage, but the OCR built into pictures is really | something else. I took a photo of a label for a barcode | when I couldnt see it myself but could get my hand nearby, | it was at an odd angle, but when I pressed my finger to the | text I was interested in the phone captured it immediately, | highlighted it, and I copied it nice as you please. | blululu wrote: | No but the first party users should not consume all the | compute on the chip. The bigger the margin the better for | the device. The other aspect of this is speed and power | consumption (battery life is a top 3 phone feature across | pretty much all consumers). | secretsatan wrote: | Arkit makes use of it on the phone, there's plane detection and | classification, image and object detection, segmentation for | people occlusion, probably more behind the scenes. | | I find it a little frustrating we aren't using the built in | capabilities of iphones more in our company, i still kinda | think apple tech is kind of a pariah in some circles, so we | have to run with stuff that runs on cloud that costs us money | over, heaven forbid something you could run on an iphone | kmeisthax wrote: | There's an app called Draw Things, for iOS/iPadOS/macOS/etcOS, | that uses the ANE to run Stable Diffusion on your | phone/tablet/laptop. | fauigerzigerk wrote: | I don't know for sure, but things like text recognition (Live | Text) or object recognition in Photos (Visual Look Up) are | obvious candidates. | | I think neural engine is absolutely key to Apple's strategy. | They want people to buy expensive devices and they don't want | to process user data on their servers. | | Users get privacy. Apple gets money. It's a pretty coherent | business model. | jjoonathan wrote: | Privacy isn't the only benefit of local compute, users also | get colossal bandwidth, tiny latency, and high reliability. | crazygringo wrote: | On the other hand, it kills your battery. | | Back when dictation was done in the cloud, I could dictate | all day on my iPhone no problem. | | Now that it's on-device it kills my battery in a couple of | hours. | | The latency is absolutely improved, and continuous | dictation (not stopping every 30s) is a godsend. | | But it does absolutely destroy your battery life. | bibanez wrote: | Don't worry too much because there is Moore's law to the | rescue. NPUs benefit from new processes | mcculley wrote: | Moore's Law makes it a good long term strategy for Apple. | The GP is complaining about his battery life today. | lucideer wrote: | I hope it's not disrespectful to point this out, less | than 24 hours after his passing, but I don't think Gordon | would object to my pointing out that Moore's Law has a | finite length. Some have argued it expired up to 13 years | ago; Moore himself predicted another 2 years or so. | nwienert wrote: | I built an always on local OCR system that used ML on | CPU/GPU a few years ago and I can say with confidence it | doesn't use much. We literally scanned your entire screen | every two seconds and it used less than 1% in total, and | this was before CoreML which is far more efficient. I | think it's FUD that it is that significant. | fauigerzigerk wrote: | Agreed. | | On the downside, we have to acknowledge that it is hugely | inefficient for everyone to own expensive hardware that has | to sit idle most of the time because it would otherwise | drain the battery. | | Where low latency is not an absolute necessity, the | economic pull of the cloud will be tremendous, especially | if mobile networks become ubiquitous and fast. | vinay_ys wrote: | That's a weak argument. Lots of hardware sits idle in the | cloud as well. And on your phone its not expensive. In | fact, the $/tflop is cheaper on phone than in the cloud - | cloud has to deal with all kinds of complexity that you | assume away in your local single-tenant phone context. | fauigerzigerk wrote: | I wouldn't be so sure. A quick web search brings up | average server utilisation numbers for large-scale cloud | providers between 45% and 65%. That's probably an order | of magnitude or two higher than what you could do on a | mobile device without absolutely annihilating the | battery. | flutas wrote: | > They want people to buy expensive devices and they don't | want to process user data on their servers. | | > Users get privacy. Apple gets money. | | Apple also gets users to subsidize the cost of compute | indefinitely (by buying the expensive phone), rather than | using their servers. | blululu wrote: | It's not a subsidy. It's a pricing structure for a | commercial transaction. Fundamentally a business can not | just give out free compute. In the long run the user of | computation needs to pay for it. It's a question of whether | people feel more satisfied paying for it in a lump sum | bundled with a device or through a subscription plan on the | cloud. For frequently, on demand, low latency applications | I would suspect that people will always be happier running | the computations locally. | saagarjha wrote: | Apple also runs an OS on that device, so they can't just | offload infinite computation for it: it would use too much | battery. | iamgopal wrote: | 15 trillion operation per second ? Of what kind ? Addition ? | Isn't that mind blowing ? | selectodude wrote: | Matrix multiplication | amelius wrote: | Of what size?? | sebzim4500 wrote: | I know you probably didn't mean this, but in case anyone is | confused ANE is not doing 15 trillion matrix | multiplications per second. It is doing 15 trillion scalar | operations in order to multiply a much smaller number of | matrices. | blululu wrote: | To my knowledge this is mostly used by internal tools, though a | number of common 3rd party apis (qr code scan) use hw | acceleration under the hood. Internally there is a ton of ML | running on the device. The most obvious is touch screen and | inputs and the camera. 3rd party developers have acres to this | via CoreML, but unless latency is critical it is usually easier | to develop and run ml on the cloud. For camera apps using ml, | this chip is going to be used either explicitly or implicitly. | simonw wrote: | Oh the touch screen! That's fascinating, is that definitely | running stuff on the neural engine? | blululu wrote: | If you think about it a capacitive touch sensor provides a | noisy Grayscale image and the goal is to detect and | classify blobs as touch gestures as quickly and accurately | as possible. Since it is running at all times and latency | really burns the UX. Consequences this has always been done | on a HW accelerator. | ManuelKiessling wrote: | Running Microsoft Teams. Barely. | waboremo wrote: | Scene analysis in photos, image captions, and machine | translations are also done using ANE. CoreML also utilizes it | when possible. | mmaunder wrote: | Anyone done any work on using a model for transcription on the | local device using the ANE? I've heard it kills the battery. | Having to transcribe voice in the cloud is a serious impediment | to end to end encryption for certain applications. | intalentive wrote: | This is close: https://github.com/ggerganov/whisper.cpp | thedonkeycometh wrote: | [dead] | ah- wrote: | There's also basic ANE support for Asahi now: | | https://github.com/eiln/ane | | https://github.com/eiln/anecc | | https://github.com/AsahiLinux/m1n1/pull/296/files | rowanG077 wrote: | That's misleading. It's much more apt to say it's being worked | on. This is not available in any Asahi release at this time. ___________________________________________________________________ (page generated 2023-03-25 23:00 UTC)