[HN Gopher] Arm Announces Armv9 Architecture: SVE2, Security, an...
       ___________________________________________________________________
        
       Arm Announces Armv9 Architecture: SVE2, Security, and the Next
       Decade
        
       Author : marc__1
       Score  : 161 points
       Date   : 2021-03-30 18:07 UTC (4 hours ago)
        
 (HTM) web link (www.anandtech.com)
 (TXT) w3m dump (www.anandtech.com)
        
       | bogomipz wrote:
       | The articles states:
       | 
       | >"SVE2 was announced back in April 2019, and looked to solve this
       | issue by complementing the new scalable SIMD instruction set with
       | the needed instructions to serve more varied DSP-like workloads
       | that currently still use NEON."
       | 
       | Could someone say what "DSP-like workloads" workloads would be? I
       | understand what a DSP is but I'm wondering what type of workloads
       | that are not signal processing share similar characteristics.
        
         | zsmi wrote:
         | My guess is anything that produces an ordered sequence of
         | numbers that need to be dealt with. For example a network card
         | or any character based IO device. Just a couple of examples of
         | the top of my head.
        
       | monocasa wrote:
       | I'm really surprised that Apple released Mac ARM cores without
       | SVE. It feels like Neon compat is going to be an albatross for a
       | platform like Mac that can't be quite as a aggressive at removing
       | ISA features as iOS devices can be.
       | 
       | But hey, maybe they just say 'screw it' and remove it anyway.
        
         | Tuna-Fish wrote:
         | As Apple and AMD are currently clearly demonstrating, SIMD just
         | really doesn't matter much.
         | 
         | Only a portion of the workloads that are commonly used can be
         | profitably vectorized using SIMD. The curiously perverse nature
         | of SIMD is that the wider the vectors, the smaller the
         | proportion of time used by these portions is, and therefore the
         | less you gain from further vector width increases. AMD is
         | currently spanking Intel in almost all practical vector
         | workloads, despite having half the vector width. Apple isn't
         | far off either, despite a quarter of the vector width.
         | 
         | It wouldn't really ever significantly hurt Apple if they just
         | literally never implemented any flavor of SVE. Spending all
         | that engineering effort on improving scalar throughput probably
         | has much better real-world payoff.
        
           | monocasa wrote:
           | The thing with SVE though is that it's one of those CDC6600
           | inspired designs like RV-V and ARM Helium that does a much
           | better job abstracting the number of hardware vector lanes.
           | That means way better power consumption at the low end
           | transparently which is very much on Apple's radar.
        
           | TinkersW wrote:
           | Your example fails to point out _why_ AMD and Apple are able
           | to compete despite having smaller vector widths, and no it
           | isn 't because "SIMD just really doesn't matter".
           | 
           | It is because AMD and Apple have wider architectures with
           | more vector ports, they can execute 3 or 4 of these
           | instructions per cycle while Intel can only execute 2(or even
           | 1 in some cases with avx512).
           | 
           | AMD has already said they will be adding AVX512 to the next
           | zen, so they apparently think SIMD matters.
           | 
           | Apple will almost certainly implement SVE, they would be
           | stupid to not do so, and they aren't stupid.
        
         | wtallis wrote:
         | The initial lack of SVE strikes me as very similar to the first
         | round of Intel Macs launching with 32-bit only processors. It
         | was 5 years before OS X dropped support for those machines, and
         | 13 years before macOS dropped support for 32-bit applications.
         | I wouldn't be surprised to see Apple accelerate those deadlines
         | a bit this time around, and make macOS start requiring SVE 3-4
         | years after they introduce supporting hardware. That would
         | probably be the point at which it was appropriate for third-
         | party applications to start requiring SVE-capable hardware.
         | 
         | I don't think keeping NEON capability in hardware is going to
         | hold back their chips much, so they probably won't be under any
         | pressure to break compatibility with NEON-using apps anytime
         | soon.
        
           | my123 wrote:
           | Will take much longer than that, because even binaries can be
           | shared between macOS and iOS this round.
           | 
           | And the iPhone 12 won't be out of support in 3 years.
        
       | 2bitencryption wrote:
       | Question - now that Apple is shipping its own ARM silicon - is
       | Apple beholden to the ARMv9/10/11/etc future? (edit: duh, Apple
       | has been shipping ARM silicon long before M1, I forgot)
       | 
       | Does Apple now have enormous input into the ARM spec process? (Or
       | maybe they did already, because iPhones?)
        
         | gchadwick wrote:
         | The conditions in the agreement between arm and Apple aren't
         | public so you can't say for sure. Historically arm hasn't
         | allowed architecture licensees to do their own thing. You're
         | implementing arm standard architecture (and passing the
         | conformance test suite) or nothing. Looks like they've given
         | Apple some special dispensation to do things differently (e.g.
         | the custom AMX instructions that have been uncovered https://gi
         | st.github.com/dougallj/7a75a3be1ec69ca550e7c36dc75...).
         | 
         | For ARMv9 this potentially means they can pick and choose what
         | they want to implement. I'm sure Apple has had plenty of input
         | into the specification (along with other partners).
        
           | my123 wrote:
           | They can only implement a superset of a specification, no
           | mix-and-match allowed.
           | 
           | As an exception, an ARMv8.x-A chip can have some ARMv8.x+1-A
           | extensions, but _never_ 8.x+2-A extensions (forbidden by
           | Arm).
           | 
           | You just have one ISA minor revision of wiggle room.
           | 
           | And yes, Arm architectures are co-designed with plenty of
           | output from partners.
        
             | monocasa wrote:
             | Except we have evidence that they've been implementing a
             | subset of the specification on M1. Like VHE stuck on.
             | 
             | The not well kept rumor is that Apple has a much looser
             | than architectural license due to their very close
             | relationship with ARM, both generally since the early 90s,
             | and that Apple contributed very heavily to early AArch64
             | design and it's arguably theirs as much as it is ARM's.
        
               | my123 wrote:
               | VHE is implemented and stuck as on indeed.
               | 
               | There's also the fact that only pc and sp are retained on
               | WFI, without the other GPRs.
               | 
               | Those quirks only affect bare metal kernel-mode (EL2),
               | and do not affect user-mode or virtualised machines at
               | EL1 in any way.
        
           | ampdepolymerase wrote:
           | ARM was founded as an Apple joint venture. Their relationship
           | is probably special.
        
         | Nokinside wrote:
         | Apple don't use ARM's microarchitecture.
         | 
         | Apple designs their own chips using ARM ISA (instruction set
         | architecture). They design completely different chips separate
         | from ARM.
         | 
         | Handful of companies like Apple, Broadcom, Marvell, Intel,
         | Qualcomm, Samsung, have bought ARM architecture license.
        
         | my123 wrote:
         | Apple is a co-founder of Arm. Arm partners all play a role in
         | the design of the Arm specifications.
        
           | kps wrote:
           | To be clear, Apple was a co-founder of Advanced RISC Machines
           | Ltd (which became the current Arm Ltd), five years after the
           | ARM (Acorn RISC Machine) _architecture_.
        
         | barkingcat wrote:
         | Apple is one of the co-founders of the Arm holdings company all
         | the way back then.
         | 
         | Even though Apple and Arm are possibly "at-arms" organizations
         | these days (no one outside of these organizations really will
         | know, except the lawyers, and the contractual obligations
         | between Apple and Arm are likely locked up and highly
         | secretive)
         | 
         | I'd say the other way around, that Arm is beholden to where
         | Apple wants to take the Arm ecosystem, if only by way of
         | showing other participants in the arm ecosystem what is
         | possible. IE. Amazon is able to use the M1 as a gauge to how
         | far it might be able to take its own Graviton cores.
        
           | monocasa wrote:
           | Yeah, agreed. People assume that the power relationship
           | between Apple and ARM is the same as between Apple and other
           | architectural license holders, but all signs point to their
           | relationship being very different with them being at least
           | full partners and maybe Apple being the one dictating terms.
        
         | GeekyBear wrote:
         | The fact that the current M1 already ships with an undocumented
         | Matrix Multiplication implementation leads me to believe that
         | Apple was one of the development partners for this new version
         | of the spec.
         | 
         | https://medium.com/swlh/apples-m1-secret-coprocessor-6599492...
         | 
         | >All licensees are not equal however, the first few are called
         | lead licensees and companies pay an added fee for this honor.
         | ARM picks 2-3 lead licensees for each market segment and works
         | closely with them.
         | 
         | https://semiaccurate.com/2013/08/07/a-long-look-at-how-arm-l...
        
       | Quillbert182 wrote:
       | It will be interesting to see what effect this has on future
       | Raspberry Pis and other single board computers.
        
         | yalogin wrote:
         | If the OS doesn't want to use it, would it even impact the Pi?
        
         | wmf wrote:
         | The RPi seems to intentionally lag around 5 years behind the
         | Arm roadmap so maybe the RPi 7 will have Armv9 in 2027.
         | 
         | I don't intend this as a criticism, just a fact. Newer cores
         | cost more so RPi has to use older technology to meet its price
         | target.
        
           | Waterluvian wrote:
           | This reminds me of Game Boy vs. Game Gear and I think it's a
           | very smart move.
        
           | monocasa wrote:
           | There's a rumor on the grapevine that RPi doesn't pay the
           | license fees for their cores. There's probably more to it
           | than that (maybe they do for compute modules which are
           | explicitly not for the .edu market?), but the word on street
           | is that that baseline licensing cost is $0 for them from ARM.
           | 
           | I ultimately think you're right, and we won't see a V9 in an
           | RPi for a while, but it's a more complex situation than most
           | SoC integrators that has a slight chance of working out in
           | RPi's favor. Does ARM care enough to make a tiny core in
           | their gate count niche? Does ARM want to give the free new
           | cores to increase market share and get V9 features in the
           | hands of tinkerers? etc.
        
             | Teknoman117 wrote:
             | Why would RPi pay the license fee for ARM cores? Up until
             | the Pi Pico, they didn't make their own SoCs. You don't
             | have to be a licensee if you're only consuming processors.
             | 
             | edit - the microcontroller was called Pico, not Nano
        
               | monocasa wrote:
               | RPi is basically a subsidiary of Broadcom. And it was RPi
               | engineers that worked on each of the SoCs past the
               | original BCM2835.
        
               | ohazi wrote:
               | They're not _actually_ a subsidiary, but historically
               | they 've been pretty close. Regardless, Broadcom is
               | absolutely the entity responsible for negotiating with
               | ARM and paying the license fee.
               | 
               | If there's any truth to this rumor it's probably ARM
               | discounting the license fee for the fraction of devices
               | that Broadcom sells to the Raspberry Pi foundation.
               | 
               | ARM knows that RPi is the go-to ARM SBC, and they benefit
               | tremendously when developers treat it as a first-class
               | platform.
        
               | monocasa wrote:
               | Sure I didn't say that they were a true subsidiary, I
               | used the word "basically".
               | 
               | At that point, yes Broadcom is ultimately paying the fees
               | to ARM, but separating RPi from that negotiation is an
               | oversimplification. RPi absolutely has a seat at that
               | table.
               | 
               | And ARM probably doesn't care that much about it being
               | the go to SBC (some cheap chinese board would take on
               | that mantle without RPi leading the charge), but instead
               | that it's the practical successor to what ARM was founded
               | to do. Put passable performance, hackable computers
               | designed by Brits in front of British school children as
               | cheaply as possible.
        
         | john_alan wrote:
         | Yep agree.
         | 
         | Raspberry Pi needs to do two things to become really useful.
         | 
         | 1) ship chips with native-AES/hardware crypto
         | 
         | 2) get rid of SD and have onboard NAND, like a phone
         | 
         | I've raised this on their forums but just get flamed for some
         | reason by senior engineers suggesting that these ideas are
         | ridiculous and it's for education in Africa, I was even banned
         | for suggesting they were shortsighted.
         | 
         | Maybe education's where it started, I don't see why they are
         | blind to the reality that most current users are tinkerers and
         | Linux hobbyists.
        
           | pjmlp wrote:
           | Linux hobbyists have plenty of boards to chose from, even
           | before Pi existed like the Beagle Board.
        
           | rvz wrote:
           | > 2) get rid of SD and have onboard NAND, like a phone
           | 
           | Oh dear. That will involve the process of flashing OS onto
           | the Pi and will increase the risk of bricking it. That is the
           | reason for using SD cards instead of a NAND chip.
           | 
           | Secondly, you can't upgrade the storage on it either and
           | would have to choose a RPi with fixed storage space on it.
           | Might as well get a M1 Mac Mini.
           | 
           | I cannot imagine having to choose a future RPi 5 having
           | either a 8GB, 16GB, 32GB NAND' and being unable to upgrade
           | the space on it. So no thanks and no deal to (2).
        
             | cozzyd wrote:
             | The BeagleBoneBlack has eMMC and an SD card and a hardware
             | switch to force it to boot off the SD card. Plus, flash is
             | rarely really bricked, you can just hook it up to an SPI
             | bus...
        
             | Teknoman117 wrote:
             | The compute modules have a DFU mode and the flash tool
             | loads a small binary to effectively turn it into a USB
             | stick to write to the internal flash.
             | 
             | But that being said, I've burned out plenty an SD card.
             | Haven't managed to do that to an eMMC module yet.
             | 
             | Also, on devices like the Beaglebone, you can use eMMC and
             | an SD card at the same time.
        
             | CameronNemo wrote:
             | On lots of pine boards you can flip a switch or short some
             | pins to prevent the bootrom from using the SPI flash or
             | eMMC. The SPI is not removable, but flashing a bad payload
             | to it is not an unrecoverable error.
        
             | derefr wrote:
             | If the Pi can still boot from USB, then you can re-flash
             | the NAND. No chance of bricking it.
        
             | qwertox wrote:
             | I agree, I have enough bad experience with on-board NAND
             | from other SBCs; it's no fun at all.
             | 
             | The SD cards are nice for most use cases, but on some
             | boards I'd really like to see a SATA port or M.2 slot which
             | can be booted from and a x4 PCIe port (even if through an
             | optional board or a HAT).
             | 
             | I see the SD cards on the Raspis as modern floppy discs.
             | They are good to have, but not ideal for some cases.
        
               | unixhero wrote:
               | M.2 is the way
               | 
               | People have already achieved it by hardware hacking
        
               | geerlingguy wrote:
               | Hardware hacking not necessary if you use the Compute
               | Module 4; many boards are already integrating M.2 slots:
               | https://pipci.jeffgeerling.com/boards_cm
               | 
               | My hope is the next Pi model B might include an M.2 slot
               | (maybe 42mm long) on the bottom.
        
               | willis936 wrote:
               | I personally would not trade the hat real estate for an
               | M.2 slot in most use cases. I _use_ that hat slot for
               | peripherals that enable unique and fun applications.
               | Micro SD is horribly unreliable, but idk what the
               | solution is without trading a significant amount of
               | space.
        
             | read_if_gay_ wrote:
             | Just because there's onboard storage doesn't mean they
             | couldn't still give it an M.2 or something. And you can
             | install an OS just fine by booting off USB, just as you'd
             | do with any other computer.
        
           | tyingq wrote:
           | Rpi does seem to have already split to serve two different
           | use cases. The normal board for hobby/educational purposes,
           | and the compute module for embedded "real" use. That works
           | for #2. I suspect #1 depends a lot on what Broadcom can give
           | them at a price point that preserves the Rpi history of being
           | dirt cheap.
        
           | PontifexMinimus wrote:
           | > 1) ship chips with native-AES/hardware crypto
           | 
           | > 2) get rid of SD and have onboard NAND, like a phone
           | 
           | Why are these important?
        
             | ed25519FUUU wrote:
             | At least for 1) hardware accelerated encryption/decryption
             | makes many things faster. Even just browsing the web.
        
           | etaioinshrdlu wrote:
           | What's so important about hardware crypto? For me, the real
           | game-changer would be some type of better GPGPU support on
           | existing and future devices. Most of the FLOPs are in the GPU
           | and it's not very practical to use it for compute due to lack
           | of drivers and APIs.
        
             | stefan_ wrote:
             | There aren't a lot of flops at all in the RPi GPU, though.
             | They probably want some sort of Neural Network inference
             | engine, both because it is easy enough and makes for good
             | educational content.
        
               | etaioinshrdlu wrote:
               | It turns out to be a difficult question to answer as to
               | the number of FLOPs of the Pi 4 GPU, see here:
               | https://www.raspberrypi.org/forums/viewtopic.php?t=244519
               | 
               | But it does appear the the GPU has perhaps 2x to 5x the
               | FLOPs of the CPU, albeit with not much in the way of APIs
               | to actually use it.
        
             | hedora wrote:
             | Hardware crypto has an incredibly low gate count, and
             | essentially all network communication requires crypto.
             | Without it, you waste the (much more power and die-space
             | intensive) CPU on AES.
        
           | numpad0 wrote:
           | Got to have proper power section first. The left-bottom
           | section. And designated external power headers rather than
           | the USB or GPIO backfeeding(PoE pins might work for that
           | already?). Some claims the chronic SD card longevity issue in
           | all Pi is coming from dirty power Pi feeds.
        
           | KindOne wrote:
           | >onboard NAND.
           | 
           | I disagree. Having onboard NAND seems like a disadvantage,
           | for like when it goes bad from all the usage.
           | 
           | Should have one M.2 slot.
        
             | CameronNemo wrote:
             | Have you seen the new firefly rk3568 offering? They have
             | lots of I/O including an M.2 slot and SATA port.
        
             | api wrote:
             | These aren't mutually exclusive. I would vote for onboard
             | NAND and an M.2 slot, and perhaps also a jumper or
             | something to select which to boot from.
             | 
             | Normal thing might be to boot from on-board NAND but put
             | real workloads on M.2 if present.
             | 
             | I have to also vote for crypto extensions. The Pi 4 is the
             | only ARM64 I have ever seen that lacks AES and CLMUL. Makes
             | it pretty lame for web serving, router/firewall/VPN
             | gateway, etc.
        
               | john_alan wrote:
               | Yep lack of crypto makes it dog slow for so many things.
               | Including running a little Monero node or whatever.
               | 
               | I would love to know why they chose to exclude this, it
               | couldn't make a big difference in price could it?
        
               | rubyist5eva wrote:
               | > Yep lack of crypto makes it dog slow for so many
               | things. Including running a little Monero node or
               | whatever.
               | 
               | Good.
               | 
               | The last thing we need is profit mad crypto miners buying
               | up every single Pi and we end up with shortages and
               | insane prices just like with gaming GPUs.
        
               | john_alan wrote:
               | Running a node is about seeding the chain not mining,
               | ofc.
               | 
               | Also a free market means people can do with the stuff
               | they buy as they wish.
               | 
               | Whether I mine on it or stick it up my arse is just as
               | valid as you gaming.
        
               | rubyist5eva wrote:
               | Doesn't change the fact I'm glad they excluded features
               | that make it unfavourable to crypto people - for any
               | reason - so that I can buy a pi for what it's intended
               | for at a decent price instead of wondering if people are
               | buying them to shove them up their arse.
        
               | sudosysgen wrote:
               | Crypto extensions will never make the rpi even remotely
               | cost competitive with even very old GPUs.
               | 
               | The main advantage will be general network ops.
        
               | john_alan wrote:
               | I wouldn't bother to educate someone so closed minded.
        
               | rubyist5eva wrote:
               | Just because I don't want to shove a rasberry pi up my
               | arse doesn't mean I'm closed minded.
        
               | api wrote:
               | He said node not miner. There is no way a Pi even with
               | AES could possibly be worth running as a miner.
        
           | zucker42 wrote:
           | There are plenty of SBCs with AES support.
        
             | john_alan wrote:
             | yep but not with Raspbian and the community, or wifi, or
             | ...
        
         | oblio wrote:
         | Honest question from an outsider: except for the
         | hobbyist/techie points, is there a solid reason why you'd buy a
         | Raspberry Pi?
         | 
         | Aren't there any Intel Celeron or some such super cheap x86
         | small boards? I imagine you'll get a ton more oomph, probably
         | much better software support and they shouldn't cost that much
         | more.
        
           | asimovfan wrote:
           | I use a rpi 4 8gb as my daily driver. It does everything and
           | its hard to get distracted with.
        
           | ahhname wrote:
           | The x86 based SBC are usually a lot more expensive. Usually
           | they start around $100 or so, but at that point they use way
           | outdated chips. For instance, the basic LattePanda has an
           | Atom x5-Z8350, which is limmited to 2GB of DDR3 RAM.
           | 
           | The Pis really can do a lot too. Mine hosts a vpn, pihole,
           | calibre-web, and a couple of other basic things. Really the
           | only downside to the ARM based SBCs is they mostly use
           | MicroSD cards, which probably are the greatest bottleneck in
           | the system and mostly likely thing to fail.
        
           | teraflop wrote:
           | Almost every x86 board on the market is way more expensive
           | and/or way more power-hungry than the Raspberry Pi.
           | 
           | The only one I'm aware of that's in the same ballpark is the
           | Atomic Pi, which was apparently a limited production run
           | using heavily discounted surplus components. It's not as
           | popular as you'd expect given the $40 price point, which I
           | assume is because there isn't anything like the same level of
           | community support that the Raspberry Pi has.
        
             | ekianjo wrote:
             | > Almost every x86 board on the market is way more
             | expensive and/or way more power-hungry than the Raspberry
             | Pi.
             | 
             | Used Celeron NUCs are not more expensive, and they are way
             | more powerful.
        
               | teraflop wrote:
               | Could you share a link, then? The prices I've seen for
               | the Intel NUCs are more like $100+, even used, compared
               | to $30 for the cheapest Raspberry Pi 4.
        
             | mkaic wrote:
             | Happy Atomic Pi customer here, and their lack of popularity
             | confuses me too. Even comes with wifi out of the box! :P
        
           | senko wrote:
           | Bought a couple hundred of them for on-site devices
           | (intentionally avoiding IoT branding here). We use them as
           | plug (into mains and net) and play appliances, zero knowledge
           | required by the end user.
           | 
           | Cheap, robust, solid support and tools. Definitely hit the
           | sweet spot for what we needed.
        
           | gregmac wrote:
           | It's a standardized piece of kit at a great price point with
           | broad and long-term availability around the world. There's
           | simply not anything that hits all those points in the x86
           | world.
           | 
           | It's a good target if you want to provide something very
           | close to a "turn-key appliance" without actually selling your
           | own hardware: You can provide an image that a user writes to
           | a microSD card, plugs in, and it just works.
           | 
           | For example: HomeAssistant [1] for home automation, OctoPi
           | [2] 3D printer software, OpenElec [3] media center, RetroPi
           | [4] game console emulator, Volumio [4] music player, and lots
           | more [5].
           | 
           | [1] https://www.home-assistant.io/installation
           | 
           | [2] https://octoprint.org/download/
           | 
           | [3] https://openelec.tv/
           | 
           | [4] https://retropie.org.uk/
           | 
           | [5] https://volumio.org/
           | 
           | [6] https://github.com/thibmaek/awesome-raspberry-pi
        
             | chaosharmonic wrote:
             | > It's a good target if you want to provide something very
             | close to a "turn-key appliance" without actually selling
             | your own hardware: You can provide an image that a user
             | writes to a microSD card, plugs in, and it just works.
             | 
             | Better yet - they boot from USB now (and even NVMe if
             | you're using a CM4).
        
           | centimeter wrote:
           | The raspberry pi is pretty popular here, but as someone who
           | works on both embedded and server applications, I find it to
           | be a massive piece of shit. The broadcom SoCs they use are
           | complete garbage, with really poor reliability and several
           | badly broken peripherals.
        
           | bananabreakfast wrote:
           | There's a reason raspberry pi's are on arm. x86 does not do
           | well in small form factor and would actually perform far
           | worse is this situation.
        
           | kzrdude wrote:
           | I'm working in embedded, admittedly I'm not a hardcore
           | hardware guy. But everyone's using rpis for being the basic
           | driver for our little gizmo during development. It's simply a
           | good, cheap board to use for R&D and connecting stuff to.
        
           | robert_foss wrote:
           | There is no x86 equivalent at that price point & form factor.
        
             | unixhero wrote:
             | Nor energy usage.
        
               | ThrowawayR2 wrote:
               | That's often said but seems to be a myth.
               | 
               | - The Odroid H2+ x86 board claims about ~4W idle power:
               | https://www.hardkernel.com/shop/odroid-h2plus
               | 
               | - This measurement of idle power on the Raspberry Pi 4
               | says it uses ~3.4W:
               | https://www.tomshardware.com/reviews/raspberry-pi-4
        
               | AshamedCaptain wrote:
               | Actually, that would only be right if the Raspberry Pi
               | was any good regarding power usage. However the Raspberry
               | Pi is utterly crap at power savings, with practically no
               | power saving modes and very power-hungry (though cheap)
               | components. (excluding the RPI0 models, which have no
               | peripherals whatsoever).
               | 
               | I have a _very old_ PN40 from Asus -- this is a full PC
               | with an intel Celeron, a SATA SSD and about 8GB of RAM
               | that idles at 1.7W (serving websites via GB Ethernet) as
               | measured _at the wall_. This is just standard PC Linux
               | distro with zero customization (other than `powertop
               | --auto-tune`).
               | 
               | For comparison, the latest RPI4B without the SSD and with
               | 1/8 RAM idles at 3.5ishW on the wall. Even the older RPI3
               | I could never get below 2W, and that is without any
               | peripherals (no display, no Wifi, no eth, min USB) and
               | significant tuning.
               | 
               | On the other hand the PN40 + components cost
               | significantly more (probably close to 10x), and the CPU
               | performance itself is not that good these days.
        
               | ekianjo wrote:
               | Latest x86 mobile chips (with U in their names) are very
               | low-energy consumers.
        
               | my123 wrote:
               | Intel tried to make: https://ark.intel.com/content/www/us
               | /en/ark/products/79084/i...
               | 
               | in the past as a relatively cheap x86 CPU. ($9.62 per
               | unit)
               | 
               | However, it's a 400MHz 486 with some backports (1st
               | generation Pentium, no MMX even) and broken LOCK prefix
               | so that software specifically needed to be recompiled for
               | it.
               | 
               | And at 2.2 W too, it never ended up succeeding anywhere,
               | with no successor.
        
               | toast0 wrote:
               | I was pretty sure (without citations) the quark is a die
               | shrunk pre-mmx pentium (but post division table bug), are
               | you sure it's a 486?
               | 
               | The right era pentium had the same LOCK prefix issue (we
               | used to call it the F00F bug)
               | 
               | Edit: Thanks for link; I must have anchored on the
               | instruction set and ignored that it was using the 486
               | pipeline.
        
               | my123 wrote:
               | https://www.linleygroup.com/newsletters/newsletter_detail
               | .ph...
               | 
               | "The new x86 CPU uses an old-school 486 scalar pipeline
               | and the original Pentium instruction set with some modern
               | enhancements"
        
               | tedunangst wrote:
               | The fact it's got the f00f bug certainly made it sound
               | like a dusted off pentium core.
        
               | jandrese wrote:
               | Yeah, but the PC104 boards that those got stuck on were
               | still $250 each.
        
             | mkaic wrote:
             | Atomic Pi is pretty darn close, though it's twice the size.
             | Severely underrate SBC imo
        
             | ThrowawayR2 wrote:
             | The Intel Compute Stick
             | (https://www.intel.com/content/www/us/en/products/boards-
             | kits...) and its clones (which can still be found easily on
             | eBay) and the Intel Compute Card
             | (https://www.intel.com/content/www/us/en/compute-
             | card/compute...) showed that x86 could match the Raspberry
             | Pi form factor with roughly the same idle power and much
             | better peak performance. However, the parent is correct
             | that Intel couldn't (or wasn't willing to) match the price
             | point.
        
               | my123 wrote:
               | Raspberry Pis are on much older Arm
               | architectures/manufacturing processes, with nowhere near
               | the performance of a smartphone, and with very reduced
               | energy efficiency.
               | 
               | Raspberry Pi 3 was a Cortex-A53 back ported to 40nm
               | outright.
               | 
               | Raspberry Pi 4 is on 28nm. (for comparison, Apple A8 and
               | Snapdragon 810 were on 20nm already, in 2014-15)
               | 
               | It runs with a Cortex-A72 at a low clock (1.5GHz, phones
               | even are at twice that clock nowadays) and quite high
               | power consumption because of the process node.
               | 
               | The memory interface is narrow (32-bit bus, phones ship
               | with a 64-bit bus and laptops/desktops with a 128b one)
               | at a low data rate (LPDDR4-3200).
               | 
               | In addition to that, the CPU can only use 5GB/sec of it,
               | with the remainder being reserved for the GPU only.
               | 
               | Those were some of the sacrifices needed to reach this
               | price point. (and why it isn't representative at all of
               | the performance of higher-end ARMs)
        
               | JoshTriplett wrote:
               | The Raspberry Pi is also more solder-friendly, with GPIO
               | and similar. The Compute Stick makes sense if you're
               | going to connect to well-defined ports.
        
       | pjmlp wrote:
       | Nice to see more focus on C Machines with memory tagging.
        
       | ArkanExplorer wrote:
       | Will we see the next-generation of consoles running this instead
       | of x64?
        
         | jayd16 wrote:
         | Nintendo Switch is already Arm. If nVidia lands another
         | console, it would probably be Arm.
        
         | vbezhenar wrote:
         | Is there any public supplier of ARM chips comparable to x64?
         | Apple M1 is awesome, but Apple's not going to sell them.
        
           | mdasen wrote:
           | Basically, no. Amazon has their Graviton processors, but
           | they're not selling them. Nuvia's Phoenix processors might
           | have become that, but they've been bought by Qualcomm and
           | we'll see what happens.
           | 
           | Qualcomm is likely going to want to focus their talent on the
           | mobile market where they've historically been running only
           | slightly customized ARM designs. I think Qualcomm would like
           | to close the performance gap between it and Apple and fend
           | off MediaTek who is taking an increasing amount of
           | marketshare. As the US weans off CDMA, it's likely that
           | Samsung might end up using their own chips more. So Qualcomm
           | might want to focus Nuvia on mobile.
           | 
           | Qualcomm had tried its hand at Intel competitors, both on the
           | consumer and server side. They seem to have given up on that
           | for now.
           | 
           | Ampere is trying to get into the server space, and Anandtech
           | notes that they're competitive with the AMD EPYC Rome series
           | (https://www.anandtech.com/show/16315/the-ampere-altra-
           | review...). Oracle said they'd launch some in 2021, but who
           | knows how limited that will be or whether their plans will
           | change.
           | 
           | Amazon will want to use their own chips rather than pay a
           | third-party. More and more datacenter operations seem
           | concentrated in the big three providers who might not want to
           | let a new third-party get margin there (specifically, Amazon,
           | Google, and Microsoft). I can't imagine Google not going with
           | an in-house design if they wanted to launch an ARM platform.
           | 
           | Consumer devices are difficult. macOS won't work on non-Apple
           | hardware. The Windows ARM experience will be sub-par because
           | there's no dictator to force ARM on everyone (like Apple).
           | Apple can say, "we're moving to ARM" and developers either
           | get on board or are left behind. If Microsoft says, "we're
           | moving to ARM" it's more like, "we're going to add ARM
           | support, but we'll always be a first-class experience on
           | Intel and you can still expect to run Win16 apps from 1990 on
           | your new ARM computer and if developers and consumers don't
           | show interest, we're flexible and we'll pivot away from
           | ARM...so maybe don't buy an ARM machine right now because we
           | haven't been able to convince developers...and since you
           | won't buy the ARM machines, we'll probably just think it's a
           | flop in two years and put fewer resources towards ARM...so
           | there's no real reason for an ARM CPU company to want to make
           | good CPUs...which reinforces why consumers shouldn't buy
           | them..."
           | 
           | We are seeing movement in the space, but it's hard. I don't
           | think we'll see a lot of consumer stuff for Windows and Linux
           | comparable to Intel. I think it's just hard to break into
           | that space. With Linux, the market is small already. With
           | Windows, trying to convince consumers on a less-compatible
           | experience or an experience that Microsoft is less committed
           | to is hard. I think it's easier to compete against AMD and
           | Intel in the server market where so much software is already
           | CPU-independent and doesn't have the same reliance on
           | consumer software and compatibility. I think if you're making
           | consumer processors, you want to target Android and
           | Chromebook where you won't be dealing with convincing
           | consumers to select a lesser-compatible, lesser-supported
           | alternative to Wintel.
        
           | rwmj wrote:
           | There are loads of high performance Arm chips, but they're
           | pretty much all in the server space, ie power hungry. But
           | does any of that matter for a console? The Switch seems to be
           | phenomenally successful and yet is powered by a very modest
           | 64 bit Arm chip (4 x Cortex A57, an 8 year old
           | microarchitecture).
        
             | wk_end wrote:
             | Since the Wii Nintendo has been successful occupying a
             | different niche than Microsoft/Sony. The Xbox and the
             | Playstation both sell based on top-of-the-line (console)
             | performance; the Switch sells for other reasons.
             | 
             | I doubt either Microsoft or Sony are going to change tack
             | and try to fight Nintendo (and probably lose) on Nintendo's
             | home turf.
        
               | pjmlp wrote:
               | Exactly, Nintendo never went for last generation
               | hardware, they rather focus on gameplay.
        
         | wmf wrote:
         | If I were MS or Sony I'd choose the GPU first and take whatever
         | CPU comes with that GPU.
        
           | monocasa wrote:
           | Microsoft's public comments at Hot Chips and The Platform
           | Security Summit suggest that they did exactly that for the
           | past two generations.
        
           | ArkanExplorer wrote:
           | That might be an Nvidia GPU - with DLSS - but who would
           | manufacture the CPU?
           | 
           | Or, could we see Nvidia (since it has acquired ARM) jumping
           | into the console space?
           | 
           | A suite of three consoles ranging from mobile/portable, to
           | 1080p/TV, to 4K/Desktop quality could be very appealing,
           | especially if the top-end model could also use a mouse and
           | keyboard and dual-boot into ARM Windows.
           | 
           | Nvidia would just need to design a Linux game OS and
           | storefront. If they offered a 10% developer commission and
           | paid for some big exclusives, they could have a very
           | compelling product.
        
             | wmf wrote:
             | If you want an Nvidia GPU you might as well let Nvidia
             | design the whole SoC so they're going to use ARM cores that
             | they're familiar with. Xavier and Orin are designed for
             | cars but you can imagine how they could be modified into
             | console SoCs.
        
             | jayd16 wrote:
             | You should look at their Tegra/Shield offering as well the
             | Switch. They're way ahead of you.
        
               | monocasa wrote:
               | Sort of. They appear to be in a bit of a holding pattern,
               | having not released a competitive SoC in that space for
               | some years.
               | 
               | The rumor is that the Switch contract only came because
               | Nvidia had a firesale on those older chips that they
               | expected to make their way into flag ship Android
               | devices, but instead sat in inventory for years.
               | 
               | Maybe after the ARM acquisition goes through (if it
               | does), they'll start looking down that line again.
        
               | my123 wrote:
               | NVIDIA was involved in the Switch since even before the
               | Tegra X1 was even announced. Those rumours aren't true.
               | 
               | (You can take a peek at LinkedIn for example, of former
               | Nintendo engineers, which makes the timeline more clear)
        
               | monocasa wrote:
               | Can you give some more pointers? There's a lot of former
               | Nintendo engineers.
               | 
               | I find it very difficult to believe that contrary to the
               | rumors, Nintendo had been sitting on a SoC for many years
               | without releasing a product or even pushing for a die
               | shrink. Like, the Tegra X1 was announced in 2014, and
               | released into products by 2015, and the Switch didn't
               | come out until 2017.
               | 
               | Turning off an entire core complex also points to it not
               | being designed for them. Nintendo isn't known for paying
               | for gates they aren't using.
        
               | my123 wrote:
               | For example:
               | 
               | https://www.linkedin.com/in/eyhchen from the NVIDIA side
               | 
               | "Gave a power consumption related demo to Nintendo team
               | during sales process" (for his Jul 2013-Dec 2014 period
               | of employment)
               | 
               | https://www.linkedin.com/in/gyferic from the Nintendo
               | side
               | 
               | "Benchmark parallel processing - OpenMP stress test on
               | SoC Nvidia Tegra X1" (for a Sept 2014-Mar 2015 period of
               | employment)
        
       | grishka wrote:
       | Oh yeah, security. Can't wait to see all the exciting new ways
       | the device manufacturers will employ all these features in a
       | user-hostile manner.
        
       | The_rationalist wrote:
       | When are commercial implementations of SVE coming?
        
         | my123 wrote:
         | For server, in Neoverse-V1 this year.
         | 
         | For client, stay tuned.
        
           | The_rationalist wrote:
           | Nice. OpenJDK has managed a breakthrough with their hardware
           | agnostic Vector API (the first of its kind?) so that every
           | SIMD algorithm coded with it will work equally well on ARM
           | (and also can target at runtime the best vector width if
           | available)
        
             | iron2disulfide wrote:
             | TIL about JDK's vector API. That's awesome; I mostly work
             | in the C and C++ space and have learned to embrace the
             | gigantic ifdef's for various arch-specific vectorized
             | instructions.
        
             | nn3 wrote:
             | Or rather work equally badly. Not sure if I would describe
             | this as a breakthrough.
        
       | The_rationalist wrote:
       | What's really interesting about SVE is that it allow lower than
       | 128 bit parallelism e.g 64. I have seen mentions that some
       | algorithms show best performance with such values.
        
         | astrange wrote:
         | This is true for graphics/video codecs where you want to move
         | around pixels in blocks smaller than 128-bit. The MMX
         | instructions nobody likes in x86 are actually still pretty
         | useful here.
         | 
         | But you can do SIMD-in-GPR tricks, or dedicated hardware, or
         | GPGPU to replace that, so it's not a big problem if it's
         | missing.
        
           | Narishma wrote:
           | I think even the old ARM11 in the first Raspberry Pi
           | supported some SIMD-in-GPR features, though I'm not sure if
           | software took advantage of them.
        
       ___________________________________________________________________
       (page generated 2021-03-30 23:00 UTC)