[HN Gopher] AWS Graviton2
       ___________________________________________________________________
        
       AWS Graviton2
        
       Author : yarapavan
       Score  : 83 points
       Date   : 2020-01-25 17:56 UTC (5 hours ago)
        
 (HTM) web link (perspectives.mvdirona.com)
 (TXT) w3m dump (perspectives.mvdirona.com)
        
       | neonate wrote:
       | https://web.archive.org/web/20200125180037/https://perspecti...
        
       | miohtama wrote:
       | This is good news. Are Linux server distributions for ARM64s yet
       | on-par with their PC counterparts? Getting base layer software is
       | not going to be an issue?
        
         | QuinnyPig wrote:
         | Been using Ubuntu on one for a few weeks; the only things I
         | missed was a few Docker containers that weren't built, and aws-
         | vault didn't have an ARM binary. I built my own, and aws-vault
         | shopped a new release with ARM support 20 minutes after I
         | whined about it on Twitter.
         | 
         | Everything else has been flawless.
        
       | sdan wrote:
       | This is great news. Except that a ton of software doesn't support
       | ARM.
       | 
       | When I was trying to shift all my current infrastructure onto a
       | couple of RPI's, many of the Docker containers didn't support ARM
       | (qeumu and buildx aren't reliable) and other software didn't
       | support ARM either.
       | 
       | Unless there's a good way to go from AMD to ARM, I'm not entirely
       | sure how great Graviton or other competitors will get.
        
         | JunkDNA wrote:
         | Back in the late 90's and early 00's, there were a ton of cpu
         | platforms around: SGI MIPS, DEC Alpha's, Intel, Sun SPARC,
         | etc... while I will admit it was a colossal pain working
         | somewhere that had all of those, it was often possible to
         | recompile from source to get things to run. I'm not suggesting
         | it's trivial, but given the incredible investment in ARM in the
         | mobile space, the wind is at least at your back today. It
         | certainly has got to be much easier than it was in the days of
         | being the only person in the world trying to recompile an
         | obscure open source scientific computing package for DEC Alpha.
         | Commercial software is a different beast, but even there, the
         | incentive will be high to do a port if lots of people start
         | migrating to this.
        
       | gautamcgoel wrote:
       | This is great, but I'd be _really_ excited if we could go out and
       | buy the chips ourselves instead of having to pay the Amazon tax
       | and run our code on untrusted systems in the cloud. Of course,
       | Amazon has little incentive to sell the chips, since it gives
       | them a competitive advantage against other cloud providers.
        
       | otterley wrote:
       | (I work for AWS. Opinions are my own and not necessarily those of
       | my employer.)
       | 
       | I've been doing some initial M6g tests in my lab, and while I'm
       | not able to disclose benchmarks, I can say that my real-world
       | experience so far reflects what's been claimed elsewhere.
       | 
       | Graviton2 is going to be a game changer. It's not like the usual
       | experience with ARM where you have to trade off performance for
       | price, and decide whether migrating is worth the recompilation
       | effort. In my lab, performance of the workloads I've tried so far
       | is uniformly _better_ than on the equivalent M5 configuration
       | running on the Intel processor. You 're not sacrificing anything
       | by running on Graviton2.
       | 
       | If your workloads are based on scripting languages, Java, or Go,
       | or you can recompile your C/C++ code, you're going to want to use
       | these instances if you can. The pricing is going to make it
       | irresistible. Basically, unless you're running COTS (commercial
       | off-the-shelf software), it's a no-brainer.
        
         | staticassertion wrote:
         | Would these chips be reasonable for something like an i3en
         | class?
        
         | m0zg wrote:
         | I feel like this is where hyperthreading is finally starting to
         | bite Intel in the rear. Cloud providers have been selling
         | "VCPUs" that aren't actual cores. I best most customers don't
         | even know what they're buying. Even if ARM cores are slower
         | (and they don't really have to be), they're still going to be
         | faster than hyperthreads.
        
           | lallysingh wrote:
           | I don't think so. Most apps get 0.5-0.6 instructions
           | completed per clock cycle, and Intel cores can put out
           | multiple instructions per cycle.
           | 
           | Most customers don't measure, and don't optimize cache usage
           | for the actual tradeoffs (cache, tlbs) to matter.
        
             | m0zg wrote:
             | Most, yes. But things like databases, video/data
             | compression, compute / deep learning workloads, etc, _are_
             | negatively affected by the fact that cores aren't really
             | cores. Basically anything that's actually using the CPU to
             | an appreciable extent will be affected by that. Add to that
             | the hyperthreading-specific CVEs as well.
        
         | vbezhenar wrote:
         | It is surprising, because I was under impression that Java has
         | so many optimizations for x86 and ARM was so new that it's
         | almost impossible to beat x86 without very significant
         | investments. It's nice to hear that I was wrong.
        
           | Twirrim wrote:
           | ARM has been so dominant in the mobile market that there has
           | been a lot of effort around ARM optimisations, both in
           | compilers and interpreted languages. Way more so than
           | alternative architectures like MIPS etc.
        
           | pm215 wrote:
           | It's not very "new". People have been working on porting,
           | improving and optimizing software for the Arm server
           | ecosystem for more than a decade now; really performant and
           | widely available hardware may be new on the scene, but it
           | would be silly to wait for that before starting work on the
           | software side of things...
        
         | rbranson wrote:
         | I am extremely excited about this development. Less so getting
         | everything that assumes x86 in the stack supporting a
         | "parameterized" platform.
         | 
         | (Hi Michael!)
        
           | floatboth wrote:
           | heh, I've had a very fun experience with spotinst.com -- my
           | a1 spot instance went down and that service couldn't restore
           | it because it did not label the AMI as arm64. Reported that
           | to them, got a couple acknowledgements but haven't heard back
           | in a while, so presumably this is still not fixed.
        
           | otterley wrote:
           | Hi Rick! I know it's only a part of the toolchain needed to
           | support the migration, but multi-arch Docker image support
           | for Amazon ECR is definitely on our immediate roadmap.
        
         | QuinnyPig wrote:
         | I too have been playing with m6g, and while I'm allowed to
         | disclose benchmarks I haven't bothered to run any; the pedantry
         | that unleashes is enough to drive me to drink.
         | 
         | "Lies, damned lies, and benchmarks." I can say that the
         | qualitative experience is superb; everything "just works" as
         | you'd expect it to, and performance is stellar.
        
         | chroem- wrote:
         | Does the chip still manage to have superior power efficiency
         | versus x86 even at these performance levels?
        
           | chris_overseas wrote:
           | From the comments section of the article:
           | 
           | > Because there are so many power sensitive applications
           | where ARMs are used, much has been invested in power
           | minimization and management and they do very well. It's easy
           | to get remarkably better power consumption with an ARM part.
           | But, in this particular case, our focus was more on server-
           | side price/performance and, with that focus, our power
           | consumption isn't really materially better the alternatives.
        
           | QuinnyPig wrote:
           | To my uninformed understanding, power efficiency wasn't
           | really the design goal. If it's AWS's power bill, do we care
           | as much (presuming sustainable energy)?
        
             | marcinzm wrote:
             | >If it's AWS's power bill, do we care as much (presuming
             | sustainable energy)?
             | 
             | I care because it's an interesting question regarding
             | future trends in technology. More broadly, people care
             | about a lot more than just what has direct short term
             | applications to their jobs.
        
               | QuinnyPig wrote:
               | Fair. Trouble is, we're never going to get Graviton2
               | devices of our own outside of an AWS datacenter or
               | device; the power profile is intellectually interesting,
               | but not likely to ever enter the public sphere.
        
               | marcinzm wrote:
               | If something is possible and economical then others will
               | copy it in time.
        
               | nine_k wrote:
               | I'm not sure why Amazon won't try to recoup some of the
               | investment by designing and marketing other boards /
               | devices with this chip, or by just selling it in
               | quantities to some manufacturers (think telecom, home
               | entertainment, industrial equipment, etc).
        
               | QuinnyPig wrote:
               | Based upon my experience, I'd say it's going to be
               | profitable in its own right just by powering EC2
               | instances. The same argument could be said to apply to
               | Apple's ARM chips.
        
             | Scaevolus wrote:
             | Power and heat dissipation are a huge part of TCO for
             | datacenters. I'm sure Amazon picked an appropriate point on
             | the performance per watt curve.
        
             | qeternity wrote:
             | Given that your cost is going to be correlated to AWS's
             | cost of goods sold, I'd argue of course we care.
        
               | QuinnyPig wrote:
               | I care what AWS charges me; I don't have the energy to
               | care what their underlying cost structure and its
               | constituent parts looks like.
        
               | wmf wrote:
               | Indeed; we shouldn't care about AWS power efficiency but
               | we also shouldn't assume that it's bad just because we
               | can't see it.
        
           | ksec wrote:
           | Just a small reminder this is comparing a x86 on Intel
           | 14nm+++++ and another one on TSMC 7nm. i.e the power
           | efficiency would likely have absolutely nothing to do with
           | the ISA in use.
        
       | gok wrote:
       | There are some interesting implications for widely deployed
       | processors that are literally never publicly seen because they
       | spend their whole life in a highly locked down data center. I
       | wonder if things like Meltdown could have ever been discovered if
       | researchers could only poke at the chips via EC2.
        
         | QuinnyPig wrote:
         | I believe the "metal" variants expose the processor extensions
         | you'd need to discover Meltdown.
        
           | floatboth wrote:
           | Also, AWS processors use off-the-shelf Arm Cortex/Neoverse
           | cores, and stuff like Spectre is core-level.
        
           | ec109685 wrote:
           | That's a good point. And AWS allows this type of security
           | research as well freely:
           | https://twitter.com/TeriRadichel/status/1101228943128969218
        
       | yarapavan wrote:
       | * [James Hamilton] believe there is a high probability we are now
       | looking at what will become the first high volume ARM Server.
       | More speeds and feeds: >30B transistors in 7nm process 64KB
       | icache, 64KB dcache, and 1MB L2 cache 2TB/s internal, full-mesh
       | fabric Each vCPU is a full non-shared core (not SMT) Dual SIMD
       | pipelines/core including ML optimized int8 and fp16 Fully cache
       | coherent L1 cache 100% encrypted DRAM 8 DRAM channels at 3200 Mhz
       | 
       | * ARM Servers have been inevitable for a long time but it's great
       | to finally see them here and in customers hands in large numbers.
        
         | kortilla wrote:
         | Is there somewhere to buy something like this for use at home?
        
           | downrightmike wrote:
           | Probably not until end of life for these chips
        
         | nine_k wrote:
         | What really stands out for me is "100% encrypted DRAM".
         | 
         | How efficient is this? Can different cores have different
         | encryption keys, so that different VMs under a hypervisor can't
         | benefit from breaking the hypervisor's protections?
        
           | praseodym wrote:
           | I can't speak for the Graviton2 CPUs, but AMD Epyc CPUs have
           | RAM encryption with per-VM keys for increased isolation:
           | https://developer.amd.com/sev/
        
       | Dunedan wrote:
       | > Here's comparative data between M6g and M5, the previous
       | generation instance type
       | 
       | Instead of comparing the 7nm Graviton2 processor against an 14nm
       | Intel processor, I'd like to see its performance compared to an
       | AMD Epyc 2 processor, which would be a more apples-to-apples
       | comparison as both are "7nm" parts. Unfortunately Epyc 2
       | processors aren't available from AWS yet (but are already
       | announced: https://aws.amazon.com/de/blogs/aws/in-the-works-new-
       | amd-pow...).
        
         | Rapzid wrote:
         | This is what I'm curious about as epyc currently represents the
         | state of the art x86 right? What's the TCO comparison when
         | factoring in performance, density, and power?
        
       | pranith wrote:
       | I wonder when and how Azure and Google Cloud will compete with
       | AWS in this market.
       | 
       | They could buy ARM processors available in the market, but I
       | doubt they will be able to get them as cheap as AWS who builds
       | their own.
        
         | jacques_chester wrote:
         | I believe Amazon have subbed out manufacture to TSMC.
        
           | pranith wrote:
           | True, but the design is licensed from ARM and tuned to AWS's
           | requirements.
        
         | justicezyx wrote:
         | I led the support for multi arch images for Borg.
         | 
         | Google had the software stack ready for internal workload
         | longtime ago, PowerPC was used
         | 
         | https://www.forbes.com/sites/patrickmoorhead/2018/03/19/head...
        
           | pranith wrote:
           | I am sure the software is multi-arch ready but I wonder if
           | they are evaluating ARM servers either for use internally or
           | to launch in the cloud...
        
         | gundmc wrote:
         | Does Microsoft make any of their own silicon?
         | 
         | It feels like Google has been directing their in-house designs
         | on ML/TPUs while Amazon went all in on ARM. It will be
         | interesting to see how those bets pay off.
        
           | floatboth wrote:
           | No, but Microsoft bought some off the shelf Ampere and
           | Cavium/Marvell servers. But they keep them for internal use
           | only for now :(
           | 
           | Huawei makes their own silicon and servers with that silicon
           | -- also only internal, not available on huaweicloud :(
           | 
           | The only other player is Scaleway who bought first gen Cavium
           | ThunderX's way back when. And Packet of course but that's
           | bare metal only, no cheap small VPSes.
        
             | pranith wrote:
             | If they are good enough for internal use, they should be
             | good enough for public use. Not really sure what is
             | stopping them from exposing it to the public... I am sure
             | there is _some_ demand for it.
        
               | gundmc wrote:
               | There are a number of reasons not to launch as an
               | external cloud offering. A few:
               | 
               | - Reliability (performance and availability) could be
               | below Azure standards
               | 
               | - Supply chain maturity - they may have difficulty
               | scaling procurement and deployment to meet orders
               | 
               | - Lock in - major cloud providers typically provide
               | product guarantees with forenotice on the order of years
               | before a deprecation. It's a big commitment to launch a
               | product externally.
               | 
               | - Business case - maybe the TCO doesn't make sense when
               | compared with Azure's data on demand and price point
        
       ___________________________________________________________________
       (page generated 2020-01-25 23:00 UTC)