[HN Gopher] Blacksmith - Rowhammer bit flips on all DRAM devices...
       ___________________________________________________________________
        
       Blacksmith - Rowhammer bit flips on all DRAM devices today despite
       mitigations
        
       Author : buran77
       Score  : 488 points
       Date   : 2021-11-15 16:27 UTC (6 hours ago)
        
 (HTM) web link (comsec.ethz.ch)
 (TXT) w3m dump (comsec.ethz.ch)
        
       | HALtheWise wrote:
       | Maybe OS's or hypervisors should support a mode where different
       | processes/security domains are forced into contiguous blocks of
       | physical memory with buffer zones between. Especially for cloud
       | computing, pre-allocating memory to different VMs should be
       | reasonable. I could even see browsers taking advantage of it,
       | given that they already force javascript threads into per-domain
       | processes for Spectre/Meltdown protection.
        
       | javajosh wrote:
       | Um, how concerned should I be about this? Is it time to turn off
       | javascript in the browser? And if it is, is this not the End
       | Times for browser distributed software? I ask because sometimes
       | you need to ask if it's really the end of the world or just
       | Monday.
        
         | notyourday wrote:
         | > Um, how concerned should I be about this?
         | 
         | Not at all. It is another one of the Chicken Little's security
         | theater that sound sexy and sophisticated but in practice mean
         | nothing.
         | 
         | If you are concerned about this issue you should first ensure
         | that you never type "npm install" or copy something off Stack
         | Overflow into your code.
        
           | acuozzo wrote:
           | > Not at all.
           | 
           | https://github.com/IAIK/rowhammerjs
           | 
           | This has existed for years and its use has been observed in
           | the wild. RowHammer isn't new.
        
         | Terr_ wrote:
         | > Is it time to turn off javascript in the browser?
         | 
         | I've been browsing with Javascript off-by-default for years and
         | recommend it regardless.
         | 
         | Sure, perhaps a site I trusted and whitelisted will try to hack
         | me, but I feel it's far less likely than some random typo-
         | squatting site or advertising-broker logic.
        
         | egberts1 wrote:
         | JavaScript, despite compiler hardening, is still subjectable to
         | SPECTRE-like actions. It's in the compilers' mitigation effort
         | and CPU memory controllers both.
         | 
         | Something about CPU firmware and its inability to fix some
         | aspect of memory bank controllers.
        
         | pas wrote:
         | It depends on how concerned are you about the regular security
         | issues. Which ideally depends on the things you are responsible
         | for. If it's your family photos in iCloud? Nah. If it's the
         | nuclear launch codes for the whole world? Then yeah, you should
         | already be isolating separating compartmentalizing...
        
           | myself248 wrote:
           | > photos in iCloud?
           | 
           | I mean, the last fappening was done by phishing, as I recall.
           | Are you saying another one, done by rowhammer, wouldn't be a
           | problem?
           | 
           | I suspect most folks have stuff they'd rather not make
           | public. Credit card numbers, at the very least. This affects
           | us all.
        
             | pas wrote:
             | While there are clearly bad (and some interesting) ethical
             | aspects of the Fappening, it was not a big problem for
             | society. Everyday many people experience the same violation
             | of their privacy (revenge porn sites, accidentally sending
             | nudes to the wrong address, etc), and life goes on. (And as
             | far as I know nobody of the affected celebs have taken
             | drastic measures, while unfortunately this cannot be said
             | for non-celebs, as cyberbullying regularly gets people to
             | attempt suicide with "success".)
             | 
             | Similarly the day-to-day ransomwares probably cost more
             | than securing browsers will.
        
           | javajosh wrote:
           | You don't have to be in charge of launch codes to worry about
           | this one. I mean, if anyone can 'npm install rowhammer' and
           | reliably attack anyone that visits their site, even in a
           | modern browser, then it's time to turn js off. I'm okay if
           | there are some minor or even moderate XSS exploits out there;
           | but a hardware vuln accessible to javascript means javascript
           | goes bye bye for now, at least from my local machine's
           | browser.
        
             | pas wrote:
             | If it comes to that you'll hear about it on HN first. If
             | folks start to explot it you'll hear about it.
             | 
             | If it comes you'll hear about "apps" that freeze every
             | other browser process while you do online banking.
             | 
             | That said, proactively blocking JS is great for many
             | reasons, and you can whitelist sites you sort of trust.
        
               | ziddoap wrote:
               | >If it comes to that you'll hear about it on HN first. If
               | folks start to explot it you'll hear about it.
               | 
               | This is a pretty laissez-faire attitude with regards to
               | security-critical exploits. _You_ might be the one of the
               | first ones affected. After all, _someone_ has to be first
               | one before they can post it on HN.
               | 
               | In a more general sense, it's a bad idea to rely on
               | someone else as your primary alert. Whether it is a HN
               | post, news article, whatever.
               | 
               | (I know that you aren't quite saying to use it as a
               | primary alert, but a cursory read of your comment implies
               | it, so it's worth discussing.)
        
               | scandinavian wrote:
               | > You might be the one of the first ones affected
               | 
               | If you are regularly the fist target of zero-days your
               | threat model should be stricter yes. The other 7.9
               | billion people in the world can afford to be a little
               | less strict.
        
       | tambourine_man wrote:
       | This kind of thing depresses me. We're are building our
       | infrastructure on quicksand.
        
       | sroussey wrote:
       | I guess another plus for DDR5 which has some ECC built in to
       | including in the non-ECC versions.
        
         | temac wrote:
         | The internal ECC in basic DDR5 is intended to provide a
         | reliability just good enough to be at the level of DDR4 without
         | ECC (or maybe slightly above, but probably not a lot?), and
         | likely the error rate would just be crazy high without any DDR5
         | internal "ECC" at all -- even without a rowhammer pattern or
         | cosmic rays, etc. I not sure it will help against rowhammer
         | (and variants)?
        
           | sroussey wrote:
           | Agreed, though did the spec for ddr5 finalize after row-
           | hammer? Seems like an opportunity to adjust for attacks.
           | 
           | They should do another set of tests on DDR5 and compare to
           | DDR4.
        
           | egberts1 wrote:
           | Original author already said DDR5 ECC, once mapped, will too
           | suffer SPECTRE.
        
             | AaronFriel wrote:
             | SPECTRE is a CPU vulnerability in the speculative execution
             | that modern processors perform to create a side channel to
             | leak data through.
             | 
             | Could you elaborate on what you mean by DDR5 "suffering"
             | from SPECTRE? I believe that vulnerability is memory
             | technology agnostic and theoretically would work if you
             | could boot a CPU absent any memory at all, as the leaks
             | occur through CPU caches.
        
               | egberts1 wrote:
               | CPU? More like memory controller and their capacitance
               | banks.
        
       | patryn20 wrote:
       | All these new hardware level physics exploits in software
       | fascinate me. At the same time they make me wonder if the
       | hardware will ever truly be able to be secure and perhaps we need
       | to just move on to new methods and concepts in hardware design
       | and maybe trade size for security.
       | 
       | It also brings to mind that Simpsons scene: "Stop stop! He's
       | already dead!"
        
         | guerrilla wrote:
         | How curious are you? Here's a whole class on exactly that from
         | the authors of the OP [1] at ETH Zurich.
         | 
         | 1.
         | https://www.youtube.com/watch?v=AJBmIaUneB0&list=PL5Q2soXY2Z...
        
           | mhh__ wrote:
           | Onur Mutlu is a really good lecturer, to my eyes at least.
        
             | guerrilla wrote:
             | Some of his TAs are really funny too :)
        
         | cout wrote:
         | It also makes me think of the phonograph in Godel Escher Bach.
         | One character made a record player and the other made a record
         | that always caused the record player to break. The crux was
         | that a record player cannot be built that can play all records;
         | it is impossible.
        
         | kaba0 wrote:
         | I think moving such complexity to hardware was a mistake (like
         | branch predictor, etc). Perhaps exposing a very low level API
         | to CPU functionality (like microcode level) and JIT compiling
         | existing x86 to that could perhaps work? We have just enough
         | problem managing complexity in software but at least it is
         | fixable there.
         | 
         | (As for the possibility of such, there is already an x86 to x86
         | JIT compiler that increases performance)
        
           | notriddle wrote:
           | What does any of this stuff have to do with Rowhammer, an
           | attach that doesn't even target the CPU, which just so
           | happens to have ECC on all its registers and cache lines?
        
             | kaba0 wrote:
             | Parent was talking about modern hardware in general, and
             | it's not like CPUs don't get their own fair share of
             | similar vulnerabilities. Also, a JIT compiler can probably
             | realize that the same memory region gets rowhammered and
             | may throttle execution of such thread.
        
         | thinkingemote wrote:
         | I fully expect apples M1 chip to get a spectre type
         | vulnerability at some point and then any improvements it had
         | will disappear.
        
           | mhh__ wrote:
           | M1 is vulnerable to Spectre (some variants at least) as far
           | as I'm aware.
           | 
           |  _Any_ speculation around memory accesses will yield this.
        
             | zsmi wrote:
             | That why the solution is to only allow speculation when
             | it's "ok". M1 is believed to have a number of Spectre-esq
             | security mechanisms built into it to determine just that.
             | For example:
             | https://patents.google.com/patent/US20200192673A1
             | 
             | Also, In dougallj's code [1] the zero of registers should
             | be superfluous so it is assumed the function below is
             | needed to make the experiments run stably by claiming
             | ownership of the registers as part of a general anti-
             | speculation security mechanism.
             | 
             | static int add_prep(uint32_t *ibuf, int instr_type);
             | 
             | The M1 explainer [2] has lot of interesting ideas like this
             | contained inside it.
             | 
             | [1] https://gist.github.com/dougallj/5bafb113492047c865c0c8
             | cfbc9...
             | 
             | [2] https://news.ycombinator.com/item?id=28549954
        
       | buran77 wrote:
       | I think the scary part is this one:
       | 
       | > "DDR4 systems with ECC will likely be more exploitable, after
       | reverse-engineering the ECC functions," researchers Razavi and
       | Jattke said.
        
         | cjsplat wrote:
         | I believe that is saying understanding ECC function details
         | makes it easier to exploit EEC devices, _not_ that it will be
         | easier than exploiting non-ECC devices.
         | 
         | The ECC code word is bigger so it is a larger target, but you
         | have to flip multiple bits to cause pain. If you have 2 bit
         | detect, you need to flip three bits to get something that
         | corrects to a different value.
        
         | staticassertion wrote:
         | Where are you reading this quote? I'm seeing essentially the
         | opposite in the linked post.
         | 
         | > ECC cannot provide complete protection against Rowhammer but
         | makes exploitation harder.
         | 
         | edit: I looked at some linked papers and I see similar quotes,
         | though not that one.
         | 
         | edit2:
         | 
         | https://arstechnica.com/gadgets/2021/11/ddr4-memory-is-even-...
         | 
         | > "DDR4 systems with ECC will likely be more exploitable, after
         | reverse-engineering the ECC functions," researchers Razavi and
         | Jattke said.
         | 
         | OK, so not more exploitable relative to not-ecc RAM, just
         | relative to ecc ram pre-RE.
        
       | 015a wrote:
       | Eventually, it will be obvious that running shared workloads on a
       | single piece of physical hardware has fundamentally
       | unremediatable security implications. This slow recognition is
       | deeply perpendicular with how the current landscape of x86 chips
       | are both manufactured and priced, as well as how cloud providers
       | have structured billions of dollars in DC investment; in other
       | words, they'll down-play it. This will be a massive opportunity
       | for ARM & SoC manufacturers in the coming years, as its far
       | better positioned to offer, for example, a single rack appliance
       | with 64 individual 2-core mini-computers at a price-point
       | competitive with a 128 core x86 rack, as one computer.
       | 
       | Computing moves in cycles:
       | 
       | - 2000s: gigahertz race on each core
       | 
       | - 2010s: increase core counts, multicore everything
       | 
       | - 2020s: back to core-efficiency and increasing per-core
       | performance. M1 is already leading this charge, but is obviously
       | a mismatch for a DC environment.
       | 
       | AMD and Intel need to adjust, or face extinction. Its not just
       | about pushing ultra-high per-core performance (they're both good
       | at this); its about pushing for more efficiency, so per-blade
       | density in a DC can be pushed higher in the face of more, smaller
       | individual computers. If they don't evolve, AWS will for them
       | [1].
       | 
       | [1] https://aws.amazon.com/ec2/graviton/
        
         | short12 wrote:
         | AMD seems to be doing just fine. With a roadmap for the future.
         | They will be just fine
        
         | baq wrote:
         | AMD and Intel deliver CPUs tailor made for large cloud
         | customers already; those SKUs are not available via normal
         | channels.
        
           | dathinab wrote:
           | Their next gen "normal" server CPUs also have "cloud
           | optimized CPUs".
           | 
           | While this seems to be mainly about Power/Heat/Perf.
           | optimizing for typical "cloud" workloads it might also have
           | design decisions related to this problem. We will see, but
           | they look interesting anyway.
        
           | belter wrote:
           | Can you elaborate or provide a reference on how those CPUs
           | are different?
        
             | baq wrote:
             | not really, just heard about those. but it wouldn't really
             | be that much different than a customized SKU for a
             | playstation or an xbox.
        
               | belltaco wrote:
               | But what model number would they show up as in the OS
               | facing the users? From what I see most CPU names are the
               | ones available in the market.
        
             | ThrowawayR2 wrote:
             | A recent HN posting points to a news article about AMDs
             | customized processors for Facebook although it provides
             | little detail:
             | https://news.ycombinator.com/item?id=29204257
             | 
             | " _The custom Epyc 7003 part that Facebook has commissioned
             | from AMD has 36 cores out of its potential 64 cores
             | activated, and with a bunch of optimizations for Facebook
             | applications at the web, database, and inference tiers of
             | its application stack._ "
        
             | Syonyk wrote:
             | Anyone who would know the fine grained differences is
             | almost certainly NDA'd up quite tightly.
             | 
             | But it's likely things like different core counts,
             | frequency performance curves, perhaps more or different
             | memory/IO controller counts, perhaps instruction extensions
             | of some particular interest (I'd wager BF16 was available
             | in data center SKUs long before consumer SKUs), etc.
             | 
             | It's nothing radical, but if you're going to buy an awful
             | lot of them and want something slightly custom, Intel will
             | definitely do that for you.
        
               | javajosh wrote:
               | What kind of levers do big ISPs/cloud providers want over
               | their CPUs, other than the basic efficiency gains we all
               | want? Isn't it risky doing any customization since you
               | don't get commodity pricing?
        
               | swiftcoder wrote:
               | At AWS scale you can feasibly just buy up the entire run
               | of chips. What's a few million CPUs more or less between
               | friends?
        
               | mywittyname wrote:
               | Additionally, Amazon is absolutely the type of company to
               | create their own x86 processor if Intel/AMD are unwilling
               | to make some customization.
               | 
               | Intel does not want to be competing with an Amazon Basics
               | processor.
        
               | Grazester wrote:
               | You basically need a license for X86 from Intel for this.
               | Do you think x86 is open source!
        
               | to11mtm wrote:
               | > Isn't it risky doing any customization since you don't
               | get commodity pricing?
               | 
               | Depends on how many you order and how reasonable the
               | customization is. Like, what if I wanted a 10,000 38 core
               | Xeons with a 2.5GHz Base clock and 3.5GHz Turbo clock?
               | Such a part doesn't exist, yet sits smack dab between two
               | existing Models, the Xeon Platnium 8368 and 8368Q.
               | Assuming yields are already good enough, that might be
               | the sort of thing a CPU maker will do. (Might need more
               | than 10k units though, IDK.)
        
               | baq wrote:
               | I'd wager Pat doesn't get out of bed for 10k units and
               | neither does Lisa.
        
               | wolf550e wrote:
               | It's unlikely a different die (silicon chip), so the
               | usual number of cores and usual size of caches, and it's
               | unlikely that AWS gets a secret x86 instruction
               | unlockable via msr or something like that. It's probably
               | just constants for the frequency scaling (e.g. all cores
               | and single core turbo).
               | 
               | Cloudflare mentioned they use a custom Intel SKU on their
               | blog, and they're not as big as AWS.
        
               | ghbe44 wrote:
               | I'm familiar with Google's custom SKUs, and I assure you
               | they're unique enough to experience microcode bugs nobody
               | else on the planet ever hits. There was a weird FDIV one
               | back in the day that wasn't the same FDIV bug that hit us
               | all a long time ago. Cloudflare isn't really at that
               | customization table.
               | 
               | Your take sounds accurate for non-FAANGs (and a couple of
               | those). For Google, not so much. There is a lot of custom
               | acceleration and such that isn't available to any other
               | customer, and a nearby commenter is right, every detail
               | about them is locked behind a very strong NDA. I've heard
               | thirdhand AWS does indeed have custom instructions in the
               | virtualization extensions stuff, but don't know if that's
               | true.
        
               | monocasa wrote:
               | My understanding is that Google's SKUs are the same die.
               | Non Google versions just have the Google specific silicon
               | fused off, or require a MSR knock sequence to turn on.
        
               | ghbe44 wrote:
               | > I'd wager BF16 was available in data center SKUs long
               | before consumer SKUs
               | 
               | Right you are, Ken.
        
         | [deleted]
        
         | passivate wrote:
         | But you still need data exchange between parallel/concurrent
         | workloads. The security focus will shift to the data nodes
         | which exchange data between the CPUs. And then the focus will
         | probably shift towards latency and performance and moving these
         | data modules closer to each other.. kinda like RAM :P
        
         | froh wrote:
         | IBM Z series silicon (z architecture and it's predecessors
         | s390,...) Is designed with multi tenancy in mind from the get
         | go. Finding a way to escape virtualization let alone
         | partitioning to access confidential competitor data was a no
         | go.
         | 
         | And indeed to my understanding spectre, meltdown, rowhammer and
         | similar attacks are not an issue there.
         | 
         | https://en.wikipedia.org/wiki/Z/Architecture
         | 
         | I wonder when more features from the mainframe cross pollinate
         | Intel amd arm CPU architectures.
        
           | monocasa wrote:
           | Z Series is very much susceptible to meltdown, spectre, and
           | rowhammer. IBM says that it should be fine because Spectre et
           | al need to be running untrusted workloads to work, but they
           | haven't updated their advisories since NetSpectre. : /
           | 
           | A lot of the talk of mainframe levels of security is specious
           | at best.
        
             | froh wrote:
             | Wow. Your right, for spectre patches were needed on system
             | z.
             | 
             | https://www.suse.com/de-de/support/kb/doc/?id=000019105
             | 
             | Meltdown didn't affect z nor amd.
        
               | monocasa wrote:
               | Series Z and POWER up to and including POWER9 are
               | susceptible to meltdown as well.
               | 
               | https://www.zdnet.com/article/meltdown-spectre-ibm-preps-
               | fir...
        
               | classichasclass wrote:
               | But, not 7400 G4 and earlier.
               | 
               | https://tenfourfox.blogspot.com/2018/01/actual-field-
               | testing...
        
               | monocasa wrote:
               | Which aren't POWER cores. POWER is an IBM trademark; the
               | 7400 is a Motorola PowerPC core.
        
               | classichasclass wrote:
               | A fair point, though early POWER cores probably aren't
               | for the same reasons these aren't (can't speculate
               | through indirect branches using SPRs), and IBM was
               | involved in the development of both.
        
         | throwaway894345 wrote:
         | > Eventually, it will be obvious that running shared workloads
         | on a single piece of physical hardware has fundamentally
         | unremediatable security implications.
         | 
         | If I understand correctly, Homomorphic Encryption aims to solve
         | for these kinds of attacks (although presumably the
         | computations are more expensive and programs must be
         | restructured to use HE primitives?).
         | https://en.wikipedia.org/wiki/Homomorphic_encryption
         | 
         | EDIT: Why the downvotes? Am I mistaken?
        
           | teddyh wrote:
           | HE is _stupidly_ inefficient, and is entirely an academic
           | oddity, and there is every indication that it will always
           | remain that way. Bringing up HE in a context of real problems
           | needing real solutions is unproductive.
        
             | ghaff wrote:
             | If solving a problem in a stupidly inefficient way is the
             | only way of mitigating said problem, you may not have much
             | choice--at least for some use cases. Saying something _can
             | 't_ be answer because it isn't an efficient answer (today)
             | is also unproductive.
        
           | pbronez wrote:
           | Yeah HE is a totally different security model. It moves all
           | the trust out of the hardware. You ship encrypted workloads
           | to an untrusted party, who computes on the encrypted data and
           | returns an encrypted result. Then you decrypt the result to
           | use it.
           | 
           | And yeah, this is just as inefficient as you'd think.
        
             | rocqua wrote:
             | I'd say its much more inefficient as you'd think.
             | 
             | Firstly the computations on encrypted data just take a LOT
             | longer, especially for Fully Homomorphic encryption. With a
             | single 64 bit addition taking microseconds.
             | 
             | Secondly, the FHE code cannot branch based on data. If it
             | could, it would know something about the data, and it
             | wouldn't be proper encryption. This means an if statement
             | becomes "calculate both branches, and throw away the result
             | you don't need". Similarly, a for loop becomes "give me an
             | upper bound of how long this will run", then "loop for
             | exactly the upper bound number of times, if the loop is
             | 'done' early, just throw away the result of the remaining
             | operations".
             | 
             | FHE is really cool, and has its uses in situations where
             | you want to cooperate without needing to trust, but it is
             | stupidly inefficient. (Things get really cool if two
             | parties want to cooperate and the computing party can e.g.
             | branch based on their local unencrypted data)
        
           | simiones wrote:
           | HE is incredibly slow, and unlikely to ever be fast enough
           | for common workflows. As in, an HE implementation of an
           | algorithm that runs in seconds in Python on a regular machine
           | might run in minutes or tens of minutes.
           | 
           | Not to mention, to avoid side-channel attacks, an HE scheme
           | still needs to always run the longest possible sequence of
           | operations regardless of input (otherwise, information about
           | the input data is leaked through timing). So an HE version of
           | a quick-sort scheme would always run in O(n^2), otherwise it
           | would leak details about the contents of the list. In some
           | cases it would even have to run in the same amount of time
           | regardless of the _size_ of the list, to avoid leaking
           | information about that.
        
             | tomc1985 wrote:
             | Seems like it still might be useful for small bits of data
             | like session tokens or other encryption keys
        
             | Y_Y wrote:
             | To be fair, the performance problem you're talking about
             | should affect latency rather than throughput. If you can
             | batch lots of operations (not all controlled by the same
             | user) then you can do things as fast as you can without
             | leaking (much) information.
             | 
             | This is still phenomenally slow, of course.
        
         | landonxjames wrote:
         | "We have never shared two threads on a core between EC2
         | instances" - https://www.youtube.com/watch?v=kQ4H6XO-
         | iao&t=2485s
         | 
         | Interesting that AWS has been mitigating for side channel
         | attacks since before they became a big news item. Curious about
         | Azure and GCP's stance on this
        
           | mlyle wrote:
           | Maybe they were super smart and foresaw side-channel being
           | such a big problem.
           | 
           | Or, maybe, they just thought the lack of deterministic
           | performance created billing/accounting/customer service
           | problems. (One hyperthread can just about completely starve
           | in many circumstances).
        
             | thrashh wrote:
             | Well, you pick EC2 because you _want_ dedicated cores.
             | 
             | You pick a VPS because you want to save costs and share
             | cores.
             | 
             | So it's not so much AWS choosing as you the customer
             | choosing.
        
               | secondcoming wrote:
               | I was under the impression that AWS provides vCPUs, not
               | dedicated CPUs unless you go bare-metal?
        
               | bostik wrote:
               | You can get dedicated tenancy instances in AWS, for an
               | upfront monthly cost of ~$1.2k (per account) and about
               | +10% on top of your normal EC2 instance costs.
               | 
               | I ran the numbers back in 2015 and persuaded the previous
               | employer to go for dedicated tenancy with all
               | performance-critical and privacy-sensitive workloads. I
               | was effectively hedging against unknown but practically
               | guaranteed cross-VM attacks to pacify a paranoid
               | regulator.
               | 
               | Then Rowhammer happened. Less than a day later, our
               | contact with the regulator comes to me asking how it
               | affects us. Being able to answer - with absolute
               | confidence - that it did not, was one of the proudest
               | moments of my career. And the turnaround from "regulator
               | comes asking awkward questions" to "regulator is happy
               | and sees no reason to ask again" of less than 48 hours
               | must be some kind of record too.
        
               | vngzs wrote:
               | You get dedicated time on a core while your task is
               | running. For instance t3.medium is 2 vCPUs (because
               | hyperthreading) but as you can see in [0] it's only one
               | physical core.
               | 
               | [0]: https://aws.amazon.com/ec2/physicalcores/
        
             | dathinab wrote:
             | > Maybe they were super smart and foresaw side-channel
             | being such a big problem.
             | 
             | The looming thread of side-channel attacks on SMT systems
             | has been known since well, before we had SMT systems
             | (because it also can apply to Co-Processor, and non SMT
             | multi core systems).
             | 
             | The difference between back then and today is "we believe
             | it's possible but haven't found a way yet" and "there are
             | multiple known ways", as well as it being wide spread known
             | instead of just in some communities.
             | 
             | The reason we still shipped the problematic CPU's is
             | because improvement in perf. and as such competitiveness
             | and revenue on the short term where more important.
             | 
             | There also was a shift in what people expect from security
             | and which attack vectors are relevant. For example user
             | applications a user installed where generally trusted as
             | much as the user. While today we increasingly move to not
             | trusting any applications even if they are installed by a
             | trusted user and produced by a trusted third party. Similar
             | running arbitrary untrusted native code from multiple
             | untrusted users and "upholding" side-channel protection
             | wasn't often an important requirement in the past.
        
             | belter wrote:
             | Microsoft Research looked into it - Paper is from 2020 and
             | is reference 24 in the document mentioned in the main post
             | here.
             | 
             | "Are We Susceptible to Rowhammer? An End-to-End Methodology
             | for Cloud Providers"
             | 
             | https://arxiv.org/pdf/2003.04498.pdf
             | 
             | Although their answer in this paper was diplomatic, my
             | interpretation is that they confirm it as a problem. Their
             | conclusion was it would not be as bad it was considered at
             | the time. To be revisited on the context of this more
             | recent work.
             | 
             | Edit: Adding main reference
             | 
             | "BLACKSMITH: Scalable Rowhammering in the Frequency Domain"
             | https://comsec.ethz.ch/wp-content/files/blacksmith_sp22.pdf
        
           | discreteevent wrote:
           | That's ok unless you are running something virtual like
           | kubernetes on top of the EC2 instance but want to ensure
           | isolation between containers/pods.
        
             | my123 wrote:
             | That's insecure today anyway. Containers are not a security
             | boundary. (If you don't use a VMM like gVisor, Firecracker
             | or go the Drawbridge way)
        
           | KingMachiavelli wrote:
           | They offer vCPUs in multiples of 2 so it makes logical sense
           | to divide the resources that way; performance would be a lot
           | more unpredictable if you could be sharing a single core with
           | another EC2 user/instance.
        
           | kmeisthax wrote:
           | Viable sidechannel attacks on SMP/Hyperthreaded designs have
           | been known about since 2005, only a few years after Intel
           | brought their first SMP designs to market.
        
         | KingMachiavelli wrote:
         | I don't see how Graviton/custom ARM chips are evidence of this
         | predicted trend. ARM chips tend to have even higher thread
         | counts and poorer per-core performance. The biggest security
         | difference is the absence of SMT/hyper-threading.
         | 
         | I think it will come down to what you are willing to call
         | 'individual' processors. But actually having physically
         | distinct memory seems like a lot of overhead for attacks that
         | won't matter for 90% of users. Also I would think that the on-
         | board ECC of DDR5 would protect it against these types of
         | attacks.
        
           | 015a wrote:
           | Graviton is not a prediction of the trend; its a signal that
           | Amazon is willing to make very deep investment into custom,
           | customer-facing hardware if Intel/AMD can't deliver what they
           | need.
           | 
           | The trend is yet to come. My statement is that, if AMD/Intel
           | doesn't adapt, Amazon has the hardware investment to leave
           | them behind, just like Apple did.
           | 
           | But to be clear on two points: They will probably adapt. And
           | Amazon/etc will probably never leave them behind fully. DCs,
           | especially public cloud, are not all-or-nothing like Apple's
           | Mac Lineup is.
           | 
           | Then the question follows, why would they want something
           | Intel/AMD isn't offering right now? The trend is System-on-
           | Chip. Beyond Security (this isn't the last electrical
           | interference/speculative execution-like attack we'll see).
           | SoCs are easier to service (easier != cheaper. holistic
           | replacement versus per-component debugging. servers are
           | cattle, not pets). Denser. More vertically integrated.
           | Capable of far higher IO performance. Lots of benefits.
           | 
           | Mega-servers with 256 cores and 4 terabytes of memory still
           | have a huge place in all DCs; but not when multiple untrusted
           | workloads are running simultaneously. They're not for
           | EC2/Fargate/Lambda/etc; they're for S3. Highly managed,
           | trusted workloads.
        
         | Arrath wrote:
         | > This will be a massive opportunity for ARM & SoC
         | manufacturers in the coming years, as its far better positioned
         | to offer, for example, a single rack appliance with 64
         | individual 2-core mini-computers at a price-point competitive
         | with a 128 core x86 rack, as one computer.
         | 
         | I'm curious about the eventual end-game of security in this
         | space. Take the 64 individual processors in your example, give
         | each one their own independent memory bus to their own ram
         | chip, isolate them from each other as much as possible. What
         | else can be done, if a malicious process on processor Z has to
         | go all the way to disk to try to get back at data working on
         | processor J, is that as maximally secure as it can be without
         | being in a completely separate chassis with only network access
         | to the other device?
        
         | com2kid wrote:
         | > Eventually, it will be obvious that running shared workloads
         | on a single piece of physical hardware has fundamentally
         | unremediatable security implications.
         | 
         | Sure, but this is worse than that. This is "your online poker
         | game client can gets access to your web browser's bank account
         | session info."
         | 
         | We need process isolation within a single machine, or else we
         | are kinda screwed as a field.
        
           | thrashh wrote:
           | IMO this is a perspective from software engineering.
           | 
           | But this is an electrical problem. Interference is a huge
           | issue with any engineering that involves physical things and
           | these kinds of attacks are just interference problems. This
           | issue is no different from a microwave knocking out your Wi-
           | Fi. These attacks have become possible because the acceptable
           | interference threshold that chip makers have been using has
           | turned out to be too low.
           | 
           | How do you fix interference problems? First, you choose a new
           | threshold of acceptable interference and then you engineer
           | better isolation, you lower density, and/or you switch
           | technology.
           | 
           | We could make shared computing complete safe tomorrow if we
           | wanted to so I think calling this the end to shared computing
           | is quite alarmist. The issue is that we collectively want to
           | both have the cake and eat it too: we currently have a
           | certain cost-to-compute ratio that we have become accustomed
           | to and we don't want to compromise that. We're basically
           | buying time until we can invent a new technology that can
           | achieve the same density without the same level of
           | interference.
        
             | AnimalMuppet wrote:
             | "Lower density" is a _really_ hard sell at the moment...
        
             | yholio wrote:
             | In a world where insecure high density exists, secure low
             | density is at odds with cloud computing: the cloud makes
             | sense only if you can efficiently utilize you computing
             | power round the clock and make it cheaper than shipping
             | mostly idle terminals. If securing the cloud is expensive,
             | then there's a cutoff point where it's better to ship
             | highly dense, cheap terminals.
             | 
             | So maybe "screwed as a field" is not an exaggeration if the
             | field is butt computing.
        
           | jeremyjh wrote:
           | Right, we may be entering an era in which secure network
           | computing is impractical and the impact could be very far
           | reaching.
        
             | johnvaluk wrote:
             | I've been experimenting with isolating work activity from
             | personal activity. It's amazing how difficult it is to
             | prevent information from leaking between networks and
             | applications. It's hard to find alternative solutions for
             | entrenched convergence/convenience features like
             | copy/paste, messaging and entertainment. Working remotely
             | reveals too much about your personal and work relationships
             | to corporations and VPNs only help to connect more dots. I
             | can't curl up in a hammock with 3 laptops and a phone on a
             | nice day, so I keep returning to a single device that does
             | it all.
        
             | orlytho wrote:
             | Sorry, entering? Just like climate change some of us have
             | been warning that a lack of focus on and willingness to
             | challenge the fundamentals of building software and systems
             | would lead to non-securable computing in the general case.
             | That warning has been sounding since the 1990s. Nobody
             | cares. It took a meat plant and pipeline paying a ransom in
             | cryptocurrency for everyone to notice that we are
             | completely and irredeemably fucked as a computing species.
             | We are already there, my guy, and it's only a matter of
             | time now.
             | 
             | Think about someone trying to do basic ETL. Like having a
             | tabular file and summing it or something. Don't use Excel,
             | we say, stand up a $4 million Spark and AWS architecture
             | with seven hundred pitfalls that can let bored Russians
             | take over your whole network as if they were going to the
             | dry cleaners because remember, you just might be Google
             | someday. That's where we are. It's been a complete industry
             | failure for a decade and it's only getting worse.
             | Accelerating, even: now you need some operationally-
             | terrifying Kubernetes to even be at the table, and then as
             | an industry we (rightly) say running this stuff ourselves
             | is too hard, so pay Amazon to do it rather than even ponder
             | if we have settled on the right approach.
             | 
             | Tada: Humanity just lost computing to three companies. We
             | very likely aren't getting it back.
             | 
             | There are probably 5,000 people doing this work who can
             | adequately secure such a system and make it mostly
             | impermeable. Where middle computing is royally screwed is
             | that nearly all of them work in San Francisco or its clones
             | abroad. So then you get "best practice" blog posts and
             | industry think pieces and the lowest bidder ties them
             | together into something resembling a competent computing
             | system. That's been the state of the art since 2004
             | everywhere except Santa Clara county.
             | 
             | With the exception of some areas in the IC and DoD, I just
             | described the entirety of US government IT. That ETL
             | example? It's actually real and underpins a small part of
             | Medicare across several government contracts. Because the
             | tools the valley exports are all they've got, and we sure
             | love building systems with massive footguns, and then
             | shaming organizations publicly for missing item #543 on the
             | tribal "secure your computing system" checklist and
             | shooting themselves.
             | 
             | The entire industry must change, top to bottom, but just
             | like climate change, again, that's a nonstarter. Posix and
             | the Web are not the path forward and I hope I live to see
             | the industry figure that out. I'm increasingly skeptical.
             | The good news is my hometown might flood into the sea
             | first, sparing me from considering in my last moments that
             | every argument I've _ever_ made in this profession has
             | fallen on deaf ears and that everyone has to derive our
             | industry's peril from first principles for themselves.
        
               | jeremyjh wrote:
               | Right, on average industry has been failing forever. The
               | difference now is it might not be practical for _anyone_
               | to actually secure an internet server or web browser,
               | full stop. I think that is a fundamentally different
               | situation.
        
               | orlytho wrote:
               | I think we have been in that situation since everybody
               | started mimicking how Google does things
        
         | FpUser wrote:
         | >"as its far better positioned to offer, for example, a single
         | rack appliance with 64 individual 2-core mini-computers at a
         | price-point competitive with a 128 core x86 rack,"
         | 
         | I have server application capable of utilizing many threads and
         | thousands requests/s. You think I will deploy it on tiny 2 core
         | CPU? No thank you, it currently runs on powerful dedicated
         | server from Hetzner where I control everything.
         | 
         | >"AMD and Intel need to adjust, or face extinction ....
         | 
         | If they don't evolve, AWS will for them..."
         | 
         | Sounds like pontification / FUD.
        
         | temac wrote:
         | The M1 family certainly doesn't go into the direction of less
         | cores, and even if you had a single one you could probably
         | rowhammer during your timeslice and then patiently wait for the
         | target process to execute. Since even individual computers
         | execute random garbage code straight from the Internet (e.g.
         | JS), there is still a needed for internal security.
         | 
         | That being said, and even if I consider that a quite different
         | subject, I agree that the current efficiency story of Intel is
         | not very good, but I hope they will improve in a not so far
         | future. The dev lifecycle of CPUs is quite long and it seems to
         | be an obvious target. I suspect they will be _forced_ to
         | improve their efficiency, because that 's actually were the
         | performance potential is today (the current dissipation level
         | of their last desktop CPUs is not reasonable, and prevents
         | scalability). And trying to lower the core count also can yield
         | to high consumption, e.g. if you want your performance back by
         | increasing the frequency. Wide and "slow" is needed, and it is
         | harder to increase the internal number of execution units per
         | core and have them actually used, than to increase the core
         | count -- plus ironically one way to do that is through HT,
         | which goes against your wish to share less hardware. (Now if
         | you compare their P-core and E-core in Alder Lake the story is
         | more complicated, but their marketing figures seem very
         | strange, so I won't conclude anything for now. The current
         | instances of P-core we have is for that weird desktop market
         | with unreasonable high TDP anyway.)
         | 
         | Now if you really want miniaturized _individual_ computers that
         | would not be shared at all, I 'm not sure the market will
         | actually go into that direction because big systems will
         | continue to be needed (and clusters are a niche mostly for
         | HPC), and I'm not sure a "slightly more secure on smaller
         | systems" market would be interesting enough. Esp. in an era of
         | chip shortage. And also because it would _still_ be bigger than
         | a shared equivalent system make with the same techs... But if
         | that 's really a niche that has to be addressed, I suspect it
         | would mostly be a matter for Intel to create new small _and_
         | slower SKUs ( "slower" compared to their desktops insanities)
         | -- they even kind of have that already, but yes the physical
         | miniaturization aspect is not handled yet -- that does not
         | really depends that much on the cores, though. And even in
         | those computers, I'm not sure there would be much demand for
         | very low core counts. The threading of pretty much all
         | workloads tends to increase, nowadays.
         | 
         | One last point, after the e.g. Pentium 4 fiasco nobody really
         | left the IPC race. AMD had some difficulties when trying
         | "weird" ideas ( _part_ of which were because of their marketing
         | communication), and then again a completely new design from
         | scratch to market takes time. In general there was a pause of
         | performance growth around 2016 for a few years, and that was
         | mostly Intel having _process_ problems and the rest of the
         | industry catching up (and then overtaking them).
        
         | oopsyDoodl wrote:
         | Intel gets it and is adjusting to be a foundry that builds
         | chips to application spec.
         | 
         | For me cloud computing is just where the best pay is. I do not
         | at all see it as the future of computing.
         | 
         | One reason is ML will help us realize we write code we don't
         | need; so much of it is syntax sugar for business specific
         | needs; infra, security... it'll be realized cloud software is
         | solving unemployment not technical problems of value. That many
         | issues with software back in the day were lower quality
         | networks, and consumer hardware. I mean any phone can abstract
         | metadata from any one users amount of behavior, we do it in a
         | DC because that's where the jobs are. Chip manufacturing will
         | include ML normalized logic for specific application.
         | 
         | LAN IOT will improve and we'll realize the Metaverse can be
         | implemented with a local client and AI generated art, on a
         | mobile GPUs power in a few years. Middle men like Zuckerberg
         | face the most uncertain future. He failed to diversify as well
         | as Bezos, Newell, and others.
         | 
         | IMO, Valve is a serious threat with Steamdeck; an open IOT
         | brain in a kid friendly form factor could be the new cigarette.
         | Even Apple may have to take them seriously. My kids iPads need
         | replacing soon; a flat glass slab with no interactive controls,
         | requiring another $800+ machine to develop on, bloated
         | development tools, fees, and a bunch of cloud logins, are not
         | going to motivate kids to feel creative.
        
         | api wrote:
         | I am not at all convinced that this is not a solvable problem.
         | It may require significant changes in how schedulers work, such
         | as resurrecting the idea of processor affinity.
         | 
         | Unfortunately it will likely have negative performance
         | implications for multi-tenant work loads.
        
       | yread wrote:
       | Interesting that there is a lot of variation between the modules
       | - some get 1.1 million bit flips and some 14. Perhaps that was
       | the ECC?
        
       | egberts1 wrote:
       | Does not work on Intel Core Generation 1 (specifically Xeon
       | Westmere EP hex-core) with DDR3-ECC.
       | 
       | Perhaps I should start snapping them all up ... because market
       | demand.
        
         | pomian wrote:
         | Did you discover anywhere how vulnerable DDR3 RAM is yet? Both
         | ECC and regular? Since this is a hardware induced
         | vulnerability, maybe it doesn't exist?
        
           | egberts1 wrote:
           | And this Intel Xeon E5660 overclocked to 4.1Ghz using
           | DDR3-1600 makes it the cheapest 6-CPU per chipset evah ... 7
           | years running (since 2015) and hopefully the safest.
           | 
           | https://overclock-then-
           | game.com/index.php/benchmarks/1-x5660...
        
       | joebob42 wrote:
       | To my understanding it's still hard to exploit this to steal
       | information / break access / etc just because you'd need to know
       | where the right bits were. On the other hand, if all we want to
       | do is break our adversary's process / crash it / make it perform
       | arbitrary incorrect behavior, this seems substantially easier to
       | accomplish even if we have no idea at all which bits we are
       | flipping.
        
       | boibombeiro wrote:
       | Brainstorming some solutions.
       | 
       | Maybe a randomized algorithm for ECC. Every so often changes how
       | the ECC is computed?
       | 
       | The region nearby where privileged information is stored could
       | have a speed limit on multiple sequential writes?
       | 
       | Add blocks that are more sucetible for those attacks to do early
       | detection. A honeypot bit.
        
       | fbanon wrote:
       | Can't wait for the first large-scale exploit of this stuff, which
       | will finally force DRAM manufacturers to fix their faulty crap.
        
         | passivate wrote:
         | Why wish for exploits that harm regular users who have nothing
         | to do with design decisions made by DRAM manufacturers?
        
       | zw123456 wrote:
       | Does anyone know if SDRAM (SDR) is susceptible to this attack ?
       | 
       | If you don't need speed but need security...
        
       | rkagerer wrote:
       | How far _back_ in time must one go to find main memory tech that
       | would be immune to this? (eg. SRAM, magnetic core, vacuum tubes?)
        
         | josteink wrote:
         | DDR3 supposedly. It's not packed as densely.
        
         | mdrzn wrote:
         | Somewhere they mention that even just DDR3 is immune from this,
         | because the chip is not "crammed" enough to be susceptible to
         | this attack.
        
       | zanethomas wrote:
       | Funny thing about that.
       | 
       | At Alpha Microsystems, about 1982, I was in charge of the
       | diagnostic programming group. At that time failing memory boards
       | were expensive and customers would not be happy with such
       | components.
       | 
       | I wrote a memory testing diagnostic that was based on knowing
       | exactly how addresses were mapped to cell locations so I could
       | try to provoke such failures.
       | 
       | Chip manufacturers were aware of this problem which is why they
       | scrambled the addresses.
       | 
       | Potential vendors, Motorola et al, were required to provide
       | mapping information before we would consider their chips.
       | 
       | Now I'm curious to know what such mapping looks like with modern
       | memory chips.
        
         | classichasclass wrote:
         | Not related to article: would love to hear about your time at
         | Alpha Microsystems. See https://ampm.floodgap.com/ (hosted on
         | an Alpha Micro Eagle 300).
        
           | zanethomas wrote:
           | Here's a bit of history ... I think the lead up helps to
           | understand what I did first at AM. I'll send you an email
           | with more info later.
           | 
           | 1976-1985 My first job was at Basic Four Corporation. I got
           | in as a test technician, assembling small refrigerator-sized
           | mini-computers, giving them their first tests and swapping
           | components until they passed. I soon learned how to use the
           | machine-language assembler and started writing small programs
           | to help me determine which components were failing without
           | swapping and hoping. Within a few months I was testing more
           | than 2x the number of machines the other techs were testing.
           | Management noticed and soon the other techs were getting up
           | to speed.
           | 
           | At this point management pretty-much turned me loose. I moved
           | up to diagnosing and repairing failed components (8"x11"
           | pcbs). My understanding of programming and digital circuits
           | allowed me to write small programs that could be used to
           | "light up" specific circuits on the board making it easy to
           | poke around with a scope to see where things went wrong.
           | Again this was a huge productivity boost and the technique
           | was propagated to other techs. A couple years later I went
           | back for a while part-time as a consultant and wrote my first
           | DSL for techs to use.
           | 
           | Next I talked my way into the firmware development group and
           | worked on firmware for tape, disk and other devices. This is
           | the period of time when microprocessors were being
           | incorporated into everything and my experience with the
           | Micro-68 put me in a good position to participate. I also got
           | to write microcode for a 2901-based cpu that was in
           | development.
           | 
           | And then, somehow, perhaps at a user's group, I learned about
           | Alpha Microsystems.
           | 
           | When I first visited Alpha Microsystems their idea of
           | "burning in" pcbs consisted of putting them in a powered
           | backplane, in a wooden box, with a lightbulb, where they sat
           | for some period of time.
           | 
           | Basic Four had serious testing which included putting entire
           | computers into temperature-cycling ovens where they ran tests
           | for 24 hours. That knocked out a pretty high percentage of
           | boards. After I told them what they were missing Alpha
           | Microsystems hired me to improve their process.
           | 
           | For the next several years I participated in creating the
           | flow of production and testing. A department evolved to
           | handle the hardware side of things and I became head of a
           | diagnostic programming department which grew to, variously,
           | between 6 and 10 programmers. After that department was
           | functioning and had someone who could step up I transferred
           | into the operating systems group. I was one of only three
           | people allowed to work on the operating system code, let
           | alone even see it since it was held as a trade secret.
           | 
           | During my last year at Alpha Microsystems a brilliant
           | programmer I had hired introduced me to the then just-
           | released Structure And Interpretation Of Computer
           | Programming, a new textbook for students at MIT. That book's
           | use of the language Scheme introduced me to first-class
           | functions, closures, and many other concepts which found
           | their way into popular programming languages decades later.
           | The SICP had all the information one needed to create a
           | Scheme interpreter. I wrote one using 68000 assembler so I
           | could run the sample code.
        
             | kawsper wrote:
             | Wauw, amazing! Thank you so much for sharing your story!
        
               | zanethomas wrote:
               | ha, thanks!
               | 
               | i love programming so much i've never quit
               | 
               | taught fullstack at ucla BC (before covid)
               | 
               | did a large frontend project with vue the past year
               | 
               | i'll die with a keyboard under my fingers!
        
       | pomian wrote:
       | It is very reassuring that there is such an agency as the
       | Computer Security Group ( this article). Run by and funded
       | independently by a multi national science agency. Likely without
       | oversight by any industrial organisation (read lobby group.) It
       | would be nice to have similar scientific bodies for other
       | livelihood and security threats, such as health, and logistics,
       | most especially food, drugs and environmental issues( chemicals
       | for example.) Will this agency become corrupted as those have
       | been over time?
        
         | contidrift wrote:
         | For cosmetics: https://www.ewg.org/skindeep/
        
       | Dylan16807 wrote:
       | So it's another Target Row Refresh _bypass_.
       | 
       | Which is only possible because the DRAM has limited memory for
       | recently-accessed rows.
       | 
       | When is a company going to put out chips that have the access
       | count stored _inside_ the row? It 's the most obvious way to do
       | it and makes this entire class of attack impossible.
       | 
       | Edit: Okay, reading the paper more apparently LPDDR5 has
       | something similar to this. Why is LPDDR so divergent from normal
       | DDR?
        
       | r00fus wrote:
       | Is this purely an x86 concern or does it affect non-x86 usages of
       | DRAM? ie, ARM cores, Apple Silicon, RISC-V, etc.
        
         | zekica wrote:
         | It affects all processors that use this type of DRAM. Apple
         | uses LPDDR5x and from what I've read, they don't have ECC, so
         | this attack should work fine on M1.
        
       | nickcw wrote:
       | From the article: > We demonstrate that it is possible to trigger
       | Rowhammer bit flips on all DRAM devices today despite deployed
       | mitigations on commodity off-the-shelf systems with little
       | effort.
       | 
       | The fact that user space code can cause bit flips in your RAM is
       | a hardware bug. I'd love to see this code in memory testers like
       | memtest86 so I could send the RAM back if it ever caught a
       | problem like this.
       | 
       | I guess it shows just how close to the edge of not working our
       | modern computing environment is.
        
         | temac wrote:
         | The problem is right now DDR4 devices that work correctly in
         | that regard, do not exist. Likely the best mitigation for any
         | sensitive application, even so lightly, is DDR4 with ECC (even
         | if it may not be enough, it is vastly better than nothing, and
         | not just because of rowhammer)
         | 
         | And I have no idea if the internal "ECC" of standard DDR5 helps
         | or not. It is not intended for regular ECC level of reliability
         | anyway. (And I have seen discussion about likely bitflips
         | detected in crash dumps of M1 Pro devices)
         | 
         | So, as much as I would like to return defective devices, I
         | would probably be left with no computer, no smartphone, etc.
        
           | jandrese wrote:
           | Maybe a paranoid app could try to allocate the memory
           | adjacent to any sensitive bits as a buffer? That would be
           | pretty difficult to do but might be possible if you are
           | tremendously paranoid and have a good way to examine the
           | hardware.
        
             | to11mtm wrote:
             | I wonder how easy that would be to reason about though. I
             | really don't know much about modern DRAM circuitry I feel
             | like it might be abstracted away to some level, also there
             | are bios-tweaky settings that might make a difference (i.e.
             | things like channel configuration and/or bank interleaving,
             | if that's still a thing.)
             | 
             | IDK though. Maybe there's a way. I've worked on code that
             | uses padding around a data structure to ensure it has it's
             | own cache line. Maybe if you allocate a large enough
             | contiguous block you'll be OK?
        
             | mnw21cam wrote:
             | You'd have to have the operating system's cooperation,
             | because it may be mapping individual 4k pages all over the
             | place, and it's the only thing that has a chance of knowing
             | how the memory is laid out on the actual chips.
        
               | AshamedCaptain wrote:
               | And the CPU and the chipset's cooperation. In many cases
               | it is even a trade secret how data is striped across the
               | different slots/banks.
        
               | xxpor wrote:
               | Doesn't the same thing to exploiting rowhammer though? If
               | you're after specific data, how would you know what
               | physical address to use, even if you knew the physical
               | address of the target data?
        
               | jandrese wrote:
               | Yes, but there is a difference between an attack being
               | run by a low level hardware hacker vs. the software being
               | run by average users. This is why I said it would be
               | hard, you would need to encode a lot of low level
               | hardware knowledge into your application.
               | 
               | This might only be useful in very specialized
               | circumstances, like with kernel support on only a handful
               | of carefully chosen hardware platforms. However, someone
               | like Apple could do this on their systems, as they
               | control both the kernel and the hardware. Sensitive
               | memory locations could be cordoned off in special zones
               | with buffering to prevent Rowhammer type attacks.
        
               | StillBored wrote:
               | <i> it is even a trade secret how data is striped across
               | the different slots/banks </i>
               | 
               | If that is true, vs just the usual "you don't need to
               | know" trash common these days, its crazy. A pretty good
               | picture can be built up with just software timing
               | analysis, but its hardly rocket science to hook a logic
               | analyzer/scope up and determine bit swizzling and
               | page/bank interleave. Plus, many of the more open BIOS
               | vendors have tended to provide page/row/controller/socket
               | interleave options for quite a while, partially because
               | it can mean a 5-10% uplift for some applications to have
               | it set one way or the other. Its been one of those how do
               | you tune your memory system options for a couple decades
               | now.
        
               | [deleted]
        
             | staticassertion wrote:
             | If your app is particularly paranoid you could just double
             | your memory usage, and verify errors against the clone.
        
             | pas wrote:
             | Or use different banks of DRAM for different security
             | contexts, managed via NUMA?
        
           | indymike wrote:
           | > The problem is right now DDR4 devices that work correctly
           | in that regard, do not exist.
           | 
           | Yes, and this is actually the problem. Without safe hardware,
           | it is almost impossible to write safe software.
        
             | userbinator wrote:
             | Meanwhile you can still buy DDR3 --- used --- from various
             | shops in China, and it'll be 100% perfect and Rowhammer-
             | free because it was made before the insane process
             | shrinkages that caused it.
             | 
             | (Have done this a few times. Was hesitant initially because
             | of the low price and used nature, but when the first stick
             | passed MemTest86+ including RH perfectly, I was convinced.
             | Half a dozen more sticks later and still good. But maybe
             | they've started to run out of the good old stuff now...)
        
           | oopsyDoodl wrote:
           | I think this illustrates the cloud is unacceptable for
           | anything more than storage and retrieval.
           | 
           | All computed results from data science must include steps and
           | code to verify locally.
           | 
           | It calls into question privacy on federated networks and
           | crypto networks; any node can be manipulated locally to
           | change payload outputs on delivery, reveal secrets, disrupt
           | workloads.
           | 
           | This makes sense to a lot of folks in computer engineering
           | and physics, versus abstract software. No physical theory I
           | know of offers any guarantee our arbitrary computing machines
           | will ever be securable. We put fart pipes on Hondas.
           | 
           | Science proves it's titillating smoke and mirrors once again.
           | Still waiting for nuclear rocket cars.
           | 
           | I think this proves further as well why general computing
           | chips need to be replaced with workload specific designs,
           | where the anticipated inputs are well known and no vague
           | logic paths to intentionally allow software monkey patching
           | ever ship.
        
             | cheschire wrote:
             | Security doesn't scale at a price point that private sector
             | companies could typically afford.
             | 
             | Perhaps we fail at pricing security into the value of a
             | company, or maybe that's what risk appetite is about.
        
           | myself248 wrote:
           | Yep, though I agree with the parent, if a RAM returns data
           | from a location that is different from what is stored in that
           | location, assuming all recommended timings have been
           | followed, then that RAM is defective. If that means they
           | should be recommending a refresh after every single
           | operation, then that's what it means.
           | 
           | In other words, the whole industry sits on a bed of lies at
           | this point and it's only because government is technically
           | incompetent that we haven't seen the world's biggest class-
           | action.
        
             | ygjb wrote:
             | > In other words, the whole industry sits on a bed of lies
             | at this point and it's only because government is
             | technically incompetent that we haven't seen the world's
             | biggest class-action.
             | 
             | That's a pretty bold claim - if it's accurate, can you
             | point at the government regulation that precludes a class
             | action lawsuit? I would assume it's something in the T&C or
             | EULA for the hardware rather than a regulation?
        
               | userbinator wrote:
               | Bold but sadly correct. The industry runs on misdirection
               | and deception.
               | 
               | If users knew better, we'd have the equivalent of another
               | Pentium FDIV bug.
        
           | buran77 wrote:
           | According to older paper [0] ECC can also be bypassed after
           | reverse-engineering the mitigation in DDR3 DIMMs. Also:
           | 
           | > "DDR4 systems with ECC will likely be more exploitable,
           | after reverse-engineering the ECC functions," researchers
           | Razavi and Jattke said
           | 
           | > What if I have ECC-capable DIMMs? Previous work [1] showed
           | that due to the large number of bit flips in current DDR4
           | devices, ECC cannot provide complete protection against
           | Rowhammer but makes exploitation harder.
           | 
           | [0] https://cs.vu.nl/~lcr220/ecc/ecc-rh-paper-eccploit-press-
           | pre...
           | 
           | [1]
           | https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8835222
        
             | FpUser wrote:
             | While it can not protect from bit flips it can most likely
             | warn about attack in progress due to a large number of bit
             | flips.
        
         | jacquesm wrote:
         | You'll be sending back all of your RAM then.
        
         | jdmichal wrote:
         | > I'd love to see this code in memory testers like memtest86 so
         | I could send the RAM back if it ever caught a problem like
         | this.
         | 
         | Is this an individual stick bug, or a design bug akin to
         | Spectre? Given that they tested 40 different sticks and seemed
         | to find effective patterns for all of them make me think it's
         | the latter. In which case, just return all your computers?
        
           | jandrese wrote:
           | Yeah, this is a problem with high density DRAM. Fixes are
           | possible but difficult and expensive. In the cutthroat DRAM
           | market nobody seems willing to put in the engineering effort
           | to develop a solution and then manufacture it. It would have
           | to sell at a premium, in a market where it's difficult to
           | even find ECC memory at times.
           | 
           | The article states that ECC isn't sufficient to solve
           | Rowhammer, but it does make the attack harder. If you are
           | looking for a hardware solution it is your only option
           | currently, even if it is imperfect.
        
             | R0b0t1 wrote:
             | If you detect flips you can send it back, it's violating
             | the advertised interface.
        
               | 0xffff2 wrote:
               | Sure, but if it's possible to cause bit flips in
               | basically all currently available components, then what's
               | the point? Return all of your RAM and have a non-
               | functional computer?
        
               | therein wrote:
               | Can't get bit flips on my RAM if I have no RAM.
        
               | ziddoap wrote:
               | So you just.. Send back all of your RAM all the time
               | because they are all susceptible to bitflips?
               | 
               | This is not a tenable solution.
        
             | seiferteric wrote:
             | I imagine you are right that consumers don't care or know
             | enough to put pressure on manufacturers, but what about big
             | companies like the FANGs? Or Government agencies like NSA
             | etc? What are they doing about it? I imagine intelligence
             | agencies are well aware of the issue and have or are trying
             | to get solutions to protect their own infra.
        
               | codetrotter wrote:
               | Off topic but shouldn't FAANG really be referred to as
               | MAAAN now?
               | 
               | Google -> Alphabet
               | 
               | Facebook -> Meta
               | 
               | Meta, Apple, Amazon, Alphabet, Netflix. MAAAN.
        
               | tester756 wrote:
               | I like MAGMA more
        
               | festkal wrote:
               | MANGA
               | 
               | M - Meta A - Apple N - Netflix G - Google A - Amazon
        
               | gavinray wrote:
               | I'm now adopting this, thanks
        
               | somethingwitty1 wrote:
               | Google still exists. Facebook still exists. Meta/Alphabet
               | are mainly holding companies. For example, I received
               | reach outs from recruiters for Google and Facebook today.
               | Not from Meta and Alphabet. Maybe once Meta and Alphabet
               | deliver on things and build brand recognition, it should
               | change. Until then, I'd vote "no".
        
               | vidarh wrote:
               | If you're doing that: M3AN.
        
               | gjs278 wrote:
               | faang is a stupid fucking acronym, just kill it and refer
               | to them as nothing
        
               | daniel-thompson wrote:
               | AMANAM
               | 
               | Apple Microsoft Amazon Netflix Alphabet Meta
        
           | dboreham wrote:
           | It happened because at some point in the past (around 10
           | years ago I believe), the memory vendors decided that it was
           | ok to design product with known "pattern sensitivity",
           | because...I suppose they made more money that way and no
           | customer complained loudly. Pattern sensitivity in memory
           | chips is a very old problem, and previously was treated as a
           | fatal design error, fixed and the affected chips used for
           | parking lot infill. But today every chip sold is affected.
           | Basically the same as Boeing and the MAX, just fewer people
           | killed.
        
             | myself248 wrote:
             | > Pattern sensitivity in memory chips is a very old
             | problem, and previously was treated as a fatal design error
             | 
             | You seem to know more about this than I do. Where might one
             | go to read more about this? Are there whitepapers, design
             | docs, other resources which could be used to assert
             | positively that the industry used to consider this a
             | defect?
        
         | tbob22 wrote:
         | MemTest86 already has a rowhammer test, not sure how it
         | compares to blacksmith but some sticks I have do fail that test
         | and can only be mitigated by setting tREFI extremely low (while
         | also taking a large performance hit).
         | 
         | Most of the higher end hynix/micron/samsung sticks I've tried
         | do not fail at JEDEC or XMP after 7+ passes.
        
         | userbinator wrote:
         | The problem is that apparently the authors of memory testing
         | programs were somehow convinced to hide the severity of this
         | issue when it first appeared, claiming things like it being a
         | rare edge case and that _too much RAM would fail_.
         | 
         | I believe newer versions of MemTest86+ have Rowhammer tests but
         | it's disabled by default.
         | 
         | See my previous comment on this matter:
         | 
         | https://news.ycombinator.com/item?id=12410274
        
         | gjs278 wrote:
         | did you read the fucking headline? all ram is vulnerable to
         | this. you don't need to run the test or return anything. you
         | won't get a stick that isn't vulnerable. jesus you are dumb
        
         | StillBored wrote:
         | Pretty sure the passmark version does
         | https://www.memtest86.com/compare.html
         | 
         | Except that the distro's are still shipping an older version
         | due to license issues (or something like that AFAIK).
        
         | mithro wrote:
         | Google is working on platforms to make this easier to explore
         | this problem, see
         | https://opensource.googleblog.com/2021/11/Open%20source%20DD...
        
       | adriag wrote:
       | nice
        
       | anthk wrote:
       | HN itself shouldn't even need JS.
        
       | acomjean wrote:
       | This seems bad, but as a practical matter, what does this mean?
       | 
       | If you have a process running on your machine can it use this to
       | get root? Read Keys?
       | 
       | It looks like they ran their process for 12 hours to do the
       | flipping.
       | 
       | And if your flipping your process's memory for that long, what
       | are the chances you are next to sensitive memory for another
       | process? It seems bad, but it seems like if your randomly
       | flipping bits in memory the system will likely crash.
        
         | TacticalCoder wrote:
         | > If you have a process running on your machine can it use this
         | to get root? Read Keys?
         | 
         | Which keys? OpenBSD / OpenSSH for example does now (since
         | 2019?) encrypt SSH keys in RAM to prevent rowhammer like
         | attacks. They're mixing the key with a huge "pre key" and an
         | attacker would need to side-channel attack the entire pre key
         | to be able to read the real SSH keys.
         | 
         | So it's not as if there was _nothing_ that could be done to
         | guard against this.
         | 
         | It may require changing lots of software/libraries but at least
         | the security-conscious ones already started, without waiting
         | for this "blacksmith" attack.
        
         | guerrilla wrote:
         | > If you have a process running on your machine can it use this
         | to get root? Read Keys?
         | 
         | Yes. There's a BlackHat talk on that[1]. DoS is a big issue on
         | shared hardware too. You can crash other processes, the kernel
         | and the hypervisor.
         | 
         | > And if your flipping your process's memory for that long,
         | what are the chances you are next to sensitive memory for
         | another process?
         | 
         | You don't need to leave it up to chance. There are apparently
         | ways you can control it and of course you can just spray
         | everything.
         | 
         | 1.
         | https://www.blackhat.com/docs/us-15/materials/us-15-Seaborn-...
         | 
         | 2. https://github.com/google/rowhammer-
         | test/blob/master/rowhamm...
        
           | acomjean wrote:
           | Intersting. I can see taking the system down, especially on
           | shared hardware.
           | 
           | My (probably slightly naive) understanding is the OS is
           | allocating memory, acting like a memory cop so processes
           | don't overlap and swapping when needed and so forth. It seems
           | like these hardware errors might be mitigated by the OS.
           | 
           | I'm in a bit over my head in the pdf, but it explains: "Our
           | kernel privilege escalation works by using row hammering to
           | induce a bit flip in a page table entry (PTE) that causes the
           | PTE to point to a physical page containing a page table of
           | the attacking process. This gives the attacking process
           | readwrite access to one of its own page tables, and hence to
           | all of physical memory."
           | 
           | I'm still a little fuzzy on how they get the right location
           | in the page table without hosing the system, but this gives
           | me enough of a gist. Thanks!
        
             | guerrilla wrote:
             | > My (probably slightly naive) understanding is the OS is
             | allocating memory, acting like a memory cop so processes
             | don't overlap and swapping when needed and so forth. It
             | seems like these hardware errors might be mitigated by the
             | OS.
             | 
             | You don't need to have access to the memory you want to
             | attack, only the ability to cause something to read from
             | memory that physically neighbors the memory you want to
             | attack.
             | 
             | Think of page tables like a big array in kernel memory (I'm
             | being imprecise.) The entries in this table are mappings
             | from your virtual addresses to physical addresses. You can
             | cause the kernel to read your at least part (or all of?)
             | your PTEs, which means you can cause it to flip bits in
             | neighboring physical cells, which means you can modify the
             | PTEs. TL;DR: If you can cause the kernel to read some
             | specific memory then you can modify memory near it. In this
             | case, you can modify the page table entries, allocating
             | memory that isn't yours to you (or doing other funny
             | things.)
             | 
             | A simpler to explain but probably not realistic
             | explanation: Imagine if your UID is stored right next to
             | your GID in some kernel structure, if you could cause the
             | kernel to repeatedly read your GID then you could cause bit
             | flips in your UID, which would change which user your
             | user...
        
         | ARandumGuy wrote:
         | That's my question as well. I'm not a security expert, but this
         | doesn't seem all that concerning for anything other then the
         | highest security applications. It seems to me that executing
         | this attack not only requires running code on the target
         | machine (which admittedly isn't that big of a hurdle), but
         | requires basically complete knowledge of memory allocation at
         | the time of the attack, something that is fairly opaque and
         | ever changing on most hardware.
         | 
         | Is there something I'm missing here? What are the realistic
         | attacks that this vulnerability allows?
        
           | acuozzo wrote:
           | > What are the realistic attacks that this vulnerability
           | allows?
           | 
           | https://github.com/IAIK/rowhammerjs
        
           | joebob42 wrote:
           | It might be hard to do a targeted attack. On the other hand,
           | flipping random bits is likely to just start crashing stuff
           | and / or producing incorrect results pretty reliably.
        
           | guerrilla wrote:
           | See my reply, sibling to yours.
        
           | pfortuny wrote:
           | See my reply above: in-memory data analysis...
        
           | myself248 wrote:
           | There's no reason the code couldn't just blindly try every
           | attack until it finds one that succeeds. If there's a
           | malicious app that runs a lot (lookin' at you, random firefox
           | tab that eats 99% of my CPU for minutes at a time and then
           | goes idle again), it has all the time in the world.
        
         | pfortuny wrote:
         | This may be very bad for data analysts. Imagine trying to study
         | a huge in-memory database and getting different values each
         | time...
        
           | speedgoose wrote:
           | Assuming you run your database in a public cloud, what's the
           | likelihood of someone throwing money at the cloud provider to
           | run the attack on a host shared with your database, with no
           | added benefits to them?
        
             | kevingadd wrote:
             | The implication of the post was that your own data analysis
             | workloads could cause rowhammer flips unintentionally, I
             | think.
        
         | pomian wrote:
         | Complex modeling, that runs for days, could have completly
         | messed up results, especially since the old standard of using
         | ECC, is not proof of infallibility. Often those results are in
         | series, so early errors would compound, and it may very hard to
         | determine if and why results are wrong.
        
       | zsmi wrote:
       | There was another paper last year that showed something similar,
       | and it went into way more detail on what TRR actually is.
       | 
       | TRRespass: Exploiting the Many Sides of Target Row Refresh
       | 
       | https://arxiv.org/pdf/2004.01807.pdf
       | 
       | Memory controller mitigation of RowHammer can work pretty well,
       | if one actually has it turned on. Which is unfortunately rarely
       | the case even in 2021.
        
       | philipkglass wrote:
       | Is this possibly a route to jailbreaking for iOS, via the
       | temporary provisioning profile for apps? It seems like you could
       | run a Rowhammer memory corruption app on your personal device
       | until getting escalated privileges. Newer OS releases may not
       | patch the hole since this is a hardware flaw. But I admittedly
       | have only the vaguest idea of what defenses need overcoming on a
       | modern iOS device.
        
         | londons_explore wrote:
         | Rowhammer is still pretty hard to exploit because typically you
         | can't reliably flip most bits, and you can normally only flip
         | bits that are very close in physical memory address to those
         | you control.
         | 
         | Combine that with a lack of knowledge of physical memory
         | addresses and inability to have much control over memory
         | layouts, and it really gets tricky to gain privileges outside a
         | lab environment in a reasonable short time.
         | 
         | Remember that flipping bits at random will almost certainly
         | kernel panic the machine before it gives you root access.
         | 
         | I'm sure a determined attacker could do it though.
        
           | chasil wrote:
           | Hopefully, the OpenBSD extreme implementation of ASLR makes
           | it even safer.
           | 
           | On every boot, there is a brand new kernel and C library:
           | reordering libraries: done        reorder_kernel: kernel
           | relinking done
           | 
           | ASLR has been compromised in the past, so this likely isn't
           | completely secure.
        
           | jandrese wrote:
           | This seems like a case where targeting an iPhone might make
           | life easier. The hardware is quite uniform for a particular
           | model.
        
             | SV_BubbleTime wrote:
             | Which gets you some architecture knowledge, but doesn't
             | promise or indicate that your userland ram space is
             | adjacent to anything important. It's a better start at
             | playing the game though.
        
           | guerrilla wrote:
           | > very close in physical memory address to those you control.
           | 
           | Not quite. It doesn't require the ability to write to the
           | neighboring cells, just read from them.
        
       | belter wrote:
       | Great work. Fascinating and depressing at the same time. Like
       | watching your house on fire, but not being able to avoid getting
       | mesmerized by the beautiful flames and tones as your designer
       | furniture burns away :-)
       | 
       | "...Are there any DIMMs that are safe?
       | 
       | We did not find any DIMMs that are completely safe. According to
       | our data, some DIMMs are more vulnerable to our new Rowhammer
       | patterns than others.
       | 
       | Which implications do these new results have for me?
       | 
       | Triggering bit flips has become more easy on current DDR4
       | devices, which facilitates attacks. As DRAM devices in the wild
       | cannot be updated, they will remain vulnerable for many years.
       | 
       | How can I check whether my DRAM is vulnerable?
       | 
       | The code of our Blacksmith Rowhammer fuzzer, which you can use to
       | assess your DRAM device for bit flips, is available on GitHub. We
       | also have an early FPGA version of Blacksmith, and we are working
       | with Google to fully integrate it into an open-source FPGA
       | Rowhammer-testing platform.
       | 
       | Why hasn't JEDEC fixed this issue yet?
       | 
       | A very good question! By now we know, thanks to a better
       | understanding, that solving Rowhammer is hard but not impossible.
       | We believe that there is a lot of bureaucracy involved inside
       | JEDEC that makes it very difficult.
       | 
       | What if I have ECC-capable DIMMs?
       | 
       | Previous work showed that due to the large number of bit flips in
       | current DDR4 devices, ECC cannot provide complete protection
       | against Rowhammer but makes exploitation harder. What if my
       | system runs with a double refresh rate? Besides an increased
       | performance overhead and power consumption, previous work (e.g.,
       | Mutlu et al. and Frigo et al.) showed that doubling the refresh
       | rate is a weak solution not providing complete protection.
       | 
       | Why did you anonymize the name of the memory vendors?
       | 
       | We were forced to anonymize the DRAM vendors of our evaluation.
       | If you are a researcher, please get in touch with us to receive
       | more information. ..."
        
         | gruez wrote:
         | why paste the FAQ into your comment?
        
           | belter wrote:
           | Its only a partial quote of the whole FAQ. I know its usual
           | to get a gist for an article from the comments before getting
           | to the full details...
        
             | hungryforcodes wrote:
             | Thanks! I wasn't going to read the article -- and this
             | answers my questions:)
        
         | [deleted]
        
         | Rd6n6 wrote:
         | Is this a threat to servers only, or to any network attached
         | computer with ddr4?
         | 
         | By coincidence, I've been bluescreening with ram related error
         | codes for the last 2 days haha
        
           | belter wrote:
           | Threat to your phone and your Routers:
           | 
           | "Drive-by Rowhammer attack uses GPU to compromise an Android
           | phone" [2018]
           | 
           | https://news.ycombinator.com/item?id=16984663
           | 
           | "Inducing Rowhammer Faults through Network Requests"
           | 
           | https://arxiv.org/pdf/1805.04956.pdf
        
             | Rd6n6 wrote:
             | Scary, thanks
        
       | zokier wrote:
       | I would imagine that SME/TME (AMD/Intel memory encryption) would
       | mitigate Rowhammer-style attacks quite effectively because
       | attackers would not be able to control the physical bit patterns
       | anymore?
        
         | guerrilla wrote:
         | Nope. It only requires being able to read neighboring cells. It
         | would make a privilege escalation attack harder but not
         | impossible. DoS attacks would still be relatively easy.
        
           | LogonType10 wrote:
           | >DoS attacks would still be relatively easy.
           | 
           | Local DoS? Can you elaborate on this?
        
             | guerrilla wrote:
             | Rowhammer is a memory corruption technique, so if you
             | corrupt the right memory of a process in the right way then
             | you can crash it; same for a kernel or hypervisor.
        
             | 420official wrote:
             | They are saying that privilege escalation is harder because
             | it's really challenging to target specific bits to flip,
             | whereas flipping random bits will eventually lead to a
             | crash of some kind causing the service to fail which is
             | effectively a DoS
        
       | Andys wrote:
       | It would be so much easier if the RAM just got moved onto the
       | CPU. As chips get more dense and NAND becomes cheaper, I could
       | see rowhammer-susceptible DRAM just going away completely for
       | many forms of computing.
        
       | buryat wrote:
       | it's a ploy to sell more DRAM
        
         | Syonyk wrote:
         | One of the more absurd plots...
         | 
         | "It's broken! Buy more of the broken stuff!"
        
         | egberts1 wrote:
         | It's a plot to resurge demands for DDR3s.
        
         | cout wrote:
         | Redundant array of inexpensive RAM?
        
           | zokier wrote:
           | You might be joking, but memory mirroring is a thing on
           | higher-end servers
        
       | Animats wrote:
       | OK, for starters, ECC has to become standard.
       | 
       | Then the rate of ECC errors has to be monitored. If something is
       | trying a rowhammer attack, it's going to cause unwanted bit flips
       | which the ECC will correct. Normally, the ECC error rate is very
       | low. Under attack, it should go up. So an attack should be
       | noticeable. You might get some false alarms, but that just means
       | it's time to replace memory.
        
         | arcticbull wrote:
         | Luckily all DDR5 DIMMs will have on-chip ECC. My understanding
         | is it's not a complete mitigation but does make exploitation
         | harder.
        
           | snak wrote:
           | Yes, the article mentions it:
           | 
           | > What if I have ECC-capable DIMMs?
           | 
           | > Previous work showed that due to the large number of bit
           | flips in current DDR4 devices, ECC cannot provide complete
           | protection against Rowhammer but makes exploitation harder.
        
             | hinkley wrote:
             | It sounds to me like ECC isn't being included in the DDR5
             | spec due to magnanimity so much as because it doesn't
             | function without it. That ECC has become 'load-bearing'.
             | 
             | Does that mean we need an extended ECC to deal with
             | critical systems that require additional robustness?
        
               | Legion wrote:
               | Who error checks the error checkers?
        
               | RedShift1 wrote:
               | It's just a matter of time before someone finds a way to
               | exploit the ECC part, calls it Hammerrow and brings us
               | back to square one...
        
               | aaaaaaaaaaab wrote:
               | Rowhamming would be a better pun, as DDR5 uses a Hamming
               | code for error correction.
        
           | eqvinox wrote:
           | > My understanding is it's not a complete mitigation but does
           | make exploitation harder.
           | 
           | It won't. It's designed to counter silicon limitations to
           | increased density, i.e. it's made to _correct the errors that
           | result from packing cells beyond the limit of error-free
           | operation_.
           | 
           | The extra redundancy from on-chip ECC is intended to be
           | "consumed" by the chip itself, and since this will allow
           | optimizing chip manufacture to denser and cheaper, it's no
           | question at all that it will get pushed to the very limit.
           | 
           | There's still "classic" ECC for DDR5. 8 bits mapped to 9,
           | terminated at the CPU which can look at things. That's what I
           | want, need, and will buy.
           | 
           | P.S.: Shame on Intel for still walling off desktop CPUs from
           | ECC. https://ark.intel.com/content/www/us/en/ark/search/featu
           | refi...
        
             | Dylan16807 wrote:
             | > It won't. It's designed to counter silicon limitations to
             | increased density, i.e. it's made to correct the errors
             | that result from packing cells beyond the limit of error-
             | free operation.
             | 
             | I'd love to see actual parameters for the error correction
             | codes, but DDR5 could pretty easily be a lot more robust
             | than DDR4.
             | 
             | When you have no error correction at all, you need
             | ridiculously high reliability. Even if these new memory
             | cells are have a much higher error rate, if they're
             | designed to seamlessly handle a few bits in the same row
             | flipping then the overall reliability could skyrocket.
             | 
             | Edit: Oh, there's a paper from micron talking about DDR5
             | only having single bit correction internally. That's not as
             | useful as it could be against attacks...
             | 
             | > There's still "classic" ECC for DDR5. 8 bits mapped to 9
             | 
             | But Single Error Correction, Dual Error Detection is not
             | enough to prevent attacks.
             | 
             | Also because DDR5 uses a smaller width you actually need to
             | map 8 bits to 10.
        
             | arcticbull wrote:
             | I think on-chip ECC would mitigate this problem just as
             | well as off-chip ECC. Off-chip ECC is meant to catch errors
             | during transmission (i.e. 72 bits transmitted for 64 bit
             | words), not necessarily just the ones that occur internal
             | to the package.
             | 
             | I agree it's meant to counter limitations due to increased
             | density, but it should catch this to an extent also as this
             | error is induced on-package right, not during transmission.
             | Or am I mistaken?
        
             | dogma1138 wrote:
             | That's not exactly correct whilst it does there mainly to
             | allow for higher densities and frequencies it's designed to
             | prevent bit flips from happening on chip.
             | 
             | It's not end to end ECC as in it doesn't prevent flips that
             | happen on the bus or in CPU cache but it does prevent
             | single bit errors on DRAM.
        
       | Karliss wrote:
       | While they were able to get bitflips with all the modules.
       | Difference between 100000 and 15 bitflips during 12h seems
       | significant to me. Whatever mitigation manfacturer B has seems to
       | be work a lot better than others. That's potential reason for
       | choosing to buy that instead of others. If that where to be
       | improved further and decreased probability 10000 times more it
       | might reach the point where its comparable to random bitflips
       | from cosmic radiation.
        
       | undoware wrote:
       | Former startup CSO here. There is not enough coffee in the world
       | to make this day not terrible for anyone whose job involves
       | worrying about security.
        
       | guerrilla wrote:
       | Wow, it's so cool to see this post. I was just watching the
       | digital design class from ETH Zurich and they went through
       | Rowhammer (repeated three times) and I was wondering if the
       | industry had solved anything and I suspected not based on their
       | explanations. Now I wish I was actually sitting in class even
       | more.
       | 
       | I highly recommend that series by the way.[1] It's not just an
       | architecture class because it's actually pretty up to date and
       | highly focused on security.
       | 
       | 1.
       | https://www.youtube.com/watch?v=AJBmIaUneB0&list=PL5Q2soXY2Z...
        
       | mettamage wrote:
       | It was an honor to have Kaveh as a teacher when he was still
       | working for VUSec :)
        
       | ben_w wrote:
       | This is probably a native question, but: could this sort of
       | attack be prevented by having the physical values be trivially
       | encrypted version of the logical values? I'm thinking something
       | as simple as:                   $value XOR f(memory address,
       | random number etched onto chip)
        
         | jsmith45 wrote:
         | That could certainly make it more difficult to exploit, sure,
         | but keep in mind that being able to force a specific value
         | change is not a hard requirement for this to be a security bug.
         | 
         | Even then, memory vendors tend to want to compete on frequency
         | and access timings, which means doing any additional work not
         | strictly required by the JEDEC standard will make their product
         | appear worse than competing products, so I doubt they will want
         | to do that.
         | 
         | Plus a similar technique could actually be done by the CPU's
         | memory controller to similar effect, and historically DRAM
         | design has favored pushing things to the controller when
         | possible.
        
         | pxx wrote:
         | Flipping a bit flips the output of xor.
        
           | matja wrote:
           | AMD EPYC processors already support AES encryption of memory
           | (https://developer.amd.com/sev/) where VMs themselves cannot
           | know the key. Interesting that I didn't see that mentioned in
           | the paper as a possible mitigation.
        
             | guerrilla wrote:
             | Why do you believe this would help? Have you seen the
             | attack from the BlackHat?[1] It doesn't require being able
             | to read any plain text and it doesn't matter how the data
             | is stored, only that it's near. You don't even have to have
             | any access to the target memory or even know where it is,
             | only the ability to cause something else to read it
             | predictably.
             | 
             | 1. https://www.blackhat.com/docs/us-15/materials/us-15-Seab
             | orn-...
        
       ___________________________________________________________________
       (page generated 2021-11-15 23:00 UTC)