[HN Gopher] Blacksmith - Rowhammer bit flips on all DRAM devices... ___________________________________________________________________ Blacksmith - Rowhammer bit flips on all DRAM devices today despite mitigations Author : buran77 Score : 488 points Date : 2021-11-15 16:27 UTC (6 hours ago) (HTM) web link (comsec.ethz.ch) (TXT) w3m dump (comsec.ethz.ch) | HALtheWise wrote: | Maybe OS's or hypervisors should support a mode where different | processes/security domains are forced into contiguous blocks of | physical memory with buffer zones between. Especially for cloud | computing, pre-allocating memory to different VMs should be | reasonable. I could even see browsers taking advantage of it, | given that they already force javascript threads into per-domain | processes for Spectre/Meltdown protection. | javajosh wrote: | Um, how concerned should I be about this? Is it time to turn off | javascript in the browser? And if it is, is this not the End | Times for browser distributed software? I ask because sometimes | you need to ask if it's really the end of the world or just | Monday. | notyourday wrote: | > Um, how concerned should I be about this? | | Not at all. It is another one of the Chicken Little's security | theater that sound sexy and sophisticated but in practice mean | nothing. | | If you are concerned about this issue you should first ensure | that you never type "npm install" or copy something off Stack | Overflow into your code. | acuozzo wrote: | > Not at all. | | https://github.com/IAIK/rowhammerjs | | This has existed for years and its use has been observed in | the wild. RowHammer isn't new. | Terr_ wrote: | > Is it time to turn off javascript in the browser? | | I've been browsing with Javascript off-by-default for years and | recommend it regardless. | | Sure, perhaps a site I trusted and whitelisted will try to hack | me, but I feel it's far less likely than some random typo- | squatting site or advertising-broker logic. | egberts1 wrote: | JavaScript, despite compiler hardening, is still subjectable to | SPECTRE-like actions. It's in the compilers' mitigation effort | and CPU memory controllers both. | | Something about CPU firmware and its inability to fix some | aspect of memory bank controllers. | pas wrote: | It depends on how concerned are you about the regular security | issues. Which ideally depends on the things you are responsible | for. If it's your family photos in iCloud? Nah. If it's the | nuclear launch codes for the whole world? Then yeah, you should | already be isolating separating compartmentalizing... | myself248 wrote: | > photos in iCloud? | | I mean, the last fappening was done by phishing, as I recall. | Are you saying another one, done by rowhammer, wouldn't be a | problem? | | I suspect most folks have stuff they'd rather not make | public. Credit card numbers, at the very least. This affects | us all. | pas wrote: | While there are clearly bad (and some interesting) ethical | aspects of the Fappening, it was not a big problem for | society. Everyday many people experience the same violation | of their privacy (revenge porn sites, accidentally sending | nudes to the wrong address, etc), and life goes on. (And as | far as I know nobody of the affected celebs have taken | drastic measures, while unfortunately this cannot be said | for non-celebs, as cyberbullying regularly gets people to | attempt suicide with "success".) | | Similarly the day-to-day ransomwares probably cost more | than securing browsers will. | javajosh wrote: | You don't have to be in charge of launch codes to worry about | this one. I mean, if anyone can 'npm install rowhammer' and | reliably attack anyone that visits their site, even in a | modern browser, then it's time to turn js off. I'm okay if | there are some minor or even moderate XSS exploits out there; | but a hardware vuln accessible to javascript means javascript | goes bye bye for now, at least from my local machine's | browser. | pas wrote: | If it comes to that you'll hear about it on HN first. If | folks start to explot it you'll hear about it. | | If it comes you'll hear about "apps" that freeze every | other browser process while you do online banking. | | That said, proactively blocking JS is great for many | reasons, and you can whitelist sites you sort of trust. | ziddoap wrote: | >If it comes to that you'll hear about it on HN first. If | folks start to explot it you'll hear about it. | | This is a pretty laissez-faire attitude with regards to | security-critical exploits. _You_ might be the one of the | first ones affected. After all, _someone_ has to be first | one before they can post it on HN. | | In a more general sense, it's a bad idea to rely on | someone else as your primary alert. Whether it is a HN | post, news article, whatever. | | (I know that you aren't quite saying to use it as a | primary alert, but a cursory read of your comment implies | it, so it's worth discussing.) | scandinavian wrote: | > You might be the one of the first ones affected | | If you are regularly the fist target of zero-days your | threat model should be stricter yes. The other 7.9 | billion people in the world can afford to be a little | less strict. | tambourine_man wrote: | This kind of thing depresses me. We're are building our | infrastructure on quicksand. | sroussey wrote: | I guess another plus for DDR5 which has some ECC built in to | including in the non-ECC versions. | temac wrote: | The internal ECC in basic DDR5 is intended to provide a | reliability just good enough to be at the level of DDR4 without | ECC (or maybe slightly above, but probably not a lot?), and | likely the error rate would just be crazy high without any DDR5 | internal "ECC" at all -- even without a rowhammer pattern or | cosmic rays, etc. I not sure it will help against rowhammer | (and variants)? | sroussey wrote: | Agreed, though did the spec for ddr5 finalize after row- | hammer? Seems like an opportunity to adjust for attacks. | | They should do another set of tests on DDR5 and compare to | DDR4. | egberts1 wrote: | Original author already said DDR5 ECC, once mapped, will too | suffer SPECTRE. | AaronFriel wrote: | SPECTRE is a CPU vulnerability in the speculative execution | that modern processors perform to create a side channel to | leak data through. | | Could you elaborate on what you mean by DDR5 "suffering" | from SPECTRE? I believe that vulnerability is memory | technology agnostic and theoretically would work if you | could boot a CPU absent any memory at all, as the leaks | occur through CPU caches. | egberts1 wrote: | CPU? More like memory controller and their capacitance | banks. | patryn20 wrote: | All these new hardware level physics exploits in software | fascinate me. At the same time they make me wonder if the | hardware will ever truly be able to be secure and perhaps we need | to just move on to new methods and concepts in hardware design | and maybe trade size for security. | | It also brings to mind that Simpsons scene: "Stop stop! He's | already dead!" | guerrilla wrote: | How curious are you? Here's a whole class on exactly that from | the authors of the OP [1] at ETH Zurich. | | 1. | https://www.youtube.com/watch?v=AJBmIaUneB0&list=PL5Q2soXY2Z... | mhh__ wrote: | Onur Mutlu is a really good lecturer, to my eyes at least. | guerrilla wrote: | Some of his TAs are really funny too :) | cout wrote: | It also makes me think of the phonograph in Godel Escher Bach. | One character made a record player and the other made a record | that always caused the record player to break. The crux was | that a record player cannot be built that can play all records; | it is impossible. | kaba0 wrote: | I think moving such complexity to hardware was a mistake (like | branch predictor, etc). Perhaps exposing a very low level API | to CPU functionality (like microcode level) and JIT compiling | existing x86 to that could perhaps work? We have just enough | problem managing complexity in software but at least it is | fixable there. | | (As for the possibility of such, there is already an x86 to x86 | JIT compiler that increases performance) | notriddle wrote: | What does any of this stuff have to do with Rowhammer, an | attach that doesn't even target the CPU, which just so | happens to have ECC on all its registers and cache lines? | kaba0 wrote: | Parent was talking about modern hardware in general, and | it's not like CPUs don't get their own fair share of | similar vulnerabilities. Also, a JIT compiler can probably | realize that the same memory region gets rowhammered and | may throttle execution of such thread. | thinkingemote wrote: | I fully expect apples M1 chip to get a spectre type | vulnerability at some point and then any improvements it had | will disappear. | mhh__ wrote: | M1 is vulnerable to Spectre (some variants at least) as far | as I'm aware. | | _Any_ speculation around memory accesses will yield this. | zsmi wrote: | That why the solution is to only allow speculation when | it's "ok". M1 is believed to have a number of Spectre-esq | security mechanisms built into it to determine just that. | For example: | https://patents.google.com/patent/US20200192673A1 | | Also, In dougallj's code [1] the zero of registers should | be superfluous so it is assumed the function below is | needed to make the experiments run stably by claiming | ownership of the registers as part of a general anti- | speculation security mechanism. | | static int add_prep(uint32_t *ibuf, int instr_type); | | The M1 explainer [2] has lot of interesting ideas like this | contained inside it. | | [1] https://gist.github.com/dougallj/5bafb113492047c865c0c8 | cfbc9... | | [2] https://news.ycombinator.com/item?id=28549954 | buran77 wrote: | I think the scary part is this one: | | > "DDR4 systems with ECC will likely be more exploitable, after | reverse-engineering the ECC functions," researchers Razavi and | Jattke said. | cjsplat wrote: | I believe that is saying understanding ECC function details | makes it easier to exploit EEC devices, _not_ that it will be | easier than exploiting non-ECC devices. | | The ECC code word is bigger so it is a larger target, but you | have to flip multiple bits to cause pain. If you have 2 bit | detect, you need to flip three bits to get something that | corrects to a different value. | staticassertion wrote: | Where are you reading this quote? I'm seeing essentially the | opposite in the linked post. | | > ECC cannot provide complete protection against Rowhammer but | makes exploitation harder. | | edit: I looked at some linked papers and I see similar quotes, | though not that one. | | edit2: | | https://arstechnica.com/gadgets/2021/11/ddr4-memory-is-even-... | | > "DDR4 systems with ECC will likely be more exploitable, after | reverse-engineering the ECC functions," researchers Razavi and | Jattke said. | | OK, so not more exploitable relative to not-ecc RAM, just | relative to ecc ram pre-RE. | 015a wrote: | Eventually, it will be obvious that running shared workloads on a | single piece of physical hardware has fundamentally | unremediatable security implications. This slow recognition is | deeply perpendicular with how the current landscape of x86 chips | are both manufactured and priced, as well as how cloud providers | have structured billions of dollars in DC investment; in other | words, they'll down-play it. This will be a massive opportunity | for ARM & SoC manufacturers in the coming years, as its far | better positioned to offer, for example, a single rack appliance | with 64 individual 2-core mini-computers at a price-point | competitive with a 128 core x86 rack, as one computer. | | Computing moves in cycles: | | - 2000s: gigahertz race on each core | | - 2010s: increase core counts, multicore everything | | - 2020s: back to core-efficiency and increasing per-core | performance. M1 is already leading this charge, but is obviously | a mismatch for a DC environment. | | AMD and Intel need to adjust, or face extinction. Its not just | about pushing ultra-high per-core performance (they're both good | at this); its about pushing for more efficiency, so per-blade | density in a DC can be pushed higher in the face of more, smaller | individual computers. If they don't evolve, AWS will for them | [1]. | | [1] https://aws.amazon.com/ec2/graviton/ | short12 wrote: | AMD seems to be doing just fine. With a roadmap for the future. | They will be just fine | baq wrote: | AMD and Intel deliver CPUs tailor made for large cloud | customers already; those SKUs are not available via normal | channels. | dathinab wrote: | Their next gen "normal" server CPUs also have "cloud | optimized CPUs". | | While this seems to be mainly about Power/Heat/Perf. | optimizing for typical "cloud" workloads it might also have | design decisions related to this problem. We will see, but | they look interesting anyway. | belter wrote: | Can you elaborate or provide a reference on how those CPUs | are different? | baq wrote: | not really, just heard about those. but it wouldn't really | be that much different than a customized SKU for a | playstation or an xbox. | belltaco wrote: | But what model number would they show up as in the OS | facing the users? From what I see most CPU names are the | ones available in the market. | ThrowawayR2 wrote: | A recent HN posting points to a news article about AMDs | customized processors for Facebook although it provides | little detail: | https://news.ycombinator.com/item?id=29204257 | | " _The custom Epyc 7003 part that Facebook has commissioned | from AMD has 36 cores out of its potential 64 cores | activated, and with a bunch of optimizations for Facebook | applications at the web, database, and inference tiers of | its application stack._ " | Syonyk wrote: | Anyone who would know the fine grained differences is | almost certainly NDA'd up quite tightly. | | But it's likely things like different core counts, | frequency performance curves, perhaps more or different | memory/IO controller counts, perhaps instruction extensions | of some particular interest (I'd wager BF16 was available | in data center SKUs long before consumer SKUs), etc. | | It's nothing radical, but if you're going to buy an awful | lot of them and want something slightly custom, Intel will | definitely do that for you. | javajosh wrote: | What kind of levers do big ISPs/cloud providers want over | their CPUs, other than the basic efficiency gains we all | want? Isn't it risky doing any customization since you | don't get commodity pricing? | swiftcoder wrote: | At AWS scale you can feasibly just buy up the entire run | of chips. What's a few million CPUs more or less between | friends? | mywittyname wrote: | Additionally, Amazon is absolutely the type of company to | create their own x86 processor if Intel/AMD are unwilling | to make some customization. | | Intel does not want to be competing with an Amazon Basics | processor. | Grazester wrote: | You basically need a license for X86 from Intel for this. | Do you think x86 is open source! | to11mtm wrote: | > Isn't it risky doing any customization since you don't | get commodity pricing? | | Depends on how many you order and how reasonable the | customization is. Like, what if I wanted a 10,000 38 core | Xeons with a 2.5GHz Base clock and 3.5GHz Turbo clock? | Such a part doesn't exist, yet sits smack dab between two | existing Models, the Xeon Platnium 8368 and 8368Q. | Assuming yields are already good enough, that might be | the sort of thing a CPU maker will do. (Might need more | than 10k units though, IDK.) | baq wrote: | I'd wager Pat doesn't get out of bed for 10k units and | neither does Lisa. | wolf550e wrote: | It's unlikely a different die (silicon chip), so the | usual number of cores and usual size of caches, and it's | unlikely that AWS gets a secret x86 instruction | unlockable via msr or something like that. It's probably | just constants for the frequency scaling (e.g. all cores | and single core turbo). | | Cloudflare mentioned they use a custom Intel SKU on their | blog, and they're not as big as AWS. | ghbe44 wrote: | I'm familiar with Google's custom SKUs, and I assure you | they're unique enough to experience microcode bugs nobody | else on the planet ever hits. There was a weird FDIV one | back in the day that wasn't the same FDIV bug that hit us | all a long time ago. Cloudflare isn't really at that | customization table. | | Your take sounds accurate for non-FAANGs (and a couple of | those). For Google, not so much. There is a lot of custom | acceleration and such that isn't available to any other | customer, and a nearby commenter is right, every detail | about them is locked behind a very strong NDA. I've heard | thirdhand AWS does indeed have custom instructions in the | virtualization extensions stuff, but don't know if that's | true. | monocasa wrote: | My understanding is that Google's SKUs are the same die. | Non Google versions just have the Google specific silicon | fused off, or require a MSR knock sequence to turn on. | ghbe44 wrote: | > I'd wager BF16 was available in data center SKUs long | before consumer SKUs | | Right you are, Ken. | [deleted] | passivate wrote: | But you still need data exchange between parallel/concurrent | workloads. The security focus will shift to the data nodes | which exchange data between the CPUs. And then the focus will | probably shift towards latency and performance and moving these | data modules closer to each other.. kinda like RAM :P | froh wrote: | IBM Z series silicon (z architecture and it's predecessors | s390,...) Is designed with multi tenancy in mind from the get | go. Finding a way to escape virtualization let alone | partitioning to access confidential competitor data was a no | go. | | And indeed to my understanding spectre, meltdown, rowhammer and | similar attacks are not an issue there. | | https://en.wikipedia.org/wiki/Z/Architecture | | I wonder when more features from the mainframe cross pollinate | Intel amd arm CPU architectures. | monocasa wrote: | Z Series is very much susceptible to meltdown, spectre, and | rowhammer. IBM says that it should be fine because Spectre et | al need to be running untrusted workloads to work, but they | haven't updated their advisories since NetSpectre. : / | | A lot of the talk of mainframe levels of security is specious | at best. | froh wrote: | Wow. Your right, for spectre patches were needed on system | z. | | https://www.suse.com/de-de/support/kb/doc/?id=000019105 | | Meltdown didn't affect z nor amd. | monocasa wrote: | Series Z and POWER up to and including POWER9 are | susceptible to meltdown as well. | | https://www.zdnet.com/article/meltdown-spectre-ibm-preps- | fir... | classichasclass wrote: | But, not 7400 G4 and earlier. | | https://tenfourfox.blogspot.com/2018/01/actual-field- | testing... | monocasa wrote: | Which aren't POWER cores. POWER is an IBM trademark; the | 7400 is a Motorola PowerPC core. | classichasclass wrote: | A fair point, though early POWER cores probably aren't | for the same reasons these aren't (can't speculate | through indirect branches using SPRs), and IBM was | involved in the development of both. | throwaway894345 wrote: | > Eventually, it will be obvious that running shared workloads | on a single piece of physical hardware has fundamentally | unremediatable security implications. | | If I understand correctly, Homomorphic Encryption aims to solve | for these kinds of attacks (although presumably the | computations are more expensive and programs must be | restructured to use HE primitives?). | https://en.wikipedia.org/wiki/Homomorphic_encryption | | EDIT: Why the downvotes? Am I mistaken? | teddyh wrote: | HE is _stupidly_ inefficient, and is entirely an academic | oddity, and there is every indication that it will always | remain that way. Bringing up HE in a context of real problems | needing real solutions is unproductive. | ghaff wrote: | If solving a problem in a stupidly inefficient way is the | only way of mitigating said problem, you may not have much | choice--at least for some use cases. Saying something _can | 't_ be answer because it isn't an efficient answer (today) | is also unproductive. | pbronez wrote: | Yeah HE is a totally different security model. It moves all | the trust out of the hardware. You ship encrypted workloads | to an untrusted party, who computes on the encrypted data and | returns an encrypted result. Then you decrypt the result to | use it. | | And yeah, this is just as inefficient as you'd think. | rocqua wrote: | I'd say its much more inefficient as you'd think. | | Firstly the computations on encrypted data just take a LOT | longer, especially for Fully Homomorphic encryption. With a | single 64 bit addition taking microseconds. | | Secondly, the FHE code cannot branch based on data. If it | could, it would know something about the data, and it | wouldn't be proper encryption. This means an if statement | becomes "calculate both branches, and throw away the result | you don't need". Similarly, a for loop becomes "give me an | upper bound of how long this will run", then "loop for | exactly the upper bound number of times, if the loop is | 'done' early, just throw away the result of the remaining | operations". | | FHE is really cool, and has its uses in situations where | you want to cooperate without needing to trust, but it is | stupidly inefficient. (Things get really cool if two | parties want to cooperate and the computing party can e.g. | branch based on their local unencrypted data) | simiones wrote: | HE is incredibly slow, and unlikely to ever be fast enough | for common workflows. As in, an HE implementation of an | algorithm that runs in seconds in Python on a regular machine | might run in minutes or tens of minutes. | | Not to mention, to avoid side-channel attacks, an HE scheme | still needs to always run the longest possible sequence of | operations regardless of input (otherwise, information about | the input data is leaked through timing). So an HE version of | a quick-sort scheme would always run in O(n^2), otherwise it | would leak details about the contents of the list. In some | cases it would even have to run in the same amount of time | regardless of the _size_ of the list, to avoid leaking | information about that. | tomc1985 wrote: | Seems like it still might be useful for small bits of data | like session tokens or other encryption keys | Y_Y wrote: | To be fair, the performance problem you're talking about | should affect latency rather than throughput. If you can | batch lots of operations (not all controlled by the same | user) then you can do things as fast as you can without | leaking (much) information. | | This is still phenomenally slow, of course. | landonxjames wrote: | "We have never shared two threads on a core between EC2 | instances" - https://www.youtube.com/watch?v=kQ4H6XO- | iao&t=2485s | | Interesting that AWS has been mitigating for side channel | attacks since before they became a big news item. Curious about | Azure and GCP's stance on this | mlyle wrote: | Maybe they were super smart and foresaw side-channel being | such a big problem. | | Or, maybe, they just thought the lack of deterministic | performance created billing/accounting/customer service | problems. (One hyperthread can just about completely starve | in many circumstances). | thrashh wrote: | Well, you pick EC2 because you _want_ dedicated cores. | | You pick a VPS because you want to save costs and share | cores. | | So it's not so much AWS choosing as you the customer | choosing. | secondcoming wrote: | I was under the impression that AWS provides vCPUs, not | dedicated CPUs unless you go bare-metal? | bostik wrote: | You can get dedicated tenancy instances in AWS, for an | upfront monthly cost of ~$1.2k (per account) and about | +10% on top of your normal EC2 instance costs. | | I ran the numbers back in 2015 and persuaded the previous | employer to go for dedicated tenancy with all | performance-critical and privacy-sensitive workloads. I | was effectively hedging against unknown but practically | guaranteed cross-VM attacks to pacify a paranoid | regulator. | | Then Rowhammer happened. Less than a day later, our | contact with the regulator comes to me asking how it | affects us. Being able to answer - with absolute | confidence - that it did not, was one of the proudest | moments of my career. And the turnaround from "regulator | comes asking awkward questions" to "regulator is happy | and sees no reason to ask again" of less than 48 hours | must be some kind of record too. | vngzs wrote: | You get dedicated time on a core while your task is | running. For instance t3.medium is 2 vCPUs (because | hyperthreading) but as you can see in [0] it's only one | physical core. | | [0]: https://aws.amazon.com/ec2/physicalcores/ | dathinab wrote: | > Maybe they were super smart and foresaw side-channel | being such a big problem. | | The looming thread of side-channel attacks on SMT systems | has been known since well, before we had SMT systems | (because it also can apply to Co-Processor, and non SMT | multi core systems). | | The difference between back then and today is "we believe | it's possible but haven't found a way yet" and "there are | multiple known ways", as well as it being wide spread known | instead of just in some communities. | | The reason we still shipped the problematic CPU's is | because improvement in perf. and as such competitiveness | and revenue on the short term where more important. | | There also was a shift in what people expect from security | and which attack vectors are relevant. For example user | applications a user installed where generally trusted as | much as the user. While today we increasingly move to not | trusting any applications even if they are installed by a | trusted user and produced by a trusted third party. Similar | running arbitrary untrusted native code from multiple | untrusted users and "upholding" side-channel protection | wasn't often an important requirement in the past. | belter wrote: | Microsoft Research looked into it - Paper is from 2020 and | is reference 24 in the document mentioned in the main post | here. | | "Are We Susceptible to Rowhammer? An End-to-End Methodology | for Cloud Providers" | | https://arxiv.org/pdf/2003.04498.pdf | | Although their answer in this paper was diplomatic, my | interpretation is that they confirm it as a problem. Their | conclusion was it would not be as bad it was considered at | the time. To be revisited on the context of this more | recent work. | | Edit: Adding main reference | | "BLACKSMITH: Scalable Rowhammering in the Frequency Domain" | https://comsec.ethz.ch/wp-content/files/blacksmith_sp22.pdf | discreteevent wrote: | That's ok unless you are running something virtual like | kubernetes on top of the EC2 instance but want to ensure | isolation between containers/pods. | my123 wrote: | That's insecure today anyway. Containers are not a security | boundary. (If you don't use a VMM like gVisor, Firecracker | or go the Drawbridge way) | KingMachiavelli wrote: | They offer vCPUs in multiples of 2 so it makes logical sense | to divide the resources that way; performance would be a lot | more unpredictable if you could be sharing a single core with | another EC2 user/instance. | kmeisthax wrote: | Viable sidechannel attacks on SMP/Hyperthreaded designs have | been known about since 2005, only a few years after Intel | brought their first SMP designs to market. | KingMachiavelli wrote: | I don't see how Graviton/custom ARM chips are evidence of this | predicted trend. ARM chips tend to have even higher thread | counts and poorer per-core performance. The biggest security | difference is the absence of SMT/hyper-threading. | | I think it will come down to what you are willing to call | 'individual' processors. But actually having physically | distinct memory seems like a lot of overhead for attacks that | won't matter for 90% of users. Also I would think that the on- | board ECC of DDR5 would protect it against these types of | attacks. | 015a wrote: | Graviton is not a prediction of the trend; its a signal that | Amazon is willing to make very deep investment into custom, | customer-facing hardware if Intel/AMD can't deliver what they | need. | | The trend is yet to come. My statement is that, if AMD/Intel | doesn't adapt, Amazon has the hardware investment to leave | them behind, just like Apple did. | | But to be clear on two points: They will probably adapt. And | Amazon/etc will probably never leave them behind fully. DCs, | especially public cloud, are not all-or-nothing like Apple's | Mac Lineup is. | | Then the question follows, why would they want something | Intel/AMD isn't offering right now? The trend is System-on- | Chip. Beyond Security (this isn't the last electrical | interference/speculative execution-like attack we'll see). | SoCs are easier to service (easier != cheaper. holistic | replacement versus per-component debugging. servers are | cattle, not pets). Denser. More vertically integrated. | Capable of far higher IO performance. Lots of benefits. | | Mega-servers with 256 cores and 4 terabytes of memory still | have a huge place in all DCs; but not when multiple untrusted | workloads are running simultaneously. They're not for | EC2/Fargate/Lambda/etc; they're for S3. Highly managed, | trusted workloads. | Arrath wrote: | > This will be a massive opportunity for ARM & SoC | manufacturers in the coming years, as its far better positioned | to offer, for example, a single rack appliance with 64 | individual 2-core mini-computers at a price-point competitive | with a 128 core x86 rack, as one computer. | | I'm curious about the eventual end-game of security in this | space. Take the 64 individual processors in your example, give | each one their own independent memory bus to their own ram | chip, isolate them from each other as much as possible. What | else can be done, if a malicious process on processor Z has to | go all the way to disk to try to get back at data working on | processor J, is that as maximally secure as it can be without | being in a completely separate chassis with only network access | to the other device? | com2kid wrote: | > Eventually, it will be obvious that running shared workloads | on a single piece of physical hardware has fundamentally | unremediatable security implications. | | Sure, but this is worse than that. This is "your online poker | game client can gets access to your web browser's bank account | session info." | | We need process isolation within a single machine, or else we | are kinda screwed as a field. | thrashh wrote: | IMO this is a perspective from software engineering. | | But this is an electrical problem. Interference is a huge | issue with any engineering that involves physical things and | these kinds of attacks are just interference problems. This | issue is no different from a microwave knocking out your Wi- | Fi. These attacks have become possible because the acceptable | interference threshold that chip makers have been using has | turned out to be too low. | | How do you fix interference problems? First, you choose a new | threshold of acceptable interference and then you engineer | better isolation, you lower density, and/or you switch | technology. | | We could make shared computing complete safe tomorrow if we | wanted to so I think calling this the end to shared computing | is quite alarmist. The issue is that we collectively want to | both have the cake and eat it too: we currently have a | certain cost-to-compute ratio that we have become accustomed | to and we don't want to compromise that. We're basically | buying time until we can invent a new technology that can | achieve the same density without the same level of | interference. | AnimalMuppet wrote: | "Lower density" is a _really_ hard sell at the moment... | yholio wrote: | In a world where insecure high density exists, secure low | density is at odds with cloud computing: the cloud makes | sense only if you can efficiently utilize you computing | power round the clock and make it cheaper than shipping | mostly idle terminals. If securing the cloud is expensive, | then there's a cutoff point where it's better to ship | highly dense, cheap terminals. | | So maybe "screwed as a field" is not an exaggeration if the | field is butt computing. | jeremyjh wrote: | Right, we may be entering an era in which secure network | computing is impractical and the impact could be very far | reaching. | johnvaluk wrote: | I've been experimenting with isolating work activity from | personal activity. It's amazing how difficult it is to | prevent information from leaking between networks and | applications. It's hard to find alternative solutions for | entrenched convergence/convenience features like | copy/paste, messaging and entertainment. Working remotely | reveals too much about your personal and work relationships | to corporations and VPNs only help to connect more dots. I | can't curl up in a hammock with 3 laptops and a phone on a | nice day, so I keep returning to a single device that does | it all. | orlytho wrote: | Sorry, entering? Just like climate change some of us have | been warning that a lack of focus on and willingness to | challenge the fundamentals of building software and systems | would lead to non-securable computing in the general case. | That warning has been sounding since the 1990s. Nobody | cares. It took a meat plant and pipeline paying a ransom in | cryptocurrency for everyone to notice that we are | completely and irredeemably fucked as a computing species. | We are already there, my guy, and it's only a matter of | time now. | | Think about someone trying to do basic ETL. Like having a | tabular file and summing it or something. Don't use Excel, | we say, stand up a $4 million Spark and AWS architecture | with seven hundred pitfalls that can let bored Russians | take over your whole network as if they were going to the | dry cleaners because remember, you just might be Google | someday. That's where we are. It's been a complete industry | failure for a decade and it's only getting worse. | Accelerating, even: now you need some operationally- | terrifying Kubernetes to even be at the table, and then as | an industry we (rightly) say running this stuff ourselves | is too hard, so pay Amazon to do it rather than even ponder | if we have settled on the right approach. | | Tada: Humanity just lost computing to three companies. We | very likely aren't getting it back. | | There are probably 5,000 people doing this work who can | adequately secure such a system and make it mostly | impermeable. Where middle computing is royally screwed is | that nearly all of them work in San Francisco or its clones | abroad. So then you get "best practice" blog posts and | industry think pieces and the lowest bidder ties them | together into something resembling a competent computing | system. That's been the state of the art since 2004 | everywhere except Santa Clara county. | | With the exception of some areas in the IC and DoD, I just | described the entirety of US government IT. That ETL | example? It's actually real and underpins a small part of | Medicare across several government contracts. Because the | tools the valley exports are all they've got, and we sure | love building systems with massive footguns, and then | shaming organizations publicly for missing item #543 on the | tribal "secure your computing system" checklist and | shooting themselves. | | The entire industry must change, top to bottom, but just | like climate change, again, that's a nonstarter. Posix and | the Web are not the path forward and I hope I live to see | the industry figure that out. I'm increasingly skeptical. | The good news is my hometown might flood into the sea | first, sparing me from considering in my last moments that | every argument I've _ever_ made in this profession has | fallen on deaf ears and that everyone has to derive our | industry's peril from first principles for themselves. | jeremyjh wrote: | Right, on average industry has been failing forever. The | difference now is it might not be practical for _anyone_ | to actually secure an internet server or web browser, | full stop. I think that is a fundamentally different | situation. | orlytho wrote: | I think we have been in that situation since everybody | started mimicking how Google does things | FpUser wrote: | >"as its far better positioned to offer, for example, a single | rack appliance with 64 individual 2-core mini-computers at a | price-point competitive with a 128 core x86 rack," | | I have server application capable of utilizing many threads and | thousands requests/s. You think I will deploy it on tiny 2 core | CPU? No thank you, it currently runs on powerful dedicated | server from Hetzner where I control everything. | | >"AMD and Intel need to adjust, or face extinction .... | | If they don't evolve, AWS will for them..." | | Sounds like pontification / FUD. | temac wrote: | The M1 family certainly doesn't go into the direction of less | cores, and even if you had a single one you could probably | rowhammer during your timeslice and then patiently wait for the | target process to execute. Since even individual computers | execute random garbage code straight from the Internet (e.g. | JS), there is still a needed for internal security. | | That being said, and even if I consider that a quite different | subject, I agree that the current efficiency story of Intel is | not very good, but I hope they will improve in a not so far | future. The dev lifecycle of CPUs is quite long and it seems to | be an obvious target. I suspect they will be _forced_ to | improve their efficiency, because that 's actually were the | performance potential is today (the current dissipation level | of their last desktop CPUs is not reasonable, and prevents | scalability). And trying to lower the core count also can yield | to high consumption, e.g. if you want your performance back by | increasing the frequency. Wide and "slow" is needed, and it is | harder to increase the internal number of execution units per | core and have them actually used, than to increase the core | count -- plus ironically one way to do that is through HT, | which goes against your wish to share less hardware. (Now if | you compare their P-core and E-core in Alder Lake the story is | more complicated, but their marketing figures seem very | strange, so I won't conclude anything for now. The current | instances of P-core we have is for that weird desktop market | with unreasonable high TDP anyway.) | | Now if you really want miniaturized _individual_ computers that | would not be shared at all, I 'm not sure the market will | actually go into that direction because big systems will | continue to be needed (and clusters are a niche mostly for | HPC), and I'm not sure a "slightly more secure on smaller | systems" market would be interesting enough. Esp. in an era of | chip shortage. And also because it would _still_ be bigger than | a shared equivalent system make with the same techs... But if | that 's really a niche that has to be addressed, I suspect it | would mostly be a matter for Intel to create new small _and_ | slower SKUs ( "slower" compared to their desktops insanities) | -- they even kind of have that already, but yes the physical | miniaturization aspect is not handled yet -- that does not | really depends that much on the cores, though. And even in | those computers, I'm not sure there would be much demand for | very low core counts. The threading of pretty much all | workloads tends to increase, nowadays. | | One last point, after the e.g. Pentium 4 fiasco nobody really | left the IPC race. AMD had some difficulties when trying | "weird" ideas ( _part_ of which were because of their marketing | communication), and then again a completely new design from | scratch to market takes time. In general there was a pause of | performance growth around 2016 for a few years, and that was | mostly Intel having _process_ problems and the rest of the | industry catching up (and then overtaking them). | oopsyDoodl wrote: | Intel gets it and is adjusting to be a foundry that builds | chips to application spec. | | For me cloud computing is just where the best pay is. I do not | at all see it as the future of computing. | | One reason is ML will help us realize we write code we don't | need; so much of it is syntax sugar for business specific | needs; infra, security... it'll be realized cloud software is | solving unemployment not technical problems of value. That many | issues with software back in the day were lower quality | networks, and consumer hardware. I mean any phone can abstract | metadata from any one users amount of behavior, we do it in a | DC because that's where the jobs are. Chip manufacturing will | include ML normalized logic for specific application. | | LAN IOT will improve and we'll realize the Metaverse can be | implemented with a local client and AI generated art, on a | mobile GPUs power in a few years. Middle men like Zuckerberg | face the most uncertain future. He failed to diversify as well | as Bezos, Newell, and others. | | IMO, Valve is a serious threat with Steamdeck; an open IOT | brain in a kid friendly form factor could be the new cigarette. | Even Apple may have to take them seriously. My kids iPads need | replacing soon; a flat glass slab with no interactive controls, | requiring another $800+ machine to develop on, bloated | development tools, fees, and a bunch of cloud logins, are not | going to motivate kids to feel creative. | api wrote: | I am not at all convinced that this is not a solvable problem. | It may require significant changes in how schedulers work, such | as resurrecting the idea of processor affinity. | | Unfortunately it will likely have negative performance | implications for multi-tenant work loads. | yread wrote: | Interesting that there is a lot of variation between the modules | - some get 1.1 million bit flips and some 14. Perhaps that was | the ECC? | egberts1 wrote: | Does not work on Intel Core Generation 1 (specifically Xeon | Westmere EP hex-core) with DDR3-ECC. | | Perhaps I should start snapping them all up ... because market | demand. | pomian wrote: | Did you discover anywhere how vulnerable DDR3 RAM is yet? Both | ECC and regular? Since this is a hardware induced | vulnerability, maybe it doesn't exist? | egberts1 wrote: | And this Intel Xeon E5660 overclocked to 4.1Ghz using | DDR3-1600 makes it the cheapest 6-CPU per chipset evah ... 7 | years running (since 2015) and hopefully the safest. | | https://overclock-then- | game.com/index.php/benchmarks/1-x5660... | joebob42 wrote: | To my understanding it's still hard to exploit this to steal | information / break access / etc just because you'd need to know | where the right bits were. On the other hand, if all we want to | do is break our adversary's process / crash it / make it perform | arbitrary incorrect behavior, this seems substantially easier to | accomplish even if we have no idea at all which bits we are | flipping. | boibombeiro wrote: | Brainstorming some solutions. | | Maybe a randomized algorithm for ECC. Every so often changes how | the ECC is computed? | | The region nearby where privileged information is stored could | have a speed limit on multiple sequential writes? | | Add blocks that are more sucetible for those attacks to do early | detection. A honeypot bit. | fbanon wrote: | Can't wait for the first large-scale exploit of this stuff, which | will finally force DRAM manufacturers to fix their faulty crap. | passivate wrote: | Why wish for exploits that harm regular users who have nothing | to do with design decisions made by DRAM manufacturers? | zw123456 wrote: | Does anyone know if SDRAM (SDR) is susceptible to this attack ? | | If you don't need speed but need security... | rkagerer wrote: | How far _back_ in time must one go to find main memory tech that | would be immune to this? (eg. SRAM, magnetic core, vacuum tubes?) | josteink wrote: | DDR3 supposedly. It's not packed as densely. | mdrzn wrote: | Somewhere they mention that even just DDR3 is immune from this, | because the chip is not "crammed" enough to be susceptible to | this attack. | zanethomas wrote: | Funny thing about that. | | At Alpha Microsystems, about 1982, I was in charge of the | diagnostic programming group. At that time failing memory boards | were expensive and customers would not be happy with such | components. | | I wrote a memory testing diagnostic that was based on knowing | exactly how addresses were mapped to cell locations so I could | try to provoke such failures. | | Chip manufacturers were aware of this problem which is why they | scrambled the addresses. | | Potential vendors, Motorola et al, were required to provide | mapping information before we would consider their chips. | | Now I'm curious to know what such mapping looks like with modern | memory chips. | classichasclass wrote: | Not related to article: would love to hear about your time at | Alpha Microsystems. See https://ampm.floodgap.com/ (hosted on | an Alpha Micro Eagle 300). | zanethomas wrote: | Here's a bit of history ... I think the lead up helps to | understand what I did first at AM. I'll send you an email | with more info later. | | 1976-1985 My first job was at Basic Four Corporation. I got | in as a test technician, assembling small refrigerator-sized | mini-computers, giving them their first tests and swapping | components until they passed. I soon learned how to use the | machine-language assembler and started writing small programs | to help me determine which components were failing without | swapping and hoping. Within a few months I was testing more | than 2x the number of machines the other techs were testing. | Management noticed and soon the other techs were getting up | to speed. | | At this point management pretty-much turned me loose. I moved | up to diagnosing and repairing failed components (8"x11" | pcbs). My understanding of programming and digital circuits | allowed me to write small programs that could be used to | "light up" specific circuits on the board making it easy to | poke around with a scope to see where things went wrong. | Again this was a huge productivity boost and the technique | was propagated to other techs. A couple years later I went | back for a while part-time as a consultant and wrote my first | DSL for techs to use. | | Next I talked my way into the firmware development group and | worked on firmware for tape, disk and other devices. This is | the period of time when microprocessors were being | incorporated into everything and my experience with the | Micro-68 put me in a good position to participate. I also got | to write microcode for a 2901-based cpu that was in | development. | | And then, somehow, perhaps at a user's group, I learned about | Alpha Microsystems. | | When I first visited Alpha Microsystems their idea of | "burning in" pcbs consisted of putting them in a powered | backplane, in a wooden box, with a lightbulb, where they sat | for some period of time. | | Basic Four had serious testing which included putting entire | computers into temperature-cycling ovens where they ran tests | for 24 hours. That knocked out a pretty high percentage of | boards. After I told them what they were missing Alpha | Microsystems hired me to improve their process. | | For the next several years I participated in creating the | flow of production and testing. A department evolved to | handle the hardware side of things and I became head of a | diagnostic programming department which grew to, variously, | between 6 and 10 programmers. After that department was | functioning and had someone who could step up I transferred | into the operating systems group. I was one of only three | people allowed to work on the operating system code, let | alone even see it since it was held as a trade secret. | | During my last year at Alpha Microsystems a brilliant | programmer I had hired introduced me to the then just- | released Structure And Interpretation Of Computer | Programming, a new textbook for students at MIT. That book's | use of the language Scheme introduced me to first-class | functions, closures, and many other concepts which found | their way into popular programming languages decades later. | The SICP had all the information one needed to create a | Scheme interpreter. I wrote one using 68000 assembler so I | could run the sample code. | kawsper wrote: | Wauw, amazing! Thank you so much for sharing your story! | zanethomas wrote: | ha, thanks! | | i love programming so much i've never quit | | taught fullstack at ucla BC (before covid) | | did a large frontend project with vue the past year | | i'll die with a keyboard under my fingers! | pomian wrote: | It is very reassuring that there is such an agency as the | Computer Security Group ( this article). Run by and funded | independently by a multi national science agency. Likely without | oversight by any industrial organisation (read lobby group.) It | would be nice to have similar scientific bodies for other | livelihood and security threats, such as health, and logistics, | most especially food, drugs and environmental issues( chemicals | for example.) Will this agency become corrupted as those have | been over time? | contidrift wrote: | For cosmetics: https://www.ewg.org/skindeep/ | Dylan16807 wrote: | So it's another Target Row Refresh _bypass_. | | Which is only possible because the DRAM has limited memory for | recently-accessed rows. | | When is a company going to put out chips that have the access | count stored _inside_ the row? It 's the most obvious way to do | it and makes this entire class of attack impossible. | | Edit: Okay, reading the paper more apparently LPDDR5 has | something similar to this. Why is LPDDR so divergent from normal | DDR? | r00fus wrote: | Is this purely an x86 concern or does it affect non-x86 usages of | DRAM? ie, ARM cores, Apple Silicon, RISC-V, etc. | zekica wrote: | It affects all processors that use this type of DRAM. Apple | uses LPDDR5x and from what I've read, they don't have ECC, so | this attack should work fine on M1. | nickcw wrote: | From the article: > We demonstrate that it is possible to trigger | Rowhammer bit flips on all DRAM devices today despite deployed | mitigations on commodity off-the-shelf systems with little | effort. | | The fact that user space code can cause bit flips in your RAM is | a hardware bug. I'd love to see this code in memory testers like | memtest86 so I could send the RAM back if it ever caught a | problem like this. | | I guess it shows just how close to the edge of not working our | modern computing environment is. | temac wrote: | The problem is right now DDR4 devices that work correctly in | that regard, do not exist. Likely the best mitigation for any | sensitive application, even so lightly, is DDR4 with ECC (even | if it may not be enough, it is vastly better than nothing, and | not just because of rowhammer) | | And I have no idea if the internal "ECC" of standard DDR5 helps | or not. It is not intended for regular ECC level of reliability | anyway. (And I have seen discussion about likely bitflips | detected in crash dumps of M1 Pro devices) | | So, as much as I would like to return defective devices, I | would probably be left with no computer, no smartphone, etc. | jandrese wrote: | Maybe a paranoid app could try to allocate the memory | adjacent to any sensitive bits as a buffer? That would be | pretty difficult to do but might be possible if you are | tremendously paranoid and have a good way to examine the | hardware. | to11mtm wrote: | I wonder how easy that would be to reason about though. I | really don't know much about modern DRAM circuitry I feel | like it might be abstracted away to some level, also there | are bios-tweaky settings that might make a difference (i.e. | things like channel configuration and/or bank interleaving, | if that's still a thing.) | | IDK though. Maybe there's a way. I've worked on code that | uses padding around a data structure to ensure it has it's | own cache line. Maybe if you allocate a large enough | contiguous block you'll be OK? | mnw21cam wrote: | You'd have to have the operating system's cooperation, | because it may be mapping individual 4k pages all over the | place, and it's the only thing that has a chance of knowing | how the memory is laid out on the actual chips. | AshamedCaptain wrote: | And the CPU and the chipset's cooperation. In many cases | it is even a trade secret how data is striped across the | different slots/banks. | xxpor wrote: | Doesn't the same thing to exploiting rowhammer though? If | you're after specific data, how would you know what | physical address to use, even if you knew the physical | address of the target data? | jandrese wrote: | Yes, but there is a difference between an attack being | run by a low level hardware hacker vs. the software being | run by average users. This is why I said it would be | hard, you would need to encode a lot of low level | hardware knowledge into your application. | | This might only be useful in very specialized | circumstances, like with kernel support on only a handful | of carefully chosen hardware platforms. However, someone | like Apple could do this on their systems, as they | control both the kernel and the hardware. Sensitive | memory locations could be cordoned off in special zones | with buffering to prevent Rowhammer type attacks. | StillBored wrote: | <i> it is even a trade secret how data is striped across | the different slots/banks </i> | | If that is true, vs just the usual "you don't need to | know" trash common these days, its crazy. A pretty good | picture can be built up with just software timing | analysis, but its hardly rocket science to hook a logic | analyzer/scope up and determine bit swizzling and | page/bank interleave. Plus, many of the more open BIOS | vendors have tended to provide page/row/controller/socket | interleave options for quite a while, partially because | it can mean a 5-10% uplift for some applications to have | it set one way or the other. Its been one of those how do | you tune your memory system options for a couple decades | now. | [deleted] | staticassertion wrote: | If your app is particularly paranoid you could just double | your memory usage, and verify errors against the clone. | pas wrote: | Or use different banks of DRAM for different security | contexts, managed via NUMA? | indymike wrote: | > The problem is right now DDR4 devices that work correctly | in that regard, do not exist. | | Yes, and this is actually the problem. Without safe hardware, | it is almost impossible to write safe software. | userbinator wrote: | Meanwhile you can still buy DDR3 --- used --- from various | shops in China, and it'll be 100% perfect and Rowhammer- | free because it was made before the insane process | shrinkages that caused it. | | (Have done this a few times. Was hesitant initially because | of the low price and used nature, but when the first stick | passed MemTest86+ including RH perfectly, I was convinced. | Half a dozen more sticks later and still good. But maybe | they've started to run out of the good old stuff now...) | oopsyDoodl wrote: | I think this illustrates the cloud is unacceptable for | anything more than storage and retrieval. | | All computed results from data science must include steps and | code to verify locally. | | It calls into question privacy on federated networks and | crypto networks; any node can be manipulated locally to | change payload outputs on delivery, reveal secrets, disrupt | workloads. | | This makes sense to a lot of folks in computer engineering | and physics, versus abstract software. No physical theory I | know of offers any guarantee our arbitrary computing machines | will ever be securable. We put fart pipes on Hondas. | | Science proves it's titillating smoke and mirrors once again. | Still waiting for nuclear rocket cars. | | I think this proves further as well why general computing | chips need to be replaced with workload specific designs, | where the anticipated inputs are well known and no vague | logic paths to intentionally allow software monkey patching | ever ship. | cheschire wrote: | Security doesn't scale at a price point that private sector | companies could typically afford. | | Perhaps we fail at pricing security into the value of a | company, or maybe that's what risk appetite is about. | myself248 wrote: | Yep, though I agree with the parent, if a RAM returns data | from a location that is different from what is stored in that | location, assuming all recommended timings have been | followed, then that RAM is defective. If that means they | should be recommending a refresh after every single | operation, then that's what it means. | | In other words, the whole industry sits on a bed of lies at | this point and it's only because government is technically | incompetent that we haven't seen the world's biggest class- | action. | ygjb wrote: | > In other words, the whole industry sits on a bed of lies | at this point and it's only because government is | technically incompetent that we haven't seen the world's | biggest class-action. | | That's a pretty bold claim - if it's accurate, can you | point at the government regulation that precludes a class | action lawsuit? I would assume it's something in the T&C or | EULA for the hardware rather than a regulation? | userbinator wrote: | Bold but sadly correct. The industry runs on misdirection | and deception. | | If users knew better, we'd have the equivalent of another | Pentium FDIV bug. | buran77 wrote: | According to older paper [0] ECC can also be bypassed after | reverse-engineering the mitigation in DDR3 DIMMs. Also: | | > "DDR4 systems with ECC will likely be more exploitable, | after reverse-engineering the ECC functions," researchers | Razavi and Jattke said | | > What if I have ECC-capable DIMMs? Previous work [1] showed | that due to the large number of bit flips in current DDR4 | devices, ECC cannot provide complete protection against | Rowhammer but makes exploitation harder. | | [0] https://cs.vu.nl/~lcr220/ecc/ecc-rh-paper-eccploit-press- | pre... | | [1] | https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8835222 | FpUser wrote: | While it can not protect from bit flips it can most likely | warn about attack in progress due to a large number of bit | flips. | jacquesm wrote: | You'll be sending back all of your RAM then. | jdmichal wrote: | > I'd love to see this code in memory testers like memtest86 so | I could send the RAM back if it ever caught a problem like | this. | | Is this an individual stick bug, or a design bug akin to | Spectre? Given that they tested 40 different sticks and seemed | to find effective patterns for all of them make me think it's | the latter. In which case, just return all your computers? | jandrese wrote: | Yeah, this is a problem with high density DRAM. Fixes are | possible but difficult and expensive. In the cutthroat DRAM | market nobody seems willing to put in the engineering effort | to develop a solution and then manufacture it. It would have | to sell at a premium, in a market where it's difficult to | even find ECC memory at times. | | The article states that ECC isn't sufficient to solve | Rowhammer, but it does make the attack harder. If you are | looking for a hardware solution it is your only option | currently, even if it is imperfect. | R0b0t1 wrote: | If you detect flips you can send it back, it's violating | the advertised interface. | 0xffff2 wrote: | Sure, but if it's possible to cause bit flips in | basically all currently available components, then what's | the point? Return all of your RAM and have a non- | functional computer? | therein wrote: | Can't get bit flips on my RAM if I have no RAM. | ziddoap wrote: | So you just.. Send back all of your RAM all the time | because they are all susceptible to bitflips? | | This is not a tenable solution. | seiferteric wrote: | I imagine you are right that consumers don't care or know | enough to put pressure on manufacturers, but what about big | companies like the FANGs? Or Government agencies like NSA | etc? What are they doing about it? I imagine intelligence | agencies are well aware of the issue and have or are trying | to get solutions to protect their own infra. | codetrotter wrote: | Off topic but shouldn't FAANG really be referred to as | MAAAN now? | | Google -> Alphabet | | Facebook -> Meta | | Meta, Apple, Amazon, Alphabet, Netflix. MAAAN. | tester756 wrote: | I like MAGMA more | festkal wrote: | MANGA | | M - Meta A - Apple N - Netflix G - Google A - Amazon | gavinray wrote: | I'm now adopting this, thanks | somethingwitty1 wrote: | Google still exists. Facebook still exists. Meta/Alphabet | are mainly holding companies. For example, I received | reach outs from recruiters for Google and Facebook today. | Not from Meta and Alphabet. Maybe once Meta and Alphabet | deliver on things and build brand recognition, it should | change. Until then, I'd vote "no". | vidarh wrote: | If you're doing that: M3AN. | gjs278 wrote: | faang is a stupid fucking acronym, just kill it and refer | to them as nothing | daniel-thompson wrote: | AMANAM | | Apple Microsoft Amazon Netflix Alphabet Meta | dboreham wrote: | It happened because at some point in the past (around 10 | years ago I believe), the memory vendors decided that it was | ok to design product with known "pattern sensitivity", | because...I suppose they made more money that way and no | customer complained loudly. Pattern sensitivity in memory | chips is a very old problem, and previously was treated as a | fatal design error, fixed and the affected chips used for | parking lot infill. But today every chip sold is affected. | Basically the same as Boeing and the MAX, just fewer people | killed. | myself248 wrote: | > Pattern sensitivity in memory chips is a very old | problem, and previously was treated as a fatal design error | | You seem to know more about this than I do. Where might one | go to read more about this? Are there whitepapers, design | docs, other resources which could be used to assert | positively that the industry used to consider this a | defect? | tbob22 wrote: | MemTest86 already has a rowhammer test, not sure how it | compares to blacksmith but some sticks I have do fail that test | and can only be mitigated by setting tREFI extremely low (while | also taking a large performance hit). | | Most of the higher end hynix/micron/samsung sticks I've tried | do not fail at JEDEC or XMP after 7+ passes. | userbinator wrote: | The problem is that apparently the authors of memory testing | programs were somehow convinced to hide the severity of this | issue when it first appeared, claiming things like it being a | rare edge case and that _too much RAM would fail_. | | I believe newer versions of MemTest86+ have Rowhammer tests but | it's disabled by default. | | See my previous comment on this matter: | | https://news.ycombinator.com/item?id=12410274 | gjs278 wrote: | did you read the fucking headline? all ram is vulnerable to | this. you don't need to run the test or return anything. you | won't get a stick that isn't vulnerable. jesus you are dumb | StillBored wrote: | Pretty sure the passmark version does | https://www.memtest86.com/compare.html | | Except that the distro's are still shipping an older version | due to license issues (or something like that AFAIK). | mithro wrote: | Google is working on platforms to make this easier to explore | this problem, see | https://opensource.googleblog.com/2021/11/Open%20source%20DD... | adriag wrote: | nice | anthk wrote: | HN itself shouldn't even need JS. | acomjean wrote: | This seems bad, but as a practical matter, what does this mean? | | If you have a process running on your machine can it use this to | get root? Read Keys? | | It looks like they ran their process for 12 hours to do the | flipping. | | And if your flipping your process's memory for that long, what | are the chances you are next to sensitive memory for another | process? It seems bad, but it seems like if your randomly | flipping bits in memory the system will likely crash. | TacticalCoder wrote: | > If you have a process running on your machine can it use this | to get root? Read Keys? | | Which keys? OpenBSD / OpenSSH for example does now (since | 2019?) encrypt SSH keys in RAM to prevent rowhammer like | attacks. They're mixing the key with a huge "pre key" and an | attacker would need to side-channel attack the entire pre key | to be able to read the real SSH keys. | | So it's not as if there was _nothing_ that could be done to | guard against this. | | It may require changing lots of software/libraries but at least | the security-conscious ones already started, without waiting | for this "blacksmith" attack. | guerrilla wrote: | > If you have a process running on your machine can it use this | to get root? Read Keys? | | Yes. There's a BlackHat talk on that[1]. DoS is a big issue on | shared hardware too. You can crash other processes, the kernel | and the hypervisor. | | > And if your flipping your process's memory for that long, | what are the chances you are next to sensitive memory for | another process? | | You don't need to leave it up to chance. There are apparently | ways you can control it and of course you can just spray | everything. | | 1. | https://www.blackhat.com/docs/us-15/materials/us-15-Seaborn-... | | 2. https://github.com/google/rowhammer- | test/blob/master/rowhamm... | acomjean wrote: | Intersting. I can see taking the system down, especially on | shared hardware. | | My (probably slightly naive) understanding is the OS is | allocating memory, acting like a memory cop so processes | don't overlap and swapping when needed and so forth. It seems | like these hardware errors might be mitigated by the OS. | | I'm in a bit over my head in the pdf, but it explains: "Our | kernel privilege escalation works by using row hammering to | induce a bit flip in a page table entry (PTE) that causes the | PTE to point to a physical page containing a page table of | the attacking process. This gives the attacking process | readwrite access to one of its own page tables, and hence to | all of physical memory." | | I'm still a little fuzzy on how they get the right location | in the page table without hosing the system, but this gives | me enough of a gist. Thanks! | guerrilla wrote: | > My (probably slightly naive) understanding is the OS is | allocating memory, acting like a memory cop so processes | don't overlap and swapping when needed and so forth. It | seems like these hardware errors might be mitigated by the | OS. | | You don't need to have access to the memory you want to | attack, only the ability to cause something to read from | memory that physically neighbors the memory you want to | attack. | | Think of page tables like a big array in kernel memory (I'm | being imprecise.) The entries in this table are mappings | from your virtual addresses to physical addresses. You can | cause the kernel to read your at least part (or all of?) | your PTEs, which means you can cause it to flip bits in | neighboring physical cells, which means you can modify the | PTEs. TL;DR: If you can cause the kernel to read some | specific memory then you can modify memory near it. In this | case, you can modify the page table entries, allocating | memory that isn't yours to you (or doing other funny | things.) | | A simpler to explain but probably not realistic | explanation: Imagine if your UID is stored right next to | your GID in some kernel structure, if you could cause the | kernel to repeatedly read your GID then you could cause bit | flips in your UID, which would change which user your | user... | ARandumGuy wrote: | That's my question as well. I'm not a security expert, but this | doesn't seem all that concerning for anything other then the | highest security applications. It seems to me that executing | this attack not only requires running code on the target | machine (which admittedly isn't that big of a hurdle), but | requires basically complete knowledge of memory allocation at | the time of the attack, something that is fairly opaque and | ever changing on most hardware. | | Is there something I'm missing here? What are the realistic | attacks that this vulnerability allows? | acuozzo wrote: | > What are the realistic attacks that this vulnerability | allows? | | https://github.com/IAIK/rowhammerjs | joebob42 wrote: | It might be hard to do a targeted attack. On the other hand, | flipping random bits is likely to just start crashing stuff | and / or producing incorrect results pretty reliably. | guerrilla wrote: | See my reply, sibling to yours. | pfortuny wrote: | See my reply above: in-memory data analysis... | myself248 wrote: | There's no reason the code couldn't just blindly try every | attack until it finds one that succeeds. If there's a | malicious app that runs a lot (lookin' at you, random firefox | tab that eats 99% of my CPU for minutes at a time and then | goes idle again), it has all the time in the world. | pfortuny wrote: | This may be very bad for data analysts. Imagine trying to study | a huge in-memory database and getting different values each | time... | speedgoose wrote: | Assuming you run your database in a public cloud, what's the | likelihood of someone throwing money at the cloud provider to | run the attack on a host shared with your database, with no | added benefits to them? | kevingadd wrote: | The implication of the post was that your own data analysis | workloads could cause rowhammer flips unintentionally, I | think. | pomian wrote: | Complex modeling, that runs for days, could have completly | messed up results, especially since the old standard of using | ECC, is not proof of infallibility. Often those results are in | series, so early errors would compound, and it may very hard to | determine if and why results are wrong. | zsmi wrote: | There was another paper last year that showed something similar, | and it went into way more detail on what TRR actually is. | | TRRespass: Exploiting the Many Sides of Target Row Refresh | | https://arxiv.org/pdf/2004.01807.pdf | | Memory controller mitigation of RowHammer can work pretty well, | if one actually has it turned on. Which is unfortunately rarely | the case even in 2021. | philipkglass wrote: | Is this possibly a route to jailbreaking for iOS, via the | temporary provisioning profile for apps? It seems like you could | run a Rowhammer memory corruption app on your personal device | until getting escalated privileges. Newer OS releases may not | patch the hole since this is a hardware flaw. But I admittedly | have only the vaguest idea of what defenses need overcoming on a | modern iOS device. | londons_explore wrote: | Rowhammer is still pretty hard to exploit because typically you | can't reliably flip most bits, and you can normally only flip | bits that are very close in physical memory address to those | you control. | | Combine that with a lack of knowledge of physical memory | addresses and inability to have much control over memory | layouts, and it really gets tricky to gain privileges outside a | lab environment in a reasonable short time. | | Remember that flipping bits at random will almost certainly | kernel panic the machine before it gives you root access. | | I'm sure a determined attacker could do it though. | chasil wrote: | Hopefully, the OpenBSD extreme implementation of ASLR makes | it even safer. | | On every boot, there is a brand new kernel and C library: | reordering libraries: done reorder_kernel: kernel | relinking done | | ASLR has been compromised in the past, so this likely isn't | completely secure. | jandrese wrote: | This seems like a case where targeting an iPhone might make | life easier. The hardware is quite uniform for a particular | model. | SV_BubbleTime wrote: | Which gets you some architecture knowledge, but doesn't | promise or indicate that your userland ram space is | adjacent to anything important. It's a better start at | playing the game though. | guerrilla wrote: | > very close in physical memory address to those you control. | | Not quite. It doesn't require the ability to write to the | neighboring cells, just read from them. | belter wrote: | Great work. Fascinating and depressing at the same time. Like | watching your house on fire, but not being able to avoid getting | mesmerized by the beautiful flames and tones as your designer | furniture burns away :-) | | "...Are there any DIMMs that are safe? | | We did not find any DIMMs that are completely safe. According to | our data, some DIMMs are more vulnerable to our new Rowhammer | patterns than others. | | Which implications do these new results have for me? | | Triggering bit flips has become more easy on current DDR4 | devices, which facilitates attacks. As DRAM devices in the wild | cannot be updated, they will remain vulnerable for many years. | | How can I check whether my DRAM is vulnerable? | | The code of our Blacksmith Rowhammer fuzzer, which you can use to | assess your DRAM device for bit flips, is available on GitHub. We | also have an early FPGA version of Blacksmith, and we are working | with Google to fully integrate it into an open-source FPGA | Rowhammer-testing platform. | | Why hasn't JEDEC fixed this issue yet? | | A very good question! By now we know, thanks to a better | understanding, that solving Rowhammer is hard but not impossible. | We believe that there is a lot of bureaucracy involved inside | JEDEC that makes it very difficult. | | What if I have ECC-capable DIMMs? | | Previous work showed that due to the large number of bit flips in | current DDR4 devices, ECC cannot provide complete protection | against Rowhammer but makes exploitation harder. What if my | system runs with a double refresh rate? Besides an increased | performance overhead and power consumption, previous work (e.g., | Mutlu et al. and Frigo et al.) showed that doubling the refresh | rate is a weak solution not providing complete protection. | | Why did you anonymize the name of the memory vendors? | | We were forced to anonymize the DRAM vendors of our evaluation. | If you are a researcher, please get in touch with us to receive | more information. ..." | gruez wrote: | why paste the FAQ into your comment? | belter wrote: | Its only a partial quote of the whole FAQ. I know its usual | to get a gist for an article from the comments before getting | to the full details... | hungryforcodes wrote: | Thanks! I wasn't going to read the article -- and this | answers my questions:) | [deleted] | Rd6n6 wrote: | Is this a threat to servers only, or to any network attached | computer with ddr4? | | By coincidence, I've been bluescreening with ram related error | codes for the last 2 days haha | belter wrote: | Threat to your phone and your Routers: | | "Drive-by Rowhammer attack uses GPU to compromise an Android | phone" [2018] | | https://news.ycombinator.com/item?id=16984663 | | "Inducing Rowhammer Faults through Network Requests" | | https://arxiv.org/pdf/1805.04956.pdf | Rd6n6 wrote: | Scary, thanks | zokier wrote: | I would imagine that SME/TME (AMD/Intel memory encryption) would | mitigate Rowhammer-style attacks quite effectively because | attackers would not be able to control the physical bit patterns | anymore? | guerrilla wrote: | Nope. It only requires being able to read neighboring cells. It | would make a privilege escalation attack harder but not | impossible. DoS attacks would still be relatively easy. | LogonType10 wrote: | >DoS attacks would still be relatively easy. | | Local DoS? Can you elaborate on this? | guerrilla wrote: | Rowhammer is a memory corruption technique, so if you | corrupt the right memory of a process in the right way then | you can crash it; same for a kernel or hypervisor. | 420official wrote: | They are saying that privilege escalation is harder because | it's really challenging to target specific bits to flip, | whereas flipping random bits will eventually lead to a | crash of some kind causing the service to fail which is | effectively a DoS | Andys wrote: | It would be so much easier if the RAM just got moved onto the | CPU. As chips get more dense and NAND becomes cheaper, I could | see rowhammer-susceptible DRAM just going away completely for | many forms of computing. | buryat wrote: | it's a ploy to sell more DRAM | Syonyk wrote: | One of the more absurd plots... | | "It's broken! Buy more of the broken stuff!" | egberts1 wrote: | It's a plot to resurge demands for DDR3s. | cout wrote: | Redundant array of inexpensive RAM? | zokier wrote: | You might be joking, but memory mirroring is a thing on | higher-end servers | Animats wrote: | OK, for starters, ECC has to become standard. | | Then the rate of ECC errors has to be monitored. If something is | trying a rowhammer attack, it's going to cause unwanted bit flips | which the ECC will correct. Normally, the ECC error rate is very | low. Under attack, it should go up. So an attack should be | noticeable. You might get some false alarms, but that just means | it's time to replace memory. | arcticbull wrote: | Luckily all DDR5 DIMMs will have on-chip ECC. My understanding | is it's not a complete mitigation but does make exploitation | harder. | snak wrote: | Yes, the article mentions it: | | > What if I have ECC-capable DIMMs? | | > Previous work showed that due to the large number of bit | flips in current DDR4 devices, ECC cannot provide complete | protection against Rowhammer but makes exploitation harder. | hinkley wrote: | It sounds to me like ECC isn't being included in the DDR5 | spec due to magnanimity so much as because it doesn't | function without it. That ECC has become 'load-bearing'. | | Does that mean we need an extended ECC to deal with | critical systems that require additional robustness? | Legion wrote: | Who error checks the error checkers? | RedShift1 wrote: | It's just a matter of time before someone finds a way to | exploit the ECC part, calls it Hammerrow and brings us | back to square one... | aaaaaaaaaaab wrote: | Rowhamming would be a better pun, as DDR5 uses a Hamming | code for error correction. | eqvinox wrote: | > My understanding is it's not a complete mitigation but does | make exploitation harder. | | It won't. It's designed to counter silicon limitations to | increased density, i.e. it's made to _correct the errors that | result from packing cells beyond the limit of error-free | operation_. | | The extra redundancy from on-chip ECC is intended to be | "consumed" by the chip itself, and since this will allow | optimizing chip manufacture to denser and cheaper, it's no | question at all that it will get pushed to the very limit. | | There's still "classic" ECC for DDR5. 8 bits mapped to 9, | terminated at the CPU which can look at things. That's what I | want, need, and will buy. | | P.S.: Shame on Intel for still walling off desktop CPUs from | ECC. https://ark.intel.com/content/www/us/en/ark/search/featu | refi... | Dylan16807 wrote: | > It won't. It's designed to counter silicon limitations to | increased density, i.e. it's made to correct the errors | that result from packing cells beyond the limit of error- | free operation. | | I'd love to see actual parameters for the error correction | codes, but DDR5 could pretty easily be a lot more robust | than DDR4. | | When you have no error correction at all, you need | ridiculously high reliability. Even if these new memory | cells are have a much higher error rate, if they're | designed to seamlessly handle a few bits in the same row | flipping then the overall reliability could skyrocket. | | Edit: Oh, there's a paper from micron talking about DDR5 | only having single bit correction internally. That's not as | useful as it could be against attacks... | | > There's still "classic" ECC for DDR5. 8 bits mapped to 9 | | But Single Error Correction, Dual Error Detection is not | enough to prevent attacks. | | Also because DDR5 uses a smaller width you actually need to | map 8 bits to 10. | arcticbull wrote: | I think on-chip ECC would mitigate this problem just as | well as off-chip ECC. Off-chip ECC is meant to catch errors | during transmission (i.e. 72 bits transmitted for 64 bit | words), not necessarily just the ones that occur internal | to the package. | | I agree it's meant to counter limitations due to increased | density, but it should catch this to an extent also as this | error is induced on-package right, not during transmission. | Or am I mistaken? | dogma1138 wrote: | That's not exactly correct whilst it does there mainly to | allow for higher densities and frequencies it's designed to | prevent bit flips from happening on chip. | | It's not end to end ECC as in it doesn't prevent flips that | happen on the bus or in CPU cache but it does prevent | single bit errors on DRAM. | Karliss wrote: | While they were able to get bitflips with all the modules. | Difference between 100000 and 15 bitflips during 12h seems | significant to me. Whatever mitigation manfacturer B has seems to | be work a lot better than others. That's potential reason for | choosing to buy that instead of others. If that where to be | improved further and decreased probability 10000 times more it | might reach the point where its comparable to random bitflips | from cosmic radiation. | undoware wrote: | Former startup CSO here. There is not enough coffee in the world | to make this day not terrible for anyone whose job involves | worrying about security. | guerrilla wrote: | Wow, it's so cool to see this post. I was just watching the | digital design class from ETH Zurich and they went through | Rowhammer (repeated three times) and I was wondering if the | industry had solved anything and I suspected not based on their | explanations. Now I wish I was actually sitting in class even | more. | | I highly recommend that series by the way.[1] It's not just an | architecture class because it's actually pretty up to date and | highly focused on security. | | 1. | https://www.youtube.com/watch?v=AJBmIaUneB0&list=PL5Q2soXY2Z... | mettamage wrote: | It was an honor to have Kaveh as a teacher when he was still | working for VUSec :) | ben_w wrote: | This is probably a native question, but: could this sort of | attack be prevented by having the physical values be trivially | encrypted version of the logical values? I'm thinking something | as simple as: $value XOR f(memory address, | random number etched onto chip) | jsmith45 wrote: | That could certainly make it more difficult to exploit, sure, | but keep in mind that being able to force a specific value | change is not a hard requirement for this to be a security bug. | | Even then, memory vendors tend to want to compete on frequency | and access timings, which means doing any additional work not | strictly required by the JEDEC standard will make their product | appear worse than competing products, so I doubt they will want | to do that. | | Plus a similar technique could actually be done by the CPU's | memory controller to similar effect, and historically DRAM | design has favored pushing things to the controller when | possible. | pxx wrote: | Flipping a bit flips the output of xor. | matja wrote: | AMD EPYC processors already support AES encryption of memory | (https://developer.amd.com/sev/) where VMs themselves cannot | know the key. Interesting that I didn't see that mentioned in | the paper as a possible mitigation. | guerrilla wrote: | Why do you believe this would help? Have you seen the | attack from the BlackHat?[1] It doesn't require being able | to read any plain text and it doesn't matter how the data | is stored, only that it's near. You don't even have to have | any access to the target memory or even know where it is, | only the ability to cause something else to read it | predictably. | | 1. https://www.blackhat.com/docs/us-15/materials/us-15-Seab | orn-... ___________________________________________________________________ (page generated 2021-11-15 23:00 UTC)