[HN Gopher] CacheOut: Leaking Data on Intel CPUs via Cache Evict... ___________________________________________________________________ CacheOut: Leaking Data on Intel CPUs via Cache Evictions Author : beefhash Score : 236 points Date : 2020-01-27 19:20 UTC (3 hours ago) (HTM) web link (cacheoutattack.com) (TXT) w3m dump (cacheoutattack.com) | [deleted] | loeg wrote: | Tl;dr: Another Intel TSX async abort side channel. | | See also Intel's advisory: https://software.intel.com/security- | software-guidance/softwa... (cite 23 in the CacheOut PDF). | baybal2 wrote: | Again, I reiterate: it's not humanly possible to make performant, | general purpose CPU that is 100% safe from instruction level | attacks. | | Untrusted code execution must go, that's the only way. | | The genie is out of the bottle now. There will be more and more | practical instruction level attacks with each year now. | Dylan16807 wrote: | You can avoid instruction-level attacks by putting untrusted | code on an isolated core with an isolated memory module, right? | That seems more likely to me than everyone disabling | javascript. | mooman219 wrote: | Maybe with the recent influx of larger core count systems, | this isn't entirely unreasonable in the future. | baybal2 wrote: | > You can avoid instruction-level attacks by putting | untrusted code on an isolated core with an isolated memory | module, right? | | No, unfortunately. Side channels will work even across numa | domains, and cores with own memory. | | Side channel free hardware is extremely hard to do even in | simplest ISAs specially made for that. Look at how credit | card industry keeps struggling making safe smartcards. | NullPrefix wrote: | Then it should be written on the product packaging. Big | letters. | heavyset_go wrote: | > _Untrusted code execution must go, that 's the only way._ | | This is how you get corporate rule over what can and can't run | on machines you own. | | _Safer_ architectures can be developed, without handing over | control to a third party. | baybal2 wrote: | I don't believe that even a complete comparch-101 in-order, | non-pipelined, architecture without branch prediction, and | register renaming can be made safe enough. | | But yes, the industry made an extremely risky bet a decade | ago with both virtualisation, and running unsafe Javascript | with JIT. | | It will take many billions for the industry to do a U-turn, | and switch back to dedicated servers as a golden standard, | and putting a leash on ambitions of browser makers. | cpudestw2020 wrote: | > I don't believe that even a complete comparch-101 in- | order, non-pipelined, architecture without branch | prediction, and register renaming can be made safe enough. | | Disagree. You do have to put a lot of work into the | process. Formal spec as well as a formal method/simulation. | There are certainly a lot of fun things to consider in | that, but I don't think it's completely unfeasible. | | As an aside, one of the things that intrigues me about | RISC-V is things such as a formal instruction set spec | being openly published [1]. You could apply all sorts of | tooling to that before actually creating silicon. | | > But yes, the industry made an extremely risky bet a | decade ago with both virtualisation, and running unsafe | Javascript with JIT. | | Focusing so much on Jitting JS was as bad idea then just | like it was now. The entire world of the web is | artificially propped up and will probably die the way it | should have died when it started: with people frustrated by | constant security vulnerabilities enabled by a group of ad | companies who don't care about anything but money. | | > It will take many billions for the industry to do a | U-turn, and switch back to dedicated servers as a golden | standard, and putting a leash on ambitions of browser | makers. | | WRT Dedicated servers, if anything the industry needs that | push now anyway. I've yet to see a 'Real' cost projection | on a SaaS 'rewrite'; i.e. if the current thing has been in | use 5 years longer than it should have, your cost models | should go out 5 years longer than you intend your new | solution to exist. | | Which would be ironic; my pain is in the beancounting side | (usually what an org really cares about) yet security will | be the more likely scenario. | | I think the cat is out of the bag for the browser market | though. Users have on the whole gotten 're-enclosed' thanks | to modern laptops tending towards small eMMC or SSD sizes. | | But FFS, this sort of thing is the reason WASM scares the | daylights out of me. | | [1] https://github.com/mrLSD/riscv-fs | cpudestw2020 wrote: | Can't edit, but obviously the web won't die. But it is | going to change a bit over the next few years still; the | privacy changes are likely to be a big catalyst for a lot | of change... | swebs wrote: | >Untrusted code execution must go, that's the only way. | | Ironically, you have to enable third party javascript to even | view the page. | legulere wrote: | I don't think so. Side channel attacks became recently more | mainstream and people start designing stuff with it in mind. | | What you see here is yet another attack on a relatively unused | Intel-only x86 extension. | | The bigger problem is that Unixes and Windows are pretty bad at | sandboxing syscalls by default. | baybal2 wrote: | I do remember there 100% were academic papers throwing barbs | into Pentium 3 branch predictor security back in nineties. | They just went unnoticed for there being no Google logo on | the paper. | | People in hardware engineering community knew of their | existence long ago. It was just them not being practically | exploitable that kept them from headlines. | | But now we have a multibillion buck virtualised hosting | industry, and JS with JIT in every browser -- a million buck | incentive for black hats to poke into architectural | vulnurabilities | demarcus wrote: | rgy g | dmix wrote: | As someone unfamiliar with creating scientific papers I'm curious | if anyone knows what format the citation modal is using? | | https://imgur.com/a/bem4e8u | | Is it for TeX or a common syntax for some publishing platforms to | pick up? | detaro wrote: | Bibtex. Originally for the bibtex tool (which generates | bibliographies for (La)TeX documents), but now common with | other citation tools too. | wwwhizz wrote: | That's bibtex, for LaTeX. | emn13 wrote: | I mean, you gotta appeciate their efforts to make something so | technical approachable by the general population, including gems | like this in their Q&A: :-D: | | _What is an operating system?_ | | _An operating system (OS) is system software responsible for | managing your computer hardware by abstracting it through a | common interface, which can be used by the software running on | top of it. Furthermore, the operating system decides how this | hardware is shared by your software, and as such has access to | all the data stored in your computer 's memory. Thus, it is | essential to isolate the operating system from the other programs | running on the machine._ | cptskippy wrote: | > An operating system (OS) is system software responsible for | managing your computer hardware by abstracting it through a | common interface | | So uh... that's a reference to the IME right? Not the user's | installed OS. | chmod775 wrote: | No they're speaking of the (user installed) OS. | scandinavian wrote: | >CacheOut violates the operating system's privacy by extracting | information from it that facilitates other attacks, such as | buffer overflow attacks. | | >More specifically, modern operating systems employ Kernel | Address Space Layout Randomization (KASLR) and stack canaries. | KASLR randomizes the location of the data structures and code | used by the operating system, such that the location is unknown | to an attacker. Stack canaries put secret values on the stack to | detect whether an attacker has tampered with the stack. CacheOut | extracts this information from the operating system, essentially | enabling full exploitation via other software attacks, such such | as buffer overflow attacks. | | Can anyone explain this scenario? Is this really a realistic | scenario? Do they mean if you have code execution on a system, | and want to escalate privileges, you would find another | network/socket service that is running on the same system, find | an exploit in this service, and then leak the stack canary to | allow corrupting the stack? There's often easier ways to defeat | the stack canary. | loeg wrote: | Yes, it's a local privilege escalation issue, i.e. you must | have local code execution first. Leaking KASLR/stack canary | just mean you get 90s level triviality of attacking any stack- | overflowing API you can find. Without this bug, if you found a | stack-overflow you could trigger from your local unprivileged | code, the target would likely detect the overflow due to the | canary. If you defeated that mechanism, constructing shellcode | might be more difficult due to (K)ASLR. With this bug, neither | stack canaries nor (K)ASLR are effective defenses against | unprivileged programs. | scandinavian wrote: | Sure, I was just kinda confused that this was the example | they presented. I guess for highly targeted attacks, it might | be somewhat useful. | | >Leaking KASLR/stack canary just mean you get 90s level | triviality of attacking any stack-overflowing API you can | find. | | It does not seem trivial with this exploit, but maybe I'm | just not getting it. With the low accuracy and transfer rate, | it seems like a lot of stars need to line up with regards to | how the service you attack function. | lukestateson wrote: | Intel got cash in, but we've got cacheout. | annoyingnoob wrote: | Is disabling Hyper-threading a viable work-around in this case? | sjnu wrote: | "CacheOut is effective even in the non HyperThreaded scenario, | where the victim always runs sequentially to the attacker." | kohtatsu wrote: | Did AMD design their processors with these side channel attacks | in mind, or is it a matter of where the security research is | focused? | loeg wrote: | It is specific to Intel TSX, which should already be disabled | due to earlier published MDS vulnerabilities in the same space. | jdsully wrote: | Only the implicit TSX has been disabled. You can still use | xbegin/xend/xabort. | | They had an additional mode that would transparently convert | many spinlocks into transactions without code changes - that | is now gone. | loeg wrote: | "Should" as in, I think the earlier vulnerability was | damning enough that people /should/ have disabled TSX | entirely. If they have not already, they /should/ now. | jdsully wrote: | TSX can provide very important speedups where there | aren't adversarial workloads sharing memory. Your web | browser shouldn't be using it, but I think turning it off | globally doesn't make much sense. Unlike hyperthreading | use of TSX can be decided on an app to app basis. | | As core counts increase spinlocks and other | synchronization primitives simply become too expensive. | We'll need transactional hardware support eventually. | cosmiccatnap wrote: | That's putting alot of faith in the OS and it's ability | to sandbox correctly. | loeg wrote: | Especially when this publication shows that sandboxing | correctly isn't possible with TSX. | loeg wrote: | It is dangerous to treat the security domain as app-to- | app instead of whole-machine given the scope of | vulnerability. If your TSX workload runs on dedicated | machines and/or you're ok with the reduction in kernel | defenses, sure, enable it. But the default should be | "off." | | Scaling workloads does not require transactional memory | and certainly doesn't require a vulnerable implementation | of it. HTM might be the easiest way to scale a relatively | naive algorithm, but the most scalable synchronization is | none at all (or vanishingly infrequent) -- and that works | just fine with conventional locks and atomics (both | locked instructions and memory model "atomics" such as | release/acquire/seq_cst semantics). | jdsully wrote: | The problem TSX is attempting to address is bus traffic | from those exact lock instructions you mention. These | work OK for the relatively small core counts of today but | won't scale to hundreds of cores. | | TSX brings hardware supported optimistic locking and | breaks the latency imposed by MESI and related protocols | in use today. Of course its great if you can get away | with no synchronization at all - but then you might as | well just use a GPU. TSX helps with those non-trivially | parallelized problems that are still best performed on a | CPU. | loeg wrote: | I explicitly mentioned memory model atomics in addition | to locked instructions in an attempt to prevent getting | hung up on locked atomics. I guess that didn't work. | | Obviously many workloads require some coordination, but | often something as trivial as allocating one of a given | resource per CPU is sufficient to avoid most contention | even on 100s of CPU core machines. Profile; improve. The | same is required with HTM. | | Regardless of your thoughts on HTM and scaling | technology, TSX is broken from a security standpoint, | which is the primary subject of the fine article. HTM != | TSX. | _hl_ wrote: | > AMD is not affected by CacheOut, as AMD does not offer any | feature akin to Intel TSX on their current offering of CPUs. | | It seems to be down to the notoriously buggy TSX (hardware | transactional memory) in Intel CPUs. | rodgerd wrote: | It's interesting that TSX seems to be one of those holy | grails that causes more problems than it solves - trying to | implement TSX caused, I believe, huge problems for Sunacle's | Rock processor team. | eyegor wrote: | Amd still has almost no server marketshare. This attack | specifically leverages tsx, which is an Intel set of | transactional extensions to the cpu microcode. Intel has since | published microcode updates to enable you to disable tsx. | cosmiccatnap wrote: | Not sure you've been keeping up but in 2020 most major server | makers have switched or at least offer an epyc varient | because the price/performance is better even without Intel's | constant security woes | topspin wrote: | > Amd still has almost no server marketshare. | | Yet another Intel exclusive side channel vulnerability might | help change that. This side channel stuff is terrible for | cloud operators. Every time they have to adopt another layer | of mitigation some fraction of their capacity disappears in a | puff of shame and excuses. | Symmetry wrote: | In this case AMD just doesn't support TSX instructions. For | previous vulnerabilities with Meltdown I'd guess that AMD just | didn't feel they had the engineering resources to do after the | fact validation of memory access permissions without | architecturally leaking data in some weird corner case. But | then along comes Meltdown and it turns out that non- | architectural leaks are a thing that can happen and if you try | to do after the fact validation you can't prevent them. | makomk wrote: | Mostly, it seems to be a difference in design philosphy - AMD | processors prevent speculative reads of data that shouldn't be | accessible almost everywhere, whereas Intel ones allow them | pretty much everywhere. This particular attack requires TSX | which AMD processors don't have, but I don't think it'd work on | AMD processors anyway because they're not missing the security | check that Intel ones are. If I remember rightly, there were | other now-mitigated exploits for this line fill buffer leak | that didn't require TSX. | | (The one exception is also interesting. AMD processors allow | speculative reads past the end of x86 segments and past BOUND | instructions, which of course no-one uses these days. This | suggests there may have been a deliberate decision to block | them in the more important cases.) | hinkley wrote: | "Faster than possible" seems really appropriate here. | | They built a bunch of tech debt into their processors to | boost their numbers, and now they hens are coming home to | roost. | | What I'm wondering is how many changes this will make to | their product roadmap, and to what degree it will make next | generation chips look lackluster compared to what people | (think they) have now. | temac wrote: | It is known since... forever?, that one should not speculate | across security boundaries. This was not enough to avoid | Spectre, but this was largely enough to avoid TONS of | security vulns Intel is also susceptible to. | | Somebody messed-up big time. Or from a business point of view | did they? Intel current problems are manufacturing and the | continuously lower power of their "legacy" processor (except | due to manufacturing problems, this "legacy" is still mostly | the current one) makes it so that people are buying more. Of | course there is AMD back in the game, but the market demand | is large enough; plus AMD would have been there anyway, and, | in the fiction that Intel did take good parts of the perf hit | upfront instead of the secu vulns, as competitive as in the | current situation. | | The people most annoyed are the users. Intel got away by | pretending this was not really defects in their product but | only new SW tricks that they will help defend against, and | their clients just let them say that without much complaint | (well, I guess big ones got some rebate...) but security | researchers and/or processor designers know very well this is | bullshit (see the vulns papers and FAQ) and that they simply | fucked-up big time on Meltdown, MDS, etc. I don't care that a | few other vendors did some of the same mistakes: they are | still mistakes and design flaws, and not even something new. | | Pretty much the only new shinny thing in this stream of vulns | was Spectre and the few variants that appeared quite early on | (but NOT Meltdown&co). The rest are design flaws that comes | from the "oh not a big deal to leak that potentially | privileged data, we will drop anything and trap before any | derivative can go out anyway" mentality. Yeah, no, I'm sorry | but the funding paper about speculation already told to not | do that :/ Either they did not do their homework, or they | voluntarily chose to violate the rule. | simias wrote: | >It is known since... forever?, that one should not | speculate across security boundaries. | | ... if you value security over raw performance. Clearly | Intel has decided at some point that it was worth playing | with fire in order to get ahead in benchmarks. In their | defence it seems to have worked reasonably well for them | for quite a while. | | >The people most annoyed are the users. | | I wish, but I wonder how much of that is true. Are most | users even aware of these problems? They get patched | automatically by OS vendors and then most of the time they | won't hear about them anymore. I think the "nobody gets | fired for choosing Intel" will probably still prevail for | quite some time. | qeternity wrote: | For most of Intel's existence people ran fairly trusted | workloads side-by-side. It wasn't until "the cloud" that | things really changed. | jnordwick wrote: | Before people get all nutty as usual: | | This is another TSX (transactional memory) issue, and you can | disable TSX without much of a problem. | | The attacker basically needs to be running a binary on the | machine (not JS in a browser or anything). | | The leakage is extremely slow, about 225 byte/minute for ASCII | text (a 4k page in 18 minutes). I'm not sure if that was an exact | recovery either, just a probabilistic one. With noise reduction | to enhance recovery, they said it took twice as long - so about | 113 byte/minute. | | It seems to be able to only be able to control the bottom 12-bits | of the address to recover (but I didn't fully get why) and | require the process to be either reading or writing the data to | get it into L1 cache somehow, so just sitting in RAM isn't good | enough. | | attacker still needs to figure out an address (even with ASLR | there is still a lot of guessing, and if you have really | sensitive data, just move it every second until you wipe it). | | Interesting, but kind of a non-issue. | jandrese wrote: | That's a lot of work, but the instant it's in a rootkit | everybody will be able to do it. 113 bytes is extracting an AES | key from memory in about a minute. | | It's only really hard and messy for the first guy who | implements it, after that it is much easier, albeit still | fairly messy. I'm not saying we need to panic, but it's more | than a "non-issue". | scandinavian wrote: | >but the instant it's in a rootkit everybody will be able to | do it. | | This makes no sense. If you have the privileges to install a | rootkit, there is no need to use any speculative execution | exploit. | jandrese wrote: | The first feature of a rootkit is to get root access. A | userspace kernel-information leak utility would be very | useful for this step. | scandinavian wrote: | Okay, we have differing definitions of what a rootkit is | it seems. | jnordwick wrote: | we have yet to even find Specter in a kit, this will never | see the light of day, especially since it is so easy to avoid | (turn TSX off). | | You also, don't get to extract constantly. You only get a | shot when the data is in the LFB, so the program needs to be | actively reading or writing it to keep it moving back and | forth from L1 and L2, at least that is the way I read the | paper. | Tuna-Fish wrote: | > It seems to be able to only be able to control the bottom | 12-bits of the address to recover (but I didn't fully get why) | | 2^12 = 4096, or the x86 page size in bytes. | cesarb wrote: | And as for why the page size is relevant for the L1 cache: | Intel (and AMD, and most other modern CPUs) use a VIPT | (Virtual Indexed Physical Tagged) L1 cache, where the virtual | address (before the page table translation) is used to index | the cache (this is faster since it can be done in parallel | with the TLB lookup to get the physical address). To prevent | the confusing situation where the same physical address has | more than one index in the cache, only the bits of the | address which do not change between the virtual address and | the physical address can be used; these are the bits which | correspond to the offset within the page. | | (As an aside, this also limits the size of the L1 cache, | which is why it hasn't grown much despite the L2 and L3 cache | growing a lot; an 8-way set-associative VIPT cache with a | 4KiB page size is limited to 32KiB, absent tricks like page | coloring. Perhaps this will change if 64-bit ARM servers | become popular, since they can only address the largest | amount of memory when using a 64KiB page size, and this would | make enterprise distributions default to that page size.) | FisDugthop wrote: | By your own numbers, this could translate to a 30-second AES | key exfiltration in the cloud. This isn't a non-issue, even if | you personally aren't affected. | jnordwick wrote: | you have to find the address first, which is a lot of | rummaging around. you aren't just handed it. that is going to | quite a bit, especially the program build any security | measures in (eg, allocate at a random address). good luck | with that. | Flockster wrote: | Is this coordinated with OS providers? It only mentioned that | Intel released microcode. ___________________________________________________________________ (page generated 2020-01-27 23:00 UTC)