[HN Gopher] CacheOut: Leaking Data on Intel CPUs via Cache Evict...
       ___________________________________________________________________
        
       CacheOut: Leaking Data on Intel CPUs via Cache Evictions
        
       Author : beefhash
       Score  : 236 points
       Date   : 2020-01-27 19:20 UTC (3 hours ago)
        
 (HTM) web link (cacheoutattack.com)
 (TXT) w3m dump (cacheoutattack.com)
        
       | [deleted]
        
       | loeg wrote:
       | Tl;dr: Another Intel TSX async abort side channel.
       | 
       | See also Intel's advisory: https://software.intel.com/security-
       | software-guidance/softwa... (cite 23 in the CacheOut PDF).
        
       | baybal2 wrote:
       | Again, I reiterate: it's not humanly possible to make performant,
       | general purpose CPU that is 100% safe from instruction level
       | attacks.
       | 
       | Untrusted code execution must go, that's the only way.
       | 
       | The genie is out of the bottle now. There will be more and more
       | practical instruction level attacks with each year now.
        
         | Dylan16807 wrote:
         | You can avoid instruction-level attacks by putting untrusted
         | code on an isolated core with an isolated memory module, right?
         | That seems more likely to me than everyone disabling
         | javascript.
        
           | mooman219 wrote:
           | Maybe with the recent influx of larger core count systems,
           | this isn't entirely unreasonable in the future.
        
           | baybal2 wrote:
           | > You can avoid instruction-level attacks by putting
           | untrusted code on an isolated core with an isolated memory
           | module, right?
           | 
           | No, unfortunately. Side channels will work even across numa
           | domains, and cores with own memory.
           | 
           | Side channel free hardware is extremely hard to do even in
           | simplest ISAs specially made for that. Look at how credit
           | card industry keeps struggling making safe smartcards.
        
         | NullPrefix wrote:
         | Then it should be written on the product packaging. Big
         | letters.
        
         | heavyset_go wrote:
         | > _Untrusted code execution must go, that 's the only way._
         | 
         | This is how you get corporate rule over what can and can't run
         | on machines you own.
         | 
         |  _Safer_ architectures can be developed, without handing over
         | control to a third party.
        
           | baybal2 wrote:
           | I don't believe that even a complete comparch-101 in-order,
           | non-pipelined, architecture without branch prediction, and
           | register renaming can be made safe enough.
           | 
           | But yes, the industry made an extremely risky bet a decade
           | ago with both virtualisation, and running unsafe Javascript
           | with JIT.
           | 
           | It will take many billions for the industry to do a U-turn,
           | and switch back to dedicated servers as a golden standard,
           | and putting a leash on ambitions of browser makers.
        
             | cpudestw2020 wrote:
             | > I don't believe that even a complete comparch-101 in-
             | order, non-pipelined, architecture without branch
             | prediction, and register renaming can be made safe enough.
             | 
             | Disagree. You do have to put a lot of work into the
             | process. Formal spec as well as a formal method/simulation.
             | There are certainly a lot of fun things to consider in
             | that, but I don't think it's completely unfeasible.
             | 
             | As an aside, one of the things that intrigues me about
             | RISC-V is things such as a formal instruction set spec
             | being openly published [1]. You could apply all sorts of
             | tooling to that before actually creating silicon.
             | 
             | > But yes, the industry made an extremely risky bet a
             | decade ago with both virtualisation, and running unsafe
             | Javascript with JIT.
             | 
             | Focusing so much on Jitting JS was as bad idea then just
             | like it was now. The entire world of the web is
             | artificially propped up and will probably die the way it
             | should have died when it started: with people frustrated by
             | constant security vulnerabilities enabled by a group of ad
             | companies who don't care about anything but money.
             | 
             | > It will take many billions for the industry to do a
             | U-turn, and switch back to dedicated servers as a golden
             | standard, and putting a leash on ambitions of browser
             | makers.
             | 
             | WRT Dedicated servers, if anything the industry needs that
             | push now anyway. I've yet to see a 'Real' cost projection
             | on a SaaS 'rewrite'; i.e. if the current thing has been in
             | use 5 years longer than it should have, your cost models
             | should go out 5 years longer than you intend your new
             | solution to exist.
             | 
             | Which would be ironic; my pain is in the beancounting side
             | (usually what an org really cares about) yet security will
             | be the more likely scenario.
             | 
             | I think the cat is out of the bag for the browser market
             | though. Users have on the whole gotten 're-enclosed' thanks
             | to modern laptops tending towards small eMMC or SSD sizes.
             | 
             | But FFS, this sort of thing is the reason WASM scares the
             | daylights out of me.
             | 
             | [1] https://github.com/mrLSD/riscv-fs
        
               | cpudestw2020 wrote:
               | Can't edit, but obviously the web won't die. But it is
               | going to change a bit over the next few years still; the
               | privacy changes are likely to be a big catalyst for a lot
               | of change...
        
         | swebs wrote:
         | >Untrusted code execution must go, that's the only way.
         | 
         | Ironically, you have to enable third party javascript to even
         | view the page.
        
         | legulere wrote:
         | I don't think so. Side channel attacks became recently more
         | mainstream and people start designing stuff with it in mind.
         | 
         | What you see here is yet another attack on a relatively unused
         | Intel-only x86 extension.
         | 
         | The bigger problem is that Unixes and Windows are pretty bad at
         | sandboxing syscalls by default.
        
           | baybal2 wrote:
           | I do remember there 100% were academic papers throwing barbs
           | into Pentium 3 branch predictor security back in nineties.
           | They just went unnoticed for there being no Google logo on
           | the paper.
           | 
           | People in hardware engineering community knew of their
           | existence long ago. It was just them not being practically
           | exploitable that kept them from headlines.
           | 
           | But now we have a multibillion buck virtualised hosting
           | industry, and JS with JIT in every browser -- a million buck
           | incentive for black hats to poke into architectural
           | vulnurabilities
        
       | demarcus wrote:
       | rgy g
        
       | dmix wrote:
       | As someone unfamiliar with creating scientific papers I'm curious
       | if anyone knows what format the citation modal is using?
       | 
       | https://imgur.com/a/bem4e8u
       | 
       | Is it for TeX or a common syntax for some publishing platforms to
       | pick up?
        
         | detaro wrote:
         | Bibtex. Originally for the bibtex tool (which generates
         | bibliographies for (La)TeX documents), but now common with
         | other citation tools too.
        
         | wwwhizz wrote:
         | That's bibtex, for LaTeX.
        
       | emn13 wrote:
       | I mean, you gotta appeciate their efforts to make something so
       | technical approachable by the general population, including gems
       | like this in their Q&A: :-D:
       | 
       |  _What is an operating system?_
       | 
       |  _An operating system (OS) is system software responsible for
       | managing your computer hardware by abstracting it through a
       | common interface, which can be used by the software running on
       | top of it. Furthermore, the operating system decides how this
       | hardware is shared by your software, and as such has access to
       | all the data stored in your computer 's memory. Thus, it is
       | essential to isolate the operating system from the other programs
       | running on the machine._
        
         | cptskippy wrote:
         | > An operating system (OS) is system software responsible for
         | managing your computer hardware by abstracting it through a
         | common interface
         | 
         | So uh... that's a reference to the IME right? Not the user's
         | installed OS.
        
           | chmod775 wrote:
           | No they're speaking of the (user installed) OS.
        
       | scandinavian wrote:
       | >CacheOut violates the operating system's privacy by extracting
       | information from it that facilitates other attacks, such as
       | buffer overflow attacks.
       | 
       | >More specifically, modern operating systems employ Kernel
       | Address Space Layout Randomization (KASLR) and stack canaries.
       | KASLR randomizes the location of the data structures and code
       | used by the operating system, such that the location is unknown
       | to an attacker. Stack canaries put secret values on the stack to
       | detect whether an attacker has tampered with the stack. CacheOut
       | extracts this information from the operating system, essentially
       | enabling full exploitation via other software attacks, such such
       | as buffer overflow attacks.
       | 
       | Can anyone explain this scenario? Is this really a realistic
       | scenario? Do they mean if you have code execution on a system,
       | and want to escalate privileges, you would find another
       | network/socket service that is running on the same system, find
       | an exploit in this service, and then leak the stack canary to
       | allow corrupting the stack? There's often easier ways to defeat
       | the stack canary.
        
         | loeg wrote:
         | Yes, it's a local privilege escalation issue, i.e. you must
         | have local code execution first. Leaking KASLR/stack canary
         | just mean you get 90s level triviality of attacking any stack-
         | overflowing API you can find. Without this bug, if you found a
         | stack-overflow you could trigger from your local unprivileged
         | code, the target would likely detect the overflow due to the
         | canary. If you defeated that mechanism, constructing shellcode
         | might be more difficult due to (K)ASLR. With this bug, neither
         | stack canaries nor (K)ASLR are effective defenses against
         | unprivileged programs.
        
           | scandinavian wrote:
           | Sure, I was just kinda confused that this was the example
           | they presented. I guess for highly targeted attacks, it might
           | be somewhat useful.
           | 
           | >Leaking KASLR/stack canary just mean you get 90s level
           | triviality of attacking any stack-overflowing API you can
           | find.
           | 
           | It does not seem trivial with this exploit, but maybe I'm
           | just not getting it. With the low accuracy and transfer rate,
           | it seems like a lot of stars need to line up with regards to
           | how the service you attack function.
        
       | lukestateson wrote:
       | Intel got cash in, but we've got cacheout.
        
       | annoyingnoob wrote:
       | Is disabling Hyper-threading a viable work-around in this case?
        
         | sjnu wrote:
         | "CacheOut is effective even in the non HyperThreaded scenario,
         | where the victim always runs sequentially to the attacker."
        
       | kohtatsu wrote:
       | Did AMD design their processors with these side channel attacks
       | in mind, or is it a matter of where the security research is
       | focused?
        
         | loeg wrote:
         | It is specific to Intel TSX, which should already be disabled
         | due to earlier published MDS vulnerabilities in the same space.
        
           | jdsully wrote:
           | Only the implicit TSX has been disabled. You can still use
           | xbegin/xend/xabort.
           | 
           | They had an additional mode that would transparently convert
           | many spinlocks into transactions without code changes - that
           | is now gone.
        
             | loeg wrote:
             | "Should" as in, I think the earlier vulnerability was
             | damning enough that people /should/ have disabled TSX
             | entirely. If they have not already, they /should/ now.
        
               | jdsully wrote:
               | TSX can provide very important speedups where there
               | aren't adversarial workloads sharing memory. Your web
               | browser shouldn't be using it, but I think turning it off
               | globally doesn't make much sense. Unlike hyperthreading
               | use of TSX can be decided on an app to app basis.
               | 
               | As core counts increase spinlocks and other
               | synchronization primitives simply become too expensive.
               | We'll need transactional hardware support eventually.
        
               | cosmiccatnap wrote:
               | That's putting alot of faith in the OS and it's ability
               | to sandbox correctly.
        
               | loeg wrote:
               | Especially when this publication shows that sandboxing
               | correctly isn't possible with TSX.
        
               | loeg wrote:
               | It is dangerous to treat the security domain as app-to-
               | app instead of whole-machine given the scope of
               | vulnerability. If your TSX workload runs on dedicated
               | machines and/or you're ok with the reduction in kernel
               | defenses, sure, enable it. But the default should be
               | "off."
               | 
               | Scaling workloads does not require transactional memory
               | and certainly doesn't require a vulnerable implementation
               | of it. HTM might be the easiest way to scale a relatively
               | naive algorithm, but the most scalable synchronization is
               | none at all (or vanishingly infrequent) -- and that works
               | just fine with conventional locks and atomics (both
               | locked instructions and memory model "atomics" such as
               | release/acquire/seq_cst semantics).
        
               | jdsully wrote:
               | The problem TSX is attempting to address is bus traffic
               | from those exact lock instructions you mention. These
               | work OK for the relatively small core counts of today but
               | won't scale to hundreds of cores.
               | 
               | TSX brings hardware supported optimistic locking and
               | breaks the latency imposed by MESI and related protocols
               | in use today. Of course its great if you can get away
               | with no synchronization at all - but then you might as
               | well just use a GPU. TSX helps with those non-trivially
               | parallelized problems that are still best performed on a
               | CPU.
        
               | loeg wrote:
               | I explicitly mentioned memory model atomics in addition
               | to locked instructions in an attempt to prevent getting
               | hung up on locked atomics. I guess that didn't work.
               | 
               | Obviously many workloads require some coordination, but
               | often something as trivial as allocating one of a given
               | resource per CPU is sufficient to avoid most contention
               | even on 100s of CPU core machines. Profile; improve. The
               | same is required with HTM.
               | 
               | Regardless of your thoughts on HTM and scaling
               | technology, TSX is broken from a security standpoint,
               | which is the primary subject of the fine article. HTM !=
               | TSX.
        
         | _hl_ wrote:
         | > AMD is not affected by CacheOut, as AMD does not offer any
         | feature akin to Intel TSX on their current offering of CPUs.
         | 
         | It seems to be down to the notoriously buggy TSX (hardware
         | transactional memory) in Intel CPUs.
        
           | rodgerd wrote:
           | It's interesting that TSX seems to be one of those holy
           | grails that causes more problems than it solves - trying to
           | implement TSX caused, I believe, huge problems for Sunacle's
           | Rock processor team.
        
         | eyegor wrote:
         | Amd still has almost no server marketshare. This attack
         | specifically leverages tsx, which is an Intel set of
         | transactional extensions to the cpu microcode. Intel has since
         | published microcode updates to enable you to disable tsx.
        
           | cosmiccatnap wrote:
           | Not sure you've been keeping up but in 2020 most major server
           | makers have switched or at least offer an epyc varient
           | because the price/performance is better even without Intel's
           | constant security woes
        
           | topspin wrote:
           | > Amd still has almost no server marketshare.
           | 
           | Yet another Intel exclusive side channel vulnerability might
           | help change that. This side channel stuff is terrible for
           | cloud operators. Every time they have to adopt another layer
           | of mitigation some fraction of their capacity disappears in a
           | puff of shame and excuses.
        
         | Symmetry wrote:
         | In this case AMD just doesn't support TSX instructions. For
         | previous vulnerabilities with Meltdown I'd guess that AMD just
         | didn't feel they had the engineering resources to do after the
         | fact validation of memory access permissions without
         | architecturally leaking data in some weird corner case. But
         | then along comes Meltdown and it turns out that non-
         | architectural leaks are a thing that can happen and if you try
         | to do after the fact validation you can't prevent them.
        
         | makomk wrote:
         | Mostly, it seems to be a difference in design philosphy - AMD
         | processors prevent speculative reads of data that shouldn't be
         | accessible almost everywhere, whereas Intel ones allow them
         | pretty much everywhere. This particular attack requires TSX
         | which AMD processors don't have, but I don't think it'd work on
         | AMD processors anyway because they're not missing the security
         | check that Intel ones are. If I remember rightly, there were
         | other now-mitigated exploits for this line fill buffer leak
         | that didn't require TSX.
         | 
         | (The one exception is also interesting. AMD processors allow
         | speculative reads past the end of x86 segments and past BOUND
         | instructions, which of course no-one uses these days. This
         | suggests there may have been a deliberate decision to block
         | them in the more important cases.)
        
           | hinkley wrote:
           | "Faster than possible" seems really appropriate here.
           | 
           | They built a bunch of tech debt into their processors to
           | boost their numbers, and now they hens are coming home to
           | roost.
           | 
           | What I'm wondering is how many changes this will make to
           | their product roadmap, and to what degree it will make next
           | generation chips look lackluster compared to what people
           | (think they) have now.
        
           | temac wrote:
           | It is known since... forever?, that one should not speculate
           | across security boundaries. This was not enough to avoid
           | Spectre, but this was largely enough to avoid TONS of
           | security vulns Intel is also susceptible to.
           | 
           | Somebody messed-up big time. Or from a business point of view
           | did they? Intel current problems are manufacturing and the
           | continuously lower power of their "legacy" processor (except
           | due to manufacturing problems, this "legacy" is still mostly
           | the current one) makes it so that people are buying more. Of
           | course there is AMD back in the game, but the market demand
           | is large enough; plus AMD would have been there anyway, and,
           | in the fiction that Intel did take good parts of the perf hit
           | upfront instead of the secu vulns, as competitive as in the
           | current situation.
           | 
           | The people most annoyed are the users. Intel got away by
           | pretending this was not really defects in their product but
           | only new SW tricks that they will help defend against, and
           | their clients just let them say that without much complaint
           | (well, I guess big ones got some rebate...) but security
           | researchers and/or processor designers know very well this is
           | bullshit (see the vulns papers and FAQ) and that they simply
           | fucked-up big time on Meltdown, MDS, etc. I don't care that a
           | few other vendors did some of the same mistakes: they are
           | still mistakes and design flaws, and not even something new.
           | 
           | Pretty much the only new shinny thing in this stream of vulns
           | was Spectre and the few variants that appeared quite early on
           | (but NOT Meltdown&co). The rest are design flaws that comes
           | from the "oh not a big deal to leak that potentially
           | privileged data, we will drop anything and trap before any
           | derivative can go out anyway" mentality. Yeah, no, I'm sorry
           | but the funding paper about speculation already told to not
           | do that :/ Either they did not do their homework, or they
           | voluntarily chose to violate the rule.
        
             | simias wrote:
             | >It is known since... forever?, that one should not
             | speculate across security boundaries.
             | 
             | ... if you value security over raw performance. Clearly
             | Intel has decided at some point that it was worth playing
             | with fire in order to get ahead in benchmarks. In their
             | defence it seems to have worked reasonably well for them
             | for quite a while.
             | 
             | >The people most annoyed are the users.
             | 
             | I wish, but I wonder how much of that is true. Are most
             | users even aware of these problems? They get patched
             | automatically by OS vendors and then most of the time they
             | won't hear about them anymore. I think the "nobody gets
             | fired for choosing Intel" will probably still prevail for
             | quite some time.
        
               | qeternity wrote:
               | For most of Intel's existence people ran fairly trusted
               | workloads side-by-side. It wasn't until "the cloud" that
               | things really changed.
        
       | jnordwick wrote:
       | Before people get all nutty as usual:
       | 
       | This is another TSX (transactional memory) issue, and you can
       | disable TSX without much of a problem.
       | 
       | The attacker basically needs to be running a binary on the
       | machine (not JS in a browser or anything).
       | 
       | The leakage is extremely slow, about 225 byte/minute for ASCII
       | text (a 4k page in 18 minutes). I'm not sure if that was an exact
       | recovery either, just a probabilistic one. With noise reduction
       | to enhance recovery, they said it took twice as long - so about
       | 113 byte/minute.
       | 
       | It seems to be able to only be able to control the bottom 12-bits
       | of the address to recover (but I didn't fully get why) and
       | require the process to be either reading or writing the data to
       | get it into L1 cache somehow, so just sitting in RAM isn't good
       | enough.
       | 
       | attacker still needs to figure out an address (even with ASLR
       | there is still a lot of guessing, and if you have really
       | sensitive data, just move it every second until you wipe it).
       | 
       | Interesting, but kind of a non-issue.
        
         | jandrese wrote:
         | That's a lot of work, but the instant it's in a rootkit
         | everybody will be able to do it. 113 bytes is extracting an AES
         | key from memory in about a minute.
         | 
         | It's only really hard and messy for the first guy who
         | implements it, after that it is much easier, albeit still
         | fairly messy. I'm not saying we need to panic, but it's more
         | than a "non-issue".
        
           | scandinavian wrote:
           | >but the instant it's in a rootkit everybody will be able to
           | do it.
           | 
           | This makes no sense. If you have the privileges to install a
           | rootkit, there is no need to use any speculative execution
           | exploit.
        
             | jandrese wrote:
             | The first feature of a rootkit is to get root access. A
             | userspace kernel-information leak utility would be very
             | useful for this step.
        
               | scandinavian wrote:
               | Okay, we have differing definitions of what a rootkit is
               | it seems.
        
           | jnordwick wrote:
           | we have yet to even find Specter in a kit, this will never
           | see the light of day, especially since it is so easy to avoid
           | (turn TSX off).
           | 
           | You also, don't get to extract constantly. You only get a
           | shot when the data is in the LFB, so the program needs to be
           | actively reading or writing it to keep it moving back and
           | forth from L1 and L2, at least that is the way I read the
           | paper.
        
         | Tuna-Fish wrote:
         | > It seems to be able to only be able to control the bottom
         | 12-bits of the address to recover (but I didn't fully get why)
         | 
         | 2^12 = 4096, or the x86 page size in bytes.
        
           | cesarb wrote:
           | And as for why the page size is relevant for the L1 cache:
           | Intel (and AMD, and most other modern CPUs) use a VIPT
           | (Virtual Indexed Physical Tagged) L1 cache, where the virtual
           | address (before the page table translation) is used to index
           | the cache (this is faster since it can be done in parallel
           | with the TLB lookup to get the physical address). To prevent
           | the confusing situation where the same physical address has
           | more than one index in the cache, only the bits of the
           | address which do not change between the virtual address and
           | the physical address can be used; these are the bits which
           | correspond to the offset within the page.
           | 
           | (As an aside, this also limits the size of the L1 cache,
           | which is why it hasn't grown much despite the L2 and L3 cache
           | growing a lot; an 8-way set-associative VIPT cache with a
           | 4KiB page size is limited to 32KiB, absent tricks like page
           | coloring. Perhaps this will change if 64-bit ARM servers
           | become popular, since they can only address the largest
           | amount of memory when using a 64KiB page size, and this would
           | make enterprise distributions default to that page size.)
        
         | FisDugthop wrote:
         | By your own numbers, this could translate to a 30-second AES
         | key exfiltration in the cloud. This isn't a non-issue, even if
         | you personally aren't affected.
        
           | jnordwick wrote:
           | you have to find the address first, which is a lot of
           | rummaging around. you aren't just handed it. that is going to
           | quite a bit, especially the program build any security
           | measures in (eg, allocate at a random address). good luck
           | with that.
        
       | Flockster wrote:
       | Is this coordinated with OS providers? It only mentioned that
       | Intel released microcode.
        
       ___________________________________________________________________
       (page generated 2020-01-27 23:00 UTC)