[HN Gopher] PCILeech ___________________________________________________________________ PCILeech Author : peter_d_sherman Score : 186 points Date : 2020-02-25 14:03 UTC (8 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | blibble wrote: | I've had intel_iommu=on in my boot cmdline on my personal | machines (ones I do my banking on!) for 5 years without issue | | (board+CPU for all machines both support it, dmesg confirms it is | indeed on) | tezza wrote: | So an Action Replay for x64 | | https://gamehacking.org/wiki/Action_Replay_(Amiga) | teddyh wrote: | Also for PC: | | https://www.youtube.com/watch?v=usaioMbE8EQ | MaupitiBlue wrote: | Doesn't this change everything wrt cheating? | inetknght wrote: | I would imagine it's no more "change everything" than having | root access to your own hardware. | tuvan wrote: | These devices are apparently already used for cheating. Here | are some blog posts from Riot Games and ESEA mentioning direct | memory access devices as cheating vectors and preventative | methods if you are interested. https://blog.esea.net/esea- | hardware-cheats/ https://euw.leagueoflegends.com/en- | gb/news/dev/dev-null-anti... | anonsivalley652 wrote: | Roughly 10 years ago, I knew a guy who sold various exfil/border- | extraction products to state actors including FireWire dongles | similar to Inception (they may even be effectively the same | thing, IDK). He specialized in memory dumping attacks on several | platforms including Windows, FreeBSD, Linux and macOS. | | The core problem is that buses should be authenticated, | authorized, encrypted, and selective-/least-privileged channels. | Exposing a memory and expansion bus to the outside world in the | name of "convenience" is insane. A trusted set of components in | the OS and in hardware should: | | 0. Be able to do a HwIDS checksumming of all firmwares to detect | tampering. | | 1. Limit devices' ability to connect unless they are authorized | by the user, much like a "hardware firewall" UI.. vaguely similar | to say VMware Workstation/Fusion's dialog when plugging in a new | USB device mixed with something like Little Snitch'es dialog for | a process wanting to connect to a particular port. | | 2. Authenticate devices with public/private key certs that are | burned in, a function where the device can answer challenge | requests, and Signal protocol-like construction properly modified | for PKI. Then, and only then, can a host talk securely to a | device over an encrypted channel. | hoistbypetard wrote: | We never sold anything for it, but for some demos back around | 2004/2005 we flashed some firewire iPods with custom firmware | that performed the attack described here: | | https://web.archive.org/web/20071011191205/http://md.hudora.... | | (To be clear, those slides aren't mine, but I can no longer | find the firmware we based ours off of, and never did get | permission to post source to our mods, which were never | distributed.) | | I know a couple of our customers took entirely the wrong lesson | from our demonstrations and banned mp3 players from their | office buildings entirely afterwards :) | heeen2 wrote: | Weren't these used in a famous counterstrike cheating scandal in | Norway? | VectorLock wrote: | I was just thinking "would this be an effective way to create | undetectable cheats for video games?" | landr0id wrote: | Yep, they're also pretty popular with people who cheat in | ESEA/FaceIT third-party ladders. One of the guys, ra1f, | eventually outed himself [1]. The software-based anticheat is | fairly decent, pushing people to hardware-based cheats built | off of PCILeech to avoid detection. | | [1]: https://twitter.com/rra1f/status/1067518342595006466 | _aleph2c_ wrote: | An interview with Ulf Fritz: | | https://www.youtube.com/watch?v=MIfY8g73xms&feature=emb_rel_... | DrRobinson wrote: | Another interview with him: | https://www.youtube.com/watch?v=W5Yb3q9iJao | hoistbypetard wrote: | Does DMA over PCIe work using USB gadget mode with a Linux | device? i.e. could a Pi be used easily and inexpensively to build | an acquisition device for this? | | Edit: Bleh. Nevermind. I saw this photo: | | https://gist.githubusercontent.com/ufrisk/c5ba7b360335a13bba... | | with a pcie adapter connected over what looked like USB3 and | forgot that it's thunderbolt on the macbook. I was not quite to | the middle of my first cup of coffee when I asked that. | q3k wrote: | Well USB is not PCIe and doesn't do DMA, so no. | hoistbypetard wrote: | Aren't most of the devices on the linked page doing PCIe over | USB3? | | Edit: Bleh. Nevermind. I saw this photo: | | https://gist.githubusercontent.com/ufrisk/c5ba7b360335a13bba. | .. | | with a pcie adapter connected over what looked like USB3 and | forgot that it's thunderbolt on the macbook. I was not quite | to the middle of my first cup of coffee when I asked that. | saagarjha wrote: | USB 4! | jjoonathan wrote: | How widely deployed are IOMMUs these days? I thought they became | a standard thing a few years ago. | wtallis wrote: | Intel used IOMMU support in their CPUs for product segmentation | until Thunderbolt came around. It's now pretty widely supported | in hardware, but seldom enabled by default at both the | motherboard firmware level and OS level. | drewg123 wrote: | The problem is that, used properly, IOMMUs are horribly | expensive. | | Consider a NIC driver where you're mapping an outgoing packet | for DMA. What used to essentially be a virtual to physical | translation becomes a virt to phys + entering the phys in the | iommu + removing the mapping when the transmit is complete. | This is expensive for hardware and software reasons. At one | point I benchmarked a 100g setup on linux, and with the IOMMU | enabled, we lost about 90% of the bandwidth and most of the CPU | time was spent in lock contention over the red-black tree that | managed the IOMMU tables. This was 5-ish years ago, so perhaps | things have gotten better. | | So that makes people want to just enable the IOMMU for SR-IOV | (and full device) pass-thru to VMs. This is cheaper, since you | just set the mapping up when you allocate phys mem for the | guest, and tear them down when freeing phys mem. | | MacOS used to use a really cool trick where they pre-mapped all | mbufs into the IOMMU. That made network traffic transmit and | receive comparatively fast. However,it also prevented lots of | optimizations that modern operating systems use for zero-copy | IO (like attaching pages from sendfile directly to mbufs, | similer to skb_frags). | the8472 wrote: | > At one point I benchmarked a 100g setup on linux | | Datacenters have controlled physical access. I think IOMMU is | far more important for anything with exposed thunderbolt | ports (including the upcoming USB4). So Laptops, smartphones, | workstations can still benefit from it even if it's currently | not viable for cloud server-class workloads. | spacenick88 wrote: | I think the most expensive isn't the address translation | itself but TLB housekeeping when mappings get remapped or | invalidated. This is especially true with virtualization | where the hypervisor often needs to do extra work like | (un)pinning guest pages, translating guest real addresses to | host real and reissuing TLB flushes. | emmericp wrote: | A problem on the hardware side is that Intel's IOMMU TLB is | tiny (64 entries), so using huge pages for all DMA-accessible | memory is absolutely required to get a good performance out | of it. | | We've done some benchmarks here: https://www.net.in.tum.de/fi | leadmin/bibtex/publications/pape... (Figure 9 on page 10) | | Only a very basic benchmark, working on more... | ajross wrote: | FWIW: 64 entries isn't particularly small for a dTLB, IIRC | that's exactly the size on current Intel cores. The real | problem is that device DMA, unlike software behavior, is | distressingly non-local. The device will stream out a | packet or storage block and then never touch that memory | again (or not for a very long time -- memory buffers are | huge relative to bandwidth on these things). The TLB just | doesn't do you much good. | emmericp wrote: | There's only one level of TLBs in the IOMMU. And that's | 64 entries. | | Yeah, I think the dTLB is only 64 entries on Intel CPUs | as well, but there's a second larger layer behind that, | and an even larger third layer. IIRC it's a total of 4096 | entries on recent Intel CPUs. | drewg123 wrote: | Nice paper, and thanks for the reminder of how small the | IOMMU tlb is. We never hit this because we were testing | full-sized packets (and really bigger, because of TSO) and | hit host IOMMU management overheads at ~100K to 200k TSO | sends/sec. | emmericp wrote: | Interesting, did you use huge pages? | | I think ~100k to 200k TSO "packets" per second should be | doable with the IOMMU. But I guess it depends where the | data is coming from. Could be one of the odd cases where | copying data is faster than doing zero-copy, e.g., just | copy everything into the same small set of small-ish | buffers to keep the number of pages that need to be | present in the IOMMU small? | waddlesplash wrote: | Ideally, one would only need to do this level of IOMMU for | externally attached devices (i.e. Thunderbolt.) However, the | problem is that the vast majority of kernel drivers in most | OSes were _not_ written with malicious devices in mind, so | they often do things like e.g. putting kernel address space | pointers in device-writeable areas. | | For instance, XHCI (USB3) has an "address" field in its | buffer descriptor structures that often it allows drivers to | put anything into, and it will copy that into the "transfer | finished" message. Some USB3 drivers just put the kernel | address space pointer of the bookkeeping record in there; or | worse, allocate the bookkeeping records along with the | physical buffers themselves, so a malicious device could | manipulate those if it knew where to look. | peterburkimsher wrote: | Does the attack in its current form work over Thunderbolt? | I'd really like to mount my RAM as a drive, and see the | files that various programs have open (the waveform from | Rogue Amoeba's Fission, among others). Using gore just gave | me a 2GB raw file, and it's hard for me to spot files (JPG, | PNG) in there. | peter_d_sherman wrote: | Related: https://github.com/ufrisk/pcileech-fpga | floatingatoll wrote: | Two related links previously on HN (no comments): | | _Introducing the Memory Process File System for PCILeech_ | http://blog.frizk.net/2018/03/memory-process-file-system.htm... | | _Using your BMC as a DMA device: Plugging PCILeech to HPE ILO 4_ | https://www.synacktiv.com/posts/exploit/using-your-bmc-as-a-... | loeg wrote: | And from the other side of things, securing the host OS from | devices, I found this article a few years ago interesting: | | _Fuzzing PCI express: security in plaintext (2017)_ | https://cloud.google.com/blog/products/gcp/fuzzing-pci-expre... ___________________________________________________________________ (page generated 2020-02-25 23:00 UTC)