[HN Gopher] PCILeech
       ___________________________________________________________________
        
       PCILeech
        
       Author : peter_d_sherman
       Score  : 186 points
       Date   : 2020-02-25 14:03 UTC (8 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | blibble wrote:
       | I've had intel_iommu=on in my boot cmdline on my personal
       | machines (ones I do my banking on!) for 5 years without issue
       | 
       | (board+CPU for all machines both support it, dmesg confirms it is
       | indeed on)
        
       | tezza wrote:
       | So an Action Replay for x64
       | 
       | https://gamehacking.org/wiki/Action_Replay_(Amiga)
        
         | teddyh wrote:
         | Also for PC:
         | 
         | https://www.youtube.com/watch?v=usaioMbE8EQ
        
       | MaupitiBlue wrote:
       | Doesn't this change everything wrt cheating?
        
         | inetknght wrote:
         | I would imagine it's no more "change everything" than having
         | root access to your own hardware.
        
         | tuvan wrote:
         | These devices are apparently already used for cheating. Here
         | are some blog posts from Riot Games and ESEA mentioning direct
         | memory access devices as cheating vectors and preventative
         | methods if you are interested. https://blog.esea.net/esea-
         | hardware-cheats/ https://euw.leagueoflegends.com/en-
         | gb/news/dev/dev-null-anti...
        
       | anonsivalley652 wrote:
       | Roughly 10 years ago, I knew a guy who sold various exfil/border-
       | extraction products to state actors including FireWire dongles
       | similar to Inception (they may even be effectively the same
       | thing, IDK). He specialized in memory dumping attacks on several
       | platforms including Windows, FreeBSD, Linux and macOS.
       | 
       | The core problem is that buses should be authenticated,
       | authorized, encrypted, and selective-/least-privileged channels.
       | Exposing a memory and expansion bus to the outside world in the
       | name of "convenience" is insane. A trusted set of components in
       | the OS and in hardware should:
       | 
       | 0. Be able to do a HwIDS checksumming of all firmwares to detect
       | tampering.
       | 
       | 1. Limit devices' ability to connect unless they are authorized
       | by the user, much like a "hardware firewall" UI.. vaguely similar
       | to say VMware Workstation/Fusion's dialog when plugging in a new
       | USB device mixed with something like Little Snitch'es dialog for
       | a process wanting to connect to a particular port.
       | 
       | 2. Authenticate devices with public/private key certs that are
       | burned in, a function where the device can answer challenge
       | requests, and Signal protocol-like construction properly modified
       | for PKI. Then, and only then, can a host talk securely to a
       | device over an encrypted channel.
        
         | hoistbypetard wrote:
         | We never sold anything for it, but for some demos back around
         | 2004/2005 we flashed some firewire iPods with custom firmware
         | that performed the attack described here:
         | 
         | https://web.archive.org/web/20071011191205/http://md.hudora....
         | 
         | (To be clear, those slides aren't mine, but I can no longer
         | find the firmware we based ours off of, and never did get
         | permission to post source to our mods, which were never
         | distributed.)
         | 
         | I know a couple of our customers took entirely the wrong lesson
         | from our demonstrations and banned mp3 players from their
         | office buildings entirely afterwards :)
        
       | heeen2 wrote:
       | Weren't these used in a famous counterstrike cheating scandal in
       | Norway?
        
         | VectorLock wrote:
         | I was just thinking "would this be an effective way to create
         | undetectable cheats for video games?"
        
         | landr0id wrote:
         | Yep, they're also pretty popular with people who cheat in
         | ESEA/FaceIT third-party ladders. One of the guys, ra1f,
         | eventually outed himself [1]. The software-based anticheat is
         | fairly decent, pushing people to hardware-based cheats built
         | off of PCILeech to avoid detection.
         | 
         | [1]: https://twitter.com/rra1f/status/1067518342595006466
        
       | _aleph2c_ wrote:
       | An interview with Ulf Fritz:
       | 
       | https://www.youtube.com/watch?v=MIfY8g73xms&feature=emb_rel_...
        
         | DrRobinson wrote:
         | Another interview with him:
         | https://www.youtube.com/watch?v=W5Yb3q9iJao
        
       | hoistbypetard wrote:
       | Does DMA over PCIe work using USB gadget mode with a Linux
       | device? i.e. could a Pi be used easily and inexpensively to build
       | an acquisition device for this?
       | 
       | Edit: Bleh. Nevermind. I saw this photo:
       | 
       | https://gist.githubusercontent.com/ufrisk/c5ba7b360335a13bba...
       | 
       | with a pcie adapter connected over what looked like USB3 and
       | forgot that it's thunderbolt on the macbook. I was not quite to
       | the middle of my first cup of coffee when I asked that.
        
         | q3k wrote:
         | Well USB is not PCIe and doesn't do DMA, so no.
        
           | hoistbypetard wrote:
           | Aren't most of the devices on the linked page doing PCIe over
           | USB3?
           | 
           | Edit: Bleh. Nevermind. I saw this photo:
           | 
           | https://gist.githubusercontent.com/ufrisk/c5ba7b360335a13bba.
           | ..
           | 
           | with a pcie adapter connected over what looked like USB3 and
           | forgot that it's thunderbolt on the macbook. I was not quite
           | to the middle of my first cup of coffee when I asked that.
        
             | saagarjha wrote:
             | USB 4!
        
       | jjoonathan wrote:
       | How widely deployed are IOMMUs these days? I thought they became
       | a standard thing a few years ago.
        
         | wtallis wrote:
         | Intel used IOMMU support in their CPUs for product segmentation
         | until Thunderbolt came around. It's now pretty widely supported
         | in hardware, but seldom enabled by default at both the
         | motherboard firmware level and OS level.
        
         | drewg123 wrote:
         | The problem is that, used properly, IOMMUs are horribly
         | expensive.
         | 
         | Consider a NIC driver where you're mapping an outgoing packet
         | for DMA. What used to essentially be a virtual to physical
         | translation becomes a virt to phys + entering the phys in the
         | iommu + removing the mapping when the transmit is complete.
         | This is expensive for hardware and software reasons. At one
         | point I benchmarked a 100g setup on linux, and with the IOMMU
         | enabled, we lost about 90% of the bandwidth and most of the CPU
         | time was spent in lock contention over the red-black tree that
         | managed the IOMMU tables. This was 5-ish years ago, so perhaps
         | things have gotten better.
         | 
         | So that makes people want to just enable the IOMMU for SR-IOV
         | (and full device) pass-thru to VMs. This is cheaper, since you
         | just set the mapping up when you allocate phys mem for the
         | guest, and tear them down when freeing phys mem.
         | 
         | MacOS used to use a really cool trick where they pre-mapped all
         | mbufs into the IOMMU. That made network traffic transmit and
         | receive comparatively fast. However,it also prevented lots of
         | optimizations that modern operating systems use for zero-copy
         | IO (like attaching pages from sendfile directly to mbufs,
         | similer to skb_frags).
        
           | the8472 wrote:
           | > At one point I benchmarked a 100g setup on linux
           | 
           | Datacenters have controlled physical access. I think IOMMU is
           | far more important for anything with exposed thunderbolt
           | ports (including the upcoming USB4). So Laptops, smartphones,
           | workstations can still benefit from it even if it's currently
           | not viable for cloud server-class workloads.
        
           | spacenick88 wrote:
           | I think the most expensive isn't the address translation
           | itself but TLB housekeeping when mappings get remapped or
           | invalidated. This is especially true with virtualization
           | where the hypervisor often needs to do extra work like
           | (un)pinning guest pages, translating guest real addresses to
           | host real and reissuing TLB flushes.
        
           | emmericp wrote:
           | A problem on the hardware side is that Intel's IOMMU TLB is
           | tiny (64 entries), so using huge pages for all DMA-accessible
           | memory is absolutely required to get a good performance out
           | of it.
           | 
           | We've done some benchmarks here: https://www.net.in.tum.de/fi
           | leadmin/bibtex/publications/pape... (Figure 9 on page 10)
           | 
           | Only a very basic benchmark, working on more...
        
             | ajross wrote:
             | FWIW: 64 entries isn't particularly small for a dTLB, IIRC
             | that's exactly the size on current Intel cores. The real
             | problem is that device DMA, unlike software behavior, is
             | distressingly non-local. The device will stream out a
             | packet or storage block and then never touch that memory
             | again (or not for a very long time -- memory buffers are
             | huge relative to bandwidth on these things). The TLB just
             | doesn't do you much good.
        
               | emmericp wrote:
               | There's only one level of TLBs in the IOMMU. And that's
               | 64 entries.
               | 
               | Yeah, I think the dTLB is only 64 entries on Intel CPUs
               | as well, but there's a second larger layer behind that,
               | and an even larger third layer. IIRC it's a total of 4096
               | entries on recent Intel CPUs.
        
             | drewg123 wrote:
             | Nice paper, and thanks for the reminder of how small the
             | IOMMU tlb is. We never hit this because we were testing
             | full-sized packets (and really bigger, because of TSO) and
             | hit host IOMMU management overheads at ~100K to 200k TSO
             | sends/sec.
        
               | emmericp wrote:
               | Interesting, did you use huge pages?
               | 
               | I think ~100k to 200k TSO "packets" per second should be
               | doable with the IOMMU. But I guess it depends where the
               | data is coming from. Could be one of the odd cases where
               | copying data is faster than doing zero-copy, e.g., just
               | copy everything into the same small set of small-ish
               | buffers to keep the number of pages that need to be
               | present in the IOMMU small?
        
           | waddlesplash wrote:
           | Ideally, one would only need to do this level of IOMMU for
           | externally attached devices (i.e. Thunderbolt.) However, the
           | problem is that the vast majority of kernel drivers in most
           | OSes were _not_ written with malicious devices in mind, so
           | they often do things like e.g. putting kernel address space
           | pointers in device-writeable areas.
           | 
           | For instance, XHCI (USB3) has an "address" field in its
           | buffer descriptor structures that often it allows drivers to
           | put anything into, and it will copy that into the "transfer
           | finished" message. Some USB3 drivers just put the kernel
           | address space pointer of the bookkeeping record in there; or
           | worse, allocate the bookkeeping records along with the
           | physical buffers themselves, so a malicious device could
           | manipulate those if it knew where to look.
        
             | peterburkimsher wrote:
             | Does the attack in its current form work over Thunderbolt?
             | I'd really like to mount my RAM as a drive, and see the
             | files that various programs have open (the waveform from
             | Rogue Amoeba's Fission, among others). Using gore just gave
             | me a 2GB raw file, and it's hard for me to spot files (JPG,
             | PNG) in there.
        
       | peter_d_sherman wrote:
       | Related: https://github.com/ufrisk/pcileech-fpga
        
       | floatingatoll wrote:
       | Two related links previously on HN (no comments):
       | 
       |  _Introducing the Memory Process File System for PCILeech_
       | http://blog.frizk.net/2018/03/memory-process-file-system.htm...
       | 
       |  _Using your BMC as a DMA device: Plugging PCILeech to HPE ILO 4_
       | https://www.synacktiv.com/posts/exploit/using-your-bmc-as-a-...
        
         | loeg wrote:
         | And from the other side of things, securing the host OS from
         | devices, I found this article a few years ago interesting:
         | 
         |  _Fuzzing PCI express: security in plaintext (2017)_
         | https://cloud.google.com/blog/products/gcp/fuzzing-pci-expre...
        
       ___________________________________________________________________
       (page generated 2020-02-25 23:00 UTC)