[HN Gopher] Notes on BPF and eBPF ___________________________________________________________________ Notes on BPF and eBPF Author : mlerner Score : 97 points Date : 2022-01-02 19:48 UTC (3 hours ago) (HTM) web link (jvns.ca) (TXT) w3m dump (jvns.ca) | daenz wrote: | >eBPF programs can't access arbitrary kernel memory. Instead the | kernel provides functions to get at some restricted subset of | things. | | I must finally becoming a security pessimist when I read those | sentences and the first thing I think is: these statements will | not age well. | Flocular wrote: | The BPF capability should really only be given to root. I don't | think it really gives any new attack surface. All I could see | is it giving black-hats an easier interface to "kernel-level- | fuckery". | _3u10 wrote: | It's not an easier interface. It's much easier to write | kernel modules than to mess around with eBPF. | stormbrew wrote: | Easier to get started maybe, but there's something to be | said for the 'ease' of having to worry a lot less about | crashing the kernel you're working on. | csdvrx wrote: | Yes, "can't" should be replace by "shouldn't". | | If there's a physical possibility, it's just a matter of time | before someone finds a way, as was proved by the CPU cache bugs | leaking information. | fragmede wrote: | That's definitely a security professional's read :) | | "Isn't supposed to be able to" is a lot longer and distracting | vs the oversimplification-for-sake-of-understanding of "can't". | As far as it being proven wrong though - that's already | happened, eg CVE-2021-29154 | | https://blog.kernelcare.com/vulnerability/specially-crafted-... | daenz wrote: | That's fair. I understand that in the ideal world, it would | be "can't." I guess my concern is that the wording kind of | hand waves away any potential security issues, when people | interested in this tech should absolutely be made aware of | them. | netsec_burn wrote: | Many of the links under "things you can attach eBPF programs to" | are broken, unfortunately. | crtxcr wrote: | >things you can attach eBPF programs to | | >... | | >seccomp / landlock security things | | Landlock does not use *BPF. | | Seccomp can only use BPF at this point, not eBPF (though there | has been some work on it). | pwnna wrote: | BPF is indeed a pretty interesting technology. As the knowledge | about it becomes more widespread, I anticipate that we will | unlock some new capabilities both in terms of tracing. Brendan | Gregg's book (https://www.brendangregg.com/bpf-performance-tools- | book.html) serves as a good intro to this, although you probably | only need to read a small chunk of it as a lot of it is | reference-book-style material. | | The author's mentioned that you can trace MySQL with USDT, which | is a tracepoint inserted by the developer at select locations in | the code. This kind of tracepoints form a "stable interface" for | tracing/performance debugging, whereas uprobe, which hooks into | select userspace functions, are unstable as the binary is | recompiled. Unfortunately, the USDT tracepoints (via DTrace) have | been removed in MySQL 8.0. This makes it significantly more | difficult to trace MySQL, although it's not | impossiblhttps://news.ycombinator.com/item?id=29772927e. I've | done a proof of concept of tracing MySQL with uprobe instead of | USDT in this repo[1], which can kind of give you the same results | (and possibly more stuff, as I can more easily read arbitrary | memory address due to how the old USDT tracepoints are | structured). This is not stable tho, as any MySQL upgrade may | introduce incompatibility with the trace script, as I read memory | address based on offsets (whereas with USDT this can be kept | pretty stable). My appeal to Oracle to re-add this | functionality[2] has unfortunately been rejected, which I think | is a mistake given the wide range of possibilities unlocked via | BPF. | | [1]: https://github.com/shuhaowu/mysqld-bpf | | [2]: https://bugs.mysql.com/bug.php?id=105741 | | Another thing that I've been recently thinking of is using BPF to | validate programs written for real-time Linux (via PREEMPT_RT). | To my understanding, one of the main thing to avoid is page | faults [3]. With the proper BPF tracing scripts, I think we can | validate that programs indeed avoids page faults in integration | testing. I'm not sure if it is super useful yet, but as I'm | trying to write a few RT programs, it's something that came to my | mind. | | [3]: https://lwn.net/Articles/837019/ | | In addition to tracing (so bpftrace-based/bcc-based tools), I've | recently discovered that there there are: | | 1. ebpfsnitch (https://github.com/harporoeder/ebpfsnitch): which | is an application-level firewall without kernel modules. | | 2. ebpf-traffic-monitor | (https://source.android.com/devices/tech/datausage/ebpf- | traff...): which appears to be using BPF to account for traffic | for different apps on Android. | | 3. kubectl trace (https://github.com/iovisor/kubectl-trace): Run | tracing on k8s. | | There are apparently also use cases in the context of security, | but I'm not familiar with it. | kylequest wrote: | Lots of good eBPF info from eBPF Summit: | https://ebpf.io/summit-2021/ and https://ebpf.io/summit-2020/ | | Also videos from eBPF Day KubeCon 2021: | https://www.youtube.com/playlist?list=PLj6h78yzYM2Pm5nF_GmNQ... | rammy1234 wrote: | Is PDF link broken in the blog ? | gnabgib wrote: | It's available at: | https://files.speakerdeck.com/presentations/130bc7df16db4556... | (or click the download button on the slides page) ___________________________________________________________________ (page generated 2022-01-02 23:00 UTC)