[HN Gopher] The Garbage Collection Handbook, 2nd Edition ___________________________________________________________________ The Garbage Collection Handbook, 2nd Edition Author : zzxcvb Score : 285 points Date : 2023-04-08 11:28 UTC (11 hours ago) (HTM) web link (www.routledge.com) (TXT) w3m dump (www.routledge.com) | nektro wrote: | darn i thought this was gonna be a book about waste management | einpoklum wrote: | I feel the need for garbage collection is a language design mis- | feature. That is to say, producing garbage is a language design- | mis-feature. To quote Bjarne Stroustrup: | | > I don't like garbage. I don't like littering. My ideal is to | eliminate the > need for a garbage collector by not producing any | garbage. That is now > possible. | | and it's indeed possible. For example It's become pretty much a | non-issue in modern C++: | https://stackoverflow.com/a/48046118/1593077 (and C++ is not the | only example, it's just a prominent example of a language which | almost standardized garbage collection, but eventually did not go | that way.) | shwestrick wrote: | More and more people nowadays are programming at high levels of | abstraction. If you're designing the frontend of a website, or | making a mobile game, or developing a stock trading algorithm, | or whatever else, then you probably don't want to constantly | worry about details of memory management... | | On top of this, GC is necessary for some algorithms. Any data | structure with partial sharing (e.g. binary search tree with | versioning via path-copying) needs GC to be space-efficient. | You could either rely on a built-in GC, or write your own. If | you write your own, I think you'll find that it is tedious and | error-prone due to memory safety issues. | einpoklum wrote: | > then you probably don't want to constantly worry about | details of memory management... | | Oh, you actually don't have to, that's the whole point... in | the past, you (effectively) had a choice between careful | manual management of memory and garbage collection with its | overheads. These days, you can use constructs which take care | of that management for you, with very little or no overhead, | which don't involve garbage. | | It's true that sometimes GC is algorithmically necessary; but | then the high-level-of-abstraction argument is irrelevant. | And in those cases, a GC library does indeed come in useful. | You don't need to write your own, others have likely done it | already. | sirwhinesalot wrote: | C++ can actually produce quite a bit of garbage | unintentionally, it's why linters will remind you often to call | std::move. | | That said I much prefer deterministic resource cleanup even in | a janky language like C++ over a tracing GC. | einpoklum wrote: | It's true that C++ can trigger a lot of unnecessary copying | if one writes carelessly; but that's not the same as garbage, | in that both copies are used, and none of them continue | living indefinitely without special intervention. But point | taken about pass-by-value deficiencies. | kaba0 wrote: | You can't avoid garbage if you deal with generic programs -- | Rust and C++ can only implement a subset of all programs | without RC (which is the most basic GC algorithm). | | This is the same way to many other interesting CS properties -- | most of them are undecidable at compile time, so you have to do | it at runtime. | tomp wrote: | Modern C++ has reference counting ("smart pointers") and memory | leaks instead, so YMMV. | birdymcbird wrote: | "And memory leaks instead" I choked on my morning coffee | hahahahah | winrid wrote: | I wouldn't say it's a non issue. I frequently have to tune | allocators to fix heap fragmentation in databases... | einpoklum wrote: | Recognized on the last line of the post I linked to actually. | winrid wrote: | Oh sorry:) | vrglvrglvrgl wrote: | [dead] | ignoramous wrote: | On a related note, I found ART's implementation of _Concurrent | copying and compaction_ GC to be pretty novel. Here 's a nice | write up detailing how handling _page faults_ in userspace come | in handy to accomplish that: | https://www.tdcommons.org/cgi/viewcontent.cgi?article=4751&c... | (pdf) / | https://web.archive.org/web/20230216233459/https://www.tdcom... | | For context, here's a brief overview of the evolution of the | Android Runtime Garbage Collector: https://archive.is/Ue6Pj | NobleExpress wrote: | Certainly interesting, but there are no performance numbers | mentioned in the white paper comparing userfaultfd to read | barriers. So the actual benefit to switching to userfaultfd is | unknown (at least in publically accessible documents -- I'm | sure Google has done internal performance evaluations). | | Using page faults (and/or page protection) to perform | compaction instead of barriers is a pretty old technique (see | the 1988 paper by Appel, Ellis, and Li [1]; see also the | Compressor by Kermany and Petrank [2]). But handling page | faults were very expensive (at least on Linux) until the | addition of userfaultfd recently. | | [1]: https://dl.acm.org/doi/10.1145/960116.53992 [2]: | https://dl.acm.org/doi/10.1145/1133255.1134023 | t-3 wrote: | I found "A Unified Theory of Garbage Collection" helpful: | https://dl.acm.org/doi/10.1145/1035292.1028982 | spapas82 wrote: | 20 years ago in the programming languages lesson at the | university we started learning about OOP using Java and Ada as | examples. When the professor started describing Java, a fellow | student interrupted him to inform the class that "Java has | garbage collection" (and boast of his special knowledge). | | After that incident his nickname was "the garbage collector"! | chinchilla2020 wrote: | Plot Twist: | | 20 years later, he wrote "The Garbage Collection Handbook" and | is the leading authority on garbage collection. | ajdude wrote: | Coincidentally, Ada optionally supports garbage collection in | its specifications but it's up to the runtime to implement it. | mcculley wrote: | Which means no library can depend on it, pushing memory | management responsibilities into calling code. | pjmlp wrote: | And since no implementation has ever supported it, it has | been deprecated in ISO Ada 95 and further removed from it on | ISO Ada 2012. | | https://en.wikibooks.org/wiki/Ada_Programming/Pragmas/Contro. | .. | admax88qqq wrote: | There was this project though. | | https://github.com/Roldak/AGC | | The best part is that it's faster than manual management. | People will tell you they need do to malloc and free | manually for performance, but when you actually run the | numbers GC wins for a majority of use cases. | vlovich123 wrote: | Tracing garbage collectors don't generally win against | reference counting connectors, especially when those | reference counts are automatically elided via ARC (eg | swift and objective C) or because they're rarely used by | means of composition (c++ and rust). Additionally, | different kinds of application strategies are better | depending on the use case (eg a pool allocator that you | bulk drop at the end of some computation). | | What papers are you referencing showing tracing GCs | outperforming things? If it's just the website, I think | it's an artifact of a micro benchmark rather than | something that holds true for non trivial programs. | kaba0 wrote: | Then why do every performant managed language opts for | tracing GCs when they can? | | RC is used in lower level languages _because it doesn't | require runtime support_ , and can be implemented as a | library. | | As I wrote in another comment, even with elisions, you | are still trading off constant writes _on the working | thread_ for parallel work, and you even have to pay for | synchronization in parallel contexts. | KerrAvon wrote: | Because they don't care as much about working set size as | Apple does. | kaba0 wrote: | Sure, it is a tradeoff as basically every other technical | choice. | | But we were talking about performance here, and | especially in throughput, tracing GCs are much better. | adgjlsfhk1 wrote: | this isn't fully true. Java is in the process of getting | lxr which is uses both tracing and reference counting. | NobleExpress wrote: | Yes and no. LXR is a highly optimized deferred and | coalesced reference counting (amongst many other | optimizations) GC and it looks nothing like the RC | implementations you see in other languages like Python, | Nim, Swift, etc. So yes, reference counting can be | performant, _but_ only if you are using a deferred and | coalesced implementation as otherwise the simple | implementation requires atomic operations for every | pointer read/write operation (though compilers could | optimize it to use non-atomic operations when it's sure | semantics are preserved). | | EDIT: The post you're responding to is referring to the | simple standard implementation of RC. | NobleExpress wrote: | I've always heard of the "Swift elides reference counts" | statements but I've never seen it substantiated. I don't | claim to be a Swift GC expert by any means, but the | impression I get from the two Swift GC papers I've read | [1, 2] is that Swift has a very simple implementation of | RC. The RC optimization document (albeit the document is | incomplete) [3] also doesn't give me the impression that | Swift is doing much eliding of reference counts (I'm sure | it is doing it for simple cases). | | Do you have any links which might explain what kind of | eliding Swift is doing? | | EDIT: The major RC optimizations I have seen which elide | references are deferral and coalescing and I'm fairly | certain that Swift is doing neither. | | [1]: https://dl.acm.org/doi/abs/10.1145/3170472.3133843 | | [2]: https://doi.org/10.1145/3243176.3243195 | | [3]: https://github.com/apple/swift/blob/main/docs/ARCOpt | imizatio... | pjmlp wrote: | Swift and Objective-C ARC performance is quite poor. | KerrAvon wrote: | Compared to what, though? And is that still the case if | all OS components use whatever it is, as opposed to a few | applications? Memory efficiency is crucial for overall | system performance, and ARC is highly memory-efficient | compared to every production GC I'm aware of. | kenniskrag wrote: | How is that possible? Sounds counter-intuitive. | | Do you have a recommendation for reading? | lispm wrote: | There was an Ada implementation on the Lisp Machine, that | might have been using GC. | pjmlp wrote: | As for .NET and JVM implementations, but that is a | consequence of underlying platform, just like C++/CLI and | C++/CX, and none of them have been that big into the Ada | compiler market. | syngrog66 wrote: | > and further removed from it on ISO Ada 2012. | | and in that precise moment Ada proved it had garbage | collection all along | polynomial wrote: | Ah, same as C++ it sounds like. | cwzwarich wrote: | Anyone know what's new in this edition? Both the publisher's site | and the author's site are light on details. | fithisux wrote: | Do we need this book now that we have Rust ?? | dreamcompiler wrote: | That's like asking "Do we need Go now that we have Rust?" | fithisux wrote: | Exactly that. | | BTW My comment was ironic. | mcguire wrote: | No. Rust has completely consigned all of computer science to | the dustbin of irrelevance. | samsquire wrote: | if you need code to understand garbage collection, there is | walkthrough of garbage collector and C code at | http://maplant.com/gc.html is really helpful. | | I tweaked it to work on amd64 and started adding register | scanning based on what eatonphil's discord people told me to do. | | https://github.com/samsquire/garbage-collector | | It's not fit for any purpose but more of a learning exercise. | juunpp wrote: | Side topic, but the maplant website design is great. | mananaysiempre wrote: | Bob Nystrom (of _Game Programming Patterns_ , _Crafting | Interpreters_ , and dartfmt fame) also wrote a tutorial | implementation[1], of a precise tracing GC as opposed to a | conservative one. | | Regarding register scanning in a conservative GC, Andreas Kling | has made (or at least quoted) the amusing observation[2] that | your C runtime already has a primitive to dump all callee-save | registers to memory: setjmp(). So all you have to do to scan | both registers and stack is to put a jmp_buf onto the stack, | setjmp() to it, then scan the stack normally starting from its | address. | | [1] https://journal.stuffwithstuff.com/2013/12/08/babys-first- | ga... | | [2] https://youtu.be/IzB6iTeo8kk | sirwhinesalot wrote: | Implementations are unfortunately allowed to do whatever they | want to that jmp_buf, they could xor the contents for all you | know. Hopefully no implementation does something silly like | that. | mananaysiempre wrote: | This seems like a reasonable environmental assumption if | you're already scanning the stack conservatively. I'd be | more worried about pointer authentication (AArch64), | pointer encryption (Glibc) or perhaps register windows | (SPARC, Itanium). Still, as a cheap trick for avoiding | assembly it seems to work well enough in non-exotic | situations. | saagarjha wrote: | Seems like a reasonable thing to do to avoid forging | jmp_bufs. | jlokier wrote: | Glibc's mangles some pointer registers in setjmp(). It XORs a | per-process value with the stack pointer, frame pointer and | instruction pointer stored in the jmp_buf, on all (or nearly | all) architectures. | | Although losing the stack and instruction pointers is | unlikely to be a problem for the GC context, the frame | pointer register need not contain a frame pointer value. It | can be an arbitrary program value depending on compile | options. That's something to watch out for with this GC | technique. | mananaysiempre wrote: | You're right, and I shouldn't have dismissed the PTR_MANGLE | business so easily when I looked at the source[1]. In | hindsight, the __ILP32__ (i.e. x32) special case for the | high part of %rbp on x86-64 looks awfully suspicious even | if you don't know the details. | | Given that __attribute__((optimize("no-omit-frame- | pointer"))) doesn't seem to get GCC to save the parent | frame pointer on the stack reliably, while Clang doesn't | understand that atribute (or #pragma GCC optimize(...)) at | all, this now looks less slick than it initially seemed. | | ... Have I mentioned that I dislike hardening techniques? | | [1] https://elixir.bootlin.com/glibc/glibc-2.37/source/sysd | eps/x... | 29athrowaway wrote: | "If Java had true garbage collection, most programs would delete | themselves upon execution." - Robert Sewell | frou_dh wrote: | Who is Robert Sewell and why are his preferences interesting? | hutzlibu wrote: | I think it is a joke and at least to me it is funny, without | knowing who that guy is or was. | mr_00ff00 wrote: | 1. It's a joke | | 2. Many great programmers hold the opinion that Java is | horrible. Linus for example. So if Sewell doesn't seem | credible, try the creator of Linux and git. | pjmlp wrote: | I bet Linus does not get much outside C. | manuelabeledo wrote: | > Many great programmers hold the opinion that Java is | horrible. | | Could you name a few? | mr_00ff00 wrote: | Linus, Dijkstra, Stroustrup, to name a few. | | Time has shown OOP is not good. | kaba0 wrote: | Then it's great that Java is a general purpose | programming language with plenty of ways to do FP as | well. | | Also, being good at some subset of CS doesn't mean they | are good at language design. | saagarjha wrote: | Linus's thoughts on programming languages aren't | particularly enlightening either ;) | 29athrowaway wrote: | Someone who is right regarding this. | sadiq wrote: | Great that there's a new edition of this coming. This is the | definitive reference if you work with garbage collectors, it's | also very well written. | pjmlp wrote: | Great that is having an update, it is one of the most relevant | source of informations for GC algorithms. | sirwhinesalot wrote: | I don't know if it is included in the new edition of the book but | in case anyone is interested in a modern, highly efficient RC | implementation that does not rely on deferring the reference | count updates (which kills one of the advantages of RC in the | first place), check the Perseus paper. Koka (which uses perseus) | is quite competitive with OCaml and Haskell | | Just search for "perseus reference counting", you'll find it. It | uses linear logic to insert explicit "dup/drop" operations and | then merges and coalesces them. | throw10920 wrote: | Is there any particular reason you didn't link to the paper | itself (https://www.microsoft.com/en- | us/research/uploads/prod/2020/1...) or use Microsoft's scuffed | version of DOI (MSR-TR-2020-42)? The paper appears to be freely | available, so there shouldn't be copyright issues with linking | to it... | sirwhinesalot wrote: | Just that I'm on mobile and am lazy. Thanks for posting it! | mananaysiempre wrote: | Note that (the first edition of) this book considers reference | counting, including variations like deferred reference counting, | to be garbage collection (though not _tracing_ garbage | collection), and reviews it as well. | LadyCailin wrote: | Bizarre that you can't preorder it yet, you have to wait until | June, just to preorder. | rwmj wrote: | I managed to pre-order it on Amazon UK. | mananaysiempre wrote: | As a guess, they might want the accounting for preorders to | fall into Q3? Is that a thing people do? | mcculley wrote: | Maybe there is gamesmanship around best seller lists? | avinassh wrote: | its available on amazon to pre order - | https://www.amazon.com/dp/1032218037 | NeutralForest wrote: | Cool stuff! The last 10 years have seen lots of great languages | pushing the field so I'm excited to see what's in the book =) | _a_a_a_ wrote: | Met Richard Jones, one of the authors of the book, at its | original launch. Very nice guy. Bought the original book and did | some heavy reading on it - if you get this book, be prepared to | put time into it, it's readable but it's not one you can do a | quick skim of - and I really profited from it. It should be | required reading for all programmers. | | (To be clear, I'm referring to the very first version, I haven't | read the subsequent versions but given the quality of the first | I'd be very surprised if they were any less good). | | Edit: https://www.cs.kent.ac.uk/people/staff/rej/ - a jumpoff | page for his stuff. | | Edit2: https://www.cs.kent.ac.uk/people/staff/rej/gcbib/ - | "[This] online bibliographic database includes nearly 3,000 | garbage collection-related publications. It contains abstracts | for some entries and URLs or DOIs for most of the electronically | available ones, and is continually being updated. The database | can be searched online or downloaded as BibTeX, PostScript, or | PDF." Welcome to the ultimate rabbit hole I guess. | m3kw9 wrote: | Who would usually need this book? | mkl95 wrote: | I read the mark-and-sweep GC parts to gain some deeper | understanding about Go's garbage collector. Probably the best | resource for that kind of stuff other than reading actual code. | whartung wrote: | I've written two garbage collectors in my time, neither of | which were in a programming language. | | One was for an in memory cache of data relationships. Another | was to clean up soft references with an RDF graph. Neither | were, nor needed to be, particularly sophisticated. | | The cache was a compacting collector, the RDF one was mostly a | "connectedness" test, pruning those nodes fallen from the | graph. | | Recall that malloc has a simple garbage collector for its free | space, and arguably the level of sophistication that ranks a | modern malloc implementation is how it manages its free space. | | In the end detritus must be identified and resources reclaimed. | So you see how GC like systems can occur in divergent areas of | work. | parenthesis wrote: | This book (the 1st edition) gave me exactly what I needed when | writing the garbage collector for my programming language. | | Besides the great technical content, I found it to be a very | enjoyable, readable book. | rwmj wrote: | The first edition is a definitive reference for anyone writing | a garbage collector (and by extension, exploring writing a | programming language). Also people who are interested in how | GCs work. | | Probably less so for people trying to optimize a program to | work with an existing GC (eg. tweaking the JVM), but I suppose | knowing the basic principles can help. | mkl95 wrote: | Will there be a PDF / epub option? | daveoc64 wrote: | Looks like their ebooks are available through a number of | sources: | | https://www.routledge.com/our-products/ebooks | vincent-manis wrote: | If I'm not mistaken, Routledge uses that VitalSource dreck, so | you can only read their ebooks via a dedicated app. I don't buy | dead-tree books any more, but I'll make an exception for this | one. | cidd wrote: | Rust has left the building | rwmj wrote: | Rust also uses reference counting, probably the worst sort of | garbage collection. | mr_00ff00 wrote: | Tracing is the worst in terms of performance | kaba0 wrote: | Anyone claiming something like this obviously hasn't dig | into GCs. You honestly think that _writing_ into memory at | each access, especially atomically is anywhere near the | performance of a GC that can do most of its work in | parallel and just flip a bit to basically "having deleted" | everything no longer accessible? | mr_00ff00 wrote: | a bit flip is not writing? | | Also do traces not have to work atomically? The program | needs to stop, you can't have it check roots as it runs. | | I'll admit I am no GC researcher with ph.D experience, | but your comment makes it seem you aren't either. | pjmlp wrote: | Not really, here it is winning hands down over Swift's ARC | implementation. | | https://github.com/ixy-languages/ixy-languages | saagarjha wrote: | Wasn't this the comparison that decided to use reference | types for everything for no real reason? | kaba0 wrote: | I don't know anything about the benchmark, but how would | you test GC implementations without reference types? | saagarjha wrote: | Swift's value types have reference counts because they | may have members that need their lifetimes to be managed | appropriately. (For example, if they're reference types.) | danappelxx wrote: | This is incorrect, value types are not referenced counted | in Swift. If a value type contains a reference type | member (usually an anti pattern!), then that member's | reference count is indeed incremented when the value type | is copied. But it is not accurate to claim that value | types themselves have reference counts. | pjmlp wrote: | The reason being comparing how various schemes of | automatic reference memory management perform. | | Naturally if the purpose was to compare stack allocation | performance other approach would have been taken. | saagarjha wrote: | The goal was to write a network driver in several | languages. Nobody said anything about comparing memory | management techniques, nor would the Swift implementation | use a stack allocator anyways. | pjmlp wrote: | They would appreciate your contribution to fix the | benchmark. | | Apparently the ARC performance improvements announced at | WWDC 2022 weren't needed. | saagarjha wrote: | I don't think they would, considering they've already | finished their study. | pjmlp wrote: | You could have your own repo to counterpost every time | someone like me refers to that old study with tainted | results. | SideQuark wrote: | That depends. Deallocating a zillion little objects one a a | time can be slower than doing them all in a batch. | _a_a_a_ wrote: | > Tracing is the worst in terms of performance | | eh? | _a_a_a_ wrote: | Since you're clearly struggling, permit me to spell it | out for you: why is tracing worse in terms of performance | than reference counting, and under what circumstances? | mcguire wrote: | It usually isn't. Reference counting is good when you | need to rather tightly limit memory use, are wedded to | RAII, or need something simple enough that a programmer | can knock it out in an afternoon. | nu11ptr wrote: | Only when used in a naive way, which Rust does not. For | example, the increments/decrements are done only when "clone" | is called and scope exit respectively, and based on Rust | ownership/borrow checking, is rarely done combining the best | of both worlds (but yes, implementations with aggressive | increment/decrements in loops and on every function call can | be very slow). Rust also separates Arc (atomic refs) and Rc | (non-atomic refs) and enforces usage scenarios in the type | checker giving you cheap Rc in single threaded scenarios. | Reference counting when done in a smart way works pretty | well, but you obviously have to be a little careful of cycles | (which in my experience are pretty rare and fairly obvious | when you have such a data type). | sirwhinesalot wrote: | The increment/decrement calls only occur on an explicit call | to .clone(). No .clone(), no increment/decrement. | | You won't see many clones in rust code. | rwmj wrote: | It's how often reference counts are adjusted on hot paths | that matters (including in libraries), and back to the | original point, reference counting doesn't let you free | groups of objects in one go (unlike a tracing GC). | | Also it'd be nice if the reference counts were stored | separately from the objects. Storing them alongside the | object being tracked is a classic mistake made by reference | count implementations (it spreads the writes over a large | number of cache lines). I was actually surprised that Rust | doesn't get this right. | | Another issue with manual memory management is that you | can't compact the heap. | sirwhinesalot wrote: | The amount of reference-counted pointers in most Rust | code is a tiny fraction compared to boxes or compiler- | tracked-lifetime references. | | Yes in theory it would be more efficient to store all the | reference counts together, but that's _in theory_. In | practice most Rust apps will not call clone on a shared | pointer on a hot path and if they do it 's usually 1 such | pointer and they do something with the data as well (so | it's all 1 cache line anyway) | | You can't compare Rust/C++ with Swift/Nim when it comes | to RC, there just aren't enough reference count | operations for it to matter much (unless you're in a | shitty OO C++ codebase like me that pretends it is java | with std::shared_ptr everywhere) | | Apps where heap compaction would be relevant in a low- | level language like Rust or C++ will typically use a bump | allocator which will trounce any kind of GC. | ufo wrote: | The book also has a chapter on reference counting ;-) | medo-bear wrote: | nek minnit top hn post: garbage collection - | in rust | Dwedit wrote: | What I really want out of a garbage collector is a "Collect" | function with a deadline. Pick a max time it's allowed to run | before stopping and returning to the program. | auxym wrote: | Nim has exactly that with the GC_step proc. | | https://nim-lang.org/1.4.0/gc.html | | However recent and future versions (2.0) are moving towards a | different approach that is also applicable for deterministic | real time systems: ARC, which is basically alloc and free calls | inserted automatically by the compiler using static analysis | (no "runtime"). | grumpy_coder wrote: | That is necessary but often not sufficient. It still leaves the | problem of moving time from where garbage is created to the | bulk 'collect' step. I've run collect every frame to keep | stalls down, but the time always creeps up and up and you end | up doing forensic analysis of which code is making the garbage | in the first place. | NobleExpress wrote: | Real time GCs exist such as the IBM Metronome GC. Though I'll | be honest and say I haven't heard of many real-time GCs other | than the Metronome one. Certainly many new GCs have reduced | pause times dramatically but that's orthogonal to real-time GC | (as you can make pause times infinitesimally small but not let | the mutator actually progress). | pjmlp wrote: | See PTC and Aicas. | [deleted] | mananaysiempre wrote: | Here's a fairly extensive review that mentions some recent | research on "real-time" GCs, including one with a lower bound | for mutator utilization: | https://eschew.wordpress.com/2016/09/02/summarizing-gc/. | bob1029 wrote: | I've achieved GC timing that is good enough for real-time | competitive game hosting using .NET6+. Worst case over 4 hours | of load testing was an 8ms spike. Average 1-2ms. | | The magic trick is to intentionally collect as often as | reasonably possible (i.e. at batch/frame/tick processing | boundaries) and avoid using sophisticated GC schemes that | involve multiple threads or asynchrony. | | Oh, and obviously you need to minimize allocations throughout | or it won't matter. ___________________________________________________________________ (page generated 2023-04-08 23:00 UTC)