[HN Gopher] Implementing a safe garbage collector in Rust ___________________________________________________________________ Implementing a safe garbage collector in Rust Author : celeritascelery Score : 70 points Date : 2022-04-26 11:56 UTC (3 days ago) (HTM) web link (coredumped.dev) (TXT) w3m dump (coredumped.dev) | ridiculous_fish wrote: | I don't understand how the rooting works here. Usually GC | languages make the distinction between a "pointer" which is a | reference into the heap, and a "root" which is something known to | the GC that keeps a pointer live. I don't see that distinction | here. I think this is what the author is saying: | | 1. You can allocate from an arena; you get an object. | | 2. Allocated objects may be freed when a GC is triggered, unless | they are added as a root. It is your responsibility to add your | object as a root. | | 3. Roots are in a LIFO stack and cannot be moved. | | Ok, but we never actually see the root _type_. For example: | let rooted: Vec<Object<'root>> = ...; // we have a vec of root | objects let object: Object<'arena> = arena.add("new"); // | object is bound to arena rooted.push(object); // object | is now bound to rooted arena.garbage_collect(); // Object | is marked as live and not freed | | How is anything rooted here? The lifetime changed from 'arena to | 'root but I don't see a root being created. | | Some other misc questions: | | _What if instead of storing them as a map, we store them as a | stack instead? When something is rooted, it is pushed on the | stack. When it drops, it is popped from the stack._ | | But later we see roots not obeying a LIFO order, under | "Preventing escapes" where roots are dynamically created and | destroyed in an arbitrary order. | | _The function takes a &mut Arena, and at the end it returns an | Object with the same lifetime_ | | But earlier it says "we have to make sure the object can't move." | I'm confused what is being returned here: an object reference, or | a root. | celeritascelery wrote: | > How is anything rooted here? The lifetime changed from 'arena | to 'root but I don't see a root being created. | | In this example, the Vec has been rooted previously. So pushing | an object into the Vec will make it "transitively" rooted | (accessible from the root). You would root a struct with the | root_struct![1] macro, which works very similar to the root! | macro shown in the post. | | However you made you realize one error; The rooted `Vec` in the | example you pointed is a by value type, but in the | implementation you can only get references to rooted structs, | so that example needs to be updated. | | > But later we see roots not obeying a LIFO order, under | "Preventing escapes" where roots are dynamically created and | destroyed in an arbitrary order. | | Objects are just a copyable wrapper around a pointer, so they | are not the part that has the LIFO semantics. inside the root! | macro[2] there is a `StackRoot` type that is the actual "root". | The object just borrows from that so that is has a 'root | lifetime and is valid post gc. The actual root struct is not | exposed outside of the macro. | | I hope this makes the distinction between "roots" and "objects" | clearer. Objects are just pointers to heap data. When we root | an object we store the data it points to on the root stack and | create a new `StackRoot`. Then we say this object is rooted. | But the struct that "does the rooting" is inside the macro and | not exposed. Rooting a struct works similarly. | | [1] | https://github.com/CeleritasCelery/rune/blob/5a616efbed763b9... | | [2] | https://github.com/CeleritasCelery/rune/blob/5a616efbed763b9... | ridiculous_fish wrote: | The existence of the StackRoot type is definitely a missing | piece of the puzzle, thank you. So we have Object and | StackRoot. But then what is a "root object?" | let rooted: Vec<Object<'root>> = ...; // we have a vec of | root objects | | I would expect this vector to just contain Objects? | | I think conceptually you have two different Object types: a | "dangling" Object that is subject to collection, and a | "rooted" Object which is referenced by a root stored in the | RootSet. Part of my confusion is that there's no vocabulary | to distinguish these. In this code: | root!(value, gc) | | Visually it appears to "do something" to value but in fact | there's two different variables, with two different types, | both named 'value'. It would be clearer at least to me to | reify that by giving them different types instead of calling | both Object. | | In a moving GC this distinction becomes unavoidable: dangling | objects are raw pointers, while rooted objects refer to the | root location itself and require double indirection to | access. | | Anyways it's just a suggestion, take it or leave it. It's | very cool how Rust's lifetimes let you enforce that no | collections happen while an Object dangles. | | One last thought: because you're already on the hook for | passing in the GC when dereferencing an Object, you could | exploit this by making Object references just a 32 bit | offset, relative to the GC's heap. This would let you save | some memory. | amelius wrote: | This is cute but modern concurrent garbage collectors like the | ones used in VMs and languages such as Go are way more | complicated. In fact, the GC can be the most difficult aspect of | an entire environment including the compiler. | NeutralForest wrote: | Very cool project! What motivated you to start a Rust VM for | Emacs? | celeritascelery wrote: | I talk little bit about that in my introductory post[1]. | Basically my interest in Emacs, Rust, and interpreters came to | together and I decided to see what I could do. I was also | inspired by the remacs project[2] and I think Rust has a lot to | offer as dynamic language backend. I don't know where this will | go, but I am going to continue to work on it. | | [1]https://coredumped.dev/2021/10/21/building-an-emacs-lisp- | vm-... | | [2]https://github.com/remacs/remacs | NeutralForest wrote: | I use Emacs a lot so it's always a treat to learn more about | the internals, one way or another, have fun! | gwbas1c wrote: | One thing I'm trying to understand is: Is this a garbage | collector _for_ Rust; or is this a garbage collector _implemented | in_ Rust for another environment. | | It's confusing because the author refers to their Emacs VM; but | then refers to Rc in Rust. | nicoburns wrote: | It's currently the latter, but the author believes that this | approach could be cleaned up and generalised to also be the | former. I believe it would then be a general purpose GC library | in Rust that could be used for Rust code or any environment | written in Rust. | steveklabnik wrote: | The latter, implemented in Rust (in my understanding). | | They bring up Rc<T> because using reference counting as their | implementation is in theory a possibility, which they then | quickly (and rightfully, imho) dismiss. | zamalek wrote: | Although you probably could use it for something else, Boa[1] | is a good demonstration of this (it uses the GC crate, but | the same principle probably applies). | | [1]: https://github.com/boa-dev/boa ___________________________________________________________________ (page generated 2022-04-29 23:00 UTC)