[HN Gopher] Techniques for safe garbage collection in Rust
       ___________________________________________________________________
        
       Techniques for safe garbage collection in Rust
        
       Author : PaulHoule
       Score  : 81 points
       Date   : 2024-08-21 15:50 UTC (3 days ago)
        
 (HTM) web link (kyju.org)
 (TXT) w3m dump (kyju.org)
        
       | amelius wrote:
       | If you fit a gc on top of rust, the result is going to be less
       | efficient than when using a modern gc'd language to begin with.
       | So I'm curious what drove them to this.
        
         | the_mitsuhiko wrote:
         | The answer to your question is in the linked post.
        
         | Aurornis wrote:
         | This is a garbage collector written in Rust, not "on top of"
         | Rust.
         | 
         | This isn't equivalent to adding garbage collection to the
         | entire language. It's a garbage collected pointer type that can
         | be employed for specific use cases.
         | 
         | The article and the repo explain why they developed it:
         | Implementing VMs for garbage collected languages in Rust.
        
           | amelius wrote:
           | > It's a garbage collected pointer type that can be employed
           | for specific use cases.
           | 
           | You mean use cases where performance does not matter?
        
             | twiss wrote:
             | Do you think Lua (for example, or any other GC'd language)
             | has valid use cases? If so, it needs an implementation.
             | This blog post shows (part of) one way to do that.
        
             | remexre wrote:
             | Use-cases like "implementing a garbage-collected language."
        
               | amelius wrote:
               | High performance garbage collectors are typically written
               | in assembly because of the intricacies of the cache
               | hierarchy.
        
               | remexre wrote:
               | Which one(s) are you thinking of? The JVM's appear to be
               | in C++, GHC's and SBCL's are in C, Go's is in Go, and I'm
               | not familiar with other high-performance garbage-
               | collected platforms.
        
               | hayley-patton wrote:
               | Citation very much needed, please; assembly wouldn't give
               | you more control over caches than any language with a
               | prefetch intrinsic.
        
             | vlovich123 wrote:
             | Out of curiosity, what language do you think the Java,
             | JavaScript, and Python VMs and garbage collectors are
             | written in? If you can understand why the VM is typically
             | written in a systems programming language that doesn't
             | itself have a VM or garbage collector, then you can start
             | to think about why this is useful regardless of whether
             | performance matters or not (& Java and C# are generally
             | considered fairly high performance languages and VM
             | implementations with efficient garbage collectors - the
             | downsides may not matter to your problem domain).
        
               | neonsunset wrote:
               | To expand on GPs point, I believe it implies that
               | implementing a GC type for the Rust _itself_ within its
               | constraints (and even LLVM is not perfect, if we skip to
               | LLVM-IR) is bound to be worse than in a language with
               | bespoke precise+tracing+moving garbage collector which
               | always requires deep compiler integration for  "VM" to
               | have exact information of where gcrefs are located at
               | every safepoint (including registers!), be able to
               | collect objects as soon as they are no longer referenced
               | and not when they go out of scope later, determine
               | whether write (or, worse, read) barriers are required or
               | can be omitted and have the ability to suspend the
               | execution to update the object references upon moving
               | them to a different generation/heap/etc.
               | 
               | All GC implementations in Rust that I've seen so far
               | relied on much more heavy handed techniques like having
               | GC<T> to be a double indirection, pushing references to
               | threadlocal queue, have GC pointers to be fat to pass
               | around metadata inline, etc. They have been closer to
               | modified RC with corresponding cost.
        
       | the_mitsuhiko wrote:
       | I strongly encourage people interested in this topic to look at
       | the piccolo VM that has been referenced in the post. It is a
       | fascinating exploration of how a stackless VM can work in Rust.
       | It uses "sequences" (think a bit like a future or promise) to
       | express control flow that yields or calls into other VM methods.
       | Particularly for a little while now it has the concept of an
       | async_sequence [1] which allows one to abuse (?) async/await
       | where it will create a shadow stack to make this code more
       | readable.
       | 
       | I find all of this quite exciting because there are not that many
       | stackless VMs out there that are readable and not a massive
       | usability hazard if you do something wrong.
       | 
       | [1]:
       | https://github.com/kyren/piccolo/blob/master/tests/async_seq...
        
         | CyberDildonics wrote:
         | Why wouldn't I forget about stackless VMs, "sequences" and
         | yielding calls to VM methods and just make a normal native
         | program with a lock free queue?
        
           | the_mitsuhiko wrote:
           | Quite frankly I don't understand the question. The assume
           | here is that you have the goal is to have/write a dynamic
           | language you will have to write a VM. If you write a VM a
           | stackless model is the preferred one for a variety of
           | reasons. However not having a stack causes pains for native
           | calls that can detour via native objects back into the VM.
           | That you often have when embedding these languages (which is
           | the goal).
           | 
           | So if you have that sort of setup, you will need to find
           | solutions.
           | 
           | What piccolo does is quite pretty and definitely beats some
           | other options out there. A lot of stackless VMs just crash
           | horribly if you do something naughty with re-entrant calls.
        
         | twiss wrote:
         | Companion blog post: https://kyju.org/blog/piccolo-a-stackless-
         | lua-interpreter/
        
       | keithalewis wrote:
       | Techniques for safe garbage collection in C++: unique_ptr and
       | right curly brace }. Linear types with delayed gratification.
        
         | slaymaker1907 wrote:
         | This is assuming people don't abuse it and pass around raw
         | references/pointers to it (they will). The safest construct,
         | though with definite performance impact, is std::shared_ptr but
         | people think they are smarter than they are and don't use it.
        
         | SkiFire13 wrote:
         | This is neither safe (you can still get raw pointers/references
         | to its contents) nor garbage collection in the usual sense (you
         | can't even share it!)
        
       ___________________________________________________________________
       (page generated 2024-08-25 09:00 UTC)