hngopher.com

       [HN Gopher] What should I know about garbage collection as a Jav...
       ___________________________________________________________________
        
       What should I know about garbage collection as a Java developer?
        
       Author : saikatsg
       Score  : 37 points
       Date   : 2023-01-10 05:38 UTC (17 hours ago)
        
 (HTM) web link (www.azul.com)
 (TXT) w3m dump (www.azul.com)
        
       | turtledragonfly wrote:
       | One thing that that I hadn't fully understood until recently is
       | that garbage collectors can actually allow you to write _more
       | efficient_ code.
       | 
       | Previously, I had the general understanding that you were trading
       | convenience (not thinking about memory management or dealing with
       | the related bugs) in exchange for performance (GC slows your
       | program down).
       | 
       | That's still true broadly, but there's an interesting class of
       | algorithms where GC can give you a perf. improvement: immutable
       | data structures, typically used in high-concurrency situations.
       | 
       | Consider a concurrent hash map: when you add a new key, the old
       | revision of the map is left unchanged (so other threads can keep
       | reading from it), and your additions create a new revision. Each
       | revision of the map is immutable, and your "changes" to it are
       | really creating new, immutable copies (with tricks, to stay
       | efficient).
       | 
       | These data structures are great for concurrent performance, but
       | there's a problem: how do you know when to clean up the memory?
       | That is: how do you know when all users are done with the old
       | revisions, and they should be freed?
       | 
       | Using something like a reference count adds contention to this
       | high-concurrency data structure, slowing it down. Threads have to
       | fight over updating that counter, so you have now introduced
       | shared mutable state which was the whole thing you were trying to
       | avoid.
       | 
       | But if there's a GC, you don't have to think about it. And the GC
       | can choose a "good time" to do it's bookkeeping in bulk, rather
       | than making all of your concurrent accesses pay a price. So, if
       | done properly, it's an overall performance win.
       | 
       | Interestingly, a performant solution without using GC is "hazard
       | pointers," which are essentially like adding a teeny tiny garbage
       | collector, devoted just to that datastructure (concurrent map, or
       | whatever).
        
         | tadfisher wrote:
         | Well put. I find it fascinating to watch memory-safe runtimes
         | converge on automatic memory management (via GC or ARC) and
         | owner/borrower models. I'm just not sure which I like better,
         | or if I'm thinking too imperatively.
        
         | bob1029 wrote:
         | > But if there's a GC, you don't have to think about it. And
         | the GC can choose a "good time" to do it's bookkeeping in bulk,
         | rather than making all of your concurrent accesses pay a price.
         | So, if done properly, it's an overall performance win.
         | 
         | In many environments, you can explicitly force a GC collection
         | from application code. I've got a few situations where
         | explicitly running GC helps reduce latency/jitter, since I can
         | decide precisely where and how often it occurs.
         | 
         | In my environment, calling GC.Collect more frequently than the
         | underlying runtime typically will result in the runtime-induced
         | collections taking less time (and occurring less frequently).
         | But, there is a tradeoff in that you are stopping the overall
         | world more frequently (i.e. every frame or simulation tick) and
         | theoretical max throughput drops off as a result.
         | 
         | Batching is the best way to do GC, but it is sometimes
         | catastrophic for the UX.
        
         | mike_hearn wrote:
         | Yeah, but it's actually deeper than just adding refcounts. The
         | algorithms themselves can change in some cases.
         | 
         | The issue is, the hardware can usually only do
         | atomic/interlocked operations at the word level. If you have a
         | GC then you can atomically update a pointer from one thing to
         | another and not think about the thing that was being pointed to
         | previously: an object becomes unreachable atomically due to the
         | guarantees provided by the GC (either via global pauses or
         | write barriers or both). If you don't have that then you need
         | to both update a pointer and a refcount atomically, which goes
         | beyond what the hardware can easily do without introducing
         | locks, but that in turn creates new problems like needing an
         | ordering.
        
         | zackangelo wrote:
         | Most JVMs take advantage of a thread local "bump" allocator[1]
         | as well to avoid having to cross JVM or kernel boundaries to
         | allocate memory, which can result in huge speedups for memory-
         | intensive use cases.
         | 
         | [1] https://shipilev.net/jvm/anatomy-quarks/4-tlab-allocation/
        
           | eldenring wrote:
           | Bump allocators are incredibly fast, and are super efficient
           | in generational GCs where compaction is super cheap, however
           | almost all (maybe all) modern languages don't usually cross
           | kernel boundaries when allocating memory, including C++
           | malloc.
        
         | Alifatisk wrote:
         | I don't exactly know how but I've always connected GC-languages
         | with slow performance, but today, I realized how wrong I was.
        
           | mike_hearn wrote:
           | Performance and GC is a tricky topic, partly because there's
           | not so many GCd languages explicitly designed for performance
           | above usability (maybe D would count? _maybe_ Go?). GC is
           | normally chosen for usability reasons, and then the language
           | has other usability features that reduce performance and it
           | gets difficult to disentangle them. Immutability is a common
           | problem. GC makes allocating lots of objects easy, so people
           | make immutable types (e.g. Java 's String type) and that
           | forces you to allocate lots of objects, which causes lots of
           | cache misses as the young gen pointer constantly moves
           | forwards, and that slows everything down whereas a C++ dev
           | might shorten a string by just inserting a NULL byte into the
           | middle of it. Functional programming patterns are a common
           | culprit because of their emphasis on immutability. You bleed
           | performance in ways that don't show up on profiles because
           | they're smeared out all over the program.
           | 
           | Another complication is that people talk about the
           | performance of languages, when often it's about the
           | performance of an implementation. The most stunning example
           | of this is TruffleRuby in which the GraalVM EE Ruby runtime
           | often runs 50x faster or more than standard Ruby. Language
           | design matters a lot, but how smart your runtime is matters a
           | lot too.
           | 
           | A final problem is that many people associate GC with
           | scripting languages like Python, JavaScript, Ruby, PHP etc
           | and they often have poor or non-existent support for multi-
           | threading. So then it's hard to get good performance on
           | modern hardware of course and that gets generalized to all GC
           | languages.
        
           | turtledragonfly wrote:
           | Well, there's still truth to it in other cases, I think. One
           | terrible thing GCs can do is make your performance
           | _unpredictable_. In some performance-sensitive situations
           | (eg: video games), your worst-case perf is more important
           | than your average case. Adding a GC can mess with that worst-
           | case behavior, and in unpredictable ways.
           | 
           | That being said, modern GCs are much better (less "stop the
           | world" stuff), and more configurable. But it's still a real
           | concern.
        
       ___________________________________________________________________
       (page generated 2023-01-10 23:00 UTC)