[HN Gopher] Handles are the better pointers (2018)
       ___________________________________________________________________
        
       Handles are the better pointers (2018)
        
       Author : ibobev
       Score  : 161 points
       Date   : 2023-06-21 15:25 UTC (7 hours ago)
        
 (HTM) web link (floooh.github.io)
 (TXT) w3m dump (floooh.github.io)
        
       | stephc_int13 wrote:
       | Handles are great, and I agree that their use can help solve or
       | mitigate many memory management issues.
       | 
       | The thing is that Handles themselves can be implemented in very
       | different fashions.
       | 
       | I extensively use handles in my own framework, and I plan to
       | describe my implementation at some point, but so far I have seen
       | 3-4 different systems based on different ideas around handles.
       | 
       | We might need a more precise naming to differentiate the flavors.
        
         | chubot wrote:
         | Yeah I agree, seems like most of the comments here are talking
         | about slightly different things, with different tradeoffs
         | 
         | Including memory safety
        
       | loeg wrote:
       | We use something like this in a newish C++ project I work on. Our
       | handle system just registers arbitrary pointers (we sometimes
       | follow the convention of grouping items of the same type into a
       | single allocation, and sometimes don't, and the handle-arbitrary
       | pointer system tolerates both). It basically makes it explicit
       | who owns a pointer and who doesn't; and non-owners need to
       | tolerate that pointer having been freed by the time they access
       | it.
        
       | the_panopticon wrote:
       | The UEFI spec leverages handles throughout its APIs. The
       | implementations from the sample in 1998 to today's EDKII at
       | tianocore.org use the address of the interface, or protocol, as
       | the handle value. Easy for a single address space environment
       | like UEFI boot firmware.
        
       | [deleted]
        
       | pphysch wrote:
       | It's interesting to see the convergence between ECS (primarily
       | game engine) and RDBMS.
       | 
       | "AoS" ~ tables
       | 
       | "systems" ~ queries
       | 
       | "handles" ~ primary keys
       | 
       | Makes me wonder if you could architect a "RDBMS" that is fast
       | enough to be used as the core of a game engine but also robust
       | enough for enterprise applications. Or are there too many
       | tradeoffs to be made there.
        
         | bitwize wrote:
         | IBM was, for a while, in the business of selling "gameframes"
         | -- mainframe systems that performed updates fast enough to run
         | a MMO game. They used Cell processors to provide the extra
         | needed compute, but it was the mainframe I/O throughput that
         | allowed them to serve a large number of players. It was backed
         | by IBM DB2 for a database, but it's entirely likely the game
         | updates were performed in memory and only periodically
         | committed to the database. Since, as you say, ECS entity
         | components closely resemble RDBMS tables, this could be
         | accomplished quickly and easily with low "impedance mismatch"
         | compared to an OO game system.
         | 
         | https://en.wikipedia.org/wiki/Gameframe
         | 
         | Back on Slashdot there was a guy called Tablizer who advocated
         | "table-oriented programming" instead of OOP. The rise of ECS
         | means that when it comes to gamedev, TOP is winning. Tablizer
         | might be surprised and delighted by that.
        
         | syntheweave wrote:
         | Essentially all a game engine is are these things:
         | 
         | * A tuned database for mostly-static, repurposable data
         | 
         | * I/O functions
         | 
         | * Compilation methods for assets at various key points(initial
         | load, levels of detail, rendering algorithm)
         | 
         | * Constraint solvers tuned for various subsystems(physics,
         | pathfinding, planning).
         | 
         | A lot of what drives up the database complexity for a game is
         | the desire to have large quantities of homogenous things in
         | some contexts(particles, lights, etc.) and fewer but carefully
         | indexed things in others(mapping players to their connections
         | in an online game). If you approach it relationally you kill
         | your latency right away through all the indirection, and your
         | scaling constraint is the worst case - if the scene is running
         | maxed-out with data it should do so with minimal latency. So
         | game engines tend to end up with relatively flat, easily
         | iterated indexes, a bit of duplication, and homogenization
         | through componentized architecture. And that can be pushed
         | really far with static optimization to describe every entity in
         | terms of carefully packed data structures, but you have to have
         | a story for making the editing environment usable as well,
         | which has led to runtime shenanigans involving "patching"
         | existing structures with new fields, and misguided attempts to
         | globally optimize compilation processes with every edit[0].
         | 
         | Godot's architecture is a good example of how you can relax the
         | performance a little bit and get a lot of usability back: it
         | has a scene hierarchy, and an address system which lets you
         | describe relationships in hierarchical terms. The hierarchy
         | flattens out to its components for iteration purposes when
         | subsystems are running, but to actually describe what you're
         | doing with the scene, having a tree is a godsend, and
         | accommodates just about everything short of the outer-join type
         | cases.
         | 
         | [0] https://www.youtube.com/watch?v=7KXVox0-7lU
        
       | 10000truths wrote:
       | A really powerful design pattern is combining the use of handles
       | and closures in a memory allocator. Here's a simplified Rust
       | example:                 let mut myalloc =
       | MyAllocator::<Foo>::new();       let myfoohandle =
       | myalloc.allocate();       let myfoohandle2 = myalloc.allocate();
       | myalloc.expose::<&mut Foo, &Foo>(myfoohandle, myfoohandle2,
       | |myfoo, myfoo2| {         myfoo.do_mutating_foo_method();
       | println!(myfoo2);       });       myalloc.compact(); // Rearrange
       | the allocated memory for myfoo and myfoo2
       | myalloc.expose::<&mut Foo>(myfoohandle, |myfoo| {         //
       | Still valid!         myfoo.do_mutating_foo_method();       });
       | 
       | Internally, the allocator would use a map of handles to pointers
       | to keep track of things.
       | 
       | Because the closures strongly limit the scope of the referenced
       | memory, you can relocate the memory of actively used objects
       | without fear of dangling references. This allows the allocator to
       | perform memory compactions whenever the user wants (e.g. idle
       | periods).
        
         | dralley wrote:
         | This is just manually implemented garbage collection, really.
        
           | 10000truths wrote:
           | Not really. Compaction is a _feature_ of many garbage
           | collectors, but the allocator I described doesn 't impose any
           | particular way of deciding how and when to deallocate the
           | objects. You could do so explicitly with a
           | MyAllocator::deallocate() method, or use runtime reference
           | counting, or (if you're willing to constrain handle lifetimes
           | to the allocator lifetime) you could delete the allocator
           | entry when the handle goes out of scope.
        
           | vardump wrote:
           | Garbage collection is not necessarily memory compacting.
        
           | danhau wrote:
           | I kind of agree. A while ago I was thinking about how to
           | implement something like Roslyn's red-green-trees in Rust.
           | The solution I came up, with a similar in principle handle-
           | and-allocator approach, did work, but would need the
           | occasional cleanup, to get rid of zombie objects. At that
           | point I realised that all I've done was reinvent a poor mans
           | garbage collector.
        
       | ww520 wrote:
       | Handle described here sounds like the entity ID in an ECS setup.
        
       | SeenNotHeard wrote:
       | Reminds me of a startup I worked for in the 1990s. The C code
       | base was organized into "units" (modules). Each unit
       | allocated/destroyed its own data structures, but used the "ref"
       | unit to return a reference (really, a handle) instead of a
       | pointer. Each module used a "deref" function to convert the
       | handle to a typed pointer for internal use.
       | 
       | ref used many of the tricks described in the above post,
       | including an incrementing counter to catch stale handles.
       | 
       | All pointers were converted to handles in debug and testing
       | builds, but in release builds ref simply returned the pointer as
       | a handle, to avoid the performance penalty. As machines grew
       | faster, there was talk of never turning off ref.
       | 
       | Side win: There was a unit called "xref" (external ref) which
       | converted pointers to stable handles in all builds (debug and
       | release). It was the same code, but not compiled out. External
       | refs were used as network handles in the server's bespoke RPC
       | protocol.
        
         | OnlyMortal wrote:
         | The old Mac used Handle so that it could move memory around as
         | pressure mounted.
         | 
         | In "some" ways, it's a bit like a smart pointer as it's a hook
         | to allow the underlying system to "do things" in a hidden way.
        
           | cpeterso wrote:
           | Microsoft's Win16 memory allocator APIs (GlobalAlloc and
           | LocalAlloc) also returned handles so the OS could move memory
           | blocks to new addresses behind the scenes. Application code
           | would need to call GlobalLock/Unlock APIs to acquire a
           | temporary pointer to the memory block. The APIs still exist
           | in Win32 and Win64 for backwards compatibility, but now
           | they're thin wrappers around a more standard memory
           | allocator.
           | 
           | https://learn.microsoft.com/en-
           | us/windows/win32/memory/compa...
        
             | mmphosis wrote:
             | Historically, I think there was an overlap of APIs between
             | early Macintosh and early versions of Windows because
             | Microsoft was porting their software.
             | 
             |  _Microsoft released the first version of Excel for the
             | Macintosh on September 30, 1985, and the first Windows
             | version was 2.05 (to synchronize with the Macintosh version
             | 2.2) on November 19, 1987._
             | https://en.wikipedia.org/wiki/Microsoft_Excel#Early_history
        
           | natt941 wrote:
           | I kinda miss the fun days of using addresses to physical
           | memory. (Maybe I'm wrong but I've always assumed that
           | explicit use of handles went out of fashion because virtual
           | address lookup is, in effect, a handle.)
        
           | nahuel0x wrote:
           | This was used to avoid memory fragmentation: https://en.wikip
           | edia.org/wiki/Classic_Mac_OS_memory_manageme...
        
           | davepeck wrote:
           | If anyone's curious, here's an old article about working with
           | handles and the Macintosh memory manager: http://preserve.mac
           | tech.com/articles/develop/issue_02/Mem_Mg...
           | 
           | (The article's examples are in Pascal, the original language
           | of choice for the Mac.)
           | 
           | ---
           | 
           | Update: wow, Apple still has bits of the original Inside
           | Macintosh books available online. Here's a section on the
           | memory manager, replete with discussions of the "A5 world"
           | and handle methods (MoveHHI, etc.) in Pascal: https://develop
           | er.apple.com/library/archive/documentation/ma...
        
             | sroussey wrote:
             | Oh man, that brings back memories! Inside Macintosh.
             | Pascal. ...
             | 
             | The article got virtual memory wrong to a bit.. it got much
             | better over the years and using relocatable handles fell by
             | the wayside to more plain pointers.
        
       | vintagedave wrote:
       | > - items are guaranteed to be packed tightly in memory, general
       | allocators sometimes need to keep some housekeeping data next to
       | the actual item memory
       | 
       | > - it's easier to keep 'hot items' in continuous memory ranges,
       | so that the CPU can make better use of its data caches
       | 
       | These are huge advantages. Memory managers tend to keep pools for
       | specific allocation ranges, eg, a pool for < 24 bytes, a pool for
       | <= 64, etc up to, say, a megabyte after which it might be
       | delegated to the OS, such as VirtualAlloc on Windows, directly.
       | This is hand-wavy, I'm speaking broadly here :) This keeps
       | objects of similar or the same sizes together, but it does _not_
       | keep objects of the same type together, because a memory manager
       | is not type-aware.
       | 
       | Whereas this system keeps allocations of the same type
       | contiguous.
       | 
       | You can do this in C++ by overriding operator new, and it's
       | possible in many other languages too. I've optimised code by
       | several percent by taking over the allocator for specific key
       | types, and writing a simple and probably unoptimised light memory
       | manager which is effectively a wrapper around multiple arrays of
       | the object size which keeps object memory in pools for that type,
       | and therefore close in memory. I can go into more detail if
       | anyone's interested!
        
         | lanstin wrote:
         | in go you can do this with a channel of ununused structs if
         | pulling from the channel hits default branch of select, make a
         | new one. same if adding back to the channel hits default, then
         | free it; else just stick them in the channel and pull them off.
         | eases GC pressure in hot paths. does put a little scheduler
         | pressure.
        
           | vore wrote:
           | Why not just do this with a regular stack and avoid having to
           | deal with the scheduler at all? Try pop from the stack and if
           | the stack is empty, do a new allocation.
        
       | GuB-42 wrote:
       | Another advantage of handles is that they can often be made
       | smaller than pointers, especially on 64-bit code.
       | 
       | In some cases, it can significantly lower memory consumption and
       | improve performance through more efficient cache use.
        
       | Animats wrote:
       | WGPU, the cross-platform library for Rust, is in the process of
       | going in the other direction. They had index-based handles, and
       | are moving to reference counts.[1] The index tables required
       | global locks, and this was killing multi-thread performance.
       | 
       | [1] https://github.com/gfx-rs/wgpu/pull/3626
        
         | slashdev wrote:
         | Maybe that's a case of the wrong design? The index tables
         | shouldn't need global locks. It gets a little hairy if you need
         | to be able to reallocate or move them (I.e grow them) but that
         | happens at most a small number of times and there are ways of
         | only taking the lock if that's happening.
         | 
         | I've implemented this pattern without locks or CAS in C++, and
         | it works just fine.
         | 
         | I'm currently using this pattern in rust (although with fixed
         | size) and it works really well. The best part is it bypasses
         | the borrow checker since an index isn't a reference. So no
         | compile time lifetimes to worry about. It's awesome for linked
         | lists, which are otherwise painful in rust. Also it can
         | sometimes allow a linked list with array like cache
         | performance, since the underlying layout is an array.
        
           | Animats wrote:
           | _" it bypasses the borrow checker since an index isn't a
           | reference"_
           | 
           | That's a bug, not a feature. The two times I've had go to
           | looking for a bug in the lower levels of Rend3/WGPU (which
           | are 3D graphics libraries), they've involved some index table
           | being corrupted. That's the only time I've needed a debugger.
        
       | fsckboy wrote:
       | this should be titled "handles are better implemented this way,
       | rather than as smart-pointers, which themselves aren't really
       | pointers"
       | 
       | and blaming cache misses from fragmentation on pointers is
       | whipping your old tired workhorse.
        
       | kazinator wrote:
       | > _Once the generation counter would 'overflow', disable that
       | array slot, so that no new handles are returned for this slot._
       | 
       | This is an interesting idea. When a slot becomes burned this way,
       | you still have lots of other slots in the array. The total number
       | of objects you can ever allocate is the number of slots in the
       | array times the number of generations, which could be tuned such
       | that it won't exhaust for hundreds of years.
       | 
       | You only have to care about total exaustion: no free slot remains
       | in the array: all are either in use, or burned by overflow. In
       | that case, burned slots can be returned into service, and we hope
       | for the best.
       | 
       | If the total exhaustion takes centuries, the safety degradation
       | from reusing burned slots (exposure to undetected use-after-free)
       | is only academic.
        
       | stonemetal12 wrote:
       | Since he mentions C++, I would add you don't have to give up on
       | RAII to adopt this approach. You obviously can't use the standard
       | smart pointers, but developing similar smart handles isn't that
       | much extra effort.
        
       | pie_flavor wrote:
       | I love this pattern! I make use of it all the time in Rust with
       | the slotmap library. Improves upon arrays by versioning the
       | indexes so values can be deleted without messing up existing
       | indexes and the space can be reused without returning incorrect
       | values for old indexes.
        
       | kgeist wrote:
       | The article basically describes the Entity-Component-System
       | architecture and it makes sense when your app is essentially a
       | stateful simulation with many independent subsystems (rendering,
       | physics, etc.) managing lots of similar objects in realtime (i.e.
       | games). I thought about how it could be used in other contexts
       | (for example, webdev) and failed to find uses for it outside of
       | gamedev/simulation software. It feels like outside of gamedev,
       | with this architecture, a lot of developer energy/focus will be,
       | by design, spent on premature optimizations and infrastructure-
       | related boilerplate (setting up object pools etc.) instead of
       | focusing on describing business logic in a straightforward,
       | readable way. Are there success stories of using this
       | architecture (ECS) outside of gamedev and outside of C++?
        
         | chocobor wrote:
         | I use something like that for controlling a cluster of vending
         | machines.
        
         | feoren wrote:
         | You are conflating object pools and ECS architecture. Yes, they
         | work very well together, but neither is required for the other.
         | 
         | The ECS architecture is about which parts of your application
         | are concerned with what, and it's an absolute godsend for
         | building complex and flexible business logic. I would be loathe
         | to ever use anything else now that I have tasted this golden
         | fruit.
         | 
         | Object pooling is about memory management; in an OO language
         | this is about reducing pressure on the garbage collector,
         | limiting allocation and GC overhead, improving memory access
         | patterns, etc. -- all the stuff he talks about in this article.
         | I almost never use object pools unless I'm running a huge
         | calculation or simulation server-side.
         | 
         | Games use both, because they're basically big simulations.
        
           | noduerme wrote:
           | I write business logic and I use a fair amount of object
           | pooling. It's not quite the same as in game dev or e.g. a
           | socket pool or worker pool where you're constantly swapping
           | tons of things in and out, but it can still be helpful to
           | speed up the user experience, manage display resources and
           | decrease database load.
           | 
           | One example would be endless-scrolling calendar situations,
           | or data tables with thousands of rows, or anything where I'm
           | using broader pagination in the database calls than I want to
           | in the display chain; maybe I can call up 300 upcoming
           | reservations every time the calendar moves, erase the DOM and
           | redraw 300 nodes, but I'd rather call up 3,000 all at once
           | and use a reusable pool of 300 DOM nodes to display them.
           | 
           | Sure, it's not glamorous...
        
         | syntheweave wrote:
         | It's mostly a linguistic distinction: is your indirection a
         | memory address, or is it relative to a data structure? You gain
         | more than you lose in most instances by switching models away
         | from the machine encoding - ever since CPUs became pipelined,
         | using pointers directly has been less important than contriving
         | the data into a carefully packed and aligned array, because
         | when you optimize to the worst case you mostly care about the
         | large sequential iterations. And once you have the array, the
         | handles naturally follow.
         | 
         | The reason why it wouldn't come up in webdev is because you
         | have a database to do container indirection, indexing, etc.,
         | and that is an even more powerful abstraction. The state that
         | could make use of handles is presentational and mostly not
         | long-lived, but could definitely appear on frontends if you're
         | pushing at the boundaries of what could be rendered and need to
         | drop down to a lower level method of reasoning about graphics.
         | Many have noted a crossover between optimizing frontends and
         | optimizing game code.
        
         | guidoism wrote:
         | Really? The main point (use an index into an array instead of a
         | pointer into a blob of memory) gets you 99% of the benefit and
         | it isn't anymore difficult than using pointers. I do this all
         | the time.
        
         | meheleventyone wrote:
         | It's more the Entity in an ECS is a special case of a handle
         | that references into multiple other containers. Handles in
         | general are used all over the place outside of that context.
        
       | noduerme wrote:
       | Just wondering, for those who have implemented something like
       | this - do you still stuff these arrays with unique_ptr's instead
       | of raw pointers, at least to make it easier to manage / reset /
       | assert them within the "system" that owns them?
        
       | scotty79 wrote:
       | I've seen an argument that collections of objects should be
       | allocated field-wise. So an array of points would actually be two
       | arrays, one for x-es, one for y-s addressed by index of the
       | object in the collection.
       | 
       | I wonder how programming would look if that was the default mode
       | of allocation in C++. Pointers to objects wouldn't make much
       | sense then.
        
       | addaon wrote:
       | > move all memory management into centralized systems (like
       | rendering, physics, animation, ...), with the systems being the
       | sole owner of their memory allocations
       | 
       | > group items of the same type into arrays, and treat the array
       | base pointer as system-private
       | 
       | > when creating an item, only return an 'index-handle' to the
       | outside world, not a pointer to the item
       | 
       | > in the index-handles, only use as many bits as needed for the
       | array index, and use the remaining bits
       | 
       | > for additional memory safety checks only convert a handle to a
       | pointer when absolutely needed, and don't store the pointer
       | anywhere
       | 
       | There's two separate ideas here, in my mind, and while they play
       | nicely together, they're worth keeping separate. The first one-
       | and-a-half points ("move all memory management" and "group
       | items") are the key to achieving the performance improvements
       | described and desired in the post, and are achievable while still
       | using traditional pointer management through the use of e.g.
       | arena allocators.
       | 
       | The remainder ("treat the array base pointer" on) is about
       | providing a level of indirection that is /enabled/ by the first
       | part, with potential advantages in safety. This indirection also
       | enables a relocation feature -- but that's sort of a third point,
       | independent from everything else.
       | 
       | There's also a head nod to using extra bits in the handle indexes
       | to support even more memory safety features, e.g. handle
       | provenance... but on modern 64-bit architectures, there's quite
       | enough space in a pointer to do that, so I don't think this
       | particular sub-feature argues for indexes.
       | 
       | I guess what I'm saying is that while I strongly agree with this
       | post, and have used these two patterns many times, in my mind
       | they /are/ two separate patterns -- and I've used arena
       | allocation without index handles at least as many times, when
       | that trade-off makes more sense.
        
         | loeg wrote:
         | Totally agree these are separate ideas. We use the system-
         | private part but not the index-only part in a handle system in
         | the product I work on.
        
       | [deleted]
        
       | adamnemecek wrote:
       | I too have come to this realization.
       | 
       | It does away with problems associated with child-parent
       | references.
       | 
       | Also, you might be able to use a bitset to represent a set of
       | handles as opposed to a set or intrusive booleans.
       | 
       | It also plays nicely with GPUs too.
       | 
       | I don't know why this is not like the default. Given this is how
       | Handles work in Windows.
        
       | sylware wrote:
       | my handles are offsets into a mremap-able region of memory.
        
       | fisf wrote:
       | Yes, that's basically an entity component system. It still runs
       | into the 'fake memory leaks' problem the author describes for
       | obvious reasons, i.e. you still have to deal with "components"
       | attached to some handle somewhere (and deallocate them).
        
       | eschneider wrote:
       | As someone who did way too much with handles in early Mac and
       | Windows programming, I'll say they're definitely not 'better',
       | but for some (mostly memory constrained) environments, they have
       | some advantages. You get to compress memory, sure, but now you've
       | got extra locking and stale pointer complexity to deal with.
       | 
       | If you _need_ the relocatability and don't have memory mapping,
       | then maybe they're for you, but otherwise, there are usually
       | better options.
        
       | cmrdporcupine wrote:
       | Three thing:
       | 
       | a) pointers _are_ handles, to parts of virtual memory pages. When
       | your process maps memory, those addresses are really just ...
       | numbers... referring to pages relative to your process only.
       | Things like userfaultfd or sigsegv signal handlers are even
       | capable of blurring the line significantly between self-managed
       | handles and pointers even by allowing user resolution of page
       | fault handling. Worth thinking about.
       | 
       | b) If performance is a concern, working through a centralized
       | handle/object/page table is actually far worse than you'd think
       | at first -- even with an O(1) datastructure for the mapping.
       | _Especially_ when you consider concurrent access. Toss a lock in
       | there, get serious contention. Throw in an atomic? Cause a heap
       | of L1 cache evictions. Handles _can_ mess up branch prediction,
       | they can mess up cache performance generally, and...
       | 
       | 3) they can also confuse analysis tools like valgrind, debuggers,
       | etc. Now static analysis tools, your graphical debugger, runtime
       | checks etc don't have the same insight. Consider carefully.
       | 
       | All this to say, it's a useful pattern, but a blanket statement
       | like "Handles are the better pointers" is a bit crude.
       | 
       | I prefer that we just make pointers better; either through "smart
       | pointers" or some sort of pointer swizzling, or by improving
       | language semantics & compiler intelligence.
        
         | taeric wrote:
         | My hunch is that b is often mitigated by the fact that you were
         | having to touch many items? Such that you don't necessarily
         | even care that it is O(1) in lookup, you are iterating over the
         | items. (And if you aren't having to touch many of the items on
         | the regular, than you probably won't see a benefit to this
         | approach?)
        
         | jjnoakes wrote:
         | > pointers are handles
         | 
         | Pointers are quite limited handles - they would all to be
         | handles to the same array with base pointer 0, but that removes
         | many of the useful benefits you get from having different
         | arrays with different base pointers.
        
       | adamrezich wrote:
       | I use a system much like this for entity management for games,
       | but with an additional property that I don't see outlined here:
       | 
       | when an entity is "despawned" (destroyed), it is not "freed"
       | immediately. instead, its id ("generation counter" in the
       | article) is set to 0 (indicating it is invalid, as the first
       | valid id is 1), and it's added to a list of entities that were
       | despawned this frame. my get-pointer-from-handle function returns
       | both a pointer, and a "gone" bool, which is true if the entity
       | the handle points to has an id of 0 (indicating it _used_ to
       | exist but has since been despawned), or if the id in the handle
       | and the id in the pointed-to entity don 't match (indicating that
       | the entity the handle pointed at was despawned, and something
       | else was spawned in its place in memory). then, at the end of
       | each frame, the system goes through the list of despawning
       | entities, and it's _there_ that the memory is reclaimed to be
       | reused by newly-spawned entities.
       | 
       | in this system, it's up to the user of the get-pointer-from-
       | handle function to check "if gone", and handle things
       | accordingly. it's a bit cumbersome to have to do this check
       | everywhere that you want to get a pointer to an entity, but with
       | some discipline, you'll never encounter "use-after-free"
       | situations, or game logic errors caused by assuming something
       | that existed last frame is still there when it might be gone now
       | for any number of reasons--because you're explicitly writing what
       | fallback behavior should occur in such a situation.
        
       | hinkley wrote:
       | Java started out with handles. It seemed to be useful for getting
       | the compacting collector working right. Later on, around Java 5,
       | those went away, improving branch prediction. Then sometime
       | around Java 9 they came back with a twist. As part of concurrent
       | GC work, they needed to be able to move an object while the app
       | was still running. An object may have a handle living at the old
       | location, forwarding access to the new one while the pointers are
       | being updated.
       | 
       | That was about when I stopped paying attention to Java. I know
       | there have been two major new collectors since so I don't know if
       | this is still true. I was also never clear how they clone the
       | object atomically, since you would have to block all updates in
       | order to move it. I think write barriers are involved for more
       | recent GC's but I'm fuzzy on whether it goes back that far or
       | they used a different trick for the handles.
        
       | pavlov wrote:
       | Memory handles and their cousins, function suites. The last time
       | I used both of these must have been when writing an After Effects
       | plugin.
       | 
       | The "suite" is a bit like a handle but for code: a scoped
       | reference to a group of functions. It's a useful concept in an
       | API where the host application may not provide all possible
       | functionality everywhere, or may provide multiple versions.
       | Before, say, drawing on-screen controls, you ask the host for
       | "OSC Suite v1.0" and it returns a pointer to a struct containing
       | function pointers for anything you can do in that context with
       | that API version.
        
         | mananaysiempre wrote:
         | > Before, say, drawing on-screen controls, you ask the host for
         | "OSC Suite v1.0"
         | 
         | Or for IWebBrowser2, or for EFI_GRAPHICS_OUTPUT_PROTOCOL, or
         | for xdg-shell-v7, or for EGL_ANDROID_presentation_time. It's
         | not really an uncommon pattern, is my point. It can be awkward
         | to program against, but part of that is just programming
         | against multiple potential versions of a thing in general.
         | 
         | I can't see the connection with handles, though.
         | Suites/interfaces/protocol/etc. are an ABI stability tool,
         | whereas handles are at most better for testing in that respect
         | compared to plain opaque pointers.
        
           | pavlov wrote:
           | In my mind, the similarity is that both handles and function
           | suites allow the host runtime to change things around behind
           | your back because you're not holding direct pointers, but
           | instead the access is always within a scope.
           | 
           | With a memory handle there's usually an explicit closing call
           | like unlock:                 uint8_t *theData =
           | LockHandle(h);       ...       UnlockHandle(h); theData =
           | NULL;
           | 
           | With a function suite (they way I've seen them used anyway!),
           | the access might be scoped to within a callback, i.e. you are
           | not allowed to retain the function pointers beyond that:
           | doStuffCb(mgr) {         SomeUsefulSuite *suite =
           | mgr->getSuite(USEFUL_V1);         if (suite &&
           | suite->doTheUsefulThing) suite->doTheUsefulThing();       }
        
             | mananaysiempre wrote:
             | Ah. Yes, that's quite a bit more specific than what I
             | imagined from your initial description.
             | 
             | Doesn't that mean that you have to unhook all your
             | references from the world at the end of the callback and
             | find everything anew at the start of a new one? (For the
             | most part, of course, that would mean that both the runtime
             | and the plugin would end up using as few references as
             | possible.) What could a runtime want to change in the
             | meantime that pays for that suffering?
             | 
             | I can only think of keeping callbacks active across
             | restarts and upgrades, and even then allowing for parts to
             | disappear seems excessive.
        
       | [deleted]
        
       | jiveturkey wrote:
       | conflates a few of the sub-topics and gets some of it wrong, but
       | a good read nonetheless.
       | 
       | my first experience with handles is from early classic macOS,
       | "system N" days. Most everything in the system API was ref'd via
       | handles, not direct pointers, and you were encouraged to do the
       | same for your own apps. Memory being tight as it was back then, I
       | imagine the main benefit was being able to repack memory, ie
       | nothing to do with performance as is mostly the topic of TFA.
       | 
       | I guess ObjC is unrelated to the macOS of olde, but does it also
       | encourage handles somehow? I understand that one reason apple
       | silicon is performant is that it has specific fast handling of
       | typical ObjC references? Like the CPU can decode a handle in one
       | operation rather than 2 that would normally be required. I can't
       | find a google reference to substantiate this, but my search terms
       | are likely lacking since I don't know the terminology.
        
       | chubot wrote:
       | Counterpoint: I read many of these "handles" articles several
       | years ago, and tried them in my code, and it was something of a
       | mistake. The key problem is that they punt on MEMORY SAFETY
       | 
       | An arena assumes trivial ownership -- every object is owned by
       | the arena.
       | 
       | But real world programs don't all have trivial ownership, e.g. a
       | shell or a build system.
       | 
       | Probably the best use case for it is a high performance game,
       | where many objects are scoped to either (1) level load time or
       | (2) drawing a single frame. You basically have a few discrete
       | kinds of lifetimes.
       | 
       | But most software is not like that. If you over-apply this
       | pattern, you will have memory safety bugs.
        
         | meheleventyone wrote:
         | Disagree, you can easily add validation to handle access with
         | generations. The classic example in games is reusing pooled
         | game elements where their lifetimes are very dependent on
         | gameplay. For example reusing enemies from a pool of them where
         | their lifetime is dependent on when they get spawned and
         | destroyed which can be chaotic. Here handles with generations
         | preserve memory safety by preventing access to stale data.
        
         | throwawaymaths wrote:
         | you'd be surprised how far you can get with this. Handles are
         | basically how things like Erlang VM (and IIRC javascript VMs)
         | work.
        
       | charcircuit wrote:
       | Pointers are already handles if you are using virtual memory
        
       | jyscao wrote:
       | (2018)
       | 
       | Previous discussions:
       | 
       | 2018: https://news.ycombinator.com/item?id=17332638 (80 comments)
       | 
       | 2021: https://news.ycombinator.com/item?id=26676625 (88 comments)
        
       ___________________________________________________________________
       (page generated 2023-06-21 23:00 UTC)