[HN Gopher] ORC - Nim's new memory managment ___________________________________________________________________ ORC - Nim's new memory managment Author : mratsim Score : 165 points Date : 2020-12-08 14:45 UTC (8 hours ago) (HTM) web link (nim-lang.org) (TXT) w3m dump (nim-lang.org) | scottlamb wrote: | Couple questions about how this works. | | 1. Is "acyclic" checked by the compiler in any way? or is the | programmer basically saying "I'm responsible for making sure this | there are no cycles, and I accept that memory will leak if I'm | wrong"? I imagine the latter because the compiler check sounds | infeasible. (Similarly Rust's memory safety doesn't guarantee | there are no leaks.) | | 2. How does it work with multiple threads? I see a Q/A here [1] | which says | | > C: We do not do atomic reference counting, we have a shared | heap where you move subgraphs completely to a different thread. A | graph is always owned by a single thread, there is no need to | make the reference counting atomic. We have mechanisms in | development to ensure this at runtime or at compile-time, | inspired by the Pony programming language. | | and would be curious to read the one-page rather than one- | paragraph version of this... | | [1] https://forum.nim-lang.org/t/5734 | disruptek wrote: | You can read about the reasoning behind subgraph ownership | here: https://github.com/nim-lang/RFCs/issues/244 | scottlamb wrote: | Thanks. If I'm understanding correctly: with Isolated (which | is still being built), any given object is reachable from | only one thread. It can be used for message-passing but not | shared memory. | | It looks like they've talked about shared memory as important | for efficiency, eg https://nim-lang.org/araq/concurrency.html | (from back in 2013 apparently, and I don't know if it's a | "canonical" project view or not) says: | | > For real CPU efficiency on commodity hardware shared memory | is unavoidable. Message passing doesn't cut it and ultimately | doesn't address the same problems. | (http://www.yosefk.com/blog/parallelism-and-concurrency- | need-...) Also shared mutable memory is unavoidable in a | systems programming context. | | (I agree with that statement, fwiw.) | | It sounds like they haven't built support for shared memory | yet; correct? and as this Isolated construct is also in | progress, I think there's basically no threading support in | Nim yet? | | Are they still intending to build shared memory as well as | Isolated? Is it intended to be memory-safe (as in the Rust | terminology) or not? | treeform wrote: | 1. As I understand the {.acyclic.} pragma is way to telling the | compiler that you take full responsibility. Otherwise without | the {.acyclic.} pragma it will spend the extra time scanning | the objects for cycles. | | 2. It appears that Nim will use "move ownership of isolated | object graph" from one thread to the other. The cycle detection | code is very similar to freeing an object graph to moving of | the ownership of an object graph to a different thread. | | I would also like more clarification from the creators. The | above is just how I understand it right now. More simple | example code would be great. | elcritch wrote: | The isolation technique is an area I've been interested in | and following. It's not too well documented as that area is a | work in progress and not really useable currently. Almost | though! | | First it's good to note that ARC improves upon the standard | reference counting as in Obj-C or Swift by using move and | sink's which removes many redundant ref's/decr's. That is if | Swift hasn't added it. The compiler can infer when ownership | is moved or not. Simple ideas but put together in a powerful | fashion! | | As I understand the isolation check, it is a compile time | check that an object being moved to another thread is really | isolated with no other external references. This can | sometimes be done at compile time similar to acyclic | structures though I don't know the algorithm used. You can | also use a runtime check which, appears related to cycle | collection. Essentially you can count if the total internal | references account for all the references in a given object | graph -- probably equivalent in cost to an ORC cycle check | for that data sub-graph. Whether done at compile time or | runtime the plan is to wrap the object in a special 'Option' | like type 'Isolated[T]' that can be accepted by multithreaded | constructs. So channels or concurrent queues, etc can declare | that they only accept data that can safely be moved between | threads using the standard move/sink annotations. This also | lets users create their own. | | It boils down to 'Isolated[T]' just being a data check, and | you can move data between threads at your own risk. That's | what I'm doing currently since I'm doing multithreaded | embedded programming and using some C Queue constructs. The | move/sink and ARC handle the actual data movement. In theory | you could use Isolated runtime checks during testing (if your | data structures are too tricky for compile time checks), but | disable it for release. Similar to cycle checking. There's a | setting to use ORC to print detected cycles, then in release | you can turn off ORC. | wisty wrote: | If it's verifiable by the compiler, then it shouldn't need to | have a pragma, just optimise it automatically. (OK, it might be | useful as a hint). I'm guessing there's grey areas where this | would be some kind of halting problem type issue though - you | won't know until you run it. Constructing a graph that you know | to be a tree would be a trivial example. | | Runtime checks would be possible but then you'd have to check | every time you create an object. | throwaway_pdp09 wrote: | Pretty sure no efficent run-time algo for general cycle | detection exists. It has (AFAIK) been a big area in garbage | collection for a long while, so smart people have been all | over it. | | Not sure if one exists for compile-time either. | bobthebuilders wrote: | How do GCs collect cycles then? I'm pretty sure most modern | GCs handle cycles effortlessly. | Someone wrote: | It depends on what you call 'effortlessly'. | | And, technically, GCs collect non-garbage, and throw away | what remains. They don't care whether what they throw | away contains cycles or not, but have to take care that | they don't keep running in circles when they follow links | between objects while collecting the non-garbage. | Someone wrote: | I don't think so, either, but the only time a cycle can be | introduced is if one modifies an object to make it point to | a younger (created later in time) object. | | In some languages, that's relatively rare because modifying | objects after they are created is rare; immutable objects | can be part of cycles, but they can't be the cause of | cycles. That may be enough to make the overhead of cycle | detection small. | | I guess one could keep track of those 'dirty' links, and | use them to avoid having to detect cycles in most cases. | | It might be enough to add a single bit to each object-to- | object pointer indicating "this reference was set after | construction of the object" (could be an unused bit in a | virtual address, or the least significant bit, so it | wouldn't introduce memory overhead). | | With such bits, at first sight, one would only have to | check for cycles when P references a _possibly_ younger Q, | and destruction of some object (indirectly) decreases P's | ref count to 1. However, things probably would get hairy | when objects may have many outgoing and/or incoming | pointers (consider a set of 3 objects each referencing the | other two). | | I'll go and read the referenced paper | (https://researcher.watson.ibm.com/researcher/files/us- | bacon/...) to see how it handles this. | skulk wrote: | There's a lot of praise for ORC[1], but is there any discussion | of the downsides of this approach? One potential downside is | increased compile times, something that hasn't ever been a | problem for Nim. However, I'm not sure what kind of load ORC puts | on the compiler, and how that will scale to large projects. | | [1]: https://forum.nim-lang.org/t/6483 | planetis wrote: | I have switched to `--gc:arc`/`orc` for some time but I haven't | noticed any difference in compilation speed. | treeform wrote: | It's great that Nim is pushing boundaries in the memory | management space. I remember reading how Java and Go were able to | scan large heaps faster and faster. But what if you don't even | scan the heap? Brilliant. | disruptek wrote: | Also, we don't scan "live" memory, which matches your intuition | -- why would we? | int_19h wrote: | I may be missing something here, but reference counting + cycle | detector is what Python has been doing for many years now? | | In my experience, it's the worst of both worlds. You no longer | have the full determinism of reference counting - you only | retain it for non-cyclical structures, and in a large enough | program, it can be easy to introduce those without noticing | (with callbacks etc). But you still have the overhead of RC on | every reference assignment. | gok wrote: | Extremely cool. Reference counting with cycle collection might be | an idea whose time has finally come. ___________________________________________________________________ (page generated 2020-12-08 23:01 UTC)