[HN Gopher] Custom Allocators in Rust ___________________________________________________________________ Custom Allocators in Rust Author : g0xA52A2A Score : 100 points Date : 2023-04-08 09:53 UTC (13 hours ago) (HTM) web link (nical.github.io) (TXT) w3m dump (nical.github.io) | antimora wrote: | Does anyone know a custom allocator that allocates memory for the | same pattern usage? I am trying to find a good allocator for | tensor operations used in Burn's framework | (https://github.com/burn-rs/burn). It uses the same vectors | without any varying size changes. | foota wrote: | There's been some side work towards defining dynamic scoping for | rust. E.g., allow functions to declare a dependency on some value | or type, and then require callers to either pass that dependency | explicitly, or have it on their dependencies. | | One of the proposed uses was for allocators. | | https://tmandry.gitlab.io/blog/posts/2021-12-21-context-capa... | is one of the more cogent proposals. | | I don't think any of these has made substantial progreess towards | being approved though. | forrestthewoods wrote: | That's pretty neat. Seems very similar to Jai's implicit | context. I'm definitely a fan of context objects. It solves so | many painful problems related to allocators, logging, etc. | fpoling wrote: | The article assumes that storing an allocator together with a | container inevitably bloats the container by 1-2 CPU words. This | is not necessary so. One can require that it should be possible | to get the allocator from a pointer it allocates. With many | allocator designs this will costs like one CPU word per CPU page | which is trivial. | | This still requires to parametrize the container on the type of | the allocator, but there will be no penalty on the size of the | containers and the API will be safe. | phkahler wrote: | How much better is Rusts default than C++? I would expect it to | be better since they have all that nice lifetime information on | every piece of data. It seems like some nice things could be done | with that. Are they? | LegionMammal978 wrote: | By default, Rust just delegates to the ordinary libc malloc(), | realloc(), and free(). In safe code, allocations are managed | through RAII, just like in modern C++. Box works like | unique_ptr, Arc works like shared_ptr, Vec works like vector, | and so on. Lifetime annotations are just a way for the | programmer to tell the compiler that none of the rules are | being broken: they don't actually affect the semantics of the | compiled program, they just make it fail to compile if there is | an error. The underlying UB rules (except for those around | unique references) are not much stricter than those required by | C++; meanwhile, the lifetime rules are fairly strict, but the | compiler cannot optimize based on them. | | At best, the lifetime system could allow you to use a faster | design for your library, by placing tighter compile-time | constraints on allowed use cases than you could have feasibly | documented in a C++ library's interface. But examples of this | are not too common in my experience, since most find it | acceptable for C++ libraries not to accommodate sufficiently | weird use cases. | cormacrelf wrote: | > _This one is probably closest to what one would write in | languages like C++. The data structure just assumes the allocator | will outlive it, and it is up to the user to either use an static | allocator, or pretend to using unsafe code to cast away the | lifetime while making sure that the allocator outlives the data | structure without the compiler 's support._ | | _I know many in the Rust community will frown upon this | approach, but to be honest I don 't think that it is a terrible | solution in the context of an advanced feature like custom | allocation strategies. Tessellator does not expose an unsafe API, | but it documents that if its users were to break the rules they'd | simply have to make sure the allocator outlives the data | structure. In any other language with this kind of control over | memory management, this contract would have to be manually upheld | by users of the API and it is considered normal._ | | ... no thanks. | lenkite wrote: | Well, I hope we get custom allocation soon in Rust. Its really | strange not having the same in an advertised system programming | language. | | Can help in incidents like the below: | | https://www.svix.com/blog/heap-fragmentation-in-rust-applica... | Diggsey wrote: | You can already do custom allocation in Rust. These efforts | are about bringing support for custom allocators to the types | in `std` so that you can combine use of a normal `Vec` with a | custom allocator, instead of needing to definte your own | `Vec` (or at least pull one in from a 3rd party crate). | tomjakubowski wrote: | You might want custom local allocators. You can already | replace the global allocator with one designed to reduce | fragmentation, as described in the article. | andrepd wrote: | The value proposition of Rust is precisely having zero-cost | abstractions and highest-performance code with compile-time | verification of correctness. | | That being said, I don't oppose the occasional judicious use of | unsafe. If 99% of your code is verified, it's not 100%, but | that's still a massive improvement over a C++ codebase. | likeabbas wrote: | I think `unsafe` would have been more aptly named | `compiler_unverifiable`. IMO there would be less apprehension | to using `unsafe` when it's needed. | Yoric wrote: | `unchecked` might be more palatable, but I agree that there | is some uncomfortable mismatch between the meanings of | "unsafe" and `unsafe`. | ZephyrBlu wrote: | As someone who mainly uses higher level languages, doesn't | care for C/C++ and really likes Rust, I'm glad it was named | so strongly and that safety and correctness is very | important in the community. | | It creates a strong incentive to only write safe Rust, | which is great for the vast majority of people. | anonymoushn wrote: | In this case it would be nice if operations that are | type-safe and don't read or write memory at all, such as | mm256_shuffle_epi8, were available in safe rust. | jsheard wrote: | It's a work in progress, on nightly there is a safe and | portable SIMD abstraction under development in std::simd. | | e.g. "mm256_shuffle_epi8" on X64+AVX2, ARM64+NEON and | plain ARM: https://rust.godbolt.org/z/7rjKE93Kn | | It currently only works with constant shuffle masks but | dynamic shuffles are on the to-do list. | likeabbas wrote: | I think the incentive to write compiler verifiable Rust | would be the same as safe Rust, but with less fear for | the situations where you do need to bypass the borrow | checker such as with cyclical references in graphs | (currently doing this right now by re-writing a basic NN | in Rust). Even the standard library uses unsafe for | certain situations. | proto_lambda wrote: | > with less fear for the situations where you do need to | bypass the borrow checker such as with cyclical | references in graphs | | That's a pretty tricky thing to get right, and with the | consequence for getting it wrong being UB, at least a | little fear is warranted. | likeabbas wrote: | `compiler_unverifiable` isn't risky enough for you? | ZephyrBlu wrote: | It doesn't have the same connotations as `unsafe`. | likeabbas wrote: | But it's the true definition being stated. Unsafe is | subjective, compiler unverifiable isn't | ZephyrBlu wrote: | So? | | This is like applications adding artificial delay to | operations so users aren't surprised they complete so | quickly. | | User understanding is more important than definitional | correctness. | peyton wrote: | Ok Rust is about a lot of things, but correctness is not one | of those things. Dafny is an example of a programming | language that's about correctness. | cormacrelf wrote: | We can talk in specifics: this example was about making | people use unsafe to avoid having to type <'static> in the | general case. The author had been trying for a number of | paragraphs to avoid polluting the type name with generics or | lifetimes. This is what going too far looks like. | | The std APIs look like this: struct Box<T, | A: Allocator = Global> | | And it's been this way in stable releases for the last year | or so. The same has been done for Vec and all the other | std::collections. What percent of Rust programmers do you | think even noticed at all? 1%? The most flexible design and | the least impacting on regular users. You can use A = &'a dyn | Allocator if you like, equally you can choose a ZST and not | pay for 16 bytes of storage. The library author has no need | to choose in advance at all, which is great if they're | determined to make weirdly constrained choices, ultimately | forcing most uses of the API to be unsafe. I stopped reading | after that so I don't know which design they went for. | mikepurvis wrote: | And if the generated documentation is really the main | sticking point, it would be perfectly possible to special | case the allocator type parameter in rustdoc so that it | hides or otherwise deemphasizes it. | throwawaymaths wrote: | > The most flexible design | | Not quite. For the most general case, you probably want | allocator to be a proper parameter instead of a type | parameter. For example, suppose you have a green thread | that you want to have its own isolated heap. Then you can't | assume that any given allocator is a singleton in its type; | The struct itself must somehow be able to find to its | "owner" on release. In the green thread case you can't | "just use threadlocal" because a green thread might not be | sticky to an os thread. | twic wrote: | Rust already has almost exactly this problem with keeping | track of the current task in the async framework. As far | as I know, the solution is indeed to use a thread local, | and be scrupulous about updating it on task entry. It's | ugly but it seems to work. | stephc_int13 wrote: | I think that avoiding global allocation and lifetime management | is good design and a way to make memory management as simple and | flexible as it should. | | But, from my own work in the field, I came to the conclusion that | there is a better way than passing custom allocators: passing | containers instead. | | In practice this is almost the same thing (except that containers | are made to be easily traversed) but semantically there is a | subtle and positive difference. | | The goal is to avoid confusing the allocating policy (let's say | arena versus heap) with the allocator instance. | conaclos wrote: | What do you mean by Containers? | | A good abstraction for passing allocators could be effect and | effect handler. | stephc_int13 wrote: | A container is basically a structure holding a collection. | | In that case, the container also acts as an allocator (the | feature can be embedded directly or not) | | The only practical difference is that a container should | provide some facilities for traversal, while the allocator is | only taking care of allocating and freeing. | | From my experience, I very rarely allocate things without | keeping them in a kind of container, it can be something as | simple as an array of references. | Reitet00 wrote: | Great to see some movement in this area for Rust. Compared to Zig | which had this from day 1 it may be hard to adjust Rust (the way | it was hard for Go to adopt generics). | Diggsey wrote: | I think most of the challenges are around making custom | allocators _safe_ without harming the ergonomics, which Zig | hasn 't solved. | | If Rust was content to have use of custom allocators be unsafe | it would be trivial to add them (since you could just add new | variants of allocating methods that take an allocator as a | parameter). | hansvm wrote: | Are you just reminding us that Rust does some checks that Zig | doesn't, or are you saying that there's some particular | footgun in their allocators above and beyond the fact that | UAF and other memory bugs are writeable in general? | Diggsey wrote: | Neither, I was disagreeing with the comparison to Go and | generics. | | Generics in Go don't add anything beyond what generics | already do in other languages, so the challenge with | bringing generics to Go is "how do we adapt the language to | support a feature that already exists in other languages | and is generally well understood". Bringing generics to a | language that wasn't designed with them in mind has often | resulted in a sub-optimal implementation (eg. Java vs C#). | | On the other hand, "safe custom allocators" are not a | feature that any language (to my knowledge) has solved. | It's not as though this was an oversight in Rust's initial | design: using a custom allocator in an unsafe context has | always been possible in Rust, and it's too early to say | whether bringing this feature into the language later will | result in a similarly sub-optimal design: in order to be | sub-optimal there would have to exist some better solution | out there, and there currently doesn't. | fpoling wrote: | Zig approach essentially parametrizes instances of containers | on the instance of the allocator. To properly support that in | the type system one needs dependent types. | | In theory that can be done, but the consequences for the | compilation time will be extreme as the compiler becomes | essentially a generic theorem prover. | | Plus the noise from the proofs in the sources will be much | bigger than that of type parameters. | throwawaymaths wrote: | I think realistically if you want to prove safety of memory | provenance in zig, you assume that allocators are sound, | and you just track the lifetime of the memory from | alloc/create to free/destroy and call it a day. This is | probably "good enough", and in rust you're assuming the | allocation is sound as well, it's just implicit. | throwaway17_17 wrote: | I'm pretty sure the 'logic' require for paramterizing | container types over allocators can be restricted in a wide | majority of cases to a linear set which would allow for | refinement types to be used. This would eliminate the | requirement for full theorem proving. | | In fact, it would probably be acceptable to insist on | keeping the scope of parameterization limited to the | linearly determinable set of values and operations on those | values. | fpoling wrote: | A Zig-style allocator that is passed as arguments | requires to introduce a dependency of the type of the | container on the instance of the allocator. | | Perhaps it is possible to specialize for that case a | generic prover, but I am sceptical that the effect on | compilation timing will be minimal. ___________________________________________________________________ (page generated 2023-04-08 23:01 UTC)