[HN Gopher] The pervasive effects of C's malloc() and free() on ... ___________________________________________________________________ The pervasive effects of C's malloc() and free() on C APIs Author : todsacerdoti Score : 108 points Date : 2022-08-07 14:03 UTC (8 hours ago) (HTM) web link (utcc.utoronto.ca) (TXT) w3m dump (utcc.utoronto.ca) | greatgib wrote: | lpapez wrote: | _Sure, it is not for script kiddies, and for the usual new | 'web' developers that are now living in easy dev sandboxes. | But, now, most applications are really mediocre. Everything | uses hundreds of MB or GB of memory just for simple | programs..._ | | Do you realize how snobbish this sounds? | jeffbee wrote: | Well the example of the Linux kernel is an extremely bad one | because it is absolutely stuffed full of memory management bugs | on error paths. It will leak from `goto` beyond the | deallocation, or it will free memory that was never allocated | if it branched over the allocation. When pressed to its limits, | for example by running a container out of memory, the linux | kernel memory management falls to pieces. | eska wrote: | In practice this is mostly avoidable. There are many libraries | that do not allocate at all by forcing the caller to provide the | memory. The library may tell in advance how much memory is | necessary, or report in hindsight whether it was enough for the | operation. This leads to a style of allocating ahead of time and | also considering worst case memory size. | | Otherwise I prefer libraries that allow setting the allocation | function at least. | Sharlin wrote: | Of course this very paradigm, combined with the fact that C | doesn't have a proper pointer-and-length type, led to `gets` | and around n million other security disasters when bad APIs | would just take a pointer without a length and assume there's | "enough" space... | | Anyway, the whole "caller allocates" concept doesn't work too | well in the particular case of `gethostbyname`, which as the | author mentions, is a complex struct containing several | pointers and double pointers, with a potentially unlimited | number of different-length allocations one would have to make! | dathinab wrote: | There is a funny version of this: | | Caller provides memory and function fails if there is to little | memory, but writes the the required (variable) amount of memory | to an `out length` pointer. | | Effectively leading to a common pattern of: | | 1. call function with empty buffer (or buffer or arbitrary | size) | | 2. allocate buffer | | 3. call function again with properly sized buffer | | 4. add a loop if between 1 & 3 the required buffer size can | change | | its fascinating how well it works for some of it's common use | cases and how subtle but badly broken it can be for other use | cases :=) | userbinator wrote: | That's classic "Microsoft style" and I've always hated APIs | like that because they add unnecessary complexity to the code | of all their callers. | severino wrote: | Please excuse my little experience with this topic, but when | the article says this: | | > If this structure is dynamically allocated by gethostbyname() | and returned to the caller, either you need an additional API | function to free it or you have to commit to what fields in the | structure have to be freed separately, and how | | With your approach, wouldn't you need to free all the fields in | the structure separately as well? Because, what's the | difference between the library allocating the memory -with the | problems the article points out- and allocating it yourself? | liuliu wrote: | Yes, it is educated that way. However, for the case requires | some scratch space, dynamic allocations can be easier to | manage, less error prune and with less overall memory usage | (you don't need to preallocate worst case amount). | | That's why even people knew, many C APIs still return dynamic | allocated objects or simply let you inject malloc / free if you | want more control. | | This is a roundabout way to say: if you aspired to provide APIs | with zero dynamic allocation, go ahead. But if you find | yourself struggling with more complicated code as a result, | think about just letting a little bit dynamic allocations may | help. | layer8 wrote: | This is the correct approach. Let the caller allocate the | memory, and if necessary provide a function to calculate the | required memory beforehand, and for the caller to specify the | size of the memory passed to the primary function. For | dynamically-sized results, another option is to provide | iteration functions (like when walking a directory tree) in | order to simplify memory sizing for each individual function | calls. | stormbrew wrote: | The especially great thing about this approach is it lets you | put things on the stack when appropriate too. It's really | annoying when some library forces you to spam the heap with | short lived, easily scoped objects. | eska wrote: | True. Also great for thread safety. And avoiding shared | access performance pitfalls | bArray wrote: | I've always wondered why there isn't a `free_all()` function | (that I'm aware of) for exactly ensuring that handles this. Or | why you can't probe memory, with something like `is_alloc()` (is | allocated). | | I usually define my own version of `free()` to check whether the | pointer is NULL, free the memory if not, and then set the pointer | to NULL. That way if your pointer isn't NULL, it should be | pointing somewhere. I believe there are some caveats though, | specifically around OOM allocations as memory isn't truly | allocated until you go to access it. | | C memory is generally quite cool to work with, but those tripping | points really will trip you up. It's exceptionally easy to have | stuff lingering around indefinitely, even worse when it happens | in a loop. | c-smile wrote: | Such problem is pretty widespread and not limited by just | structures, C-style strings are there too. | | For example: const char* getenv(const char* | name); | | Do we need to free the string? And if "yes" then how? It is not | realistic to provide free** for each such API function... | | In Sciter API (https://sciter.com) I am solving this by callback | functions: typedef void string_receiver(const | char* s, size_t slen, void* tag); const char* | getenv(const char* name, string_receiver* r, void* tag); | | So getenv calls string_receiver and frees (if needed) stuff after | the call. | | This is a bit ugly on pure C side but it plays quite well with | C++ where you can define receiver for std::string for example and | define pure C++ version: std::string | getenv(const char* name); | | It would be nice for C to have code blocks a la Objective-C ( | https://www.tutorialspoint.com/objective_c/objective_c_block... | ), with them solution of returning data is trivial. | juped wrote: | You don't free the pointer returned by getenv(), because the | environment variables are already in memory and getenv() is | just giving you a pointer to one of their values. | | The most comfortable way to do a C API is to make the caller | allocate space for return values (that aren't something simply | copiable like int), and take a pointer to it as a parameter. | The few standard library functions that malloc things are | annoying, because you might not want to do that. | c-smile wrote: | getenv() is just a sample. But even with it... it puts some | limitation on potential API implementation and overall system | performance. E.g. const char* user_password() cannot store | the data anywhere, right? All that... | | > The most comfortable way to do a C API is to make the | caller allocate space for return values | | That's even worse. How will caller know size of the buffer | upfront? | | With the callback approach that is trivial - you get the size | on call - no need to call the API function twice - for size | of the buffer and then for real copy. | coliveira wrote: | In C the way to solve this is to look at the man page for the | function and see what they say about memory allocation. There | is no magic involved. | c-smile wrote: | Documentation solves just one problem: to free or not to | free. | | But there are performance, security and other issues. | | What if it is significantly more performant for getenv() (or | whatever) to fetch needed data using alloca (on stack, with | fallback to heap/malloc)? | | Returning naked pointer is far from being flexible really. | smarks wrote: | I'm convinced there was another style of C API where the callee | would malloc a struct, populate it, and free it immediately | before returning a pointer to it. Of course, the only way the | caller can use the result is after it has been freed. | | Naturally there was a dependency on the exact behavior of the | allocator, specifically, that it had to leave a freed block of | memory untouched sufficiently long for the caller to be able to | use the results. I seem to recall the stipulation was that freed | memory was left untouched until the next memory allocation | operation. The caller also had to be careful about using or | copying the results immediately, before doing too much work. | | I have dim memories of people talking about this sort in thing at | university in the early 1980s; we were a 4.2 BSD shop. I also | recall debugging some old C source code (srogue, which also has | BSD heritage) decades later, and encountering use-after-free | crashes. There were several instances of this. There were too | many to be accidental; it seemed deliberate to me. | | I suspect the reason for this "technique" was to relieve the | caller of the burden of freeing the memory. It allowed the caller | to return variable-length data easily, which couldn't be done if | the pointer was to a static data area. And finally it relieved | the callee of defining an explicit "free" API. | | Frankly I think this is a terrible API style. However, code that | used it properly and that was sufficiently careful would actually | function properly. But it seems like an incredibly fragile and | sloppy way to design a system. | mtlmtlmtlmtl wrote: | Man this sounds like something some students came up with after | partaking in the ganja. And it coming out of Berkeley in the | 80s certainly tracks with that... | | Not a good way of doing things. I mean, have fun using | Valgrind. Or switching out libc, etc. And what about key | material? There you would still have to do a second step of | zeroing or junking the memory when done with it anyway. | smarks wrote: | Yes, grad students at Berkeley in the early 1980s. For some | reason I associate this technique with Bill Joy (who | obviously was a major influence on a lot of what went into | the 4bsd releases). However I have no evidence of this, nor | whether any or what kinds of substances might have been | involved. | bitwize wrote: | > I'm convinced there was another style of C API where the | callee would malloc a struct, populate it, and free it | immediately before returning a pointer to it. Of course, the | only way the caller can use the result is after it has been | freed. | | That's more than a bad API design, that's undefined behavior -- | squarely in nasal-demons territory. Depending on the compiler, | the callee, the caller, or the entire observable universe can | be optimized away into a no-op. | smarks wrote: | Oh yes, totally undefined. | | But consider the time frame, early 1980s K&R C on 4bsd Unix | on a VAX. This predates ANSI/ISO C and Posix. It even | predates "nasal demons." There was no specification; or | perhaps the implementation _was_ the specification. The fact | was that at some point the bsd allocator did leave freed | memory untouched until the next memory allocation operation, | and so people wrote programs that relied on this. | | Again, I'm not defending this, but this seemed to be the way | that some people thought about things. I even remember | questioning some code that used memory after having freed it. | It was explained to me that this was "safe" because the | memory wouldn't be modified until the next malloc! | | Also, remember that BSD was the system where if you did | printf("%s", NULL); | | it would print "(null)" instead of getting SIGSEGV. And in | general, deferencing a null pointer would return zero. The | rationale for this was that it "made programs more robust." | (Again, I disagree, don't argue with me about this!) | | One more common technique from the BSD era (srogue again, but | other programs did this too). To save the state of a program, | write to a file everything between the base of the data | segment to the "break" at the top of the data segment. To | restore, just sbrk() to the right size and read it all back | in, overwriting everything starting at the base of the data | segment. I always found it surprising that this worked, but | it worked often enough that people did sh!t like this. | Athas wrote: | Well, this sounds like it was before ANSI C, so there was no | defined notion of undefined behaviour - I think that term of | art came with the later standardisations. And if it was | written to run on a specific OS or compiler (4BSD), one can | argue that it was a really bad design, but it worked reliably | on what was essentially the implementation-defined platform | it was targeting. | giomasce wrote: | It also fails on multithdreaded programs, unless you add even | more assumptions on the allocator. | smarks wrote: | "Multidreaded programs" :-) | | This was at least a decade before multithreading in C and | Unix. But yes this "technique" would have failed miserably in | a multithreaded environment. | dragontamer wrote: | Sounds like just a common mistake where people used the "stack" | accidentally. | | Ex: struct someStruct* badfunc(){ | struct someStruct toReturn; toReturn.a = foo(); | toReturn.b = bar(); return &toReturn; } | | In most people's code, this would probably work... | struct someStruct a = *badfunc(); func2(); // This will | overwrite the "toReturn" // from the last | call, but as long as the // struct was copied | before any other function // call, you're | probably fine though in // undefined-behavior | land | | ----------------- | | Either that, or you're talking about strtok (and other non- | reentrant functions). | smarks wrote: | It was definitely malloc'd memory, as I remember removing | free() calls from the callee and adding them to the caller. | pitched wrote: | I've used this before in embedded code to save rom space by | not having to include malloc. The stack _is_ the heap! Just | have to be very careful about when you can call another | function. | | The best version of this is where you allocate a block on | your stack then pass that as a pointer up to the next | function to use. The one who owns the memory is the one who | allocates it (Rust style?). Or, have the linker allocate | global blocks works too. | tpoacher wrote: | Question, why would this "probably work"? I would have agreed | with your assessment if `a` was used for its one and only | purpose before `func2` was called ... but as it stands as | soon as `func` is called, the content of `a` will most likely | be replaced by garbage (and therefore definitely will not | "work" in any reasonable sense of the word), no? | | Or do you mean something else by "probably work"? (like, in | the sense that it will output "something"). | 13of40 wrote: | Since func2 doesn't have any parameters, the only thing | that calling it will put on the stack is the return | address. In fact, if it's the last function call in the | code block, even that might get optimized out. The struct | should be intact and available via a local variable in | func2. | Jach wrote: | In GC languages a common approach is to have "finalizers" to | make something like this a possible and sometimes convenient | way of dealing with a foreign API, I wonder if what you saw was | something similar? The idea is to allocate the foreign memory, | then make a finalizer which is just a hook that will | (eventually) call the foreign free only when some object (like | a wrapper for the foreign memory) is collected. Something | similar could be hidden behind some preprocessor macros, with | guarantees only until the next OUR_MALLOC... | | The problems for the GC languages tend to be fewer but if the | wrapper is out of scope but someone grabbed and maintains a | hold on the foreign memory directly, they're playing with fire | as for when the GC will execute the finalizer hook and make | that memory invalid. It's also a frustrating technique when | foreign APIs -- particularly in certain graphics contexts -- | require allocation threads to be the same as freeing threads, | and of course depending on the implementation of free and the | GC it might be an expensive operation to have a bunch of them | suddenly happen at once when all you were expecting was a new | native object and not a bunch of GC work behind the scenes. | smarks wrote: | Definitely not GC. This was K&R C on BSD Unix around 1984. | pjmlp wrote: | That is why finalizers are yesterday solution, most modern GC | based languages have eventually catched up with Common Lisp | and offer region based resource management (try-with- | resources, use, using, defer, with,...), and in some cases | trailing lambdas, which completly hide the resource | management from the consumer. | | For scenarios like you're describing, .NET has SafeHandles | for example. | derefr wrote: | Sounds like it was just using the heap to badly imitate | returning variable-length data on the stack under a callee- | preserved calling convention. (Callee writes the variable- | length data inside their own stack frame, pops the stack frame | in the function epilogue, and "leaks" the pointer and size of | the data in caller-expected return registers. Caller uses the | dangling data -- carefully not pushing to the stack until it | has finished. Everything works out.) | giomasce wrote: | One solution I've sometimes seen or the wild is to mandate that | the library allocates just one big malloc chunk and arranges | pointers inside that chunk. So the caller has to free just one | thing. It's more inconvenient for the library, though. | Animats wrote: | Well, yes. It's 1970s C technology. | | There are several options. They all suck. | | - Pass in a buffer to be filled by the API. The API can't check | the buffer size you gave it. Be mentioned in a CERT security | advisory for creating a buffer overflow vulnerability. | | - Have the API give you a buffer. Reboot your system regularly to | recover the memory leaks. | | - Free the buffer before returning it, so the caller is using the | buffer after free. Debug memory corruption bugs when someone uses | an allocator which overwrites freed buffers. | | This is what move semantics are for. You call something, it gives | you a thing, and now it's yours to use and release. Needs | language support to work well, but is the right answer. | HarHarVeryFunny wrote: | Just one minor quibble: I'd say it's C++'s classes (not move | semantics) that made doing things like this much cleaner - | mostly by having a destructor that let's data structures | automatically clean up after themselves (release memory) when | they go out of scope. Now the developer doesn't need to know or | care about whether the structure they were given has internal | dynamically allocated components or not. | | Move semantics is "just" an optimization that makes passing | data-owning classes around more efficient. Pre C++11 you'd just | return the class by reference to avoid the inefficiency of | return by value, but with move semantics you can treat complex | types the same as simple ones, and not worry about the | efficiency of how you pass them around. | Sharlin wrote: | There's another way: pass a pointer to the cleanup function as | an out parameter - if it's a struct you're returning, just | return the pointer as one of the struct fields. This is, of | course, how "OO" as in methods and polymorphism is sometimes | simulated in C. This way you don't need to pollute your API | with a zillion different `free_foo` functions. | chjj wrote: | Just an interesting aside: returning structs directly (e.g. | `struct foo bar();`) was added in V7 UNIX (1979). That said, the | convention for it was pretty archaic: behind the scenes PCC used | a static return buffer and the caller knew to copy the struct | from the returned pointer afterwards. So what looked like thread- | safe code was actually totally broken by today's standards. | | GCC still supports this with -fpcc-struct-return[1] (though, the | modern man page doesn't seem to mention the static return | buffer). | | Also just because there were no threads back in the day doesn't | mean static return buffers were okay. In some cases, invoked | signal handlers could still call something and corrupt your | statically allocated return buffer. So making any system call | after receiving your static return pointer was a footgun to watch | out for: struct foo *bar = some_lib_func(); | time(0); /* potential breakage */ | | [1] | https://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Incompatibilities... | wwalexander wrote: | I'm not sure how common it is for libraries to return heap- | allocated memory like this, vs. taking a pointer to an | uninitialized value. | userbinator wrote: | _This became a serious issue when Unix added threads (this static | area isn 't thread safe)_ | | I'm not convinced it's "serious" --- thread-local-storage easily | solves that. | | _Since this structure contains embedded pointers (including two | that point to arrays of pointers), there could be quite a lot of | things for the caller to call free() on (and in the right | order)._ | | Again the solution is simple: Allocate everything at once, so | that free() need be called only once on the returned block. | | In some ways I think the relative difficulty of using dynamic | allocation in C compared to other languages is a good thing --- | it forces you to think whether it's really necessary before doing | so, and in many cases, it turns out not to be. That way | encourages simpler, more efficient code. In contrast, other | languages which make it _very_ easy to dynamically allocate (or | even do it by default) tend to cause the default efficiency of | code written in them to be lower, because it 's full of | unnecessary dynamic allocations. | Athas wrote: | > I'm not convinced it's "serious" --- thread-local-storage | easily solves that. | | What about other cases of exceptional control flow? What | happens if a signal arrives while that static area is being | used, and the signal handler also needs to use the static area? | userbinator wrote: | The set of functions that can be called from a signal handler | is very small (most of them corresponding to system calls or | otherwise non-stateful functions like strchr()); | gethostbybame() is not one of them, and neither are malloc() | nor free(). | thxg wrote: | One slight variation on the getaddrinfo()/freeaddrinfo() approach | is what (among many others) GMP [1] and its derivatives do: For | every struct or custom type, you systematically get | void type_init(type *t, [...]); void type_clear(type *t, | [...]); | | This is essentially explicit constructors and destructors in C, | and one can legitimately argue that it is clunky, verbose and | error-prone. | | However, if we are constrained to a C API, it does have one | important practical quality in my experience: Because it is | always the same, it eases the mental load on both the API's user | and the API's implementer, especially if there are many such | types involved. | | [1] See e.g. https://gmplib.org/manual/Initializing-Integers | vlovich123 wrote: | CoreFoundation at Apple had a similar convention. Any memory | obtained by an API named Create or Copy would have similar | Delete method that always had to be called. | deathanatos wrote: | > _and one can legitimately argue that it is clunky, verbose | and error-prone._ | | Clunky and verbose, yes, error-prone no. The APIs that do that | are generally much more clear about ownership, and thus much | easier, IMO, to write correct code for. Much worse is the API | that returns you a pointer with no obvious mechanism to free | it. Is it tied to the lifetime of an input to the function that | returned it? Is it a global and this API is completely thread- | unsafe? Am I leaking memory? | throwawaymaths wrote: | Especially for certain generics (e.g. hashmaps) you might want | to have different allocators without creating a whole separate | type: for example a global one, a threadlocal arena, etc. | username223 wrote: | > explicit constructors and destructors in C | | Exactly, and that's the right way to do it. In a language | without implicit destructors/finalizers, you need a way for | callers to say "okay, I'm done with this thing." And even with | GC, you need finalizers to take care of non-memory resources. | This may be clunky in C, but that's what you get in a language | that makes you be explicit. | Someone wrote: | This is unavoidable in any language that (supports dynamic memory | allocation and) moves dynamic memory allocation into a library. | | And it goes even further than the article claims: even functions | that allocate a flat structure on behalf of the caller and return | it should provide a companion function to free it. Reason is that | the caller and the called function might have a different idea | about what _the_ memory allocator is. That's rare on unixes, but | was reasonably common on Windows with cross-DLL calls | (https://codereview.stackexchange.com/questions/153559/safely...) | | Also, say a DLL function returns a char pointer containing a | string. How would you know whether to call _free_ or _delete_ on | it? Or, maybe, the equivalent of _free_ in Frob, the language | that DLL happens to be written in? | masklinn wrote: | > How would you know whether to call free or delete on it? | | By making ownership part of the API (and ABI). | | Sadly C is unable to express this, and thus so are FFI layers. | legalcorrection wrote: | olliej wrote: | I'm unsure why you're saying there's rust spam here? | | "lifetime" is not a rust only concept, syntactic lifetimes | might be, but the idea of C APIs specifying the lifetime of | a returned value is not new, novel, or rust specific. | | Many C APIs have SomeLibraryCreateObject(...), | SomeLibraryRetainObject(...) and | SomeLibraryReleaseObject(...) - or a more basic but less | flexible SomeLibraryCreateFoo() SomeLibraryFreeFoo(...). | | The important thing is that the API specifies the lifetime | of the returned value, idiomatic APIs do stuff like | "SomeLibraryGet..." does not transfer ownership, | "SomeLibraryCreate..." "SomeLibraryCopy..." etc do. | Generally this works more robustly with some variation of | refcounting, but you can be copy centric and say "if you | want to keep this data call SomeLibraryCopy(..)". | KerrAvon wrote: | C is unable to express this as a machine-readable attribute, | but you can certainly document it as part of the API contract | and teach the FFI layer about it. This doesn't scale, but an | FFI layer rarely translates directly into the language's | idiom without some manual effort. | pjmlp wrote: | On Windows it is clear, memory allocated by DLLs belongs to | them, and should be deallocated by APIs exposed by them, and to | play safe you should use Win32 APIs for memory management and | not rely on the free()/malloc() provided by the compiler. | josephcsible wrote: | > the free()/malloc() provided by the compiler | | Nitpick: they're provided by the C runtime library. | pjmlp wrote: | Which on non-UNIX platforms means the C compiler that one | bought, not necessarly from the OS vendor, as libc isn't | traditionally part of the OS APIs. | josephcsible wrote: | I don't think there's a perfect 1:1 correspondence | between compilers and libc versions on Windows today, but | even if there were, it's still a distinction worth | making. For example, if two libraries both statically | link the C runtime, that counts as separate ones (and so | a malloc in one paired with a free in the other will | wreak havoc), even if they're the exact same version. | cesarb wrote: | > and to play safe you should use Win32 APIs for memory | management | | Which ones? HeapAlloc/HeapFree? LocalAlloc/LocalFree? | GlobalAlloc/GlobalFree? CoTaskMemAlloc/CoTaskMemFree? | VirtualAlloc/VirtualFree? Something else? If the answer is | HeapAlloc/HeapFree, which heap? Should you enable the low- | fragmentation heap or not? | pjmlp wrote: | Doesn't matter to the caller, because they aren't supposed | to be clever and call any of them instead of the APIs | exposed by the respective DLL for resource management. | | The DLL authors should better know what APIs to call | internally on their own code. | dathinab wrote: | While not common to run into bugs because of it on unix | semantically its a problem all the time. | | While multiple statically linked C libs normally use the same | allocator, the moment you link in any other language in any way | (static,.so) the guarantee is gone. | | So you `dart:ffi.allocate` `C-malloc` and rust | `std::alloc::alloc` might in the end all use different | allocators or might happen to use the same allocator, but as | long as you don't carefully control all parts involved the all | bets are off. | | And it can make a lot of sense to use different allocators in | FFI-libraries in some use cases (mainly as a form of | optimization). | messe wrote: | > This is unavoidable in any language that (supports dynamic | memory allocation and) moves dynamic memory allocation into a | library. | | Not necessarily. The Zig[1] standard library forces callers to | provide at runtime an allocator to each data structure or | function that allocates. Freeing is then handled either by | calling .deinit()--a member function of the datastructure | returned (a standard convention)--or, if the function returns a | pointer, using the same allocator you passed in to free the | returned buffer. C's problem here is it doesn't have namespaces | or member functions, so there's a mix of conventions for what | the freeing function should be called. | | C++ allows this as well for standard library containers, | although I've rarely seen it used. | | > Also, say a DLL function returns a char pointer containing a | string. How would you know whether to call free or delete on | it? Or, maybe, the equivalent of free in Frob, the language | that DLL happens to be written in? | | I have to concede this one. I can't see a way out of this other | than documentation. | | [1]: https://ziglang.org/ | thechao wrote: | If `push_back()` & friends took an (optional) extra allocator | parameter, it'd be pretty ideal. It'd be nicer if the | implementations were forced to be single-word containers like | Stepanov wanted... | messe wrote: | std::vector can take an allocator as a template parameter | though? For a list, sure I can see that having a separate | allocator per element could work, but for a vector, surely | you'd want the same allocator for the entire range? | | EDIT: assuming we're talking about C++, if not please don't | hesitate to corrrect me. | thechao wrote: | Stateful allocators require a word be dedicated to the | allocator in the header. In my use cases, I _always_ have | access to the allocator, but I need to create a _lot_ of | containers -- most staying empty, to boot! Paying the | extra overhead is morally objectionable. | nine_k wrote: | You provide a memory allocation interface either way: it may | be a special function per type, it may be a generic | allocator. | ErikCorry wrote: | Languages with GC generally have much cleaner and simpler APIs. | chrsig wrote: | Usually. Even in languages with a GC, you get into a Bring Your | Own Buffer (BYOB) situation when trying to eek out performance. | | As in real life, one of the best ways to cut down on waste (gc | load) is to recycle. | Lammy wrote: | Ruby 3.1 provides a built-in middle ground for this use case: | https://docs.ruby-lang.org/en/master/IO/Buffer.html | ErikCorry wrote: | I'm super-reluctant to go this way because it's often a bug | factory. Once you are expecting the programmer to do manual | GC and detect when a buffer can be reused you are losing some | of the benefit. A lot of the time the answer should a better | GC and a smarter compiler. I realize that in practice that's | not always available. | chrsig wrote: | I want to say that it should only be used after profiling | and determining that it's a major cause off GC pressure, | however I think there are other times when it's obvious | that a private scratch buffer would be appropriate. | | For example, when marshaling an object before writing it to | a file, makes sense to write it to a scratch buffer before | writing it to the file. It's generally on the order of | trivial to keep said buffer encapsulated to prevent any | caller from dealing with potential pitfalls. | | Having clear ownership of the buffer is a big benefit to | help reduce any potential issues. You're correct that as a | GC approaches perfect, there ceases to be a need for it. ___________________________________________________________________ (page generated 2022-08-07 23:00 UTC)