[HN Gopher] Copying data is wasteful, mutating data is dangerous ___________________________________________________________________ Copying data is wasteful, mutating data is dangerous Author : feross Score : 134 points Date : 2020-01-08 20:04 UTC (1 days ago) (HTM) web link (pythonspeed.com) (TXT) w3m dump (pythonspeed.com) | capableweb wrote: | Seems like "both libraries make copies of the data, which means | you're using even more RAM." is a assumption (that copy data | leads to more RAM usage) and I'm guessing the libraries (haven't | used them myself, nor python, so not sure how feasible this is) | can be implemented like a persistent data structure | (https://en.wikipedia.org/wiki/Persistent_data_structure) and | therefore get both immutability (which leads to less bugs, no | mutability) and not that high RAM usage. Clojure is a famous | example of something that implements persistent data structures. | perfunctory wrote: | > can be implemented like a persistent data structure | | you don't get much benefit from persistent structures if every | element of an array is modified, like in the examples in the | post. | pipeep wrote: | If the data structure has a reference count of one, you can | safely mutate the data structure instead. Many persistent | immutable data structure libraries use this as an additional | optimization. | capableweb wrote: | I missed that, sorry. Thanks for clarifying! | vmchale wrote: | I think Futhark approaches this with uniqueness types. A bit | strange (maybe that's just me?) but it works well with arrays. | | Also it compiles to Python nicely. | xg15 wrote: | This seems like a problem that would be suited for an optimizing | compiler. | | You could imagine some hypothetical language in which _all_ | objects are immutable, as far as the programmer is concerned - | however the compiler is allowed to reuse objects behind the | scenes if there is provably no way that a side-effect could be | introduced. | | E.g., in such a language, the first variant would be the only | legal form to write the normalizing function: low | = array1.min() high = array1.max() array2 = array1 - | low array3 = array2 / (high - low) return array3 | | However, this could be safely turned into the following during | compilation: low = array1.min() high = | array1.max() array2 = array1 - low array2 /= (high - | low) return array2 | | ... because array2 is not visible outside the function and | therefore could not cause side-effects. | [deleted] | zzzeek wrote: | it is theoretically possible that a library like numpy or | pandas could do this directly. That is, if you take each of | these transformation operations and store them, but not | actually run them until needed, that is, when someone accesses | the individual array values, you only need to make one copy and | there is no impact on the programmer to have to worry about | this. | | I was doing something similar for SQLAlchemy recently. | SQLAlchemy is all about capturing Python operators and operands | into data structures that represent intent which can be invoked | later. | | edit: here is a demonstration: | https://gist.github.com/zzzeek/caa4a7ed94f326fbbc031acecb9d7... | | edit edit: it is actually possible with numpy by using the Dask | library: https://dask.org/ | vmchale wrote: | You can use Futhark; it compiles to PyOpenCL and then can be | called by python fairly fluently. | | Accelerate is a nice example of an array library with a JIT | backend. | masklinn wrote: | The problem with the library doing this is you end up with | the same issue as in Haskell: you accumulate thunks and both | memory use and location of CPU hotspots becomes much harder | to predict and troubleshoot. | zzzeek wrote: | so...turn off the "thunking" feature while you are | debugging. | JeremyBanks wrote: | That won't help when you're debugging the performance | implications of thunking. | zzzeek wrote: | compare memory / callcounts / time spent with thunking | turned on vs. off would give a good idea of it. | | i can say for my side of things thunking is a _super_ | huge performance gain as it allows for quick gathering of | expression intent and then the computation side on the | other side of a cache. | jerf wrote: | "You could imagine some hypothetical language in which all | objects are immutable, as far as the programmer is concerned - | however the compiler is allowed to reuse objects behind the | scenes if there is provably no way that a side-effect could be | introduced." | | There are functional languages that do this. Like many other | optimizations, you get less than your intuition might naturally | expect, but it does work. | | A trivial case is "tail call optimization", which most | immediately involves not creating a literal in-memory stack | record for recursive invocations but also can easily involve | the compiler re-using the memory for the arguments. IIRC Erlang | will do that as much as it can in tight recursive calls. | oefrha wrote: | Alas, CPython isn't an optimizing compiler, it faithfully | compiles code into bytecode which is then faithfully executed. | | I don't know the state of NumPy on PyPy since I don't use PyPy. | anoncake wrote: | It does have a peephole optimizer. | masklinn wrote: | numba.jit might be able to optimise the extra copies away | though, I'm reasonably sure working with numpy arrays is a | pretty significant use case. | toast0 wrote: | > You could imagine some hypothetical language in which all | objects are immutable, as far as the programmer is concerned - | however the compiler is allowed to reuse objects behind the | scenes if there is provably no way that a side-effect could be | introduced. | | Erlang/BEAM fits your hypothetical to some degree. Erlang the | language has immutable variables; however, BEAM, the virtual | machine, has mutable registers. Whether or not a given code | sample results in copy or mutation depends on what information | is available to the compiler, of course. | 333c wrote: | Why does this blog post have a recaptcha? | | In addition, I just finished reading the Rust book last night, | and I don't believe this is what interior mutability is (at least | as used in Rust). | nhumrich wrote: | I see no recaptcha. | 333c wrote: | Here's a screenshot: https://i.imgur.com/a/84Hb9Jq.png | itamarst wrote: | I'm using ConvertKit for newsletter signups, and choices are | either: | | 1. Validate emails with recaptcha (which is why you see that | thing in the corner.) 2. no email validation at all, so random | people get spammed with confirma-you-want-to-subscribe emails. | | I am not super-happy with having to make this choice :( | nayuki wrote: | How I would summarize the article is that mutation should be | confined to internal objects that have not been published yet or | are temporaries that are thrown away. | | Theoretically, you can make every API mutate data. If the user | didn't want to mutate data, they can explicitly make a copy | first. Whereas if there only exists an immutable API to return | new objects, then it's hard or impossible to get the benefits of | mutating in place. | | As far as I know, only Rust, C++, and C give support at the type- | checking level to denote which functions will modify the interior | of their arguments. This way, you can distinguish behaviors | easily - for example in Rust: impl Vector { | // The method reads this current vector and returns a new one. | pub fn normalize(&self) -> Vector { ... } // | The method reads and modifies this current vector in place. | pub fn normalize(&mut self) { ... } } | | The article gave this subtly flawed example in Python: | def normalize_in_place(array: numpy.ndarray): low = | array.min() high = array.max() array -= low | array /= high - low def visualize(array: | numpy.ndarray): normalize_in_place(array) | plot_graph(array) data = generate_data() if | DEBUG_MODE: visualize(data) do_something(data) | | In Rust, it would be a type error because visualize() takes a | reference but normalize_in_place() takes a mutable reference: | fn normalize_in_place(array: &mut numpy::ndarray) { let | low = array.min(); let high = array.max(); | array -= low; array /= high - low; } | fn visualize(array: &numpy::ndarray) { | normalize_in_place(array); // ERROR plot_graph(array); | } let data: numpy::ndarray = generate_data(); | if DEBUG_MODE { visualize(&data); } | do_something(&data); | masklinn wrote: | > As far as I know, only Rust, C++, and C give support at the | type-checking level to denote which functions will modify the | interior of their arguments. | | 1. C and C++ allow casting away constness, which may or may not | be UB depending how the parameter is defined | | 2. Swift also provides this ability to an extent: _struct_ | parameters have to be flagged `inout` to be mutable | | I'm sure there are others. | nwallin wrote: | Rust can throw away constness in unsafe blocks, which carries | the same caveat emptor that const_cast carries. | steveklabnik wrote: | Only in certain circumstances, though. | gdxhyrd wrote: | Casting away constness in C++ is something it is virtually | never done, and has to be done very explicitly. | IshKebab wrote: | This is just a straightforward memory optimisation; nothing to do | with interior mutability. | bfield wrote: | Specifically on the point about Pandas, last time I checked the | inplace flag did not save any memory. Under the hood, it creates | a temporary copy which is copied back to the original dataframe, | and then the temporary gets removed next garbage collection | cycle. If that is no longer the case I would love to know about | it though! | perfunctory wrote: | > But mutation is a cognitive burden: you need to think much | harder about what your code is doing. | | One way to alleviate the problem is to use naming convention. | normalize(a) # mutates in place. note: no return value. | b = normalized(a) # no mutation. returns a new instance. note | the adjective. | | Python does this in the standard library a.sort() | b = sorted(a) | FridgeSeal wrote: | It's convention to do this in Julia as well, every package I've | used that mutates marks the explicitly mutating function with a | ! at the end. | cestith wrote: | This is consistent with Ruby. It often includes both in the | core. There's .gsub() and .gsub!() for text substitution in | the String class, for example, or .select() and .select!() | for the Hash class. | anamexis wrote: | Unfortunately it isn't completely consistent, for example | Array#delete and String#concat | kwertzzz wrote: | This is interesting to know, because Julia uses exactly the | same approach[1] (now I know where it comes from). To me it | seems to be the ideal solution: use the in-place | normalize!(X) where performance is important, otherwise use | the more convenient allocating variant normalize(X). The | exclamation mark makes it clear that the array is modified. | | [1] https://docs.julialang.org/en/v1/base/collections/#Base | .sum! | fasquoika wrote: | >(now I know where it comes from) | | Both Julia and Ruby got this from Scheme and it might be | even older than that. | benibela wrote: | I use this convention whenever I can | | I spent far too much time yesterday figuring out why nothing | was being replaced, when I called replace on a string in | Javascript | jszymborski wrote: | in-place operations give me the heebie-jeebies. | | I think this is something that is a slightly less obvious when | dealing with classes. I see stuff like this in academic code | all the time: class SomeDataObj: def | __init__(self, data): self.data = data | def normalize(self): self.data = normalize(self.data) | | I'd much prefer for there to be variable `self.data_normalized` | which is initialized to None or even better a getter | `getNormalizedData()` that computes/caches a normalized | variable. | UncleEntity wrote: | The pythonic way would be to have sort() and sorted() which | either sort in place or return a sorted copy. Best of both | worlds. | kburman wrote: | I like how ruby way of declaration if a method does mutate | object or not but a exclamation mark at end to tell it modifies | the object | | a = a.sort or a.sort! | chewxy wrote: | I write an maintain a generic multidimensional array package | for Go, and I had come into the same problems as the OP. | | The solution was to dictate that the API is copying. But also | provide ways to allow the user to manage their own memory. | | e.g. a = tensor.New(tensor.WithShape(2,2), | tensor.Of(Float64)) b, err := a.Apply(sigmoid) | // copies c, err := a.Apply(sigmoid, | tensor.WithReuse(a)) // mutates | | Docs: https://godoc.org/gorgonia.org/tensor#Dense.Apply | DrFell wrote: | I have been mutating data for decades without any unexpected | behavior gremlins. | benibela wrote: | That is how strings work in Delphi/Pascal. | | All strings have a reference count. When you change a string, and | the ref count is not one, it is copied before the change. If the | ref count is one, it is not copied | lsb wrote: | Does the Numba JIT do this? This definitely seems like a compiler | optimization that I want to mechanically run on my code | zackmorris wrote: | Another option is for the language's runtime to use copy-on-write | (COW): | | https://en.wikipedia.org/wiki/Copy-on-write | | So basically every buffer can be made immutable and then only | copied when written to, using something like refcounts and diff | trees to internally track mutations from an immutable parent | buffer. Languages like Clojure due this to manage state. I | believe the Redux also does this internally, although I've only | studied it and haven't had a chance to use it professionally. | | In practice, this looks like a language where everything is | value-based instead of reference-based. | | So like, in C# where you pass objects by reference, a COW | language passes everything by const value. Then an imperative | language statically analyzes the code to detect mutation (my own | theory is that this can never be accomplished with 100% | certainty, at least not with a simple implementation). So | basically buffers act like Git instead of byte arrays, creating a | new snapshot with every mutation. | | Or functional languages can disallow mutation altogether, or | reduce the code into its most minimal representation and handle | mutation at special breaks in execution (like monads). I'm | grossly oversimplifying all of this and probably using the wrong | terminology, but am more than 50% confident that I can derive | what I'm talking about, so will just go with it hah. | | I think that the complex imperative implementation I'm talking | about is probably Rust. The simple functional implementation | would look like a pure immutable Lisp running within the Actor | model, handling state only through IO, and dropping monads | altogether. The catch being that complex functional code might | have so many breaks in execution that it would practically be | happening on every single line and begin to look like an | imperative language with all its flaws. This is the primary | reason why I shy away from impure functional languages like | Haskell, because I don't think they have solved this core issue | of mutation clearly enough (not to mention, I think functional | language syntax is generally write-only and not readable by | others or yourself in 6 months hahah). As far as I'm concerned, | this is still an open problem. | | In another life, I want to make an imperative language that's | primarily immutable, that uses higher order functions (or better | yet, statically analyzes code for side effects and converts | foreach loops into higher order functions automagically), and | transpiles to a purely immutable Lisp. It would be associative | array-based and look like Javascript. The idea being that we | could write code in the human-readable imperative style and let | the computer distill it down to its pure functional equivalent. | dualogy wrote: | > _The idea being that we could write code in the human- | readable imperative style and let the computer distill it down | to its pure functional equivalent_ | | Here we are, folks pulling their hairs for decades wrt how to | most efficiently trans/compile higher-order FP idioms to | minimal-overhead imperative --- and you want to go the other | way around! | | Kidding aside, the "functional sub-language" of cell-lang.net | has a feature like this, functions are pure but you get the | imperative sugars (loops, branches) while preserving purity | guarantees like referential transparency. | tome wrote: | > not to mention, I think functional language syntax is | generally write-only and not readable by others or yourself in | 6 months | | This is rather tangential, and sorry to hijack your message to | get on my soapbox, but I and many others who program in Haskell | and a dynamic language (Python in my case) find that Haskell | code we've written is _far_ easier to come back to than code we | 've written in the dynamic language. | mrkeen wrote: | I'm surprised no-one has mentioned Haskell's ST library yet. It | hits this exact use-case, and comes with some pretty sweet | safeguards from the type system. | | https://wiki.haskell.org/Monad/ST | | It allows you mutate arrays and other stuff "on the inside" like | this article suggests, but it's also able to wrap up the | operation so it appears pure from the outside, so the caller | doesn't have to guess or rely on naming conventions to figure out | the mutability. | | It's your choice on whether you want to do the copy on the way in | ('thawing'), on the way out ('freezing'), or both, by using the | 'in-place' variants of thaw and freeze. | alexhutcheson wrote: | In other languages this is less of an issue, because function | signatures can include information about whether the argument can | be modified or not. For example, in C++: // | Definitely doesn't modify 'array', because 'array' is copied. | vector<int> Normalize(vector<int> array); // Almost | certainly doesn't modify 'array'. vector<int> | Normalize(const vector<int>& array); // Probably *does* | modify 'array' (otherwise author would have used // const | reference instead of pointer.) void | Normalize(vector<int>* array); | | In Python every function uses the 3rd approach, so you have to | signal to the user in some other way whether the argument will be | modified or not. In Ruby and some other languages the convention | is to append "!" to the function name, but in Python most people | seem to just mention it in the docstring. | anaphor wrote: | There are also data structures such as zippers which are | immutable but minimize the amount of data you need to copy, at | the cost of potentially using more space overall. Zippers are | cool because they can be generalized to many different types of | data structures as long as they can be expressed as algebraic | data types. | DecoPerson wrote: | I'm not super familiar with interior mutablility, but I don't | think this is the same "interior mutablility" as used in Rust. | | Interior mutablility: | | A, immutable, points to B. | | B, immutable, points to C. | | C, mutable, points to D. | | D, immutable, points to E. | | Even though A and B are immutable -- B points to C mutably, so | you can modify its contents. Some "interior" part of A (and of B) | is mutable. You can't change which C is pointed to by B (because | B is immutable, or rather we only have an immutable reference to | it), but it's open season on the contents of C. | | What this article describes: | | We have 2 operations. | | We are passed in Array A. | | We can either: | | - Apply both operations to A, modifying it even though the | "owner" may not want this. | | - Copy before each operation (or as part of them), which leaves A | untouched, but requires triple the memory usage. | | - Cooy before (or as part of) the first operation only. The | second operation can be applied in-place on the copy. This leaves | A untouched and only requires double the memory. | | They're totally different concepts. I don't know what you'd call | the second concept -- I think the important part is that A | remains untouched, so... "Avoiding side effects while reducing | memory usage", or "How to make performant pure functions" ? | vnorilo wrote: | Reminds me of copy-on-write. Operation can reuse input buffer | if it holds the sole reference to it. In the scenario you | described, you manually ensure and write this behavior for the | rest of the pipeline. | edflsafoiewq wrote: | Yeah, this is just "don't create large temporaries | unnecessarily". I'm not sure it warrants a name. | codebje wrote: | It does if you want research done so compilers can implement | results in this space and optimise away unnecessary copies | without introducing errors. | DecoPerson wrote: | I also see this as a bit of a premature optimization. The | triple memory usage one may be significantly faster than the | in-place one -- and the memory usage is temporary so unless | you're in a constrained environment that's not a concern. | | Avoiding unintentional or unexpected side effects is a good | thing. Pure functions are great. Do whatever you can do keep | your visualization helper thing pure and avoid bugs. | | Then, once it works, if you have performance issues, do some | profiling. | | If the profiling shows this helper method as a hotspot, then | it's worth optimizing. | | At that point, you'd want to follow standard Python, numpy, and | pandas optimization strategies -- which I'm not familiar with. | | Aimless optimization like this is a waste of time. Good program | structure (data is available when you need it without tonnes of | calls), the right choice of data structures and algorithms | (hashmap vs list), the right choice of underlying tools (Julia | vs Python) and targeted optimization (profiling) are the ways | to go. | kardos wrote: | My experience finds that the in-place optimization can indeed | speed up numpy code. Even if you're not memory constrained, | fewer arrays in play means less memory bandwidth needed. | | >> Many NumPy APIs include an out keyword argument, allowing | you to write the results to an existing array, often | including the original one. | | I'd wager that saving memory or memory bandwidth is the | impetus for expanding the API to include this option. | philipkglass wrote: | The article starts with "You have a large chunk of data -- a | NumPy array, or a Pandas DataFrame -- and you need to do a | series of operations on it." | | So it's suggesting to use these techniques when you _already | know_ that you are working with a large chunk of data, either | because it was obvious from the beginning or because | profiling highlighted it. In scientific computing it is | common to store gigabytes of data in a single array. In fact | it is common that even the compromise-solution suggested in | this article, which leads to 2x memory instead of 3x, would | be too much extra overhead. | | But the compromise solution is nice if you _can_ afford 2x | memory overhead, so I upvoted the article. | codebje wrote: | The second notion, where it's clear that the memory space can | safely be used, is why there's interest in linear and affine | type systems. These systems describe types whose value can only | be read one time, and for linear types, which must be read one | time. | | On the surface that can seem a bit stupid, but the practical | outcome is that compilers can determine when an array's | contents can be mutated in place rather than overwritten. | | A variable you read over and over can be thought of | theoretically as one you duplicate, then read one copy of. | Since the act of reading the copy destroys it, a compiler | implements this by merely reading the value directly. | | Passing an array into a subroutine however means the compiler | either copies it, or forbids further access after passing it | in. If it's the latter, the memory space is available for re- | use inside the subroutine. | | Rust uses affine type theory for its borrow system to provide | memory safety. Function language research is looking at linear | and affine types for a variety of applications including | performance. | chewbacha wrote: | > B points to C mutably | | I think you mean that B points to D mutably because C is | mutable. The pointing from B to C is locked in place because of | the immutability of B. However, C is free to change it's | pointers to some other D. | masklinn wrote: | That's also the impression I have, the article really has | nothing to do with Rust's _interior_ mutability. And Rust | wouldn 't have the issue in the first place because whether it | takes the parameter by reference, mutable reference or value | would clearly explain the "contract" to the caller. | | It does feel a bit closer to Clojure's _transients_ which I | guess it 's what it's referring to. | | However there are also a number of differences with Clojure: | | * Clojure _can not_ completely prevents the caller modifying | callee data (at least using "normal" clojure collections, | which are persistent) | | * creating and reifying transients is O(1) | | * the operation is a fairly trivial mapping of #(/ (- % low) (- | high low)), which is likely what you'd have done in Clojure | eyelidlessness wrote: | This definitely made me think of Clojure transients. For | reference: https://clojure.org/reference/transients | | > If a pure function mutates some local data in order to | produce an immutable return value, is that ok? | | The idea in Clojure is that data is immutable _by default_ | but there are function-local mutation APIs specifically for | optimization. The contract between caller and callee is still | the same, a function never mutates _your_ data, but | transients provide improved performance and memory | characteristics invisible to the caller. | shpongled wrote: | I think the easier way to think about Rust's interior | mutability is via an example: (Cell is a mechanism for interior | mutability) struct S { a: Cell<u64> | } struct T { b: u64 } | | To mutate field `b` on `T`, we need a mutable object or | reference. However, we can mutate field `a` on `S` via an | immutable reference. Typically this is used to encapsulate some | kind of interior state that isn't visible outside of the | struct's methods, and helps in certain situations where using a | mutable reference might cause some hair pulling due to the | borrow checker | wlib wrote: | Isn't this solved perfectly with an affine or linear type-aware | optimizing compiler? Where all data is immutable unless the types | prove to the compiler that there are no mutation conflicts. I | believe Rust implements this and Idris 2 does so as well with | quantitative type theory. | masklinn wrote: | Rust would be very different and the problem wouldn't exist in | the first place, because the function would signal its | behaviour and intent through taking the parameter by reference, | mutable reference or value. | wlib wrote: | That's what I meant | ed_balls wrote: | Another option is to have a data structure that you have frozen | and then you unfroze it for the hot path and then freeze it again | when you exit it. | TheFuntastic wrote: | As a game developer working in managed languages (aka Unity/C#) | this problem is one of my biggest headaches. I was hoping the | article would provide divine insight, unfortunately the recommend | solution doesn't really solve the problem (as I see it). | | Whilst 2n array alloc is better than 3n, both are creating short | lived memory objects. Do that in a hot code path and suddenly | you've got tremendous garbage pressure your garbage collector is | going to choke on. | | One can optimise this by calculating results into a cached array, | but that creates opportunities for bugs (results of previous | caches persisting). I would dearly love to see a solution that | allows me to code in a functional style without sacrificing | performance. | C4stor wrote: | On certain use cases, you can use a pool of objects, creating | long lived objects drawn and return to a pool, reusing the | memory efficiently with minimal overhead. Of course, you need | to be able to size such a pool to begin with. | | Disclaimer : I have no experience with game development | whatsoever ! | vmchale wrote: | > I would dearly love to see a solution that allows me to code | in a functional style without sacrificing performance. | | Uniqueness types would make mutation easier to track: you'd | still need to write an imperative style, but less to worry | about. | | I think the "functional-style" stuff in this area uses | compilers (i.e. accelerate or futhark) that can analyze whether | copying is necessary because the programmer uses various high- | level functional combinators. I think J does something like | this as well (?) | Athas wrote: | If you have precise reference counts (which is often | achievable in array languages because they don't have | cycles), then you can dynamically do efficient in-place | updates if the array you are updating has a count of 1. | | Uniqueness types are only necessary if you want to make the | in-place update a static _guarantee_ (which can be crucial if | the code is going to run in an environment where allocation | is not possible). | itamarst wrote: | Are you familiar with data-oriented design? The game-developer | equivalent of the array-processing scientific programmers do. | | I suspect if you look at how they do things you'd find some | domain-specific design patterns. | | https://en.wikipedia.org/wiki/Data-oriented_design | unlinked_dll wrote: | >results of previous caches persisting | | You clear the buffer or capture it inside an optional type. | When you're done with the buffer you can reset it to the | default state to be used again. | | Most of the issues are logical bugs though, not flaws with the | pattern. You just need to reason about the computation that | you're doing to preallocate exactly how much data you need | outside your hot loops. | | You can read some audio source code if you want, this is an | extremely common pattern. Preallocate for your computation up | front by exactly how much you need (and often with cache- | friendly data structures) and don't get crafty reusing buffers | where you don't need to. | hderms wrote: | Imo you have to allocate _something_ else if you are going with | an immutable paradigm. You can do clever sharing of data | structures, but ultimately if you 're modifying the whole | thing, you're modifying the whole thing and sharing can only go | so far. It seems like further advances in this space either | require giving the compiler a better understanding of what is | "actually" shared and in what way (i.e. Rust and knowing when | mutations are externally visible), or making copying cheaper. | | Mutability is always (at least as current computer | architectures are concerned) going to be faster in a general | sense. | vmchale wrote: | > It seems like further advances in this space either require | giving the compiler a better understanding of what is | "actually" shared | | In the context of arrays, it helps when programmers use | higher-order combinators/functions, e.g. maps. | | There is a mutable way to do a map, and an immutable way, and | which should be used in the compiled program depends on | analysis of where the variable is used. | agumonkey wrote: | Do game devs write custom allocators ? | danielscrubs wrote: | Almost always (for big budget ones). | twiceaday wrote: | If you can't find a good implementation for an interface | perhaps the problem is the interface. Inline the function, | refactor, then express the result via re-usable components. | Ultimately it is a problem of code re-use vs performance. If | you want divine insight it is probably to use lazy evaluation, | so for the example in the article express normalize as a | generator. | strictfp wrote: | Arena style allocation helps. You still get the allocations, | but GC is cheap. IMO both Java and the CLR are in desperate | need of multiple heaps with separate GCs to solve these | problems. One big heap simply doesn't scale well. Erlang is the | only example I know of that does this right. I know that the | golang GC is low-latency, but I'm not familiar with if it can | sustain high allocation rates. | lostcolony wrote: | Nope. | | Golang still uses one global heap; it just allows for stack | based allocation in situations it can do it (avoiding putting | stuff on the global heap), so copies _can_ be cheap re: GC. | jen20 wrote: | The CLR also allows stack allocation, and many of the newer | .NET Core libraries have been built with a focus on | eliminating unnecessary allocations. | perfunctory wrote: | > To reduce memory usage, you can use in-place operations like += | | btw, I think it's a design flaw in python that `a += b` is not | always equivalent to `a = a + b`. | notduncansmith wrote: | As someone who has never used Python seriously I have to ask... | why not? | masklinn wrote: | As an optimisation, you can override += separately from +. A | pretty well known oddity related to this is: | >>> a = ([0],) # a tuple (immutable) containing a list | (mutable) >>> a[0] = a[0] + [1] Traceback | (most recent call last): File "<stdin>", line 1, in | <module> TypeError: 'tuple' object does not support | item assignment >>> a ([0],) >>> a[0] | += [1] Traceback (most recent call last): | File "<stdin>", line 1, in <module> TypeError: | 'tuple' object does not support item assignment >>> a | ([0, 1],) | | Though to be fair, _technically_ nothing prevents "+" from | mutating either operator (it'd just get you tarred and | feathered). | Sohcahtoa82 wrote: | Wait...did it throw an exception, yet still add the item to | the list? | | EDIT: Tried it myself. Sure enough, it throws the | exception, but still adds the item. This seems like a bug. | | EDIT Part 2: I understand now. `a[0] += [1]` is internally | translated to `a[0] = a[0].__iadd__([1])`. The right-hand | side is interpreted, mutating a[0] in-place as expected. | But then the re-assignment happens, which throws the | exception since you're trying to re-assign a value in a | tuple, which is immutable. | falkaer wrote: | Python defines += as the __iadd__ method which is _sometimes_ | mutating, i.e. if the object being assigned to is mutable or | not. | perfunctory wrote: | Because I once learned that `a += b` is just a shortcut for | `a = a + b`, just a syntax sugar. Now I have to constantly | remind myself that it's not the case in python. | | edit: I might have misunderstood your question. If you meant | "why it's not equivalent" please see the falkaer's answer. | im3w1l wrote: | They aren't equivalent in C++ either. | perfunctory wrote: | My C++ is a bit rusty. Does it manifest itself in the | standard library as well? The problem with python (in my | view) is that this behaviour is implemented in the | standard library and therefor propagates to the 3rd-party | libraries by convention. | roywiggins wrote: | This seems more like "defensive copying" than "interior | mutability" to me. | | Ideally you'd have a compiler that could perform optimizing magic | to detect when the copy isn't needed, but that's not going to | happen in CPython. ___________________________________________________________________ (page generated 2020-01-09 23:00 UTC)