[HN Gopher] The Windows malloc() implementation from MSVCRT is slow
       ___________________________________________________________________
        
       The Windows malloc() implementation from MSVCRT is slow
        
       Author : blackhole
       Score  : 143 points
       Date   : 2022-07-02 20:08 UTC (2 hours ago)
        
 (HTM) web link (erikmcclure.com)
 (TXT) w3m dump (erikmcclure.com)
        
       | moonchild wrote:
       | > it basically represents control flow as a gigantic DAG
       | 
       | Control flow is not a DAG.
        
         | tick_tock_tick wrote:
         | For it to be a DAG you'd have the solve the halting program
         | wouldn't you?
        
         | spatulon wrote:
         | You're not wrong.
         | 
         | I guess they're just trying to say that LLVM's control-flow
         | graph is implemented as individually heap-allocated objects for
         | nodes, and pointers for edges. (I haven't looked at the LLVM
         | code, but that sounds plausible).
         | 
         | Even if those allocations are fast on Linux/Mac, I wonder
         | whether there are other downsides of that representation, for
         | example in terms of performance issues from cache misses when
         | walking the graph. Could you do better, e.g. with a bump
         | allocator instead of malloc? But who knows, maybe graph
         | algorithms are just inherently cache-unfriendly, no matter the
         | representation.
        
         | pshirshov wrote:
         | Well, why?
        
           | AshamedCaptain wrote:
           | Directed _Acyclic_ Graph. Control flow graph has loops.
        
         | 3836293648 wrote:
         | Pretty sure they mean the AST is a DAG
        
       | evmar wrote:
       | The other inaccuracies in this article have already been covered.
       | I noticed there was also a weird rant about mimalloc in there
       | ("For some insane reason, mimalloc is not shipped in Visual
       | Studio").
       | 
       | My understanding is mimalloc is basically a one-person project[1]
       | from an MSR researcher in support of his research programming
       | languages. It sounds like it's pretty nice, but I also wouldn't
       | expect it to be somehow pushed as the default choice for Windows
       | allocators.
       | 
       | [1]: https://github.com/microsoft/mimalloc/graphs/contributors
        
       | bjourne wrote:
       | Well... Who told you to link to MSVCRT (the one in System32)? Not
       | Microsoft that's for sure. New software is supposed to link to
       | the Visual Studio C runtime it was compiled with and then ship
       | that library alongside the application itself. Even if you don't
       | compile with VS you can distribute the runtime library (freely
       | downloadable from some page on microsoft.com). Ostensibly, that
       | library contains an efficient malloc. If you willingly link to
       | the MSVCRT Microsoft for over a decade has stated is deprecated
       | and should be avoided you are shooting yourself in the foot.
       | 
       | "Windows is not a Microsoft Visual C/C++ Run-Time delivery
       | channel" https://devblogs.microsoft.com/oldnewthing/20140411-00/
        
         | spatulon wrote:
         | I suspect they are actually talking about the
         | modern/redistributable C/C++ runtime, not the old msvcrt.dll.
         | 
         | Microsoft refer to the modern libraries as the "Microsoft C and
         | C++ (MSVC) runtime libraries", so shortening that to MSVCRT
         | doesn't seem unreasonable.
        
         | sterlind wrote:
         | there's a weird licensing thing for the VC runtime where you
         | can't redistribute the dll alongside your code unless you have
         | a special license. instead, you have to install it as an MSI. I
         | have no idea why they created that restriction. I even work for
         | MS and it baffles me.
        
           | londons_explore wrote:
           | At one point MS was very keen for people to use MSI's rather
           | than building custom installers that jam things into
           | system32. Perhaps that was why.
        
       | bcbrown wrote:
       | Seeing someone refer to any piece of software technology as a
       | "trash fire" makes it harder for me to view them as credible.
       | It's unnecessarily divisive and insulting, and it means it's
       | unlikely they will have any appreciation of the tradeoffs present
       | during initial design and implementation.
        
         | dang wrote:
         | We've replaced the baity wording with more representative
         | language from the article, in keeping with the HN guideline: "
         | _Please use the original title, unless it is misleading or
         | linkbait; don 't editorialize._"
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
         | [deleted]
        
       | barrkel wrote:
       | Windows doesn't have a malloc. The API isn't libc like
       | conventional Unix and shared libraries on Windows don't generally
       | expect to be able to mutually allocate one another's memory.
       | Msvcrt as shipped is effectively a compatibility library and a
       | dependency for people who want to ship a small exe.
        
         | qsdf38100 wrote:
         | Note that Windows has HeapAlloc and HeapFree, which provide all
         | the functionality to trivially implement malloc and free.
         | 
         | The C runtime is doing exactly that, except it adds a bit of
         | bookkeeping on top of it IIRC. And in debug builds it adds
         | support for tracking allocations.
        
       | pjmlp wrote:
       | There is no Windows malloc(). Only UNIXes have the C API as part
       | of the OS API.
        
         | jart wrote:
         | malloc() isn't part of the Linux API which provides mmap().
        
           | pjmlp wrote:
           | Since we are getting pedantic, Linux isn't a UNIX.
        
             | jart wrote:
             | I thought it was. Some distro (I forget the name) paid to
             | be certified with The Open Group.
        
             | chrisseaton wrote:
             | > Linux isn't a UNIX
             | 
             | I this this isn't quite right - I think some distributions
             | are actually certified as UNIX.
             | 
             | https://www.opengroup.org/openbrand/register/
        
             | CyberDildonics wrote:
             | It's absurd you would call someone pedantic for saying
             | malloc is in a library on linux after trying to say that
             | malloc in a library on windows.
        
           | plorkyeran wrote:
           | Libc being just a library is indeed one of the ways that
           | Linux is unlike Unix.
        
             | leajkinUnk wrote:
             | What do you mean by "Unix"? Are you talking about some
             | specific Unix version, or is there something in the POSIX
             | spec that says that libc isn't a library?
        
               | naniwaduni wrote:
               | It's not that libc is supposed to not be a library, but
               | those functions _are the POSIX-defined interfaces_ to the
               | OS. Linux is unusual in that it defines its stable
               | interfaces in terms of the syscall ABI, enabling
               | different implementations of the libc that can work semi-
               | reliably across kernel versions.
        
       | rayiner wrote:
       | I wonder how much of this is the development culture at MS.
       | https://www.theregister.com/2022/05/10/jeffrey_snover_said_m...
       | ("When I was doing the prototype for what became PowerShell, a
       | friend cautioned me saying that was the sort of thing that got
       | people fired.")
       | 
       | In that environment I can imagine nobody wants to be on the hook
       | for messing with something fundamental like malloc().
       | 
       | The complete trash fire that is O365 and Teams--for some reason
       | the new Outlook kicks you out to a web app just to manage your
       | todos--suggests to me that Microsoft may be suffering from a
       | development culture that's more focused on people protecting
       | fiefdoms than delivering the best product. I saw this with Nortel
       | before it went under. It was so sclerotic that they would
       | outsource software development for their own products to third
       | party development shops because there was too much internal
       | politics to execute them in house.
        
         | munch117 wrote:
         | You shouldn't read too much into the PowerShell story. Creating
         | your own programming language is in most cases a frivolous
         | vanity project. Spending company resources on your own
         | frivolous vanity projects is the sort of thing that can get you
         | fired.
        
           | sterlind wrote:
           | ! I disagree.
           | 
           | CMD badly needed replacing. MS needed a new shell language. A
           | functional company would connect people with a passion for X
           | with the resources to achieve X, if X has a chance of helping
           | the company.
           | 
           | Windows Terminal and WSL show how far MS has come from the PS
           | days.
           | 
           | (Disclaimer: I work for MS)
        
             | londons_explore wrote:
             | I think the smart move would have been to make an official
             | port of bash...
        
               | hughw wrote:
               | Maybe that's how we got WSL
        
             | 13of40 wrote:
             | The way I remember it, the need for a new shell language
             | for system administration was something that lots of people
             | in Windows Server were trying to solve. Ballmer talked
             | about it, we had a push to add a handful of new command
             | line tools (like tasklist.exe I think) that you could use
             | under CMD, and there was a proof of concept where MMC could
             | be used to output some kind of macro language when users
             | did things in the UI. PowerShell was the thing that
             | eventually won, and I think it was largely because it stood
             | on the shoulders of .Net so had a ton of capability right
             | out of the gate. (And TBH, I think it's a little bit weird
             | that we have this mythos today where Snover sat down at his
             | computer one morning and invented it out of thin air, when
             | even the v1 feature team had something like 30 engineers
             | and PMs on it.)
        
         | sterlind wrote:
         | (I work for MS, though in core Azure rather than Office or
         | Windows.)
         | 
         | I think that PowerShell story was how old MS worked, back in
         | the days of stack ranking, hatred of Linux and the Longhorn
         | fiasco. things inside the company are a lot more functional
         | now. I saw internal politics drama at my first position, but
         | once I moved everything was chill, and experimentation and
         | contributing across team boundaries was actively encouraged and
         | rewarded.
         | 
         | I suspect Office suffers from a ton of technical debt, along
         | with being architecturally amorphous and dating from a pre-
         | cloud era. as for Windows, the amount of breakage I see in the
         | betas suggests they're not afraid of making deep changes, it's
         | probably that MSVCRT is a living fossil and has to support old
         | programs monkeypatching the guts of malloc or something.
        
           | lkfjasdlkjfsad wrote:
           | any idea what the hell is going on with Teams?
           | 
           | why can't i simply scroll up in my own conversations? let
           | alone search them. the sticky sludge of communication in
           | something as simple as chat has cost me hours since i was
           | forced to use teams. outlook search is so superior to teams
           | i'd easily prefer to have lync back. this one thing
           | absolutely cripples communication. there are a list of other
           | very basic issues that make communicating code blocks
           | frustrating. i see new app features here and there, i saw
           | some feature added the other day which won't help anyone. i
           | just don't understand the prioritization of issues
           | 
           | i don't expect a direct answer to this, although i hope to
           | read an explanation one day
           | 
           | EDIT: i removed content from this comment that was missing
           | the point
        
           | rayiner wrote:
           | It was interesting the see them switch back to Win32 after
           | all of these greenfield alternatives that quickly died. (WPF,
           | WinRT, etc.) Makes you wonder what was going on during that
           | time. Contrast Apple which has been with Cocoa which is an
           | evolution of Next Step.
        
       | fguerraz wrote:
       | "Don't use spinlocks in user-land."
        
       | denkshom wrote:
       | This rant was rather devoid of relevant technical detail.
       | 
       | I mean, why exactly is the malloc of the compatibility msvcrt so
       | slow compared to newer allocators? What is it doing?
       | 
       | An analysis of that would have been some actual content of
       | interest.
        
       | DHowett wrote:
       | I'm curious whether the "new"(ish) segment heap would address
       | some of the author's issues.
       | 
       | It's poorly documented, so I can't find a reference explaining
       | what it is on MSDN save for a snippet on the page about the app
       | manifests[1]. There's some better third-party "documentation"[2]
       | that gets into some specifics of how it works, but even that is
       | light on the real-world operational details that would be helpful
       | here.
       | 
       | Chrome tried it out and found[3] it to be less than suitable due
       | to its increased CPU cost, which might presage what Erik would
       | see if they enabled it.
       | 
       | [1] https://docs.microsoft.com/en-
       | us/windows/win32/sbscs/applica...
       | 
       | [2] (PDF warning)
       | https://www.blackhat.com/docs/us-16/materials/us-16-Yason-Wi...
       | 
       | [3]
       | https://bugs.chromium.org/p/chromium/issues/detail?id=110228...
        
         | MarkSweep wrote:
         | The one other piece of "documentation" that I know of is this
         | blog post:
         | 
         | https://blogs.windows.com/windowsexperience/2020/05/27/whats...
         | 
         | It mentions that the segment heap is used by default for UWP
         | apps and reduces memory usage of Edge.
        
       | chrisseaton wrote:
       | So why is it a trash fire? It's just slow? Or is there something
       | else wrong with it? I thought the author was going to say it did
       | something insane or was buggy somehow.
        
         | [deleted]
        
         | Someone wrote:
         | Also, is it slow because it's badly implemented, or is it
         | better than other mallocs in some other respect? Maybe, dating
         | from decades ago, it's better in the memory usage front?
        
       | KerrAvon wrote:
       | Has everyone forgotten that Unix is the common ancestor of Linux
       | and every other Unixlike? I'm seeing an uptick of people writing
       | nonsensical comments like "this was written for Linux (or Mac OS
       | X, which implements POSIX and is therefore really Linux in
       | drag)".
        
         | jchw wrote:
         | No... That's why they had the parenthetical. The problem is,
         | your computer probably doesn't boot the common ancestor. If
         | you're writing UNIX-like stuff, most likely it boots macOS or
         | Linux. If you're cool maybe it's one of the other modern BSD
         | variants aside macOS. In practice there's a pretty low
         | probability that your code also runs on all POSIX-compliant
         | operating systems, and more honest/experienced people often
         | don't kid themselves into thinking that they're seriously
         | targeting that. Even if you believe it, you probably have some
         | dependency somewhere that doesn't care, like Qt for example.
         | Saying something like "Linux (or macOS, which is similar)" is a
         | realization that you're significantly more likely to be
         | targeting both Linux and macOS than you are to even test on
         | BSD. And to solidify that point, note that lots of modern CI
         | platforms don't even have great BSD support to begin with.
         | 
         | Of course, there is a semantic point here. macOS nominally
         | really _is_ UNIX, except for when someone finds out it 's not
         | actually POSIX compliant due to a bug somewhere every year or
         | so. Still, it IS UNIX. But what people mostly run with that
         | capability, is stuff that mostly targets Linux. So... yeah.
         | 
         | Of course it is true that some people really think macOS is
         | actually Linux, but that misunderstanding is quite old by this
         | point.
         | 
         | addendum: I feel like I haven't really done a good job putting
         | my point across. What I'm really saying is, I believe most
         | developers targeting macOS or Linux today only care about POSIX
         | or UNIX insofar as they result in similarities between macOS
         | and Linux. That macOS is truly UNIX makes little difference; if
         | it happened to differ in some way, developers would happily
         | adjust to handle it, just like they do for Linux which
         | definitely isn't UNIX.
        
         | naniwaduni wrote:
         | Well, a pretty big part of the point of Linux is that it's
         | _not_ a Unix-descendant, just a Unix-clone.
        
           | pcl wrote:
           | What's the distinction between those two?
        
             | asgeir wrote:
             | Part of it is whether the code can be traced back to
             | original AT&T code. Which would be true for e.g. BSD
             | variants (which includes MacOS).
             | https://i.redd.it/kgv4ckmz3zb51.jpg
             | 
             | Another part is the trademark and certification fee.
             | https://www.opengroup.org/openbrand/Brandfees.htm
        
             | sterlind wrote:
             | Darwin forked BSD, and BSD is a fork of the original Unix
             | source. Linux is a fresh implementation of POSIX, and
             | doesn't directly inherit any code from Unix.
             | 
             | Linux is as much Unix as WSL1 was Linux - i.e. not at all,
             | just clones.
        
             | dwheeler wrote:
             | "Unix" implies paying annual fees for use of the "Unix"
             | trademark, and/or at least direct descent from the original
             | Unix code.
             | 
             | According to: https://kb.iu.edu/d/agat "To use the Unix
             | trademark, an operating system vendor must pay a licensing
             | fee and annual trademark royalties to The Open Group.
             | Officially licensed Unix operating systems (and their
             | vendors) include macOS (Apple), Solaris (Oracle), AIX
             | (IBM), IRIX (SGI), and HP-UX (Hewlett-Packard). Operating
             | systems that behave like Unix systems and provide similar
             | utilities but do not conform to Unix specification or are
             | not licensed by The Open Group are commonly known as Unix-
             | like systems."
             | 
             | Many will include the *BSDs as a Unix, because their code
             | _does_ directly descend from the original Unix code. But
             | Linux distros generally do not meet either definition of
             | "Unix".
        
         | avgcorrection wrote:
         | I just call Linx+Mac+Bsds Unix (not "Unix-like" and certainly
         | not that "*nix" nonsense). I don't respect Unix enough to be
         | perfectly precise with it.
        
         | copperx wrote:
         | Apparently yes, because all I ever hear is "macOS is like
         | Linux" and even "macOS is really Linux behind the scenes" from
         | less enlightened people.
        
       | Sesse__ wrote:
       | Just wait until you try to use it from multiple threads at the
       | same time!
        
       | TonyTrapp wrote:
       | Not that it helps here, but Microsoft never considered the MSVCRT
       | that ships with Windows to be public API. This is not the
       | "Windows allocator", this is the (very) old MSVC runtime
       | library's allocator. Of course that doesn't keep anyone from
       | using this library because it's present on any Windows system,
       | unlike the newer MSVC versions' runtime library. Using the
       | allocator from a later MSVC's runtime library would provide much
       | better results, as would writing a custom allocator on top of
       | Windows' heap implementation.
       | 
       | MSVCRT basically just exists for backwards compatibility. It's
       | impossible to improve this library at this point.
        
         | garaetjjte wrote:
         | >but Microsoft never considered the MSVCRT that ships with
         | Windows to be public API
         | 
         | It was in the past. At first msvcrt.dll was the runtime library
         | used up to Visual C++ 6. Later, VC++ moved to their own
         | separate dlls, but you could still link with system msvcrt.dll
         | using corresponding DDK/WDK up to Windows 7.
         | 
         | I'm also not sure that this is just ancient library left for
         | compatibility, some system components still link to it, and
         | msvcrt.dll itself seems to link with UCRT libraries.
        
           | TonyTrapp wrote:
           | > It was in the past. At first msvcrt.dll was the runtime
           | library used up to Visual C++ 6.
           | 
           | At that time it was already a big mess, because at first it
           | was the runtime library of Visual C++ 4 in fact! The gory
           | details are here: https://devblogs.microsoft.com/oldnewthing/
           | 20140411-00/?p=12...
           | 
           | > some system components still link to it
           | 
           | Some system components themselves are very much ancient and
           | unmaintained and only exist for backwards compatibility as
           | well.
        
             | garaetjjte wrote:
             | Ancient or not, I don't think it really matters for
             | allocation performance: malloc in both msvcrt.dll and
             | ucrtbase.dll after some indirection ends up calling
             | RtlAllocateHeap in ntdll.dll
        
         | Sesse__ wrote:
         | Win32 has an allocator (HeapAlloc), and it is similarly slow
         | and low-concurrent. Even if you enable the newer stuff like
         | LFH.
        
         | jart wrote:
         | It's effectively mandatory. Microsoft provides about twelve
         | different C Runtimes. But if you're building something like an
         | open source library, you can't link two different C runtimes
         | where you might accidentally malloc() memory with one and then
         | free() with the other. If you want to be able to pass pointers
         | around your dynamic link libraries, you have to link the one C
         | runtime everyone else uses, which is MSVCRT. Also worth
         | mentioning that on Windows 10 last time I checked ADVAPI32
         | links MSVCRT. So it's pretty much impossible to not link.
        
           | TonyTrapp wrote:
           | It isn't mandatory. I have never actively linked against
           | MSVCRT on Windows. From my experience it's mostly software
           | that isn't built with Visual Studio that uses MSVCRT, or
           | software that that takes extreme care of its binary size
           | (e.g. 64k intros). MSVCRT is not even an up-to-date C runtime
           | library. You wouldn't be able to use it for writing software
           | requiring C11 library features without implementing them
           | somewhere on top of it.
           | 
           | It's true that you cannot just happily pass pointers around
           | and expect someone else to be able to safely delete your
           | pointer - but that is why any serious library with a C
           | interface provides its own function to free objects you
           | obtained from the library. Saying that this is impossible
           | without MSVCRT implies that _every_ software needs to be
           | built with it, which is not even remotely the case. If I
           | wanted, I could build all the C libraries I use with LLVM and
           | still link against them in my application compiled with the
           | latest MSVC runtime or UCRT.
           | 
           | The much bigger problem is mixing C++ runtimes in the same
           | piece of software, there you effectively must guarantee that
           | each library uses the same runtime, or chaos ensues.
        
             | kazinator wrote:
             | If you're writing in C++ on Windows, expose only COM (or at
             | least COM-style) interface called through virtual functions
             | on an object pointer. Then you can use whatever C++ run-
             | time you want, internally. What you don't want is the other
             | library calling C++ functions by name. Like you pass it
             | some ostream object and it calls ostream::put or whatever,
             | where that symbolically resolves to the wrong one.
        
           | plonk wrote:
           | Our programs ship their DLL dependencies in their own
           | installer anyway, like most others on Windows. Just ship your
           | FOSS library with a CMake configuration and let the users
           | build it with whatever runtime they want.
        
           | kazinator wrote:
           | C applications targeting Windows must provide their own C
           | library with malloc and free (if they are using the "hosted
           | implementation" features of C).
           | 
           | MSVCRT.DLL isn't the library "everyone" uses; just Microsoft
           | programs, and some misguided freeware built with MinGW.
           | 
           | Even if ADVAPI32.DLL uses MSVCRT.DLL, it's not going to
           | mistakenly call the malloc that you provide in your
           | application; Windows DLL's don't even have that sort of
           | global symbol resolution power.
           | 
           | I would be very surprised if any public API in ADVAPI32
           | returns a pointer that the application is required to
           | directly _free_ , or accept a pointer that the application
           | must _malloc_. If that were the case, you 'd have to attach
           | to MSVCRT.DLL with LoadLibrary, look up those functions with
           | GetProcAddress and call them that way.
           | 
           | Windows has non-malloc allocators for sharing memory that way
           | among DLL's: the "Heap API" in KERNEL32. One component can
           | HeapAlloc something which another can HeapFree: they have to
           | agree on the same heap handle, though. You can use
           | GetProcessHeap to get the default heap for the process.
           | 
           | It may be that the MSVCRT.DLL malloc uses this; or else it's
           | based on VirtualAlloc directly.
        
         | FreakLegion wrote:
         | There's also UCRT, which ships with the OS since Windows 10.
         | The logic of this rant was a real head-scratcher. If you _must_
         | blame one side, it 's LLVM. Fragmentation of C runtimes is
         | annoying but inescapable. Glibc for example isn't any better.
        
           | ChrisSD wrote:
           | The UCRT has even been present since Windows 7, if users keep
           | up with updates. Or if applications bundle the UCRT installer
           | with their own.
        
           | leajkinUnk wrote:
           | Could you elaborate why Glibc isn't any better?
           | 
           | I remember some funny problems with Glibc, like, 20 years
           | ago, but it's been invisible to me (as a user) since then.
           | You get a new Glibc, old binaries still work, it's fine.
        
             | usrn wrote:
             | I'm pretty sure I've run into binaries breaking on new
             | versions of Glibc but maybe it's because the architecture
             | or calling convention changed. I've never really gotten the
             | sense that GNU cares much about binary compatibility (which
             | makes sense, they argue that sharing binaries is mostly
             | counter productive.)
        
             | FreakLegion wrote:
             | Just like with Windows the challenges affect developers
             | rather than users.
             | 
             |  _> You get a new Glibc, old binaries still work, it 's
             | fine._
             | 
             | Indeed, but when you need to build for an older glibc it's
             | not so simple. This is a common use case, since e.g. AWS's
             | environments are on glibc 2.26.
             | 
             | Ideally you'd like to build for all targets, including
             | older systems, from a single, modern environment (this is
             | trivial in Windows) -- and you can do some gymnastics to
             | make that happen[1] -- but in practice it's easier to just
             | manage different build environments for different targets.
             | This is partly why building Linux wheels is so convoluted
             | for Python[2].
             | 
             | Hardly a world-ending problem, but my point is simply that
             | C runtimes are a pain everywhere.
             | 
             | 1. https://stackoverflow.com/questions/2856438/how-can-i-
             | link-t...
             | 
             | 2. https://github.com/pypa/manylinux
        
       | somerando7 wrote:
       | > I was taught that to allocate memory was to summon death itself
       | to ruin your performance. A single call to malloc() during any
       | frame is likely to render your game unplayable. Any sort of
       | allocations that needed to happen with any regularity required
       | writing a custom, purpose-built allocator, usually either a
       | fixed-size block allocator using a freelist, or a greedy
       | allocator freed after the level ended.
       | 
       | Where do people get their opinions from? It seems like opinions
       | now spread like memes - someone you respect/has done something in
       | the world says it, you repeat it without verifying any of their
       | points. It seems like gamedev has the highest "C++ bad and we
       | should all program in C" commmunity out there.
       | 
       | If you want a good malloc impl just use tcmalloc or jemalloc and
       | be done with it
        
         | morelisp wrote:
         | Aside from the performance implications being very real (even
         | today, the best first step to micro-optimize is usually to
         | kill/merge/right-size as many allocations as possible), up
         | through ~2015 the dominant consoles still had very little
         | memory and no easy way to compact it. Every single non-
         | deterministic malloc was a small step towards death by
         | fragmentation. (And every deterministic malloc would see major
         | performance gains with no usability loss if converted to e.g. a
         | per-frame bump allocator, so in practice any malloc you were
         | doing was non-deterministic.)
        
         | [deleted]
        
         | TonyTrapp wrote:
         | As always there is some truth to it - the problem of the MSVCRT
         | malloc described in this blog article is the living proof of
         | that - but these days it's definitely not a rule that will be
         | true in 100% of cases. Modern allocators are really fast.
        
         | forrestthewoods wrote:
         | Strong agree. I recently wrote a semi-popular blog post about
         | this. https://www.forrestthewoods.com/blog/benchmarking-malloc-
         | wit...
         | 
         | It's interesting that LLVM is suffering so horrifically using
         | default malloc. I really wish the author did a deeper
         | investigation into why exactly.
        
           | dang wrote:
           | Discussed here:
           | 
           |  _Benchmarking Malloc with Doom 3_ -
           | https://news.ycombinator.com/item?id=31631352 - June 2022 (30
           | comments)
        
         | charles_kaw wrote:
         | If this person was taught game dev any time before about 2005,
         | that would have still been relevant knowledge. Doing a large
         | malloc or causing paging could have slaughtered game execution,
         | especially during streaming.
         | 
         | >If you want a good malloc impl just use tcmalloc or jemalloc
         | and be done with it
         | 
         | This wasn't applicable until relatively recently.
        
           | jcelerier wrote:
           | > Doing a large malloc or causing paging could have
           | slaughtered game execution, especially during streaming.
           | 
           | ... it still does ? I had a case a year or so ago (on then-
           | latest Linux / GCC / etc.) where a very sporadic allocation
           | of 40-something bytes (very exactly, inserting a couple of
           | int64 in an unordered_map at the wrong time) in a real-time
           | thread was enough to go from "ok" to "unuseable"
        
       | softwaredoug wrote:
       | My knowledge is like 10 years old - For a long time, Microsoft's
       | stl implementation was based on their licensning of dinkumware's
       | STL (https://www.dinkumware.com/). Not something maintained in
       | house. It seemed to work OK'ish - giving lowest common
       | denominator functionality. However, it was pretty easy to create
       | higher performing specialized data structures for your use case
       | then what seemed like simple uses of dinkumware STL.
        
         | garaetjjte wrote:
         | malloc is not related to STL. But about it, big issue with
         | Microsoft STL is that it is atrociously slow on debug builds.
        
       | oddity wrote:
       | If you're depending on the performance of malloc, you're either
       | using the language incorrectly or using the wrong language. There
       | is no such thing as a general purpose _anything_ when you care
       | about performance, there 's only good enough. If you are 1)
       | determined to stick with malloc and 2) want something predictable
       | and better, then you are necessarily on the market for one of the
       | alternatives to the system malloc _anyway_.
        
         | mwcampbell wrote:
         | The whole point of the article, though, was that the system
         | malloc was good enough on Linux and Darwin.
        
           | oddity wrote:
           | This misses the point of my comment. When you put faith in
           | malloc, you're putting hope in a lot of heuristics that may
           | or may not degenerate for your particular workload. Windows
           | is an outlier with how bad it is, but that should largely be
           | irrelevant because the code should have already been
           | insulated from the system allocator anyway.
           | 
           | An over-dependence on malloc is one of the first places I
           | look when optimizing old C++ codebases, even on Linux and
           | Darwin. Degradation on Linux + macOS is still there, but more
           | insidious _because_ the default is so good that simple apps
           | don 't see it.
        
             | dzaima wrote:
             | Except that I'd guess that there is no "good" case in the
             | case for MSVCRT's malloc. You shouldn't assume malloc is
             | free, but you should also be able to assume it won't be
             | horrifyingly slow. Just as much as you should be able to
             | rely on "x*y" not compiling to an addition loop over 0..y
             | (which might indeed be very fast when y is 0)
        
       ___________________________________________________________________
       (page generated 2022-07-02 23:00 UTC)