[HN Gopher] Fedora 38 LLVM vs. Team Fortress 2
       ___________________________________________________________________
        
       Fedora 38 LLVM vs. Team Fortress 2
        
       Author : st_goliath
       Score  : 88 points
       Date   : 2023-04-24 18:54 UTC (4 hours ago)
        
 (HTM) web link (airlied.blogspot.com)
 (TXT) w3m dump (airlied.blogspot.com)
        
       | DannyBee wrote:
       | Fedora 38 includes the LLVM15 libs to maintain backwards
       | compatibility.
       | 
       | Why is this automatically using a new, incompatible solib,
       | instead of a versioned solib?
        
         | AnssiH wrote:
         | The LLVM dependency is in the HW-specific driver solib which is
         | loaded by the OpenGL library, which has the same soname as
         | before.
        
       | IceWreck wrote:
       | Isn't this the reason why people recommend using the flatpak
       | version of Steam ?
        
         | PlutoIsAPlanet wrote:
         | Yes, especially on Fedora.
         | 
         | This isn't something Fedora is doing wrong, unfortunately some
         | games build against older libraries or are built against
         | Debian/Ubuntu and the Flatpak runtimes generally have better
         | compatibility.
        
       | exabrial wrote:
       | I thought TF2 was pretty much 100% hacked... like no legit non-
       | hackers playing except at LAN parties.
        
       | sosodev wrote:
       | It's unfortunate but the Steam experience on Linux seems to be
       | progressively getting worse (outside of Steam Deck ofc). The
       | Steam client is often borderline unusable for Linux users. You
       | can find many issue threads on GitHub reporting client freezes
       | and crashes.
       | 
       | It seems like a big part of the issues is a lack of maintenance.
       | TF2 would actually run better on Linux via Proton but VAC isn't
       | enabled so you can't join the vast majority of servers.
       | 
       | Valve also has existing Source engine tooling that allows Linux
       | ports to drop OpenGL entirely (dxvk-native as used by Portal 2
       | and L4D2) but they haven't added it to TF2... :(
        
         | ho_schi wrote:
         | I can't see how native Linux support is getting worse. Linux
         | users are good at bug reporting. Maybe some developers should
         | care more about compatibility. And yes, especially the
         | heterogeneous setups used by some makes support difficult.
         | 
         | I'm worried that Valve puts too much resources into Proton
         | (derivate of WINE) instead of tooling for native ports. Yes,
         | Proton is needed to provide initial compatibility. But Proton
         | is another layer of complexity (more bugs, integration, system
         | resources) which requires more programming. I started playing
         | CS again after it was ported natively in 2014, it runs well and
         | all issues with WINE were gone.
         | 
         | If Proton becomes to "good" we end up in a situation with a
         | high maintenance burden for Valve. Game developers will rely on
         | it and Valve has all the constant work. Instead game developers
         | should treat Linux as first-class platform for AAA-Titles, for
         | which the need appropriate APIs, compatibility and tooling. As
         | Valve does itself support Linux as first-class platform from
         | HL2 to CSGO. The target shall be official support from the very
         | first day.
         | 
         | Anyway. Looks like Valve has chosen a special implementation
         | for TF2? What I miss here is a link to a bug report. Ideally
         | opened months ago :)
        
           | danbolt wrote:
           | I think Valve has a financial incentive to keep Proton
           | compatibility in a positive state, as it increases sales of
           | the Steam Deck and encourages players to remain in their
           | ecosystem. Or, I think it's more likely than the majority of
           | AAA game developers having a financial incentive to maintain
           | Linux versions of their products.
        
           | pjmlp wrote:
           | Game studios already know Linux distributions quite well on
           | the server, and most AAA games on Android are basically only
           | using the NDK, meaning ISO C and C++, OpenGL ES, Vulkan,
           | OpenSL.
           | 
           | Besides that, PlayStation OS is based on FreeBSD. Even if the
           | 3D API is different, it is just yet another backend.
           | 
           | They don't port them, because the QA and support aren't worth
           | the sales, that is about it.
        
         | zamalek wrote:
         | > You can find many issue threads on GitHub reporting client
         | freezes and crashes.
         | 
         | The fact that these are happening does not necessarily mean the
         | client is getting worse. For example, it _could_ mean that more
         | people are installing Steam for Linux. There is no baseline to
         | say it 's getting worse, because nobody opens an issue saying
         | "all working here."
         | 
         | In my experience, the _only_ issue I have on Wayland is this:
         | https://github.com/ValveSoftware/steam-for-linux/issues/7245
         | (workaround: disable animated avatars) (edit: all AMD machine)
         | 
         | > outside of Steam Deck ofc
         | 
         | There is nothing special about the Steam Deck. It's just
         | another Linux machine.
         | 
         | > TF2
         | 
         | I don't play any Source games, but I could see TF2 having
         | issues because it's in maintenance mode. If it is bjorked that
         | has nothing to do with Steam.
        
           | sosodev wrote:
           | True, I don't have enough data to really make that claim. I
           | can say that my own hardware hasn't changed in ~4 years and
           | I've been using Steam for Linux since I built this machine.
           | It's only within the last year or so that I started having
           | major issues with the client.
           | 
           | > there is nothing special about the steam deck
           | 
           | How is first party support for the hardware and software
           | stack "nothing special"?
           | 
           | > If it is bjorked that has nothing to do with Steam.
           | 
           | Maybe it's not directly related to the rest of my comment but
           | it's related to the OP. I also think it's indicative of
           | Valve's issues with Linux.
        
             | zamalek wrote:
             | > How is first party support for the hardware and software
             | stack "nothing special"?
             | 
             | Because the vast majority of that stack (the kernel, GPU
             | driver, window manager, and so forth) has nothing to do
             | with Valve. They might contribute drivers to the kernel
             | (I'm not sure if they actually do - I would expect AMD to
             | be doing that), but otherwise it's an Arch-based distro
             | with the same Steam client and Proton runtime that everyone
             | else is using.
        
               | sosodev wrote:
               | Yes, the Steam Deck is using a fairly standard stack and
               | the default client but you're missing the point. Valve
               | directly tests the Steam Deck and prioritizes bug fixes
               | for it. When users report issues with other setups it
               | often takes months for the identified bug to be fixed if
               | it ever is.
        
           | mariusor wrote:
           | > There is nothing special about the Steam Deck. It's just
           | another Linux machine.
           | 
           | That's not true. It's a read-only linux on a fixed hardware
           | platform, which is a vaaastly different beast than the myriad
           | of hardware/software combinations that exist out there in the
           | wild.
        
             | zamalek wrote:
             | > It's a read-only linux on a fixed hardware platform
             | 
             | I have heard that argument about macOS a lot, and this is
             | nothing like that. There isn't some "special sauce er...
             | Source" that they apply to their platform. It's just GPL
             | Linux. They may have avoided bad decisions like relying on
             | NVIDIA for Linux gaming, but that's hardly the level of
             | ownership that you see with other vertical integrations. If
             | I use an AMD CPU (or Intel, which would be arguably better)
             | and AMD GPU, there is no reason why my PC couldn't be just
             | as "first-party" as the Steam Deck.
             | 
             | Wine/Proton ultimately access the GPU through DRM, that
             | remains the same for Valve hardware or custom-built
             | hardware. Both Steam and Wine/Proton currently render via
             | X11 (via XWayland if necessary), on both my PC and the
             | Steam Deck.
             | 
             | I feel like there is a gap of understanding how a HAL works
             | here.
        
               | mlyle wrote:
               | > It's just GPL Linux.
               | 
               | "Just" GPL Linux encompasses myriad library versions,
               | kernel versions, driver versions and varied hardware.
               | 
               | > I feel like there is a gap of understanding how a HAL
               | works here.
               | 
               | Just because you have a HAL doesn't mean that you don't
               | get different behavior and crashes with different numbers
               | of CPUs/concurrency or other hardware beneath. Modern
               | GPUs are also pretty complicated beasts, and assuming
               | that's fully abstracted is a mistake.
               | 
               | And this all leaves aside the myriad of other problems
               | you can have with the ensemble of software running on the
               | machine that interacts with the game (directly or
               | indirectly).
               | 
               | Being able to test and make one restricted platform work
               | well is a far different beast than covering the huge mass
               | of variation users create on their own machines.
        
               | admax88qqq wrote:
               | I feel like there is a gap in understanding of how
               | commercial software deployment goes.
               | 
               | When you have a platform like Steam Deck, it's the
               | platform that gets tested by QA, and the platform that
               | most of your devs are building for every day.
        
               | mariusor wrote:
               | Sure, a linux machine is made out of just a CPU and a
               | GPU. Even if that would be the case, what about the
               | software combinations that can exist and that the
               | SteamDeck simplifies?
               | 
               | In the gamedev world I heard a lot of people not wanting
               | to support linux because they never know which glibc
               | version to support, which mesa version to support, which
               | hardware GPUs to support, which graphical API to support,
               | etc.
               | 
               | Cutting down that matrix (and I just mentioned the most
               | egregious examples) to only one element is invaluable in
               | ensuring your users have a bug free experience.
        
         | Arnavion wrote:
         | I run Steam in a Docker container of Ubuntu 22.04 for reasons
         | like this. Also my actual system isn't polluted with 32-bit
         | libs, Steam can't rm-rf my home directory and games can't steal
         | files from my home directory (homedir inside the container is a
         | separate directory on the host), and access to X and dbus is
         | restricted (dbus socket not forwarded, X socket is from a
         | nested Xephyr instance) so nothing can be stolen from there
         | either.
         | 
         | Edit: More details in
         | https://news.ycombinator.com/item?id=34634854
        
           | sosodev wrote:
           | Is there a guide for this? I'd really like to isolate Steam
           | from the rest of my system
        
             | [deleted]
        
         | Entinel wrote:
         | Devil's advocate, I use Steam on Fedora and have had 0 issues.
         | Very rarely freezes or crashes. It's probably the most stable
         | application I use daily.
        
           | 2OEH8eoCRo0 wrote:
           | I use Steam on Fedora as well and I notice a lot of jank with
           | the Steam client (Nvidia 1080ti). Dropdown menus popping
           | through windows, sound may or may not work for videos,
           | freezing, etc. It's usable but it's not very pleasant.
        
             | Entinel wrote:
             | Just for GPU comparison I'm on an AMD RX card so it could
             | be an Nvidia issue which is known to be jank on Linux.
        
           | sosodev wrote:
           | It seems to work perfectly for some people. I've regularly
           | had issues with the client not rendering at all, freezing,
           | and crashing on Pop_OS 22.04 LTS with an nvidia GTX 1660ti.
        
           | WaffleIronMaker wrote:
           | I also use Steam on Fedora, and I've not had any issues with
           | Stardew Valley, Factorio, Celeste, N++, Undertale, and
           | others. I remember having a brief issue with Portal, but I
           | was able to resolve it. Overall, I've had a good experience.
        
       | amluto wrote:
       | It should be straightforward to make a little LD_PRELOAD shim to
       | implement the new operator new on top of old overloads and thus
       | restore proper functioning.
       | 
       | It would be a gross kludge, though.
        
         | olliej wrote:
         | I'm not sure that's sound. You can't just redirect an aligned
         | new to the unaligned operator new as you may get unaligned
         | result. It _sounds_ like what is happening is
         | a = ::operator new(some size, some alignment)         ...
         | ::operator delete(a);
         | 
         | where delete is dropping the align_val_t parameter that would
         | guarantee it hits the same allocator family. There are a
         | variety of ways this can happen, and let's just take it as
         | given that it is.
         | 
         | The problem is that if operator new(size_t, align_val_t) is
         | called then the struct has an alignment annotation. That can
         | lead to codegen that reasonably assumes alignment, even without
         | any source level decisions that depend on alignment. The result
         | of having some equivalent of (either at runtime or link time)
         | void * operator new(size_t sz, align_val_t a) {           if
         | (operator new(size_t) has been overridden) return ::operator
         | new(sz);            ...         }
         | 
         | could be an "aligned" allocation returning an unaligned value,
         | causing crashes later on.
        
           | viraptor wrote:
           | If you don't mind wasting a bit of time, you could forward
           | size+alignment to the allocator, return the aligned version
           | and keep a record of aligned-to-allocation mapping. (For
           | freeing later)
           | 
           | But as the other comment mentioned - it should be a problem
           | for tf2 in the first place since that's not the behaviour
           | they're after.
        
             | olliej wrote:
             | > If you don't mind wasting a bit of time, you could
             | forward size+alignment to the allocator, return the aligned
             | version and keep a record of aligned-to-allocation mapping.
             | (For freeing later)
             | 
             | I'm unsure what you're proposing here - the only methods
             | you know in the replacement allocator are operator
             | new(size_t) and operator delete(void _). The two possible
             | failure paths are:                   a = ::operator
             | new(some size)         ...         ::operator delete(a,
             | alignment)
             | 
             | and                   a = ::operator new(some size, some
             | alignment)         ...         ::operator delete(a)
             | 
             | In the first case what you could do is say "if I did not
             | allocator this pointer, optimistically forward it to
             | operator delete(void_ )", in the latter case you can
             | identify that a different operator new(size_t) exists but
             | you have no idea how to make that allocator produce an
             | aligned allocation. What I guess you could do is round the
             | size up to a multiple of the specified alignment, and then
             | just repeatedly allocate in the hope that you will
             | eventually get a correctly aligned value out. But that
             | would not be guaranteed.
        
               | nneonneo wrote:
               | The latter suggestion assumes that there's enough entropy
               | in the allocation process to make this work. But that's
               | not guaranteed! Suppose that your allocator doesn't pad
               | allocations (e.g. because it uses a bitmap), and that it
               | only guarantees 0x10 alignment. If the top of the heap
               | happens to be unaligned with respect to your desired
               | alignment (e.g. address ends in 0x10 when you want 0x20
               | alignment), you might wind up just repeatedly allocating
               | unaligned blocks off the top of the heap forever.
               | 
               | This is not an easy problem to solve, unfortunately. On
               | MacOS I believe they solve this problem using the two-
               | level namespace: symbol references include the library
               | name, so "operator new(size_t)" from libstdc++ is
               | distinct from "operator new(size_t)" from libtcmalloc.
               | 
               | Symbol versioning also seems like it should solve the
               | problem: have the new interfaces explicitly declared with
               | a newer ABI version (e.g. @@LIBCXX_17) and link only to
               | those new versions from code that expects them. Of
               | course, symbol versioning comes with its own set of nasty
               | drawbacks, but in this case it seems like a solution that
               | might work?
        
               | olliej wrote:
               | > The latter suggestion assumes that there's enough
               | entropy in the allocation process to make this work. But
               | that's not guaranteed!
               | 
               | Oh absolutely, there's no guarantee it's ever aligned:
               | the allocator could wrap an aligned allocator but include
               | a pointer sized prefix (a la array allocations) so you
               | would be _guaranteed_ to never be more than pointer size
               | aligned :D
               | 
               | As you say versioning and namespacing is super
               | problematic, but I'm not sure they'd even work here.
               | 
               | At it's core the problem is that some code is compiling
               | with the knowledge it has aligned allocations, so can
               | assume alignment, and the some parts are not. There are a
               | bunch of options that ensure that the allocator is
               | consistent, but they devolve to either ignoring the
               | new+delete overrides, or having the aligned allocators
               | detect the override and forward to unaligned allocators
               | while hoping nothing depended on correct alignment.
        
               | viraptor wrote:
               | > and then just repeatedly allocate in the hope that you
               | will eventually get a correctly aligned value out
               | 
               | If you preload something that patches all the new/delete
               | interfaces, you can do this without guesswork.
               | new(size, alignment) ->
               | res=alloc(size+alignment)           res_aligned=res+...
               | offsets[res_aligned] = res              new(size) ->
               | alloc(size)              free(ptr) ->
               | free(offsets[ptr] || ptr)           offsets.del(ptr)
        
               | amluto wrote:
               | See my comment above. tcmalloc implements the _C_ API as
               | well, including aligned_alloc().
        
           | jenadine wrote:
           | That's not sound in general, but it is "probably" going to
           | work for this specific case because the previous build was
           | build with allocator that did not support this alignment,
           | meaning that they did not need extra alignment. This is
           | pretty rare actually. And you had anyway to use a custom
           | allocator already with previous C++ versions to make it work.
        
             | olliej wrote:
             | While I do agree with you, and think it's probably worth
             | seeing if detecting the override and falling back to
             | unaligned allocation works, the problem is not that the
             | code in TF is compiling assuming/requiring over aligned
             | data.
             | 
             | The problem is that there is system code that they are
             | calling that is making using of over aligned allocation, so
             | therefore could be generating code dependent on said
             | alignment. The failure mode can very easily be
             | someSystemLibrary.so`someFunction:           alignedThing =
             | ::operator new(size, alignment)           ...
             | i_dunno_dma_memcpy_or_something(a, somewhere else)
             | ...           ::operator delete(a)
             | 
             | With no interaction with TF code at all. _Except_ TF has
             | replaced operator delete so that fails due to the allocator
             | mismatch. If you make ::opeator new(size_t, align_val_t)
             | redirect to ::operator new(size_t) if it detects an
             | override then the aligned operation can fail. The above
             | example is moderately difficult to induce so it 's more
             | likely that there's an explicit split with the system is
             | doing one half of new/delete and TF is doing the other, but
             | the important thing is that it implies the system code is
             | built aware of alignment and it depends on the alignment
             | even if TF does not.
        
           | Asooka wrote:
           | The C interface for aligned memory allocation is
           | aligned_alloc(). The returned pointers are always freed with
           | free(). So what is probably happening is that aligned new
           | calls aligned_alloc(), and then aligned delete simply calls
           | the regular delete, expecting to end up in free(), which by
           | design should work with both kinds of pointers.
           | 
           | I _think_ the problem here is partly with the implementation
           | of aligned new /delete. Since one is free to override only
           | the old versions, the ones supplied by the standard library
           | should make sure not to fall back to functions that may be
           | partially overriden.
        
           | amluto wrote:
           | As pure speculation, one could forward to aligned_alloc and
           | still free with ::delete. I haven't tested this, nor have I
           | looked at the code.
        
         | zokier wrote:
         | LD_PRELOAD would probably run afoul with VAC though?
        
           | Polycryptus wrote:
           | Steam on Linux already uses LD_PRELOAD under-the-hood to load
           | the overlay. Valve signs the overlay SO files, so they could
           | be making an exception for Valve-signed-preloads in VAC, but
           | it's also possible that VAC does something else to check for
           | suspicious libraries loaded in.
        
       | mmh0000 wrote:
       | I loved the premise of the article, though I really wish the
       | author had gone into detail about how he discovered the root
       | cause.
        
       | olliej wrote:
       | This is a predictable outcome of overriding the global operator
       | new. It remains annoying that this was ever allowed, and is a
       | constant source of pain for c++ standard library implementations.
        
         | phkahler wrote:
         | It seems more like the app and driver are mixing their
         | new/delete pairs. That seems like a bug to me. Maybe even an
         | API design issue if it's supposed to happen.
        
         | DannyBee wrote:
         | It actually should still work, since fedora38 includes the
         | llvm15 versioned libs.
         | 
         | The only way to make this break is if something is loading
         | random unversioned solibs or whatever the latest one it can
         | find is, and expecting this to work forever.
         | 
         | If it actually used a versioned solib, it would get llvm 15
         | just like it did before.
         | 
         | This is the whole point of versioned solibs.
        
       | Karliss wrote:
       | Whole graphics drivers using LLVM in the backend has caused
       | countless issues. The way I look at it one of the main problems
       | is that graphic API libraries shouldn't leak symbols from
       | implementation details like them using LLVM. They should expose
       | only the graphics API and nothing more.
        
         | vchuravy wrote:
         | Don't ask me about GNU_UNIQUE...
         | 
         | Due to some wonderful C++ features the dynamic linker is forced
         | to unify symbols across shared libraries, even if those symbols
         | have different versions.
         | 
         | This utterly breaks loading multiple libLLVM's except if you
         | build the copy you care about with -no-gnu-unique (or whatever
         | the flag was called)
         | 
         | I have seen wonderful things like the initializers of an
         | already loaded libLLVM being rerun when a new one is loaded.
        
       | admax88qqq wrote:
       | Unfortunately this is exactly the type of stuff that makes
       | supporting commercial apps on linux a nightmare. Weird crashes
       | due to weird linking of system libraries.
       | 
       | Common distros are very adamant about dynamic linking everything
       | in order to support the use case of "core library has
       | vulnerability, upgrade it in place without rebuilding consuming
       | apps." Along with a desire to avoid "dll hell" and force a single
       | canonical version of every library systemwide. This leads to
       | these sorts of issues.
       | 
       | Windows gets around it by letting applications put the DLLs they
       | care about beside the executable, and having it check there first
       | by default.
        
         | eikenberry wrote:
         | Isn't this exactly the use case for which flatpaks are
         | designed? Isn't Redhat/Fedora in the process of adopting them
         | as the primary way to support third party/proprietary graphical
         | apps like Steam? Doesn't the current Steam flatpak avoid this
         | issue?
         | 
         | TLDR; isn't this already addressed?
        
         | doublepg23 wrote:
         | The funny thing is in on Fedora in 2023 I don't feel like I'm
         | missing out on most software.
        
         | ho_schi wrote:
         | Aehm. That is what a lot of closed-source applications do on
         | Linux. And Valve does that, too.
         | 
         | The open-source ones are maintained in the packing system and
         | kept lean.
        
         | gabcoh wrote:
         | Can Linux not trivially do the same thing as windows with
         | LD_PRELOAD? If so why is this more of an issue on Linux than
         | Windows? Is it really less a technical challenge and more just
         | a matter of Linux getting less support from upstream
         | developers?
        
           | bravetraveler wrote:
           | I was thinking/wondering this myself. Not to reinvent the
           | wheel - more toss an idea around, but a _' venv for
           | LD_PRELOAD'_ sounds like it'd deal with this pretty handily
           | 
           | Not... in a way I'd use as a distribution/release maintainer.
           | _Probably_ as an administrator [of my LAN]
        
             | gabcoh wrote:
             | Such things already exist. Eg. Appimage or even docker.
        
               | lnxg33k1 wrote:
               | and even that has been managed to be split between snap
               | appimage and flatpak :D
               | 
               | (sorry not meant to offend, long time linux day-to-day
               | user here, but it was just ironic for me to point out
               | fragmentation of fragmentation ^^)
        
               | bravetraveler wrote:
               | Right, but I don't really want to get into a distribution
               | model - the hack suits me fine :)
               | 
               | More an exercise in curiosity than anything
               | 
               | Flatpak (or Snap, ew) probably deals with it fine today,
               | Steam's there
        
           | stabbles wrote:
           | LD_PRELOAD is too global to be useful, it's hard to scope it
           | to one process (and not child processes). macOS is better in
           | the sense that it clears DYLD_* variables when the dynamic
           | linker has done its work and the process starts. (Although
           | that can also be painful when you want to run a shell script
           | and set DYLD_* outside)
        
             | nly wrote:
             | You can compile binaries with additional relative library
             | paths in to them that will take priority over /usr/lib64
        
           | aidenn0 wrote:
           | This sounds like it's an interaction with the GPU driver
           | though, which could also happen on windows...
        
       | stryan wrote:
       | Valve does this for a couple of their games, see a similar issue
       | with Dota 2[0].
       | 
       | [0] https://github.com/ValveSoftware/Dota-2/issues/2285
        
       ___________________________________________________________________
       (page generated 2023-04-24 23:00 UTC)