[HN Gopher] Fedora 38 LLVM vs. Team Fortress 2 ___________________________________________________________________ Fedora 38 LLVM vs. Team Fortress 2 Author : st_goliath Score : 88 points Date : 2023-04-24 18:54 UTC (4 hours ago) (HTM) web link (airlied.blogspot.com) (TXT) w3m dump (airlied.blogspot.com) | DannyBee wrote: | Fedora 38 includes the LLVM15 libs to maintain backwards | compatibility. | | Why is this automatically using a new, incompatible solib, | instead of a versioned solib? | AnssiH wrote: | The LLVM dependency is in the HW-specific driver solib which is | loaded by the OpenGL library, which has the same soname as | before. | IceWreck wrote: | Isn't this the reason why people recommend using the flatpak | version of Steam ? | PlutoIsAPlanet wrote: | Yes, especially on Fedora. | | This isn't something Fedora is doing wrong, unfortunately some | games build against older libraries or are built against | Debian/Ubuntu and the Flatpak runtimes generally have better | compatibility. | exabrial wrote: | I thought TF2 was pretty much 100% hacked... like no legit non- | hackers playing except at LAN parties. | sosodev wrote: | It's unfortunate but the Steam experience on Linux seems to be | progressively getting worse (outside of Steam Deck ofc). The | Steam client is often borderline unusable for Linux users. You | can find many issue threads on GitHub reporting client freezes | and crashes. | | It seems like a big part of the issues is a lack of maintenance. | TF2 would actually run better on Linux via Proton but VAC isn't | enabled so you can't join the vast majority of servers. | | Valve also has existing Source engine tooling that allows Linux | ports to drop OpenGL entirely (dxvk-native as used by Portal 2 | and L4D2) but they haven't added it to TF2... :( | ho_schi wrote: | I can't see how native Linux support is getting worse. Linux | users are good at bug reporting. Maybe some developers should | care more about compatibility. And yes, especially the | heterogeneous setups used by some makes support difficult. | | I'm worried that Valve puts too much resources into Proton | (derivate of WINE) instead of tooling for native ports. Yes, | Proton is needed to provide initial compatibility. But Proton | is another layer of complexity (more bugs, integration, system | resources) which requires more programming. I started playing | CS again after it was ported natively in 2014, it runs well and | all issues with WINE were gone. | | If Proton becomes to "good" we end up in a situation with a | high maintenance burden for Valve. Game developers will rely on | it and Valve has all the constant work. Instead game developers | should treat Linux as first-class platform for AAA-Titles, for | which the need appropriate APIs, compatibility and tooling. As | Valve does itself support Linux as first-class platform from | HL2 to CSGO. The target shall be official support from the very | first day. | | Anyway. Looks like Valve has chosen a special implementation | for TF2? What I miss here is a link to a bug report. Ideally | opened months ago :) | danbolt wrote: | I think Valve has a financial incentive to keep Proton | compatibility in a positive state, as it increases sales of | the Steam Deck and encourages players to remain in their | ecosystem. Or, I think it's more likely than the majority of | AAA game developers having a financial incentive to maintain | Linux versions of their products. | pjmlp wrote: | Game studios already know Linux distributions quite well on | the server, and most AAA games on Android are basically only | using the NDK, meaning ISO C and C++, OpenGL ES, Vulkan, | OpenSL. | | Besides that, PlayStation OS is based on FreeBSD. Even if the | 3D API is different, it is just yet another backend. | | They don't port them, because the QA and support aren't worth | the sales, that is about it. | zamalek wrote: | > You can find many issue threads on GitHub reporting client | freezes and crashes. | | The fact that these are happening does not necessarily mean the | client is getting worse. For example, it _could_ mean that more | people are installing Steam for Linux. There is no baseline to | say it 's getting worse, because nobody opens an issue saying | "all working here." | | In my experience, the _only_ issue I have on Wayland is this: | https://github.com/ValveSoftware/steam-for-linux/issues/7245 | (workaround: disable animated avatars) (edit: all AMD machine) | | > outside of Steam Deck ofc | | There is nothing special about the Steam Deck. It's just | another Linux machine. | | > TF2 | | I don't play any Source games, but I could see TF2 having | issues because it's in maintenance mode. If it is bjorked that | has nothing to do with Steam. | sosodev wrote: | True, I don't have enough data to really make that claim. I | can say that my own hardware hasn't changed in ~4 years and | I've been using Steam for Linux since I built this machine. | It's only within the last year or so that I started having | major issues with the client. | | > there is nothing special about the steam deck | | How is first party support for the hardware and software | stack "nothing special"? | | > If it is bjorked that has nothing to do with Steam. | | Maybe it's not directly related to the rest of my comment but | it's related to the OP. I also think it's indicative of | Valve's issues with Linux. | zamalek wrote: | > How is first party support for the hardware and software | stack "nothing special"? | | Because the vast majority of that stack (the kernel, GPU | driver, window manager, and so forth) has nothing to do | with Valve. They might contribute drivers to the kernel | (I'm not sure if they actually do - I would expect AMD to | be doing that), but otherwise it's an Arch-based distro | with the same Steam client and Proton runtime that everyone | else is using. | sosodev wrote: | Yes, the Steam Deck is using a fairly standard stack and | the default client but you're missing the point. Valve | directly tests the Steam Deck and prioritizes bug fixes | for it. When users report issues with other setups it | often takes months for the identified bug to be fixed if | it ever is. | mariusor wrote: | > There is nothing special about the Steam Deck. It's just | another Linux machine. | | That's not true. It's a read-only linux on a fixed hardware | platform, which is a vaaastly different beast than the myriad | of hardware/software combinations that exist out there in the | wild. | zamalek wrote: | > It's a read-only linux on a fixed hardware platform | | I have heard that argument about macOS a lot, and this is | nothing like that. There isn't some "special sauce er... | Source" that they apply to their platform. It's just GPL | Linux. They may have avoided bad decisions like relying on | NVIDIA for Linux gaming, but that's hardly the level of | ownership that you see with other vertical integrations. If | I use an AMD CPU (or Intel, which would be arguably better) | and AMD GPU, there is no reason why my PC couldn't be just | as "first-party" as the Steam Deck. | | Wine/Proton ultimately access the GPU through DRM, that | remains the same for Valve hardware or custom-built | hardware. Both Steam and Wine/Proton currently render via | X11 (via XWayland if necessary), on both my PC and the | Steam Deck. | | I feel like there is a gap of understanding how a HAL works | here. | mlyle wrote: | > It's just GPL Linux. | | "Just" GPL Linux encompasses myriad library versions, | kernel versions, driver versions and varied hardware. | | > I feel like there is a gap of understanding how a HAL | works here. | | Just because you have a HAL doesn't mean that you don't | get different behavior and crashes with different numbers | of CPUs/concurrency or other hardware beneath. Modern | GPUs are also pretty complicated beasts, and assuming | that's fully abstracted is a mistake. | | And this all leaves aside the myriad of other problems | you can have with the ensemble of software running on the | machine that interacts with the game (directly or | indirectly). | | Being able to test and make one restricted platform work | well is a far different beast than covering the huge mass | of variation users create on their own machines. | admax88qqq wrote: | I feel like there is a gap in understanding of how | commercial software deployment goes. | | When you have a platform like Steam Deck, it's the | platform that gets tested by QA, and the platform that | most of your devs are building for every day. | mariusor wrote: | Sure, a linux machine is made out of just a CPU and a | GPU. Even if that would be the case, what about the | software combinations that can exist and that the | SteamDeck simplifies? | | In the gamedev world I heard a lot of people not wanting | to support linux because they never know which glibc | version to support, which mesa version to support, which | hardware GPUs to support, which graphical API to support, | etc. | | Cutting down that matrix (and I just mentioned the most | egregious examples) to only one element is invaluable in | ensuring your users have a bug free experience. | Arnavion wrote: | I run Steam in a Docker container of Ubuntu 22.04 for reasons | like this. Also my actual system isn't polluted with 32-bit | libs, Steam can't rm-rf my home directory and games can't steal | files from my home directory (homedir inside the container is a | separate directory on the host), and access to X and dbus is | restricted (dbus socket not forwarded, X socket is from a | nested Xephyr instance) so nothing can be stolen from there | either. | | Edit: More details in | https://news.ycombinator.com/item?id=34634854 | sosodev wrote: | Is there a guide for this? I'd really like to isolate Steam | from the rest of my system | [deleted] | Entinel wrote: | Devil's advocate, I use Steam on Fedora and have had 0 issues. | Very rarely freezes or crashes. It's probably the most stable | application I use daily. | 2OEH8eoCRo0 wrote: | I use Steam on Fedora as well and I notice a lot of jank with | the Steam client (Nvidia 1080ti). Dropdown menus popping | through windows, sound may or may not work for videos, | freezing, etc. It's usable but it's not very pleasant. | Entinel wrote: | Just for GPU comparison I'm on an AMD RX card so it could | be an Nvidia issue which is known to be jank on Linux. | sosodev wrote: | It seems to work perfectly for some people. I've regularly | had issues with the client not rendering at all, freezing, | and crashing on Pop_OS 22.04 LTS with an nvidia GTX 1660ti. | WaffleIronMaker wrote: | I also use Steam on Fedora, and I've not had any issues with | Stardew Valley, Factorio, Celeste, N++, Undertale, and | others. I remember having a brief issue with Portal, but I | was able to resolve it. Overall, I've had a good experience. | amluto wrote: | It should be straightforward to make a little LD_PRELOAD shim to | implement the new operator new on top of old overloads and thus | restore proper functioning. | | It would be a gross kludge, though. | olliej wrote: | I'm not sure that's sound. You can't just redirect an aligned | new to the unaligned operator new as you may get unaligned | result. It _sounds_ like what is happening is | a = ::operator new(some size, some alignment) ... | ::operator delete(a); | | where delete is dropping the align_val_t parameter that would | guarantee it hits the same allocator family. There are a | variety of ways this can happen, and let's just take it as | given that it is. | | The problem is that if operator new(size_t, align_val_t) is | called then the struct has an alignment annotation. That can | lead to codegen that reasonably assumes alignment, even without | any source level decisions that depend on alignment. The result | of having some equivalent of (either at runtime or link time) | void * operator new(size_t sz, align_val_t a) { if | (operator new(size_t) has been overridden) return ::operator | new(sz); ... } | | could be an "aligned" allocation returning an unaligned value, | causing crashes later on. | viraptor wrote: | If you don't mind wasting a bit of time, you could forward | size+alignment to the allocator, return the aligned version | and keep a record of aligned-to-allocation mapping. (For | freeing later) | | But as the other comment mentioned - it should be a problem | for tf2 in the first place since that's not the behaviour | they're after. | olliej wrote: | > If you don't mind wasting a bit of time, you could | forward size+alignment to the allocator, return the aligned | version and keep a record of aligned-to-allocation mapping. | (For freeing later) | | I'm unsure what you're proposing here - the only methods | you know in the replacement allocator are operator | new(size_t) and operator delete(void _). The two possible | failure paths are: a = ::operator | new(some size) ... ::operator delete(a, | alignment) | | and a = ::operator new(some size, some | alignment) ... ::operator delete(a) | | In the first case what you could do is say "if I did not | allocator this pointer, optimistically forward it to | operator delete(void_ )", in the latter case you can | identify that a different operator new(size_t) exists but | you have no idea how to make that allocator produce an | aligned allocation. What I guess you could do is round the | size up to a multiple of the specified alignment, and then | just repeatedly allocate in the hope that you will | eventually get a correctly aligned value out. But that | would not be guaranteed. | nneonneo wrote: | The latter suggestion assumes that there's enough entropy | in the allocation process to make this work. But that's | not guaranteed! Suppose that your allocator doesn't pad | allocations (e.g. because it uses a bitmap), and that it | only guarantees 0x10 alignment. If the top of the heap | happens to be unaligned with respect to your desired | alignment (e.g. address ends in 0x10 when you want 0x20 | alignment), you might wind up just repeatedly allocating | unaligned blocks off the top of the heap forever. | | This is not an easy problem to solve, unfortunately. On | MacOS I believe they solve this problem using the two- | level namespace: symbol references include the library | name, so "operator new(size_t)" from libstdc++ is | distinct from "operator new(size_t)" from libtcmalloc. | | Symbol versioning also seems like it should solve the | problem: have the new interfaces explicitly declared with | a newer ABI version (e.g. @@LIBCXX_17) and link only to | those new versions from code that expects them. Of | course, symbol versioning comes with its own set of nasty | drawbacks, but in this case it seems like a solution that | might work? | olliej wrote: | > The latter suggestion assumes that there's enough | entropy in the allocation process to make this work. But | that's not guaranteed! | | Oh absolutely, there's no guarantee it's ever aligned: | the allocator could wrap an aligned allocator but include | a pointer sized prefix (a la array allocations) so you | would be _guaranteed_ to never be more than pointer size | aligned :D | | As you say versioning and namespacing is super | problematic, but I'm not sure they'd even work here. | | At it's core the problem is that some code is compiling | with the knowledge it has aligned allocations, so can | assume alignment, and the some parts are not. There are a | bunch of options that ensure that the allocator is | consistent, but they devolve to either ignoring the | new+delete overrides, or having the aligned allocators | detect the override and forward to unaligned allocators | while hoping nothing depended on correct alignment. | viraptor wrote: | > and then just repeatedly allocate in the hope that you | will eventually get a correctly aligned value out | | If you preload something that patches all the new/delete | interfaces, you can do this without guesswork. | new(size, alignment) -> | res=alloc(size+alignment) res_aligned=res+... | offsets[res_aligned] = res new(size) -> | alloc(size) free(ptr) -> | free(offsets[ptr] || ptr) offsets.del(ptr) | amluto wrote: | See my comment above. tcmalloc implements the _C_ API as | well, including aligned_alloc(). | jenadine wrote: | That's not sound in general, but it is "probably" going to | work for this specific case because the previous build was | build with allocator that did not support this alignment, | meaning that they did not need extra alignment. This is | pretty rare actually. And you had anyway to use a custom | allocator already with previous C++ versions to make it work. | olliej wrote: | While I do agree with you, and think it's probably worth | seeing if detecting the override and falling back to | unaligned allocation works, the problem is not that the | code in TF is compiling assuming/requiring over aligned | data. | | The problem is that there is system code that they are | calling that is making using of over aligned allocation, so | therefore could be generating code dependent on said | alignment. The failure mode can very easily be | someSystemLibrary.so`someFunction: alignedThing = | ::operator new(size, alignment) ... | i_dunno_dma_memcpy_or_something(a, somewhere else) | ... ::operator delete(a) | | With no interaction with TF code at all. _Except_ TF has | replaced operator delete so that fails due to the allocator | mismatch. If you make ::opeator new(size_t, align_val_t) | redirect to ::operator new(size_t) if it detects an | override then the aligned operation can fail. The above | example is moderately difficult to induce so it 's more | likely that there's an explicit split with the system is | doing one half of new/delete and TF is doing the other, but | the important thing is that it implies the system code is | built aware of alignment and it depends on the alignment | even if TF does not. | Asooka wrote: | The C interface for aligned memory allocation is | aligned_alloc(). The returned pointers are always freed with | free(). So what is probably happening is that aligned new | calls aligned_alloc(), and then aligned delete simply calls | the regular delete, expecting to end up in free(), which by | design should work with both kinds of pointers. | | I _think_ the problem here is partly with the implementation | of aligned new /delete. Since one is free to override only | the old versions, the ones supplied by the standard library | should make sure not to fall back to functions that may be | partially overriden. | amluto wrote: | As pure speculation, one could forward to aligned_alloc and | still free with ::delete. I haven't tested this, nor have I | looked at the code. | zokier wrote: | LD_PRELOAD would probably run afoul with VAC though? | Polycryptus wrote: | Steam on Linux already uses LD_PRELOAD under-the-hood to load | the overlay. Valve signs the overlay SO files, so they could | be making an exception for Valve-signed-preloads in VAC, but | it's also possible that VAC does something else to check for | suspicious libraries loaded in. | mmh0000 wrote: | I loved the premise of the article, though I really wish the | author had gone into detail about how he discovered the root | cause. | olliej wrote: | This is a predictable outcome of overriding the global operator | new. It remains annoying that this was ever allowed, and is a | constant source of pain for c++ standard library implementations. | phkahler wrote: | It seems more like the app and driver are mixing their | new/delete pairs. That seems like a bug to me. Maybe even an | API design issue if it's supposed to happen. | DannyBee wrote: | It actually should still work, since fedora38 includes the | llvm15 versioned libs. | | The only way to make this break is if something is loading | random unversioned solibs or whatever the latest one it can | find is, and expecting this to work forever. | | If it actually used a versioned solib, it would get llvm 15 | just like it did before. | | This is the whole point of versioned solibs. | Karliss wrote: | Whole graphics drivers using LLVM in the backend has caused | countless issues. The way I look at it one of the main problems | is that graphic API libraries shouldn't leak symbols from | implementation details like them using LLVM. They should expose | only the graphics API and nothing more. | vchuravy wrote: | Don't ask me about GNU_UNIQUE... | | Due to some wonderful C++ features the dynamic linker is forced | to unify symbols across shared libraries, even if those symbols | have different versions. | | This utterly breaks loading multiple libLLVM's except if you | build the copy you care about with -no-gnu-unique (or whatever | the flag was called) | | I have seen wonderful things like the initializers of an | already loaded libLLVM being rerun when a new one is loaded. | admax88qqq wrote: | Unfortunately this is exactly the type of stuff that makes | supporting commercial apps on linux a nightmare. Weird crashes | due to weird linking of system libraries. | | Common distros are very adamant about dynamic linking everything | in order to support the use case of "core library has | vulnerability, upgrade it in place without rebuilding consuming | apps." Along with a desire to avoid "dll hell" and force a single | canonical version of every library systemwide. This leads to | these sorts of issues. | | Windows gets around it by letting applications put the DLLs they | care about beside the executable, and having it check there first | by default. | eikenberry wrote: | Isn't this exactly the use case for which flatpaks are | designed? Isn't Redhat/Fedora in the process of adopting them | as the primary way to support third party/proprietary graphical | apps like Steam? Doesn't the current Steam flatpak avoid this | issue? | | TLDR; isn't this already addressed? | doublepg23 wrote: | The funny thing is in on Fedora in 2023 I don't feel like I'm | missing out on most software. | ho_schi wrote: | Aehm. That is what a lot of closed-source applications do on | Linux. And Valve does that, too. | | The open-source ones are maintained in the packing system and | kept lean. | gabcoh wrote: | Can Linux not trivially do the same thing as windows with | LD_PRELOAD? If so why is this more of an issue on Linux than | Windows? Is it really less a technical challenge and more just | a matter of Linux getting less support from upstream | developers? | bravetraveler wrote: | I was thinking/wondering this myself. Not to reinvent the | wheel - more toss an idea around, but a _' venv for | LD_PRELOAD'_ sounds like it'd deal with this pretty handily | | Not... in a way I'd use as a distribution/release maintainer. | _Probably_ as an administrator [of my LAN] | gabcoh wrote: | Such things already exist. Eg. Appimage or even docker. | lnxg33k1 wrote: | and even that has been managed to be split between snap | appimage and flatpak :D | | (sorry not meant to offend, long time linux day-to-day | user here, but it was just ironic for me to point out | fragmentation of fragmentation ^^) | bravetraveler wrote: | Right, but I don't really want to get into a distribution | model - the hack suits me fine :) | | More an exercise in curiosity than anything | | Flatpak (or Snap, ew) probably deals with it fine today, | Steam's there | stabbles wrote: | LD_PRELOAD is too global to be useful, it's hard to scope it | to one process (and not child processes). macOS is better in | the sense that it clears DYLD_* variables when the dynamic | linker has done its work and the process starts. (Although | that can also be painful when you want to run a shell script | and set DYLD_* outside) | nly wrote: | You can compile binaries with additional relative library | paths in to them that will take priority over /usr/lib64 | aidenn0 wrote: | This sounds like it's an interaction with the GPU driver | though, which could also happen on windows... | stryan wrote: | Valve does this for a couple of their games, see a similar issue | with Dota 2[0]. | | [0] https://github.com/ValveSoftware/Dota-2/issues/2285 ___________________________________________________________________ (page generated 2023-04-24 23:00 UTC)