[HN Gopher] Dynamic linking ___________________________________________________________________ Dynamic linking Author : scrollaway Score : 198 points Date : 2020-06-26 17:12 UTC (5 hours ago) (HTM) web link (drewdevault.com) (TXT) w3m dump (drewdevault.com) | iveqy wrote: | Suckless has a project to get a fully static compiled Linux | environment. Unfortunately I don't know how far that have come | __s wrote: | https://sta.li | mforney wrote: | I don't think stali has seen any activity in several years. | | As far as I know, the only completely statically linked Linux | distribution that is actively developed is my own project | (inspired by stali), oasis: https://github.com/oasislinux/oasis | loeg wrote: | Do you static-link Linux? :-) | mforney wrote: | I'm not sure what you mean here. Are you asking whether I | build my kernel drivers as modules or built-in? Personally, | I build my kernels without modules, but I've never heard of | that technique being called "static-linking Linux". | cryptonector wrote: | The thing that's missing is static link semantics that don't | suck. | pengaru wrote: | Back in the 90s we'd statically link the most frequently executed | programs on busy servers for a significant performance boost. | | Dynamic linking is not a performance feature, it's a decoupling | feature. | binarycrusader wrote: | It is a performance feature if you are memory constrained. | Shared libraries are "shared" for a reason. On a server with | high multi-tenancy the savings can be significant. | jeffbee wrote: | Not really. Dynlinking solves a memory resource problem we | had in the 80s. These days the only people with the problem | are the very smallest embedded systems. | | As for "high multi-tenancy" there's nobody out there with | server occupancy as high as Google's (see the recent Borg | scheduler traces for concrete data) and they statically link | everything. | binarycrusader wrote: | _Not really._ | | Yes, really. | | _Dynlinking solves a memory resource problem we had in the | 80s. These days the only people with the problem are the | very smallest embedded systems._ | | Definitely not true. Every bit of memory that's not | available that could be shared memory instead is a | reduction in memory available for filesystem caches, etc. | | _As for "high multi-tenancy" there's nobody out there with | server occupancy as high as Google's (see the recent Borg | scheduler traces for concrete data) and they statically | link everything._ | | Cloud vendor computing models are not generally the | computing model of the rest of the world. Comparison to | their environment is not relevant to the general populace. | | I worked for a "big iron" OEM vendor until late 2017 and | the savings were definitely still significant then both for | their customers and the vendor themselves. | | There are numerous benefits to shared linking, that doesn't | mean it's always the appropriate solution, but is not | correct to claim that there are no performance benefits. | | Especially on more memory-constrained consumer devices, the | shared memory benefits of dynamic linking are still | significantly beneficial. | cryptonector wrote: | Dynamic linking boosted performance for Solaris back around | 2004. | AshamedCaptain wrote: | First we claim that dynamic linking does not provide any memory | savings because the libc we used is small, and later on we use | our lack of dynamic linking to justify having a small libc. | Smart, very smart. | kccqzy wrote: | That's what happens when you decouple components and make | decisions for each component separately rather than | holistically for the system as a whole. | saagarjha wrote: | As always, static and dynamic linking both have their advantages | and drawbacks. The usual arguments for dynamic linking around | brought up in the article, and as others have mentioned here, the | analysis is a bit lacking so the conclusions aren't generally | true. Static linking has its own, fairly straightforwards | benefits as well. It's no surprise that those who push one or the | other usually do so because of their specific needs. Sometimes we | even see some interesting hybrid solutions: one recent one is | Apple introducing dyld shared caches on macOS, which (while being | a pain to reverse engineer) are basically all the system | (dynamic) libraries all statically linked together and presented | dynamically, with some linker tricks to make it appear seamless. | Likewise, a lot of statically linked binaries are only partially | statically linked, still using things like libc or graphics | libraries. The moral really is to try both and pick whichever one | is the one that's better for your use case, and perhaps even | consider a mix of both to give you the most flexibility in which | tradeoffs you'd like to make. | cryptonector wrote: | Specifically for C, static linking is a trap due to its poor | semantics. Don't do it. Or fix the C link-editors, then static | linking C will be fine. | sigjuice wrote: | What would be examples of poor static linking semantics? | Thanks! | cryptonector wrote: | https://news.ycombinator.com/item?id=23656173 | andyjpb wrote: | This is an interesting analysis. For my part (anecdata), in my | /usr/bin (Debian) I have 2,956 files under 9MiB in size, 1 of | 13MiB and one of 39MiB. Most of the files are (much) under | 1.0MiB. | | On the other hand, I have a three statically linked binaries for | CloudFoundry in my home directory. One for Linux, one for MacOS | and one for Windows. They are each between 24MiB and 27MiB each. | butterisgood wrote: | ldd won't get you all the programs that dynamically load with | dlopen... but these are still interesting results. | gok wrote: | Why stop there? Package every binary in its own container, too! | Most programs only use a few system calls, so you're not really | getting anything by sharing a single kernel for the entire | system. | pengaru wrote: | Static linking has been such a nuisance for the libSDL folks that | they implemented dynamic loading of _itself_ [0], controlled via | an environment variable, as an escape hatch from executables w | /libSDL linked statically. | | It's understandable that games, especially proprietary ones, | distribute statically-linked binaries ensuring any third-party | dependencies will be present and be a compatible version. But the | value of that decision tends to diminish with time, as those | external dependencies are frequently the pieces interfacing with | the system/outside world, which keeps changing, leaving such | dependencies behind to atrophy at best or become | vulnerable/incompatible/broken at worst. | | I don't personally think it makes sense to approach this so | dogmatically. Static linking makes sense in the right | circumstances, so does dynamic linking. For general-purpose | operating systems, it seems obvious to me that you'd want most | higher-order userspace programs dynamically linked. I want my | openssl updates to touch a single library and affect all | installed ssl-using programs, for example. | | Having said that, I do wish the average linux distro still | statically linked everything in /bin and /sbin. It was nice to | still be able to administrate the system even when the dynamic | libraries were hosed. At some point it was changed to just a | single static binary; sln for static ln IIRC, assuming you'd be | able to fix your dynamic libraries with some symlinks if they | were broken, if you happened to have a shell running and could | navigate using just builtins. It was already an impossible | situation, but even that seems to be gone nowadays. | | It's a more nuanced issue, taking an "everything dynamically | linked!" or "everything statically linked!" approach strikes me | as just another form of ignorant extremism. | | [0] | http://hg.libsdl.org/SDL/file/2fabbbee604c/src/dynapi/SDL_dy... | cryptonector wrote: | > [...] Having said that, I do wish the average linux distro | still statically linked everything in /bin and /sbin. It was | nice to still be able to administrate the system even when the | dynamic libraries were hosed. [...] | | This argument came up back when Solaris 10 was in development | and the project to get rid of static link archives for system | libraries came up (search for Solaris "unified process model"). | The disposition of this argument was that if your libraries are | damaged (e.g., someone unlinked them or renamed them out of the | way, or maybe ld.so.1 itself), well, the dependent utilities in | /bin and /sbin themselves could have been damaged too, so you | can't know the extent of the damage, and it's not safe to | continue -- you have to use boot media to repair the damage, or | reinstall. And, of course, the packaging system has to be safe, | but that's not a lot to expect of a packaging system (is it??). | | To my knowledge there were no subsequent customer calls about | this. | jart wrote: | Sometimes solutions give rise to new categories of issues, | and it's difficult to connect the dots to the root cause. If | you believe dynamic linking hasn't introduced an even broader | array of difficulties for C coders needing to support both, | then please read Ulrich Drepper's DSO tutorial which gives a | pretty good rundown: https://software.intel.com/sites/default | /files/m/a/1/e/dsoho... If I remember correctly, it was | largely SCO Group that pushed UNIX vendors back in the 1990's | to switch to a WIN32 linking model. I didn't find their | arguments that compelling, to be honest, due to not citing | alternatives considered. | jart wrote: | There's nothing wrong with taking a principled approach to | builds. For example, it's not just a question of static or | dynamic. Linking of vendored/static/pinned/etc. | sources/artifacts/etc. in general comes with the benefit of | hermeticity. That enables tools to work better, since it gives | them perfect knowledge of the things that exist. It also | entails a certain degree of responsibility for writing your own | scripts that can generate automated alerts should issues arise | in the upstream sources. | dwheeler wrote: | Not convinced. | | First, this analysis was done on Arch Linux, a source-based | distribution. Since you know at compile time what your | environment is, I would expect the benefits to be smaller. And of | course, this means you're willing to do a lot of recompiles. I'd | like to see analysis done on more traditional (& common) binary | distros. | | Second, the arguments seem a little cherry-picked. "Over half of | your libraries are used by fewer than 0.1% of your executables." | is cute. But modern systems have a lot of executables, so 0.1% > | 1, so that still matters, and what about the other half. | | Finally, we're already having serious problems getting containers | to upgrade when a security vulnerability is found. Requiring | recompilation of all transitive users is not likely to win any | update speed contests. If it's _completely_ automated then it | would work, but any rocks in the process will leave people | endlessly vulnerable. See the Android ecosystem, etc. | kick wrote: | This comment hits all of the boxes for everything that is wrong | with Hacker News. | | "First, false statement. Since you know false assumption, I | would false conclusion. And of course, this means false | conclusion. I'd like to see analysis done on what you did them | on." | | "Second, the arguments seem a little cherry-picked. "Quote from | article about W and Z" is cute. But modern systems have a lot | of Z, and Obviously You Didn't Consider This." | | "Finally, we're already having serious problems getting Thing | the Author Almost Always Rags on for Sucking to upgrade when a | security vulnerability is found. Requiring recompilation of all | transitive users who the author doesn't care about and who the | author has already told are wrong is not likely to win any | update speed contests for a use-case the author thinks is | invalid. If it's _completely_ automated then the perceived | invalid use case would still be viable, but any rocks in the | process will leave people with perceived invalid use case | endlessly vulnerable. See Notoriously Bad Ecosystem That Isn 't | Relevant to the Article, etc." | cryptonector wrote: | Actually, this is a very serious topic, and TFA is wrong in a | number of ways. HN commenters being able to say so is | everything that's right with HN. My own commentary on TFA is | all over the comments on this HN post. I put more effort and | detail into my comments on this post than GP, but GP is not | wrong. | | Commenting that a comment is exemplary of all that's wrong | with HN is... popular, I'll grant you that, but it grates, | and it's not really right. It's a "shut up" argument. Just | downvote what you don't like, upvote what you do like, and | reply with detailed commentary where you have something | important to say. No need to disparage commenters. | saagarjha wrote: | I know it's a popular Hacker News trope, but when done well I | actually appreciate comments of the style "the author does a | comparison and has some results, but do also keep in mind xyz | that is non-obvious". I know there is a problem with people | who go through articles and try to pick things that could | affect the results and treat that as invalidating them, but | more often than not simple comparisons like these _do_ have | more to say about them than what is presented and I don 't | want to discourage those. | [deleted] | BeetleB wrote: | > First, this analysis was done on Arch Linux, a source-based | distribution. Since you know at compile time what your | environment is, I would expect the benefits to be smaller. And | of course, this means you're willing to do a lot of recompiles. | | I run Gentoo as my Linux box. It takes a long time to compile | stuff as it is (and I have a 22 core machine). Being forced to | compile a lot more because everything is linked statically | would be a nightmare. | | This is definitely not a plus for source based distributions. | bjourne wrote: | Running Gentoo and complaining about long compile times is | like cutting off your leg and complaining that walking hurts. | pengaru wrote: | > Running Gentoo and complaining about long compile times | is like cutting off your leg and complaining that walking | hurts. | | I fail to see the relevance of this comparison. | | It's more like wearing fitted clothing and (rationally) | objecting to requiring a visit to your tailor every time | you change your socks. | BeetleB wrote: | I'm not complaining, and I can live with the compile times | I currently have. | | I'm pointing out that saying this will benefit source based | distributions more is nonsense. | eikenberry wrote: | If compilers weren't so pathetically slow this wouldn't be | that much of an issue. If this became widespread it might | have the positive impact on projects like LLVM and get them | to pay some attention to compiling time, not just | optimization. | btilly wrote: | The problem with compile times is mostly not the compiler's | fault. It is with the project setup. | | See http://www.real-linux.org.uk/recursivemake.pdf for a | classic explanation. | eikenberry wrote: | But we haven't learned from this as modern, non-make | based build systems still suffer from terribly slow | compiles. Take Rust for example. I don't know a single | project that uses make to build (they all use cargo), yet | Rust suffers from extremely slow compile times. Much of | this (as far as I understand) comes from the LLVM | compiler, which is why I was picking on compilers. | eeereerews wrote: | What do you mean by "source-based distribution"? | riquito wrote: | One without packages shipped as prebuilt binaries: Gentoo [1] | would be a good example. You compile every binary yourself | (sometime you can use prebuilt binaries for the biggest | software, if you are so inclined and the distro allows it). | | [1] https://www.gentoo.org/ | danShumway wrote: | My Arch install doesn't compile anything locally unless I'm | installing from AUR. | | Maybe it's doing it behind the scenes or something, but I'm | doubtful, because otherwise I think my upgrades would take | a lot longer. | lordleft wrote: | Doesn't arch use AUR / pacman for binaries / package | management? I wasn't aware that like Gentoo you had to | compile most software | ddevault wrote: | No, he's wrong. Arch Linux is primarily a binary | distribution. | legulere wrote: | How is arch Linux a source based distro? | abjKT26nO8 wrote: | _> First, this analysis was done on Arch Linux, a source-based | distribution. Since you know at compile time what your | environment is, I would expect the benefits to be smaller. And | of course, this means you 're willing to do a lot of | recompiles. I'd like to see analysis done on more traditional | (& common) binary distros._ | | Arch Linux is not Gentoo. And AUR is only a secondary method of | installing software. So I'm not sure what you mean. | bjourne wrote: | Either something is wrong with the testing script or my computer | is way faster than I thought: ./test.sh | awk | 'BEGIN { sum = 0 } { sum += $2-$1 } END { print sum / NR }' | -698915 | bryanlarsen wrote: | To get these stats, Drew used 5 different scripts in 5 different | languages. Awk, sh, C, go & python. Well, the C program isn't a | script it's a test artifact. Drew must subscribe to the "best | tool for the job" philosophy rather than the "use what you know" | philosophy. | dan-robertson wrote: | I'm curious about the supposed memory advantages of dynamic | linking: on average how many different executables share each | page of memory? What about when memory is in high demand? How | high does that average become? What is the probability that a | page of a shared library is already in memory (cache or | otherwise) when it needs to be loaded, and in particular the | probability that it is there because another program loaded it). | | My guess is that apart from eg libc, the average is pretty low | (ie 1 for the pages that aren't free). | saagarjha wrote: | I think this depends heavily on your platform and how often | libraries are reused. On macOS, for example, most libraries | aside from libc are loaded R/O or CoW quite literally hundreds | of times, because every app shares AppKit and WebKit and | Security and the dozens of other platform frameworks (and their | private dependencies!) that are basically "free" to use and | ship with the system and so have very high adoption. On more | "coherent" Linux distributions I'm sure things like GTK, glib, | OpenSSL, zlib are used by a lot of things too. Sure, there's | going to be a lot of one-off dynamic libraries too, but there's | a lot of duplication with the popular dependencies and then a | long tail. | sitzkrieg wrote: | these benchmarks dont seem to take into account page cache | pm215 wrote: | I think a more interesting analysis of "security vulnerability | costs for static linking" would look not at just "how many bytes | does the end user download" but "what are the overall costs to | the _distro_ to support a fully statically linked setup ", | looking at eg CPU costs of doing the rebuild or how much total- | elapsed-time it would take to do a full rebuild of every affected | package. | mcguire wrote: | Not to mention that fixing a security vulnerability in, say, | libm or libc becomes an amount of work equivalent to a | distribution upgrade with all the associated risks. | | Or am I the only one who has occasional problems when replacing | all the binaries on my system? | yjftsjthsd-h wrote: | That sounds like a separate issue; normally, having to do a | full dist upgrade means that you changed versions of lots of | core components (kernel, init, libc, gcc), which is indeed | traumatic. If you replace every binary but the only change is | bumping the statically-linked libc from x.y.0 to x.y.1, it | should be just as boring as making the same change with a | dynamically-linked libc. | albertzeyer wrote: | > Will security vulnerabilities in libraries that have been | statically linked cause large or unmanagable updates? | | This is maybe not an issue for open source packages which are | managed by your distribution package manager, assuming they | update all the dependent packages once some library gets updated | (which would lead to a lot more updates all the time). | | However, the maybe more critical issue is about other | independently installed software, or maybe closed source | software, where you will not automatically get an update once | some library gets updated. | k__ wrote: | What are the counter arguments? | old-gregg wrote: | Run htop or similar, sort by "shared memory" column and see how | much more memory you'd need per process if shared linking did | not exist. | | I think the author's using a wrong method to make a point. | Dynamic linking feels out of place for most long-running | server-side apps (typical SaaS workload). One can argue that in | a mostly CLI-environment there's also not much benefit. | | But even an empty Ubuntu desktop runs ~400 processes and | dynamic linking makes perfect sense. libc alone would have to | exist in hundreds reincarnations consuming hundred+ megabytes | of RAM and I'm not even talking about much, much heavier GTK+ / | cairo / freetype / etc libraries needed for GUI applications. | a1369209993 wrote: | > Run htop or similar, sort by "shared memory" | | The top two entries are 80 and 36 kB respectively. RES is 2.3 | _giga_ -bytes (over four orders of magnitude larger) between | them. Even multiplying the top SHR by your ~400 processes | gives 32 MB (still two orders of magnitude off). That is not | a counter argument; that is a agreement that dynamic linking | is useless. | | Edit: RES, not VIRT. | ddevault wrote: | That is an extremely misleading figure. Shared memory is | page-aligned entire libraries dropped into RAM. Statically | linking would, as the article shows, only use on average | about 4% of the symbols available from the libraries, and the | majority of this would not end up in RAM with your statically | linked binary. And if you used a more selective approach, | dynamically linking to no more than perhaps a dozen high- | impact libraries and statically linking the rest, you'd get a | lot of the benefits and few of the drawbacks. | | Put the cold hard numbers right in front of someone's face | and still the cargo cult wins out. | loeg wrote: | > Put the cold hard numbers right in front of someone's | face and still the cargo cult wins out. | | Come on. | | You did not actually measure the figure GP mentioned and | which you are disputing. Your methodology and assumption -- | that 4% external symbol use translates into 4% size used -- | is a plausible guess, but you haven't supported it with | data. | | Even if you had measured the figure you're accusing GP of | ignoring, the tone of your remark is just aggressively | condescending and inappropriate. Tone it down. | | To address your other claims: | | > Shared memory is page-aligned entire libraries dropped | into RAM. | | There is a good reason to page- or superpage-align code | generally; it burns some virtual memory but reduces TLB | overhead and therefore misses / invalidations, which are | very costly. You would want to do the same with executable | code in a static-linked binary. | | > the majority of [the small fraction of static linked | library used] would not end up in RAM with your statically | linked binary. | | Huh? Why do you claim that? | | > And if you used a more selective approach, dynamically | linking to no more than perhaps a dozen high-impact | libraries and statically linking the rest, you'd get a lot | of the benefits and few of the drawbacks. | | I think that claim is plausible! But it wasn't an option | presented on your blog post, nor was it discussed by GP. | Prior to that comment, discussion was only around 100% vs | 0% dynamic linking. | bjourne wrote: | > There is a good reason to page- or superpage-align code | generally; it burns some virtual memory but reduces TLB | overhead and therefore misses / invalidations, which are | very costly. You would want to do the same with | executable code in a static-linked binary. | | But most code isn't performance critical. Thus trying to | align functions to page boundaries is just wasting | memory. Even in performance critical code, aligning to | cache line sizes is enough and aligning to page | boundaries doesn't provide any advantage. | MaulingMonkey wrote: | Not only is it wasting memory, but I wouldn't be | convinced it'd not be outright hurting performance by | increasing cache line aliasing, increasing TLB overhead | and misses/invalidations! _Avoiding_ page alignment can | be a performance gain! | https://pvk.ca/Blog/2012/07/30/binary-search-is-a- | pathologic... | | Like, sure, mapping your executable's code section in at | a page boundary is probably fine, but I think trying to | align individual functions to page boundaries would be a | counterproductive mistake as a general strategy. | old-gregg wrote: | _That is an extremely misleading figure. Shared memory is | page-aligned entire libraries dropped into RAM_ | | First of all, I seriously doubt it (haven't looked closely, | but if library-loading is similar to mmap, it should only | count actually used segments). | | But that shouldn't even matter, as most of these libraries | are fully utilized. All of libc is used collectively by | 400+ processes, that is also true for the complex multi- | layer GUI machinery. The output of ldd $(which gnome- | calculator) is terrifying, run it under a profiler and see | how many functions get hit, you'll be amazed. | | _Put the cold hard numbers right in front of someone 's | face and still the cargo cult wins out._ | | The coldness of your numbers did not impress. And calling a | reasonable engineering trade-off "cargo cult" doesn't get | you any points either. | | Static linking is better than dynamic linking. Sometimes. | And vice versa. That's how engineering is different from | science, there are no absolute truths, only trade-offs. | cryptonector wrote: | @ddevault: performance is only part of the story. The | poverty of semantics of C static linking is atrocious -- it | could be fixed, but until it is we should not statically | link C programs. | jnwatson wrote: | Go executables are statically linked. It makes deployment a | breeze. | | I think you overestimate how much saving you get from | dynamically linking libc. Each executable uses only a small | portion of libc, so the average savings is going to be in the | handful of kilobytes per executable. | andoma wrote: | In theory yes. However, in practice static linking with | glibc pulls in a lot of dead weight, musl comes to the | rescue though: | | test.c: int main(int argc, char **argv) { | printf("hello world\n"); return 0; } | | Dynamic linking (glibc): $ gcc -O2 -Wl,-- | strip-all test.c $ ls -sh a.out 8.0K a.out | | Static linking (glibc): $ gcc -O2 --static | -Wl,--strip-all test.c $ ls -l a.out 760K a.out | | Static linking (musl): $ musl-gcc --static | -O2 -Wl,--strip-all test.c $ ls -sh a.out 8.0K | a.out | jart wrote: | Static linking (https://github.com/jart/cosmopolitan) | jart@debian:~/cosmo$ make -j12 | CPPFLAGS+=-DIM_FEELING_NAUGHTY MODE=tiny | o/tiny/examples/hello.com jart@debian:~/cosmo$ ls | -sh o/tiny/examples/hello.com 20K | o/tiny/examples/hello.com | | Note: Output binary runs on Windows, Mac, and BSD too. | glouwbug wrote: | True, but a couple github imports and 10 lines of code | generates a 100mb binary. But, to be fair, I guess we're | okay with shipping huge binaries now a days because we're | literally shipping whole environments with docker anyway. | yjftsjthsd-h wrote: | To be fair, you're not really supposed to be shipping a | full system in a docker image if you can help it; you're | supposed to layer your application over the smallest base | that will support it (whether that's scratch, distroless, | alpine, or a minimal debian base). Of course, I'll be the | first to agree that "supposed to" and reality have little | in common; if I had a dollar for every time I've seen a | final image that still included the whole compiler | chain... | cryptonector wrote: | https://news.ycombinator.com/item?id=23656173 | k__ wrote: | Are these also issues for languages like Go and Rust? | kccqzy wrote: | Of course not. It's not even a problem for C++ if you use | namespaces pervasively. This is purely support for C. | cryptonector wrote: | No, not for Rust, and not for Goland, IIRC. The reason is | that they have proper namespacing functionality in the | language, and generally modern languages record dependency | metadata in the direct dependents. | | This is strictly a C problem. | joosters wrote: | _Over half of your libraries are used by fewer than 0.1% of your | executables_ | | That's a very misleading reference and graph. First of all, what | did they expect to find? As you add more executables, _of course_ | the % usage of a library will decrease. | | e.g. say I have a networking library on my computer, and, in a | perfect world, all my installed network tools link against it. | But now I install Gnome, and my machine has hundreds more | binaries. Not all of the binaries will do networking stuff, so | the % usage of the networking library goes down. But that doesn't | mean that the networking library is not being shared as well as | it could be. | | A much better metric would be to count, for each shared library | on a machine, the number of programs that link against it. If | only one program uses a shared library, then that means the | 'shared-ness' is not being used. If more than one program use it, | then the library is being effectively shared. But the actual | count, whether it is 200 users or 20, doesn't mean anything more. | That's why comparing all libraries against libc's usage shows | nothing useful. | jjoonathan wrote: | Yeah. Loading the GUI widget toolkit once rather than 30 times | is pretty nifty. | cryptonector wrote: | Even better would be to measure sharing in the totality of | programs you use directly and indirectly from boot. | eeereerews wrote: | >Do your installed programs share dynamic libraries? | | >Findings: not really | | >Over half of your libraries are used by fewer than 0.1% of your | executables. | | Findings: Yes, lots, but mostly the most common ones. Dynamically | linking against something in the long tail is pretty pointless | though. | rumanator wrote: | > Dynamically linking against something in the long tail is | pretty pointless though. | | I disagree. Dynamic linking, in the context of an OS which | offers a curated list of packages in the form of an official | package repository, means that a specialized third party is | able to maintain a subcomponent of your system. | | This means you and me and countless others are able to reap the | benefit of bugfixes and security fixes provided by a third- | party without being actively engaged in the process. | | In the context of an OS where the DLL hell problem hasn't been | addressed and all software packages are forced to ship all | their libraries that are shared with no one at all, indeed its | pretty pointless. | eeereerews wrote: | It can also cut the other way though. Bugs can be introduced, | compatibility can be broken, users can not find the library | in their package manager, or they may find too new of a | version. The danger of this is smaller for popular libraries, | but goes up as you move to the long tail. | wahern wrote: | But this also highlights the benefit of community | packaging. Debian packagers often backport security fixes | into older versions of libraries that are no longer | maintained upstream. That's a big part of their job--not | just to bang out a build and walk away, but to keep an eye | on things. This is why it's important to only use distro- | packaged libraries as much as you can, even when statically | linking. | | Getting off the treadmill of integrating interface-breaking | upstream changes is one of the biggest _practical_ reasons | people prefer static linking and directly adding upstream | source repositories into their build. It 's at least as | important, IME, as being able to use newer versions of | libraries unavailable in an LTS distro. It can work well | for large organizations, such as Google with their | monolithic build, because they can and often do substitute | the army of open source packers with their own army of | people to curate and backport upstream changes. For | everybody else it's quite risky, and if containerization | provides any measure we're definitely worse off given the | staleness problems with even the most popular | containers.[1] | | [1] I wouldn't be surprised if an open source project | emerged to provide regularly rebuilt and possibly patched | upstream containers, recapitulating the organizational | evolution of the traditional FOSS distribution ecosystem. | sprash wrote: | Static linking allows LTO with aggressive inlining and is | therefore able to achieve far superior performance beyond just | the startup time. Arguing that dynamic vs. static has better RAM | utilization or not is pointless because nowadays we have plenty | of RAM but single core performance is stagnating for almost a | decade already. Moores law might give us more transistors but | single thread performance is still more or less bound by the | clock and transistor switching frequency. Sooner or later static | linking will become the only way to move forward and the | conveniences dynamic linking offers will not be worth the costs. | cryptonector wrote: | > Arguing that dynamic vs. static has better RAM utilization or | not is pointless because nowadays we have plenty of RAM ... | | Using more RAM means having lower cache hit ratios. If dynamic | linking means using less RAM, you win. But it's not a clear-cut | thing -- it will depend a lot on the surrounding ecosystem. | | In any case, for C, the problem with static linking is about | semantics. Until those are fixed I'll be resolutely against | static linking for C, and for everything else, well, do | whatever performs best. | zelly wrote: | Dynamic linking provides encapsulation and security benefits. An | application with a statically linked OpenSSL can be vulnerable if | a CVE comes out for that version of OpenSSL, whereas a | dynamically linked OpenSSL could be patched immediately without | recompilation (which may be impossible if the software is | proprietary). The vendor of the shared library can update the | implementation without requiring all downstream consumers to | recompile. This should not be downplayed. | | Pure static linking makes sense when you are deploying code you | control onto an environment you control. | | While I like ease of deploying Go, where binaries are one big | blob that just works, it makes it closer to the Java style than | the UNIX modular-tool-does-one-thing-well tradition. | pwdisswordfish2 wrote: | With NetBSD, it is relatively easy to compile the entire system | as static. Is it easy to do this with Arch Linux? (Maybe Void | Linux?) | cryptonector wrote: | By far the most important reason for dynamic linking for C is | semantics: static linking semantics are stuck in 1978 and suck | (more on that below), while dynamic linking semantics make C a | much better language. | | In particular, static linking for C has two serious problems: | | 1. symbol collisions -> accidental interposition (and crashes); | | 2. you have to flatten the dependency tree into a topological | sort at the final link-edit. | | Both of these are related, and they are disastrous. They are also | related to the lack of namespaces in C. | | Besides fixing these issues, the C dynamic linking universe also | enables things like: | | - run-time code injection via LD_PRELOAD and intended | interposition | | - run-time code loading/injection via dlopen(3) | | - audit (sotruss) | | - reflection | | - filters (which allow one to move parts of libraries contents to | other libraries without forcing re-links and without forcing | built systems to change to add new -lfoo arguments to link-edits) | | - use of dladdr(3) to find an object's install location, and then | that to find related assets' install locations relative to the | first, which then yields code that can be relocated at deploy | time (sure, "don't do that" is a great answer, but if you | statically-link then you think you can, and now you just can't | have assets to load at run-time) | | - use of weak symbols to detect whether a process has some | library loaded | | and others. | | C with those features is a far superior language -- a different | language, really -- to C without them. | | (EDIT: A lot of the semantics of ELF could be brought to static | linking. Static link archives could have a .o that has metadata | like depedencies, "rpaths", exported/protected symbols, | interposer symbols, etc. The link-editor would write and consume | that metadata. However, it's 2020, and the static link ecosystem | is stuck in 1980 because no one has bothered, and no one has | bothered because dynamic linking is pretty awesome. Still, it | could be done, and once in a while I think I ought to do it to | help save people from themselves who want static linking.) | | > Do your installed programs share dynamic libraries? | | > Findings: not really | | > Over half of your libraries are used by fewer than 0.1% of your | executables. | | The C library most certainly gets shared, as well as libm and | such. The rest, it's true, not so much, but it does depend on | what you're measuring. Are you measuring C++ apps? Yeah, C++ | monomorphization leads to essentially static linking. Are you | measuring Java apps with no significant JNI usage? You won't find | much outside the libraries the JVM uses. | | > Is loading dynamically linked programs faster? | | > Findings: definitely not | | Dynamically-linked programs will load faster when their | dependencies are already loaded in memory, and slower otherwise. | The biggest win here is the C library. | | > Will security vulnerabilities in libraries that have been | statically linked cause large or unmanagable updates? | | > Findings: not really | | Correct. But, being able to update libc or some such and not have | to worry about updating consumers you might not even know about | is a very nice feature. | cryptonector wrote: | In a containerized world, static linking will generally be | faster, unless the OS does page de-duping for text/rodata. So | doing the work to make static linking semantics not suck seems | likely to be important. | mforney wrote: | You bring up some good points here. Here are some of my | experiences with these problems when working on oasis (my | static linux distro). | | > 1. symbol collisions -> accidental interposition (and | crashes); | | I've encountered symbol collisions only twice, but both | resulted in linker errors due to multiple function definitions. | I'm not sure how this could happen accidentally. Maybe you are | referring to variables in the common section getting merged | into a single symbol? Recent gcc enables -fno-common by | default, so those will be caught by the linker as well. | | > 2. you have to flatten the dependency tree into a topological | sort at the final link-edit. | | Yes, this is pretty annoying. pkg-config can solve this to some | degree with its --static option, but that only works if your | libraries supply a .pc file (this is often the case, though). | | I think libtool also can handle transitive dependencies of | static libraries, but it tries hard to intercept the -static | option before it reaches the compiler so it links everything | but libc statically. You can trick it by passing `-static | --static`. | | For oasis, I use a separate approach to linking involving RSP | files (i.e. linking with @libfoo.rsp), which really are just | lists of other libraries they depend on. | | > Besides fixing these issues, the C dynamic linking universe | also enables things like: > - run-time code injection via | LD_PRELOAD and intended interposition | | Yes, this can be a problem. I wanted to do this recently to | test out the new malloc being developed for musl libc, but | ended up having to manually integrate it into the musl sources | instead of just using LD_PRELOAD. | | > - run-time code loading/injection via dlopen(3) | | In particular, this is a big problem for scripting languages | that want to use modules written in compiled languages, as well | as OpenGL which uses dlopen to load a vendor-specific driver. | | > Dynamically-linked programs will load faster when their | dependencies are already loaded in memory, and slower | otherwise. The biggest win here is the C library. | | But doesn't the dynamic linker still have to do extra work to | resolve the relocations in the executable, even when the | dependency libraries are already loaded? | cryptonector wrote: | > > 1. symbol collisions -> accidental interposition (and | crashes); | | > I've encountered symbol collisions only twice, but both | resulted in linker errors due to multiple function | definitions. I'm not sure how this could happen accidentally. | Maybe you are referring to variables in the common section | getting merged into a single symbol? Recent gcc enables -fno- | common by default, so those will be caught by the linker as | well. | | No, this comes up all the time. Try building an all-in-one | busybox-style program, and you'll quickly run into conflicts. | | If static link archives had all the metadata that ELF files | have, then the link-editor could resolve conflicts correctly. | That is the correct fix, but no one is putting effort into | it. The static linkers haven't changed much since symbol | length limits were raised from 14 bytes! | | > > 2. you have to flatten the dependency tree into a | topological sort at the final link-edit. | | > Yes, this is pretty annoying. pkg-config can solve this to | some degree with its --static option, but that only works if | your libraries supply a .pc file (this is often the case, | though). | | pkg-config alleviates the problem, but it's not enough. Among | other things building a build system that can build with | both, static and dynamic linking is a real pain. But more | importantly, this flattening of dependency trees loses | information and makes it difficult for link-editors to | resolve symbol conflicts correctly (see above). | | > > Dynamically-linked programs will load faster when their | dependencies are already loaded in memory, and slower | otherwise. The biggest win here is the C library. | | > But doesn't the dynamic linker still have to do extra work | to resolve the relocations in the executable, even when the | dependency libraries are already loaded? | | It's still faster than I/O. (Or at least it was back in the | days of hard drives. But I think it's still true even in the | days of SSDs.) | devit wrote: | A better way to do this analysis would be to build a Linux | distribution with everything statically linked and compare to the | normal version with dynamic linking, looking at disk space used, | startup time, memory used, and time to launch specific | applications both cold and hot. | weinzierl wrote: | > A better way to do this analysis would be to build a Linux | distribution with everything statically linked [..] | | Here you go: Stali | | https://dl.suckless.org/htmlout/sta.li/ | | _" Stali distribution smashes assumptions about Linux"_ | | https://www.infoworld.com/article/3048737/stali-distribution... | devit wrote: | For a useful comparison you need the static and dynamic | distributions to be otherwise the same, i.e. you want to pick | a mainstream distribution and build it from scratch with both | static and dynamic linking and compare. | a1369209993 wrote: | I don't know of any mainstream distribution that doesn't | make full-static-from-scratch builds gratuitously painful | and difficult. Personally I gave up after trying to blunt- | force-trauma glibc into linking correctly, though, so | someone with more internals knowledge might have better | success at it. | yjftsjthsd-h wrote: | Depends on your ideas of "mainstream"; I expect nixos and | gentoo are both happy to do such rebuilds for you. But, | as you note, the real pain is that glibc really doesn't | want you to do static builds... I wonder how gentoo | and/or nixos support is for musl... | cryptonector wrote: | When during Solaris 10 development the "unified process model" | was introduced, and static link archives for all system | libraries removed, boot times improved dramatically because all | the programs that run at boot time were then dynamically-linked | and started faster than their previous statically-linked selves | because the C library and such were already loaded. Later of | course we had even more dramatic boot time improvements via SMF | (the Solaris/Illumos predecessor to Linux's systemd). | | That was an apples-to-apples comparison of pre- and post- | process model unification performance, and it was a win. | | Now, this was back in... I want to say 2003 or 2004 -- before | S10 shipped. And it's possible that the same experiment today | would not have the same result. | | I'm not sure how easy it would be to construct a distro with | only dynamically-linked executables (at least for core | libraries, like the C library) and then the same distro with | only statically-linked executables. The S10 work was done by | Roger Faulkner (RIP) and it was a huge change. | setr wrote: | I believe the original reasoning for Dynamic Linking wasn't | performance gains, but security gains -- someone described the | driving story to me as essentially a found vulnerability in a | very common library required updating and re-compiling | _everything_ on _every system_ , scarring sysadmins globally and | permanently; the space saving and performance aspects came up as | later "bonuses". | | I have little memory of the details of the story, and I'm not | 100% sure it's true, but it's a much more satisfying and | reasonable argument for dynamic linking than performance/space. | | Of course, the more modern solution would probably be a good | package manager -- if its trivial to recompile things, and track | what needs to be recompiled, then dynamic linking seems to gain | little, but bring in a lot of its own headaches (as we know | today) | cryptonector wrote: | The original reasoning for shared libraries was reducing memory | footprint -- back then memory was scarce. We're talking back in | the days of SVR2, the mid-80s. | | The original reasoning for ELF was that static link semantics | suck. ELF's semantics are far superior | (https://news.ycombinator.com/item?id=23656173). | bjourne wrote: | Stallman and other GNU folks have claimed that they stuck with | dynamic linking to foster cooperation between hackers. They | knew that it wasn't a win performance-wise but they wanted to | encourage hackers to help each other. The idea was that if | someone found a bug in a library he or she depended on they | would be "forced" to send the patch upstream rather than just | fixing the bug in their local copy of the library. Thus they | made dynamic linking the default in gcc and kept static linking | as something of an after though. | | I read it on a mailing list a long time ago so I don't have a | source. | hknapp wrote: | I appreciate the brevity of this article. | earthboundkid wrote: | Say we have program X with dependency Y. X+Y is either dynamic or | static. X can either have responsive maintainers or unresponsive | maintainers. Y can either change to fix a bug or change to add a | bug. (With Heartbleed, I remember our server was fine because we | were on some ancient version of OpenSSL.) Here are the scenarios: | | - dynamic responsive remove bug: Positive/neutral. Team X would | have done it anyway. | | - dynamic unresponsive remove bug: Positive. | | - dynamic responsive add bug: Negative. Team X will see the bug | but only be able to passively warn users not to use Y version | whatever. | | - dynamic unresponsive add bug: Negative. Users will be impacted | and have to get Y to fix the error. | | - static responsive remove bug: Positive/neutral: Team X will | incorporate the change from Y, although possibly somewhat slower | (but safer). | | - static unresponsive remove bug: Negative. Users will have to | fork X or goad them into incorportating the fix. | | - static responsive add bug: Positive. Users will not get the bad | version of Y. | | - static unresponsive add bug: Positive. Users will not get the | bad version of Y. | | Overall, dynamic is positive 1, neutral 1, negative 2, and static | is positive 2, neutral 1, negative 1. Unless you can rule out Y | adding bugs, static makes more sense. Dynamic is best if | "unresponsive remove bug" is likely, but if X is unresponsive, | maybe you should just leave X anyway. | dan-robertson wrote: | If you're a Linux distribution and you have the source code to | the software you distribute then the responsiveness of the | maintainers doesn't matter so much: any change to a library | which is compatible with the previous version (in the sense | that you could dynamically link to the new version instead of | the old version) is going to be compatible under static linking | too. | danShumway wrote: | That's a really good point that I hadn't considered. All of | the software I'm pulling from Pacman is being compiled from | source, the authors aren't uploading binaries. | | So the difference in effort between changing the static | library and changing the dynamic library is... maybe not | nothing, but not nearly as high as I was assuming. | furbyhater wrote: | I don't agree with this: | | - static responsive add bug: Positive. Users will not get the | bad version of Y. | | On the contrary, it's negative because the maintainers of X | will update Y and introduce the bug to their users (I'm | assuming they don't thoroughly audit the source of Y each time | they do a version upgrade). | | Also, "unresponsive remove bug" should be given more weight | since it's more likely to happen IMO. | petters wrote: | Sure, but I think the most important reason people have in mind | when they argue for dynamic linking is that they will receive | upstream bugfixes. I don't think your eight cases are equally | important. | | You are of course right, though, and writing it down like this | can be useful. | IshKebab wrote: | Why would you not receive upstream bug fixes with statically | linked programs? Assuming you are using an Apt-style package | manager then the statically linked program would be rebuilt | and updated too. | | If you are not using an apt-style package manager then the | program must include all of its dependencies (except ones | that are guaranteed to be present on the platform, which is | none on Linux and a few on Mac/Windows), and you will receive | bug fixes when that program is updated, whether or not it | uses static linking. | | Static/dynamic linking does not affect how likely you are to | get bug fixes in any way as far as I can tell. | saagarjha wrote: | The first hurdle is that you require source code to do any | of this. Then you need to actually rebuild everything. | Vogtinator wrote: | > On average, dynamically linked executables use only 4.6% of the | symbols on offer from their dependencies. | | That's correct, but also very misleading and leads to the wrong | conclusion. | | The dynamically linked library has references to itself, | externally visible or not. It would be wrong to claim that | Application.run(); only uses a single symbol of a library. | | > A good linker will remove unused symbols. | | With LTO or -f{function,data}-secions + --gc-sections any linker | will do. Without those options no linker is allowed to. I believe | that this the reason why static libraries are usually shipped as | separate object files (.o) within ar archives (.a), as those were | only linked in on demand. | iforgotpassword wrote: | > I believe that this the reason why static libraries are | usually shipped as separate object files (.o) within ar | archives (.a), as those were only linked in on demand. | | Yep. One function per C/obj file for smallest static binary | possible. | arcticbull wrote: | Can't you achieve the same result with -ffunction-sections? | Someone wrote: | Also, how many symbols it uses is relevant for the time linking | takes, but not for memory usage. | | Worse, I think it would have been easier to measure the size in | bytes of binaries, or, with a bit more effort, resident memory | size. | wahern wrote: | Do any Linux/glibc or Linux/musl systems support static PIE | binaries, yet? Without static PIE support you don't benefit from | ASLR (at least not fully). This 2018 article seems like a good | breakdown of the issues: | https://www.leviathansecurity.com/blog/aslr-protection-for-s... | | OpenBSD has supported static PIE since 2015; not just supported, | but all system static binaries (e.g. /bin and /sbin) are built as | static PIEs, and I believe PIE is the default behavior when using | `cc -static`. The reasons for the switch and the required | toolchain changes are summarized in this presentation: | https://www.openbsd.org/papers/asiabsdcon2015-pie-slides.pdf | | Also, simply checking the "static PIE" box isn't the end of the | story. There are different ways to accomplish it, and some are | better than others in terms of ASLR, W^X, and other exploit | mitigations. It's been a couple of years since I last looked into | and had a hold on all the issues simultaneously[1], but the basic | takeaway is that dynamic linking in system toolchains and system | runtimes is far more mature than static linking. | | [1] static PIE issues are a nexus of exploit mitigation | techniques, so if you want to deep dive into exploit mitigation | or even just linking issues then chasing the static PIE rabbit is | a good approach. | Ericson2314 wrote: | Nixpkgs / NixOS does. | mforney wrote: | Most Linux/musl systems support it. On Alpine, gcc is built | with `--enable-default-pie` so all static libraries can be | linked into a static PIE. | | On vanilla gcc (since version 8 when static PIE was | upstreamed), `-static` means non-PIE static executable and | there is a separate flag for `-static-pie` for static PIE. | Alpine patches gcc so that `-static` and `-pie` are independent | flags, so both `cc -static` and `cc -static-pie` will produce a | static PIE. | monocasa wrote: | I imagine there's some way to, because the kernel is able to | apply ASLR to the dynamic linker itself. | wahern wrote: | It's not a matter of possible, but whether the work has been | done throughout the toolchain and runtime stacks. | | When OpenBSD implemented static PIE, -static and -pie, the | literal options and the code generation aspects more | generally, were still mutually exclusive in GCC and clang. | GCC required patching to enable both static linking and PIE | generation. In fact, my local GCC 9.3 man page still says | that -static overrides -pie; to build static PIE seems to | require the special option -static-pie, in addition to | -fPIE/-fPIC when building the object code. But it's not | enough for the compiler and compile time linker to support | it. libc and libstdc++/libc++ also need support, both | internally as well as in the built libraries. And I think | libdl might need support if you want dlopen to work from a | static PIE binary. Likewise for libpthread. Supporting static | PIE requires the cooperation of many moving parts. And that's | just to _support_ it. Making it easy, let alone the default | behavior, so that it doesn 't require many carefully | coordinated, obscure flags without the risk of any misstep | silently disabling some exploit mitigation, is yet another | story. | | It can be done. It should be done. But to what extent _has_ | it been done? I 'm not sure, though some quick Googling | suggests GCC and clang are mostly there, at least in terms of | nominal support (which, again, is distinct from fully | supporting all the same mitigation measures). glibc seems to | have gotten some support (e.g. --enable-static-pie in glibc's | build), though I'm not sure whether it's available in | distros. And I would guess that musl libc support is pretty | far along, and presumably more mature than glibc's given that | Rich Felker started experimenting with static PIE several | years ago, shortly after OpenBSD did their work. See | https://www.openwall.com/lists/musl/2015/06/01/12 | monocasa wrote: | For sure. My comment was directed at Linux the kernel | rather than Linux the ecosystem, but it's a totally valid | point that for practical purposes that distinction can | border on meaningless. | saagarjha wrote: | I'm not sure I understand the first link you posted: if a | binary is statically linked, why does it need a GOT? It's | literally calling functions in its own binary... ___________________________________________________________________ (page generated 2020-06-26 23:00 UTC)