[HN Gopher] Musl 1.2.4 adds TCP DNS fallback ___________________________________________________________________ Musl 1.2.4 adds TCP DNS fallback Author : goranmoomin Score : 164 points Date : 2023-07-30 16:39 UTC (6 hours ago) (HTM) web link (www.openwall.com) (TXT) w3m dump (www.openwall.com) | joelverhagen wrote: | Really happy to see this. This caused random NuGet package | restore issues when the CNAME chain for api.nuget.org exceeded a | certain length. | | https://github.com/NuGet/NuGetGallery/issues/9396 | | Our CDN provider ended up having a shedding mode in some hot | areas that made the chain exceed the limit from time to time. Our | multi CDN set up saved us so we could do geo specific failovers. | [deleted] | [deleted] | InvaderFizz wrote: | Glad to see this finally come to fruition. | | This has been an issue plaguing Alpine for years where the musl | maintainer basically said the standard says may fallback, not | must fallback. Let the rest of the internet change, we don't feel | this is important. We're standards complainant. | | It gained traction for the change with the latest RFC. for dns | last year which made TCP fallback mandatory [0]. The cloak of | standards compliant could no longer be used. | | 0: https://datatracker.ietf.org/doc/html/rfc9210 | LexiMax wrote: | > This has been an issue plaguing Alpine for years where the | musl maintainer basically said the standard says may fallback, | not must fallback. Let the rest of the internet change, we | don't feel this is important. We're standards complainant. | | I've never really understood the underlying mentality that | causes maintainers to bend over backwards to _not_ provide | popular functionality like this. | | Reminds me a bit of the strlcpy/strlcat debacle with glibc. | yjftsjthsd-h wrote: | > I've never really understood the underlying mentality that | causes maintainers to bend over backwards to _not_ provide | popular functionality like this. | | Probably trying to prevent feature creep; musl's home page | describes it as | | > musl is lightweight, fast, simple, free, and strives to be | correct in the sense of standards-conformance and safety | | And every single feature they add makes those things harder. | tptacek wrote: | In this case it's not so much "popular functionality" as | "core mechanism of the protocol". TCP DNS isn't a quality of | life feature; it's necessary in order to look up record sets | with e.g. lots of IPv6 addresses, because UDP DNS has a | sharply limited response size. | geocar wrote: | Why would anyone want an RRset with more than 20 AAAA RRs? | | What would they be doing with it that they couldn't do | better (faster/less bugs/risk) another way? | marcus0x62 wrote: | I'm sure they have their reasons. Some might even be | good. Here's something I _know_ I want: for my resolver | to resolve valid DNS queries, no matter how stupid I | believe the remote zone 's maintainers to be. | [deleted] | tptacek wrote: | Sorry, I missed the spot in the DNS RFCs where they said | RRsets are limited to 20 records. Got a reference? | Thanks! | akira2501 wrote: | People confuse purity with quality all the time. | baq wrote: | And since we're talking about Debian, stability with | stagnation. | nineteen999 wrote: | The MX records for major email providers at the time (eg. | Yahoo) didn't even fit into a UDP DNS packet back in 2002. | | That they only just implemented this is a joke, and Alpine/musl | users are the punchline. | yjftsjthsd-h wrote: | > The cloak of standards compliant could no longer be used | | Or in other words they've always been standards-compliant, and | when the standard changed they updated to match it. | fanf2 wrote: | TCP support has been MUST implement for stub resolvers since | 2016, so musl is several years late https://www.rfc- | editor.org/rfc/rfc7766#page-6 | kelnos wrote: | Once I started reading about these issues a few years ago, I | stopped using Alpine as a container base image, and started | using Debian (the 'debian-slim' variant). Slim is still larger | than Alpine, but not by a lot, and does contain some extra | functionality in the base image that's useful for debugging | (most of which can be fairly easily removed for security | hardening). Debugging random DNS issues is difficult enough; | there's no need to make it harder by using intentionally-faulty | software. | | While I wouldn't call myself a fan of Postel's Law (I think | "being liberal with what you accept" can allow others to get | away with sloppy implementations over long time periods, | diluting the usefulness of standards and specifications), I | think at some point you have to recognize the reality of how | things are implemented in the real world, and that refusing to | conform to the de-facto standard hurts your users. | | The fact that the maintainer only caved because the TCP | fallback behavior is finally being made mandatory, and not | because he's (very belatedly) recognizing he's harming his | users with his stubbornness, also speaks volumes... and not in | a good way. | fefe23 wrote: | [flagged] | xyzzy_plugh wrote: | Given the choice between maintaining a fork of a C standard | library implementation and switching to an implementation | that doesn't have this issue, the choice is pretty clear. | | It was pretty clear that musl was unwilling to support adding | any TCP fallback code path regardless of a patch existing. | | Anyways, your comment is inflammatory and is full of straw- | men. Try being less of a dick. | fefe23 wrote: | [flagged] | xyzzy_plugh wrote: | I don't use musl, so I don't have a horse in this race, | but I wouldn't use it anyways because of issues like this | one. | | Do they owe me anything? Of course not. Is anyone here | claiming anything to the contrary? No. So what are you on | about, exactly? | | This is the comment section on a website. Here we are | discussing that the project maintainers kinda cocked this | one up from a reputation and user trust perspective. If | reputation, user trust and thus adoption are not a | concern or goal of the project, then cool, good for them. | It doesn't mean, however, that there isn't some | reputation or user trust that was eroded. And it | certainly doesn't mean we can't discuss it. | | For you to reply "what are you a child, just fork it and | grow a pair, fix it yourself" is not constructive nor | does it contribute to the discourse. Or did I put words | in your mouth? Hm I wonder how that feels. | vidarh wrote: | Yikes. The first time I ran into an issue where email delivery | for a major provider _was impossible_ without TCP fallback was | 23 years ago. To treat this as optional this long is | ridiculous. | brmgb wrote: | For all the good things Musl does, its authors can often be | downright ridiculous out of a dogmatism which is very | unpleasant to deal with as a user. See also their stubborn | refusal to give you an easy way to detect that the libc is | Musl while compiling. | yjftsjthsd-h wrote: | Well, if musl _is_ standards compliant, then wouldn 't it | make sense to detect when you're building on something | different rather than musl? Like, if the standard didn't | require TCP fallback, then in theory the application should | handle that and only skip it if it detects itself being | built on a libc with non-standardized extensions to obviate | the effort. | zajio1am wrote: | Standards are not laws, they are reasonably well-written | documents describing behavior that is supposed to be | interoperable. If, for some reason, that is not true | (perhaps because the standard is ambiguous, obsolete or | some other reason), ensuring that there is | interoperability is more important than just keeping it | to the letter of the standard. | | See RFC 1925, The twelve networking truths: | | (1) It has to work | brmgb wrote: | Because as we all know the standard never changes | potentially introducing new features, allows no | variability and bugs don't exist. | | It must be nice living in your. theoretical world. Here | in the real world, I like libraries to actually help me | do what I want to do rather than be needlessly annoying | to take a stand no one actually cares about. But that's | me. Clearly it serves Musl right given the massive amount | of use it's seeing. Hmm, wait a minute... | yjftsjthsd-h wrote: | Okay, so nobody uses it and you can stop complaining | because it'll never come up. | brmgb wrote: | Well I would like to use it. It's code is mostly sane, | easier to understand than glibc and it would make sense | in a lot of places where performances are not critical. | That's why I am so annoyed by the non sense its | developers keep pulling out. It's an incomprehensible | situation because they get absolutely nothing out of it. | They are basically annoying people for the sake of it. | It's incredible that changes like the one we are | commenting on which are just sanity being restored have | to come from an evolution of the standard. | 0xcde4c3db wrote: | > See also their stubborn refusal to give you an easy way | to detect that the libc is Musl while compiling. | | I suspect that the primary use cases for that would be | applying half-baked workarounds and refusing to work with | an "unsupported" libc, much like HTTP User-Agent. | brmgb wrote: | Or, you know, taking into account the stupid things Musl | does like not having TCP DNS fallback. | | I'm only half joking. Features detection in Musl is a | generally pain. I don't blame developers for sometimes | taking the path of least resistance and not trying to | support it. | geocar wrote: | Email servers cannot use gethostbyname anyway; they never | would have been affected by this issue. | vidarh wrote: | Musl also includes res_send() etc., which _can_ be used for | MX records, and e.g. Qmail _did_ use that back then. Here | 's the commit adding TCP support to those functions in | Musl. | | http://git.musl- | libc.org/cgit/musl/commit/src/network/res_ms... | | AOL was the specific case we ran into that at the time (ca. | 2000) returned either MX or A records (can't remember which | caused the problem) that required more than 512 bytes. | geocar wrote: | I wasn't aware musl implemented libres. I don't have a | problem with libres doing TCP. | | Qmail's issue was a hardcoded buffer size so this patch | to musl wouldn't have helped. | kstrauser wrote: | I'm skeptical of that. Why do you say so? | vidarh wrote: | gethostbyname() only returns an address, it can't be used | to e.g. query for MX records. You also lose control over | retries to different addresses if you use | gethostbyname(), which some mail server software will | also care about. | | However as I noted in my other reply, musl also | implements the res_* functions, like res_send() etc., and | _those_ can be used to address both. | robinhoodexe wrote: | We're currently consolidating all container images to run on | Debian-slim instead of a mixture of Debian, Ubuntu and alpine. | Sure, alpine is small, but with 70% of our 500 container images | being used for Python, R or node the final image is so large (due | to libraries/packages), that the difference between alpine (~30 | MB) and debian-slim (~80) is negligible. We've been experiencing | the weird DNS behaviour of alpine and other issues with musl as | well. Debian is rock solid and upgrading to bookworm from | bullseye and even buster in many cases didn't cause any problems | at all. | | I will admit though, that Debian-slim still has some non- | essential stuff that usually isn't needed at runtime, a shell is | still neat for debugging or local development. This trade off | could be considered a security risk, but it's rather simple to | restrict other stuff at runtime (such as run as non-privileged, | non-root user with all capabilities dropped and a read-only file | system except for /tmp). | | It's a balancing act between ease-of-use and security. I don't | think I'd get popular with the developers by forcing them to use | "FROM scratch" and let them figure out exactly what their | application needs at runtime and what stuff to copy over from a | previous build stage. | adolph wrote: | > the difference between alpine (~30 MB) and debian-slim (~80) | | Given that it's a different layer, the your container runtime | isn't going to redownload the layer anyway, right? | kstrauser wrote: | And even if it did, in an ancient data center that only uses | gigabit ethernet, that's only a .5s longer download. And even | a $4 DigitalOcean server comes with 10GB of storage, so that | 50MB is only 1/200th of the instance's store. (I'd also bet | that nearly no one uses instances that tiny for durable | production work where 50MB is going to make a difference.) | robinhoodexe wrote: | Exactly. Part of the appeal to consolidate all of our | container images to use Debian-slim is the ability to | optimise the caching of layers, both in our container | registry but also on our kubernetes cluster's nodes (which | can be done in a consistent manner with kube-fledged[1]). | | [1] https://github.com/senthilrch/kube-fledged | robertlagrant wrote: | Thanks for that - that operator sounds extremely useful! | leononame wrote: | Can you point me on where to look for more details on securing | a container? I'm a developer myself, and for me, the main | benefit of containers is being able to deploy the app myself | easily because I can bundle all the dependencies. | | What would you suggest I restrict at runtime and can you point | me to a tutorial or an article where I can go have a deeper | read on how to do it? | vbezhenar wrote: | Basically you need to work as unprivileged user and with | immutable file system (of course you can have ephemeral /tmp | or persistent /data, but generally the entire system should | be treated as read-only). | robinhoodexe wrote: | What you want to read is the kubernetes pod security context | fields[1]. | | In your Dockerfile, add a non-root user with UID and GID | 1000, then in the end of your Dockerfile, right before a CMD | or ENTRYPOINT you change to that user. | | In your kubernetes yaml manifest, you can now set | runAsNonRoot to true and runAsUser & runAsGroup to 1000. | | Then there's the privileged and allowPrivilegeEscalation | fields, these can nearly always be set to false unless you | actually need the extra privileges (such as using a GPU) on a | shared node. | | Then there's seccomp profiles and the system capabilities. If | you can run your container as the non-root user you've | created, and you don't need the extra privileges then these | can safely also be set to the most restricted. Non-privileged | non-root is the same as all capabilities are dropped. | | The tricky one is the readOnlyRootfilesystem field. This | includes /tmp, which is considered a global writeable | directory, so the workaround is to make a in-memory volume | and mount it at /tmp to make it writable. Likewise, your | $HOME/.cache and $HOME/.local directories (for the user you | created in your Dockerfile) are usually used by third party | packages, so creating mounts here can be useful as well (if | for some reason you can't point it to /tmp instead). | | [1] https://kubernetes.io/docs/tasks/configure-pod- | container/sec... | 4oo4 wrote: | This was really helpful for me: | | https://cheatsheetseries.owasp.org/cheatsheets/Docker_Securi. | .. | Rapzid wrote: | This here. Honestly most orgs with uhh.. Let's say a more | mature sense of ROI tradeoffs were doing this from pretty much | the very beginning. | | Also, Ubuntu 22.04 is only 28.17MB compressed right now so it | looks equiv to debian-slim. There are also these new image | lines, I can't recall the funky name for them, that are even | smaller. | kstrauser wrote: | I'm pushing to go back to Debian from Ubuntu. Canonical's | making decisions lately that don't appeal to me, and | especially on the server I don't see a clear advantage for | Ubuntu vs good ol' rock solid Debian. | senknvd wrote: | > There are also these new image lines, I can't recall the | funky name for them, that are even smaller. | | You might be thinking of the chiselled images. An interesting | idea but very much incomplete[1]. | | [1]: https://github.com/canonical/chisel-releases/issues/34 | synergy20 wrote: | why is it so much smaller than debian slim? | yjftsjthsd-h wrote: | Fewer features (including less "we've supported doing that | for 20 years and we're not cutting it now"), packages | separated into parts (ex. where debian might separate "foo" | into "foo" for the main package and "foo-dev" for headers, | alpine will also break out "foo-doc" with manpages and | such), general emphasis on being small rather than full- | featured. | Rapzid wrote: | It's not, Debian slim is ~28.8MB compressed. | kachnuv_ocasek wrote: | Do you have any tips regarding building R-based container | images? | robinhoodexe wrote: | R is kinda difficult and I haven't cracked this one. | Currently we're using the rocker based ones[1] but they are | based on Ubuntu and include a lot of stuff we don't need at | runtime. I'll look into creating a more minimal R base images | that's based on Debian-slim. | | [1] https://github.com/rocker-org/rocker-versioned2 | nickjj wrote: | I made the switch too around 4ish years ago. It has worked out | nicely and I have no intention on moving away from Debian Slim. | Everything "just works" and you get to re-use any previous | Debian knowledge you may have picked up before using Docker. | galangalalgol wrote: | I've run into the same thing for large dev images, but using | pure rust often means that musl allows for a single executable | and a config file in a from scratch container for deployment. | In cases where a slab or bump allocator are used, musl's | deficiencies seem minimized. | | That means duplication of musl in lots of containers, but when | they are all less than 10MB its less of an issue. Linking | against gnu libraries might get the same executable down to | less than 2MB but you'll add that back and more for even the | tiniest gnu nase images. | synergy20 wrote: | what is pure rust,rust needs its own libraries which is a few | MB as I recall | galangalalgol wrote: | By pure, I just meant no dependencies on c or c++ bindings | other than libc. If that is the case you can do a musl | build that has no dynamic dependencies, as all rust | dependencies are static. So then your only dependency is | the kernel, which is provided via podman/docker. A decent | sized rust program with hundreds of dependencies I can get | to compile down to 1.5MB. But that is depending on gnu. So | if you had 4 or 5 of those on a node, it might be less data | to use one gnu base image that is really small like rhel | micro, and build rust for gnu. But if you have cpu hungry | services like I do, then you usually have only a couple per | node, so from scratch musl can be a bit smaller. | phamilton wrote: | > cpu hungry services | | Have you benchmarked musl vs glibc in any way? Data I've | seen is all over the place and in curious about your | experience. | jonwest wrote: | In the same boat here as well. Especially when you're talking | about container images using JavaScript or other interpreted | languages that are bundling in a bunch of other dependencies, | the developer experience is much better in my experience given | that more developers are likely to have had experience working | in a Debian based distro than an Alpine based one. | | Especially when you're also developing within the container as | well, having that be unified is absolutely worth the | convenience, and honestly security and reliability as well. I | realize that a container with less installed on it is | inherently more secure, but if the only people who are familiar | with the system are a small infrastructure/platform/ops type of | team, things are more likely to get missed. | richardwhiuk wrote: | This was nearly three months ago? | Arnavion wrote: | Yes, and for the people who link to musl dynamically in their | Alpine containers, it's also in Alpine 3.18 | yjftsjthsd-h wrote: | And https://alpinelinux.org/posts/Alpine-3.18.0-released.html | puts that also nearly 3 months ago, FWIW. | jake_morrison wrote: | I use distroless images based on Debian or Ubuntu, e.g., | https://github.com/cogini/phoenix_container_example | | The result is images the same size as Alpine, or smaller, without | the incompatibilities. I think Alpine is a dead end. | suprjami wrote: | I hadn't heard of "distroless" before. Confusing name for a | container with just main process runtimes, but neat idea. | | https://github.com/GoogleContainerTools/distroless | ecliptik wrote: | While I'm glad this is finally addressed, this limits the | usefulness of one of my favorite interview questions. | | Asking about Alpine in a production environment was always good | way finding who has container experiences of watching C-Beams | glitter in the dark to those who only just read a "10 Docker | Container Tricks; #8 will blow your mind!" blog post from 2017. | vbezhenar wrote: | I'm using alpine containers for two years on a moderately sized | cluster and I've yet to encounter any issues caused by it. | baq wrote: | Famous last words right here. | | It's usually the case that everything works until it doesn't. | When it's DNS that doesn't work, good luck debugging it | unless you've got war stories to tell. | InvaderFizz wrote: | It's still going to be pretty common for at least a few years, | and the now incorrect assumption that it is still broken I'm | sure will persist for a decade or more among those who have | been burned and thus moved on from Alpine and do not follow it. | | DNS is a fun rabbit hole for interviews, for sure. | | My favorite one to see on a resume is NIS. If you are listing | NIS and don't have horror stories or other things to say about | NIS, that's a really good indicator of the value of your | resume. | | I intentionally list NIS on my resume because it is such a fun | conversation topic to go on about how security models changed | over time, all the ways NIS is terrible, but also how simple | and useful it was. | ecliptik wrote: | NIS is a good one. I have UUCP as a skill on my resume as an | easter egg but no one ever asks about it. | | For DNS, my favorite interview question goes like this, | | _How would you verify DNS is resolving from within a pod on | Kubernetes?_ | | After listening to the answer, add some constraints: | | 1. Common networking utilities like ping, nslookup, dig, etc | are not available | | 2. Container user is unpriviledged | | 3. su/sudo do not work | | This can lead to some elaborate k8s troubleshooting or the | simple, and correct, answer of _getent hosts_. | cmeacham98 wrote: | After constraint 1, this devolves to a weird game of "does | the interviewee realize I don't consider `getent hosts` to | be a 'common networking utility' so it's still available?" | geocar wrote: | I still use NIS because hosts files are faster than DNS. | yjftsjthsd-h wrote: | > Asking about Alpine in a production environment was always | good way finding who has container experiences of watching | C-Beams glitter in the dark to those who only just read a "10 | Docker Container Tricks; #8 will blow your mind!" blog post | from 2017. | | I dunno, I've been running containers in prod for a while now | and I don't recall Alpine being a problem. Maybe it varies by | your workload? | stefan_ wrote: | glibc also has some fun behavior that few people know about | because (1) distributions have been patching it and nobody ever | actually ran the upstream version and or (2) downstream | software is papering it over: | | https://github.com/golang/go/issues/21083 | nathants wrote: | i've stopped using containers because they are annoying. | | i've started using alpine on my laptop and ec2 because it's not. | | different strokes, different folks. | develatio wrote: | IIRC this was causing some exotic problems when deploying docker | images based on musl. | tyingq wrote: | I think there's also still some potential problems because it | still does some things differently than glibc. Musl defaults to | parallel requests if you define more than one nameserver | (multiple --dns=, for example, for the docker daemon)...where | glibc uses them in the order you provide them. | | To be clear, that's not "wrong", but just different maybe from | what docker was expecting. | LaLaLand122 wrote: | I really hope it does some things differently than glibc: | https://sourceware.org/bugzilla/show_bug.cgi?id=19643 | | In any case glibc uses NSS, so what glibc does depends on the | configuration. It may well just forward the request to | systemd-resolved. ___________________________________________________________________ (page generated 2023-07-30 23:00 UTC)