hngopher.com

       [HN Gopher] Musl 1.2.4 adds TCP DNS fallback
       ___________________________________________________________________
        
       Musl 1.2.4 adds TCP DNS fallback
        
       Author : goranmoomin
       Score  : 164 points
       Date   : 2023-07-30 16:39 UTC (6 hours ago)
        
 (HTM) web link (www.openwall.com)
 (TXT) w3m dump (www.openwall.com)
        
       | joelverhagen wrote:
       | Really happy to see this. This caused random NuGet package
       | restore issues when the CNAME chain for api.nuget.org exceeded a
       | certain length.
       | 
       | https://github.com/NuGet/NuGetGallery/issues/9396
       | 
       | Our CDN provider ended up having a shedding mode in some hot
       | areas that made the chain exceed the limit from time to time. Our
       | multi CDN set up saved us so we could do geo specific failovers.
        
       | [deleted]
        
       | [deleted]
        
       | InvaderFizz wrote:
       | Glad to see this finally come to fruition.
       | 
       | This has been an issue plaguing Alpine for years where the musl
       | maintainer basically said the standard says may fallback, not
       | must fallback. Let the rest of the internet change, we don't feel
       | this is important. We're standards complainant.
       | 
       | It gained traction for the change with the latest RFC. for dns
       | last year which made TCP fallback mandatory [0]. The cloak of
       | standards compliant could no longer be used.
       | 
       | 0: https://datatracker.ietf.org/doc/html/rfc9210
        
         | LexiMax wrote:
         | > This has been an issue plaguing Alpine for years where the
         | musl maintainer basically said the standard says may fallback,
         | not must fallback. Let the rest of the internet change, we
         | don't feel this is important. We're standards complainant.
         | 
         | I've never really understood the underlying mentality that
         | causes maintainers to bend over backwards to _not_ provide
         | popular functionality like this.
         | 
         | Reminds me a bit of the strlcpy/strlcat debacle with glibc.
        
           | yjftsjthsd-h wrote:
           | > I've never really understood the underlying mentality that
           | causes maintainers to bend over backwards to _not_ provide
           | popular functionality like this.
           | 
           | Probably trying to prevent feature creep; musl's home page
           | describes it as
           | 
           | > musl is lightweight, fast, simple, free, and strives to be
           | correct in the sense of standards-conformance and safety
           | 
           | And every single feature they add makes those things harder.
        
           | tptacek wrote:
           | In this case it's not so much "popular functionality" as
           | "core mechanism of the protocol". TCP DNS isn't a quality of
           | life feature; it's necessary in order to look up record sets
           | with e.g. lots of IPv6 addresses, because UDP DNS has a
           | sharply limited response size.
        
             | geocar wrote:
             | Why would anyone want an RRset with more than 20 AAAA RRs?
             | 
             | What would they be doing with it that they couldn't do
             | better (faster/less bugs/risk) another way?
        
               | marcus0x62 wrote:
               | I'm sure they have their reasons. Some might even be
               | good. Here's something I _know_ I want: for my resolver
               | to resolve valid DNS queries, no matter how stupid I
               | believe the remote zone 's maintainers to be.
        
               | [deleted]
        
               | tptacek wrote:
               | Sorry, I missed the spot in the DNS RFCs where they said
               | RRsets are limited to 20 records. Got a reference?
               | Thanks!
        
           | akira2501 wrote:
           | People confuse purity with quality all the time.
        
             | baq wrote:
             | And since we're talking about Debian, stability with
             | stagnation.
        
         | nineteen999 wrote:
         | The MX records for major email providers at the time (eg.
         | Yahoo) didn't even fit into a UDP DNS packet back in 2002.
         | 
         | That they only just implemented this is a joke, and Alpine/musl
         | users are the punchline.
        
         | yjftsjthsd-h wrote:
         | > The cloak of standards compliant could no longer be used
         | 
         | Or in other words they've always been standards-compliant, and
         | when the standard changed they updated to match it.
        
           | fanf2 wrote:
           | TCP support has been MUST implement for stub resolvers since
           | 2016, so musl is several years late https://www.rfc-
           | editor.org/rfc/rfc7766#page-6
        
         | kelnos wrote:
         | Once I started reading about these issues a few years ago, I
         | stopped using Alpine as a container base image, and started
         | using Debian (the 'debian-slim' variant). Slim is still larger
         | than Alpine, but not by a lot, and does contain some extra
         | functionality in the base image that's useful for debugging
         | (most of which can be fairly easily removed for security
         | hardening). Debugging random DNS issues is difficult enough;
         | there's no need to make it harder by using intentionally-faulty
         | software.
         | 
         | While I wouldn't call myself a fan of Postel's Law (I think
         | "being liberal with what you accept" can allow others to get
         | away with sloppy implementations over long time periods,
         | diluting the usefulness of standards and specifications), I
         | think at some point you have to recognize the reality of how
         | things are implemented in the real world, and that refusing to
         | conform to the de-facto standard hurts your users.
         | 
         | The fact that the maintainer only caved because the TCP
         | fallback behavior is finally being made mandatory, and not
         | because he's (very belatedly) recognizing he's harming his
         | users with his stubbornness, also speaks volumes... and not in
         | a good way.
        
         | fefe23 wrote:
         | [flagged]
        
           | xyzzy_plugh wrote:
           | Given the choice between maintaining a fork of a C standard
           | library implementation and switching to an implementation
           | that doesn't have this issue, the choice is pretty clear.
           | 
           | It was pretty clear that musl was unwilling to support adding
           | any TCP fallback code path regardless of a patch existing.
           | 
           | Anyways, your comment is inflammatory and is full of straw-
           | men. Try being less of a dick.
        
             | fefe23 wrote:
             | [flagged]
        
               | xyzzy_plugh wrote:
               | I don't use musl, so I don't have a horse in this race,
               | but I wouldn't use it anyways because of issues like this
               | one.
               | 
               | Do they owe me anything? Of course not. Is anyone here
               | claiming anything to the contrary? No. So what are you on
               | about, exactly?
               | 
               | This is the comment section on a website. Here we are
               | discussing that the project maintainers kinda cocked this
               | one up from a reputation and user trust perspective. If
               | reputation, user trust and thus adoption are not a
               | concern or goal of the project, then cool, good for them.
               | It doesn't mean, however, that there isn't some
               | reputation or user trust that was eroded. And it
               | certainly doesn't mean we can't discuss it.
               | 
               | For you to reply "what are you a child, just fork it and
               | grow a pair, fix it yourself" is not constructive nor
               | does it contribute to the discourse. Or did I put words
               | in your mouth? Hm I wonder how that feels.
        
         | vidarh wrote:
         | Yikes. The first time I ran into an issue where email delivery
         | for a major provider _was impossible_ without TCP fallback was
         | 23 years ago. To treat this as optional this long is
         | ridiculous.
        
           | brmgb wrote:
           | For all the good things Musl does, its authors can often be
           | downright ridiculous out of a dogmatism which is very
           | unpleasant to deal with as a user. See also their stubborn
           | refusal to give you an easy way to detect that the libc is
           | Musl while compiling.
        
             | yjftsjthsd-h wrote:
             | Well, if musl _is_ standards compliant, then wouldn 't it
             | make sense to detect when you're building on something
             | different rather than musl? Like, if the standard didn't
             | require TCP fallback, then in theory the application should
             | handle that and only skip it if it detects itself being
             | built on a libc with non-standardized extensions to obviate
             | the effort.
        
               | zajio1am wrote:
               | Standards are not laws, they are reasonably well-written
               | documents describing behavior that is supposed to be
               | interoperable. If, for some reason, that is not true
               | (perhaps because the standard is ambiguous, obsolete or
               | some other reason), ensuring that there is
               | interoperability is more important than just keeping it
               | to the letter of the standard.
               | 
               | See RFC 1925, The twelve networking truths:
               | 
               | (1) It has to work
        
               | brmgb wrote:
               | Because as we all know the standard never changes
               | potentially introducing new features, allows no
               | variability and bugs don't exist.
               | 
               | It must be nice living in your. theoretical world. Here
               | in the real world, I like libraries to actually help me
               | do what I want to do rather than be needlessly annoying
               | to take a stand no one actually cares about. But that's
               | me. Clearly it serves Musl right given the massive amount
               | of use it's seeing. Hmm, wait a minute...
        
               | yjftsjthsd-h wrote:
               | Okay, so nobody uses it and you can stop complaining
               | because it'll never come up.
        
               | brmgb wrote:
               | Well I would like to use it. It's code is mostly sane,
               | easier to understand than glibc and it would make sense
               | in a lot of places where performances are not critical.
               | That's why I am so annoyed by the non sense its
               | developers keep pulling out. It's an incomprehensible
               | situation because they get absolutely nothing out of it.
               | They are basically annoying people for the sake of it.
               | It's incredible that changes like the one we are
               | commenting on which are just sanity being restored have
               | to come from an evolution of the standard.
        
             | 0xcde4c3db wrote:
             | > See also their stubborn refusal to give you an easy way
             | to detect that the libc is Musl while compiling.
             | 
             | I suspect that the primary use cases for that would be
             | applying half-baked workarounds and refusing to work with
             | an "unsupported" libc, much like HTTP User-Agent.
        
               | brmgb wrote:
               | Or, you know, taking into account the stupid things Musl
               | does like not having TCP DNS fallback.
               | 
               | I'm only half joking. Features detection in Musl is a
               | generally pain. I don't blame developers for sometimes
               | taking the path of least resistance and not trying to
               | support it.
        
           | geocar wrote:
           | Email servers cannot use gethostbyname anyway; they never
           | would have been affected by this issue.
        
             | vidarh wrote:
             | Musl also includes res_send() etc., which _can_ be used for
             | MX records, and e.g. Qmail _did_ use that back then. Here
             | 's the commit adding TCP support to those functions in
             | Musl.
             | 
             | http://git.musl-
             | libc.org/cgit/musl/commit/src/network/res_ms...
             | 
             | AOL was the specific case we ran into that at the time (ca.
             | 2000) returned either MX or A records (can't remember which
             | caused the problem) that required more than 512 bytes.
        
               | geocar wrote:
               | I wasn't aware musl implemented libres. I don't have a
               | problem with libres doing TCP.
               | 
               | Qmail's issue was a hardcoded buffer size so this patch
               | to musl wouldn't have helped.
        
             | kstrauser wrote:
             | I'm skeptical of that. Why do you say so?
        
               | vidarh wrote:
               | gethostbyname() only returns an address, it can't be used
               | to e.g. query for MX records. You also lose control over
               | retries to different addresses if you use
               | gethostbyname(), which some mail server software will
               | also care about.
               | 
               | However as I noted in my other reply, musl also
               | implements the res_* functions, like res_send() etc., and
               | _those_ can be used to address both.
        
       | robinhoodexe wrote:
       | We're currently consolidating all container images to run on
       | Debian-slim instead of a mixture of Debian, Ubuntu and alpine.
       | Sure, alpine is small, but with 70% of our 500 container images
       | being used for Python, R or node the final image is so large (due
       | to libraries/packages), that the difference between alpine (~30
       | MB) and debian-slim (~80) is negligible. We've been experiencing
       | the weird DNS behaviour of alpine and other issues with musl as
       | well. Debian is rock solid and upgrading to bookworm from
       | bullseye and even buster in many cases didn't cause any problems
       | at all.
       | 
       | I will admit though, that Debian-slim still has some non-
       | essential stuff that usually isn't needed at runtime, a shell is
       | still neat for debugging or local development. This trade off
       | could be considered a security risk, but it's rather simple to
       | restrict other stuff at runtime (such as run as non-privileged,
       | non-root user with all capabilities dropped and a read-only file
       | system except for /tmp).
       | 
       | It's a balancing act between ease-of-use and security. I don't
       | think I'd get popular with the developers by forcing them to use
       | "FROM scratch" and let them figure out exactly what their
       | application needs at runtime and what stuff to copy over from a
       | previous build stage.
        
         | adolph wrote:
         | > the difference between alpine (~30 MB) and debian-slim (~80)
         | 
         | Given that it's a different layer, the your container runtime
         | isn't going to redownload the layer anyway, right?
        
           | kstrauser wrote:
           | And even if it did, in an ancient data center that only uses
           | gigabit ethernet, that's only a .5s longer download. And even
           | a $4 DigitalOcean server comes with 10GB of storage, so that
           | 50MB is only 1/200th of the instance's store. (I'd also bet
           | that nearly no one uses instances that tiny for durable
           | production work where 50MB is going to make a difference.)
        
           | robinhoodexe wrote:
           | Exactly. Part of the appeal to consolidate all of our
           | container images to use Debian-slim is the ability to
           | optimise the caching of layers, both in our container
           | registry but also on our kubernetes cluster's nodes (which
           | can be done in a consistent manner with kube-fledged[1]).
           | 
           | [1] https://github.com/senthilrch/kube-fledged
        
             | robertlagrant wrote:
             | Thanks for that - that operator sounds extremely useful!
        
         | leononame wrote:
         | Can you point me on where to look for more details on securing
         | a container? I'm a developer myself, and for me, the main
         | benefit of containers is being able to deploy the app myself
         | easily because I can bundle all the dependencies.
         | 
         | What would you suggest I restrict at runtime and can you point
         | me to a tutorial or an article where I can go have a deeper
         | read on how to do it?
        
           | vbezhenar wrote:
           | Basically you need to work as unprivileged user and with
           | immutable file system (of course you can have ephemeral /tmp
           | or persistent /data, but generally the entire system should
           | be treated as read-only).
        
           | robinhoodexe wrote:
           | What you want to read is the kubernetes pod security context
           | fields[1].
           | 
           | In your Dockerfile, add a non-root user with UID and GID
           | 1000, then in the end of your Dockerfile, right before a CMD
           | or ENTRYPOINT you change to that user.
           | 
           | In your kubernetes yaml manifest, you can now set
           | runAsNonRoot to true and runAsUser & runAsGroup to 1000.
           | 
           | Then there's the privileged and allowPrivilegeEscalation
           | fields, these can nearly always be set to false unless you
           | actually need the extra privileges (such as using a GPU) on a
           | shared node.
           | 
           | Then there's seccomp profiles and the system capabilities. If
           | you can run your container as the non-root user you've
           | created, and you don't need the extra privileges then these
           | can safely also be set to the most restricted. Non-privileged
           | non-root is the same as all capabilities are dropped.
           | 
           | The tricky one is the readOnlyRootfilesystem field. This
           | includes /tmp, which is considered a global writeable
           | directory, so the workaround is to make a in-memory volume
           | and mount it at /tmp to make it writable. Likewise, your
           | $HOME/.cache and $HOME/.local directories (for the user you
           | created in your Dockerfile) are usually used by third party
           | packages, so creating mounts here can be useful as well (if
           | for some reason you can't point it to /tmp instead).
           | 
           | [1] https://kubernetes.io/docs/tasks/configure-pod-
           | container/sec...
        
           | 4oo4 wrote:
           | This was really helpful for me:
           | 
           | https://cheatsheetseries.owasp.org/cheatsheets/Docker_Securi.
           | ..
        
         | Rapzid wrote:
         | This here. Honestly most orgs with uhh.. Let's say a more
         | mature sense of ROI tradeoffs were doing this from pretty much
         | the very beginning.
         | 
         | Also, Ubuntu 22.04 is only 28.17MB compressed right now so it
         | looks equiv to debian-slim. There are also these new image
         | lines, I can't recall the funky name for them, that are even
         | smaller.
        
           | kstrauser wrote:
           | I'm pushing to go back to Debian from Ubuntu. Canonical's
           | making decisions lately that don't appeal to me, and
           | especially on the server I don't see a clear advantage for
           | Ubuntu vs good ol' rock solid Debian.
        
           | senknvd wrote:
           | > There are also these new image lines, I can't recall the
           | funky name for them, that are even smaller.
           | 
           | You might be thinking of the chiselled images. An interesting
           | idea but very much incomplete[1].
           | 
           | [1]: https://github.com/canonical/chisel-releases/issues/34
        
           | synergy20 wrote:
           | why is it so much smaller than debian slim?
        
             | yjftsjthsd-h wrote:
             | Fewer features (including less "we've supported doing that
             | for 20 years and we're not cutting it now"), packages
             | separated into parts (ex. where debian might separate "foo"
             | into "foo" for the main package and "foo-dev" for headers,
             | alpine will also break out "foo-doc" with manpages and
             | such), general emphasis on being small rather than full-
             | featured.
        
             | Rapzid wrote:
             | It's not, Debian slim is ~28.8MB compressed.
        
         | kachnuv_ocasek wrote:
         | Do you have any tips regarding building R-based container
         | images?
        
           | robinhoodexe wrote:
           | R is kinda difficult and I haven't cracked this one.
           | Currently we're using the rocker based ones[1] but they are
           | based on Ubuntu and include a lot of stuff we don't need at
           | runtime. I'll look into creating a more minimal R base images
           | that's based on Debian-slim.
           | 
           | [1] https://github.com/rocker-org/rocker-versioned2
        
         | nickjj wrote:
         | I made the switch too around 4ish years ago. It has worked out
         | nicely and I have no intention on moving away from Debian Slim.
         | Everything "just works" and you get to re-use any previous
         | Debian knowledge you may have picked up before using Docker.
        
         | galangalalgol wrote:
         | I've run into the same thing for large dev images, but using
         | pure rust often means that musl allows for a single executable
         | and a config file in a from scratch container for deployment.
         | In cases where a slab or bump allocator are used, musl's
         | deficiencies seem minimized.
         | 
         | That means duplication of musl in lots of containers, but when
         | they are all less than 10MB its less of an issue. Linking
         | against gnu libraries might get the same executable down to
         | less than 2MB but you'll add that back and more for even the
         | tiniest gnu nase images.
        
           | synergy20 wrote:
           | what is pure rust,rust needs its own libraries which is a few
           | MB as I recall
        
             | galangalalgol wrote:
             | By pure, I just meant no dependencies on c or c++ bindings
             | other than libc. If that is the case you can do a musl
             | build that has no dynamic dependencies, as all rust
             | dependencies are static. So then your only dependency is
             | the kernel, which is provided via podman/docker. A decent
             | sized rust program with hundreds of dependencies I can get
             | to compile down to 1.5MB. But that is depending on gnu. So
             | if you had 4 or 5 of those on a node, it might be less data
             | to use one gnu base image that is really small like rhel
             | micro, and build rust for gnu. But if you have cpu hungry
             | services like I do, then you usually have only a couple per
             | node, so from scratch musl can be a bit smaller.
        
               | phamilton wrote:
               | > cpu hungry services
               | 
               | Have you benchmarked musl vs glibc in any way? Data I've
               | seen is all over the place and in curious about your
               | experience.
        
         | jonwest wrote:
         | In the same boat here as well. Especially when you're talking
         | about container images using JavaScript or other interpreted
         | languages that are bundling in a bunch of other dependencies,
         | the developer experience is much better in my experience given
         | that more developers are likely to have had experience working
         | in a Debian based distro than an Alpine based one.
         | 
         | Especially when you're also developing within the container as
         | well, having that be unified is absolutely worth the
         | convenience, and honestly security and reliability as well. I
         | realize that a container with less installed on it is
         | inherently more secure, but if the only people who are familiar
         | with the system are a small infrastructure/platform/ops type of
         | team, things are more likely to get missed.
        
       | richardwhiuk wrote:
       | This was nearly three months ago?
        
         | Arnavion wrote:
         | Yes, and for the people who link to musl dynamically in their
         | Alpine containers, it's also in Alpine 3.18
        
           | yjftsjthsd-h wrote:
           | And https://alpinelinux.org/posts/Alpine-3.18.0-released.html
           | puts that also nearly 3 months ago, FWIW.
        
       | jake_morrison wrote:
       | I use distroless images based on Debian or Ubuntu, e.g.,
       | https://github.com/cogini/phoenix_container_example
       | 
       | The result is images the same size as Alpine, or smaller, without
       | the incompatibilities. I think Alpine is a dead end.
        
         | suprjami wrote:
         | I hadn't heard of "distroless" before. Confusing name for a
         | container with just main process runtimes, but neat idea.
         | 
         | https://github.com/GoogleContainerTools/distroless
        
       | ecliptik wrote:
       | While I'm glad this is finally addressed, this limits the
       | usefulness of one of my favorite interview questions.
       | 
       | Asking about Alpine in a production environment was always good
       | way finding who has container experiences of watching C-Beams
       | glitter in the dark to those who only just read a "10 Docker
       | Container Tricks; #8 will blow your mind!" blog post from 2017.
        
         | vbezhenar wrote:
         | I'm using alpine containers for two years on a moderately sized
         | cluster and I've yet to encounter any issues caused by it.
        
           | baq wrote:
           | Famous last words right here.
           | 
           | It's usually the case that everything works until it doesn't.
           | When it's DNS that doesn't work, good luck debugging it
           | unless you've got war stories to tell.
        
         | InvaderFizz wrote:
         | It's still going to be pretty common for at least a few years,
         | and the now incorrect assumption that it is still broken I'm
         | sure will persist for a decade or more among those who have
         | been burned and thus moved on from Alpine and do not follow it.
         | 
         | DNS is a fun rabbit hole for interviews, for sure.
         | 
         | My favorite one to see on a resume is NIS. If you are listing
         | NIS and don't have horror stories or other things to say about
         | NIS, that's a really good indicator of the value of your
         | resume.
         | 
         | I intentionally list NIS on my resume because it is such a fun
         | conversation topic to go on about how security models changed
         | over time, all the ways NIS is terrible, but also how simple
         | and useful it was.
        
           | ecliptik wrote:
           | NIS is a good one. I have UUCP as a skill on my resume as an
           | easter egg but no one ever asks about it.
           | 
           | For DNS, my favorite interview question goes like this,
           | 
           |  _How would you verify DNS is resolving from within a pod on
           | Kubernetes?_
           | 
           | After listening to the answer, add some constraints:
           | 
           | 1. Common networking utilities like ping, nslookup, dig, etc
           | are not available
           | 
           | 2. Container user is unpriviledged
           | 
           | 3. su/sudo do not work
           | 
           | This can lead to some elaborate k8s troubleshooting or the
           | simple, and correct, answer of _getent hosts_.
        
             | cmeacham98 wrote:
             | After constraint 1, this devolves to a weird game of "does
             | the interviewee realize I don't consider `getent hosts` to
             | be a 'common networking utility' so it's still available?"
        
           | geocar wrote:
           | I still use NIS because hosts files are faster than DNS.
        
         | yjftsjthsd-h wrote:
         | > Asking about Alpine in a production environment was always
         | good way finding who has container experiences of watching
         | C-Beams glitter in the dark to those who only just read a "10
         | Docker Container Tricks; #8 will blow your mind!" blog post
         | from 2017.
         | 
         | I dunno, I've been running containers in prod for a while now
         | and I don't recall Alpine being a problem. Maybe it varies by
         | your workload?
        
         | stefan_ wrote:
         | glibc also has some fun behavior that few people know about
         | because (1) distributions have been patching it and nobody ever
         | actually ran the upstream version and or (2) downstream
         | software is papering it over:
         | 
         | https://github.com/golang/go/issues/21083
        
       | nathants wrote:
       | i've stopped using containers because they are annoying.
       | 
       | i've started using alpine on my laptop and ec2 because it's not.
       | 
       | different strokes, different folks.
        
       | develatio wrote:
       | IIRC this was causing some exotic problems when deploying docker
       | images based on musl.
        
         | tyingq wrote:
         | I think there's also still some potential problems because it
         | still does some things differently than glibc. Musl defaults to
         | parallel requests if you define more than one nameserver
         | (multiple --dns=, for example, for the docker daemon)...where
         | glibc uses them in the order you provide them.
         | 
         | To be clear, that's not "wrong", but just different maybe from
         | what docker was expecting.
        
           | LaLaLand122 wrote:
           | I really hope it does some things differently than glibc:
           | https://sourceware.org/bugzilla/show_bug.cgi?id=19643
           | 
           | In any case glibc uses NSS, so what glibc does depends on the
           | configuration. It may well just forward the request to
           | systemd-resolved.
        
       ___________________________________________________________________
       (page generated 2023-07-30 23:00 UTC)