[HN Gopher] The trouble with symbolic links
       ___________________________________________________________________
        
       The trouble with symbolic links
        
       Author : jwilk
       Score  : 196 points
       Date   : 2022-07-22 09:04 UTC (13 hours ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | forrestthewoods wrote:
       | Hard linking files isn't useful in my experience because it
       | requires every tool working with that file to never delete it and
       | recreate it. However that's what exactly many tools do. So the
       | only way to reliably "share" a file is with a symlink. At least
       | this is true for my workflows.
        
       | jmillikin wrote:
       | IMO the article's conclusion is backwards. There's nothing wrong
       | with symlinks when files are opened with openat(), since in
       | principal the program (or controlling program) should always be
       | in control of the filesystem layout. It's open() that causes
       | problems, and complex interactions with symlinks in attacker-
       | controlled directories are just one of them.
       | 
       | The POSIX file API was designed before the concept of capability
       | passing (arguably, before the concept of computer security in
       | general). A modern replacement would look more like Fuchsia,
       | where child processes are provided file access scoped to their
       | parent process's authority. This same scoping can also be used
       | within a process, for example to implement a server that can
       | "self-chroot".                 > So, more functions following the
       | pattern of openat() had to be created       > [...]       > Some
       | are still missing, like getxattrat() and setxattrat().
       | 
       | The functions to get/set xattrs on a file descriptor (rather than
       | a path) are fgetxattr() and fsetxattr(). They're not usable for
       | the specific case of a file descriptor opened with O_PATH, but
       | that restriction is both documented and reasonable -- O_PATH
       | doesn't allow operations that inspect the state of the file
       | itself, such as reading/writing.
       | 
       | A better example might have been listxattr() vs flistxattr(),
       | because the former works on a file without read permissions, but
       | the latter fails on a descriptor opened with O_PATH.
       | listxattr("xattr-chmod000.txt", NULL, 0)         = 14
       | listxattr("xattr-chmod000.txt", "user.testattr\0", 14) = 14
       | getxattr("xattr-chmod000.txt", "user.testattr", NULL, 0) = -1
       | EACCES (Permission denied)
       | 
       | vs                 openat(AT_FDCWD, "xattr-chmod000.txt",
       | O_RDONLY|O_PATH) = 3       flistxattr(3, NULL, 0)
       | = -1 EBADF (Bad file descriptor)
        
         | jerf wrote:
         | "There's nothing wrong with symlinks when files are opened with
         | openat(), since in principal the program (or controlling
         | program) should always be in control of the filesystem layout.
         | It's open() that causes problems, and complex interactions with
         | symlinks in attacker-controlled directories are just one of
         | them."
         | 
         | One of my minor annoyances with new languages is the continued
         | persistence of open-based file APIs, with openat APIs shoved
         | off to the side if they are even implemented at all. If you
         | start from scratch with an openat-based API, it's not even that
         | hard; you basically get a file object, just one with some
         | different attributes and methods (or appropriate local ideas),
         | most of which you don't care about, and it's not that hard to
         | work with if you start with that from day one. It can be quite
         | hard to backport something deeply based on string-based path
         | manipulation into the *at-based APIs, though.
         | 
         | I haven't deeply studied it but you ought to be able to
         | simulate an openat-based API on a conventional filesystem that
         | doesn't support it. It may not immunize you to security issues,
         | but at least the code ought to be as portable as any other code
         | that starts getting detailed about its interactions with
         | filesystems, which is already "kinda, not very, some elbow
         | grease required"... it's not like the bar is sky high because
         | all that stuff already works perfectly across all platforms and
         | filesystems anyhow.
        
       | ziml77 wrote:
       | Interestingly, Windows actually did exactly what's proposed at
       | the end when MS added them in Vista. To minimize the security
       | issues with symlinks, you had to elevate to admin to create them.
       | 
       | It was only during the life of Windows 10 that they even added
       | the option to not have to elevate to create them. It was done
       | specifically because symlinks are often shared across systems
       | since they end up in places like git repos and npm packages:
       | https://blogs.windows.com/windowsdeveloper/2016/12/02/symlin...
        
         | faragon wrote:
         | You could create symlinks since Windows Vista without
         | permission elevation, if having disabled the UAC ("User Account
         | Control").
        
           | ziml77 wrote:
           | Disabling UAC meant there was nowhere to elevate to because
           | you just had admin rights all the time.
        
             | faragon wrote:
             | You're right, thank you.
        
         | alerighi wrote:
         | There is an option (that is proposed when you install git) to
         | allow normal users to create symlinks. The fact is that
         | symlinks are very useful to a developer, and a lot of
         | development tools make use of them.
        
       | alerighi wrote:
       | Well there is the solution: work with file descriptors and not
       | with paths. POSIX should be extended to make sure all functions
       | that take a path has also the version that takes the file
       | descriptor (to avoid the /proc/self/fd/%d hack, that is not
       | portable to non-Linux OS that don't have /proc, and on Linux
       | requires /proc to be mounted that is not always the case for
       | example in sandboxes and chroots).
       | 
       | You don't also only have problem with symlinks if you work with
       | paths, but with any kind of paths. For example is wrong to check
       | with the path if a file exists and then do something with it,
       | because it can as well be deleted, modified, etc. You have to
       | work with file descriptors, and use only one function (open) to
       | resolve the path into a descriptor one time (that is also more
       | efficient, since resolving a path is computationally expensive,
       | especially on modern filesystems).
        
       | bhawks wrote:
       | A posix filesystem by itself is not a defensible security
       | perimeter. Symlinks introduce security problems but there are
       | other sources as well. If you have a system where processes with
       | different trust profiles share a common view of a file system you
       | have to assume one can manipulate the filesystem state to subvert
       | the other.
       | 
       | Android has dealt with this via locking down and isolating apps
       | to their own filesystems. Cross app communication and data
       | sharing utilizes IPC primitives that have rich caller/Callie
       | information that can be used to build capabilities and
       | authn/authz checks.
       | 
       | The posix filesystem just doesn't have the
       | abstractions/expressiveness one would need to build a robust
       | security perimeter between untrusted apps.
        
         | philsnow wrote:
         | > The posix filesystem just doesn't have the
         | abstractions/expressiveness one would need to build a robust
         | security perimeter between untrusted apps.
         | 
         | This, absolutely, but I think it's even worse than that; in my
         | mind the value prop of k8s is twofold: declarative
         | configuration and isolation that forces apps to be able to
         | interact over a very small boundary, the network overlay.
        
       | stjohnswarts wrote:
       | Does anyone else read something like this
       | 
       |  _Jeremy Allison gave a talk titled "The UNIX Filesystem API is
       | profoundly broken: What to do about it?"._
       | 
       | and immediately shut down on the person saying it because he all
       | know it's extremely hyperbolic given what is happening in
       | reality?
       | 
       | It's great to not like something and point out its flaws as far
       | as you use them and "here's the great idea to fix those issues"
       | but to try to offend you audience and the Unix community as the
       | first words you see in a talk is a great way to have people shut
       | off and think "oh great another neckbeard with an overinflated
       | sense of self"
        
       | yyyk2 wrote:
       | No, there is absolutely nothing broken about symbolic links. What
       | is the issue here is accessing user files as root. That is
       | inherently unsafe in POSIX and also affects hardlinks as well
       | (even more so, since you have to "follow" hardlinks
       | http://michael.orlitzky.com/articles/posix_hardlink_heartach...).
        
       | marcosdumay wrote:
       | I really think that an "open the file as this user or group (also
       | constrained to the current user's permissions)" option will solve
       | the security problems better than the "open the file relative to
       | this root" one. Or, failing that, an usable capability system
       | (not SELinux).
       | 
       | I don't think just checking for symlinks really solves anything.
       | It may make your bugs harder to exploit, what is always good, but
       | people use symlinks, so you have to support them, so the bugs
       | will stay there.
        
       | smelbe wrote:
       | There doesn't seem to be a way to batch together operations that
       | involve walking through directories and symlinks to do something
       | to a file. This seems to be a major source of complexity.
        
         | rwmj wrote:
         | I always thought Unix v7+ should have added some kind way to do
         | atomic groups of syscalls, eg:
         | begin_transaction ();       lstat ("/path", ...);       lstat
         | ("/path/foo", ...);       commit ();
         | 
         | In Unix v7 mkdir was not a system call. It was a setuid program
         | implemented using mknod + link. That was racy so the mkdir(2)
         | system call was added. But it could have been solved more
         | generally (and more elegantly) by adding transactions.
         | 
         | It could also solve the whole thing with ending up with zero-
         | length files because you didn't use the right incantation to
         | update a file atomically on ext4
         | (https://thunk.org/tytso/blog/2009/03/12/delayed-
         | allocation-a...).
        
           | bhawks wrote:
           | A general purpose transactional interface widene the error
           | space to include cross process deadlocks / denial of service
           | not to mention performance issues.
        
           | amaranth wrote:
           | Wasn't making userspace handle these kinds of things a big
           | part of "worse is better"?
        
             | masklinn wrote:
             | Turns out when facing adversarial actors worse is just
             | worse.
        
         | klodolph wrote:
         | Could you elaborate? Seems like there's a bunch of things that
         | can't be batched together on an ordinary file, without
         | involving symlinks.
        
           | masklinn wrote:
           | What they (and TFA) are saying is that there is no
           | transactional view of the FS. If you could work in
           | "repeatable read" (only and always see the state of the FS
           | before you started the transaction) symlink races wouldn't be
           | possible.
        
             | klodolph wrote:
             | Right, but there is no transactional view with or without
             | symlinks.
        
               | masklinn wrote:
               | Without symlinks it doesn't matter, because by definition
               | a symlink race requires a symlink to be involved.
        
       | js2 wrote:
       | I'm sorry symlinks are a thorn in Jeremy's side, but they are
       | useful from a user's perspective. Hard links don't fill the same
       | need. You can't normally hard link directories. If a file has
       | multiple links, finding them all normally requires scanning the
       | entire file system, so deleting a file now becomes harder. A file
       | with multiple links doesn't have an obvious canonical path.
       | 
       | As an example of all these issues, I manage a bunch of Mac build
       | hosts with multiple Xcode versions installed. We only retain the
       | most recent patch release of each major.minor version, but drop
       | compatibility symlinks in place for the other versions. On macOS,
       | an application is just a directory. So for example we'll have:
       | Xcode-13.0.app -> Xcode-13.0.1.app       Xcode-13.0.1.app
       | 
       | From a simple "ls" it's obvious which versions are installed and
       | which are just compatibility shims. Symlinks are just so damn
       | convenient for use cases like this. Hard links don't cut the
       | mustard here.
       | 
       | So there are more reasons for symlinks than just "hard links are
       | restricted to linking within the same filesystem", but yes, that
       | too.
       | 
       | Probably I'm just lacking imagination and there's a solution that
       | offers the advantages of symlinks with none of the downsides, but
       | in my experience, we see this sort of indirection all over the
       | computing landscape, so it seems like there's a fundamental need
       | for it.
        
         | lxgr wrote:
         | > You can't normally hard link directories.
         | 
         | That's only to avoid loops, as far as I understand. Symlinks do
         | allow loops, but require application programmers to handle
         | them. So maybe we just need better APIs/API contracts around
         | loops, rather than two types of links?
         | 
         | > If a file has multiple links, finding them all normally
         | requires scanning the entire file system
         | 
         | Couldn't this pretty easily be solved at the file system level?
         | Just store a back pointer from a file to each of its names.
         | 
         | The fact that it's possible to break symlinks very easily by
         | deleting the pointed-to file (name) is a problem as well:
         | Wouldn't application developers usually, or at least sometimes,
         | want to know about the fact that they are about to break a link
         | (or conversely, not deleting the final copy of a file and not
         | just a reference to it)?
         | 
         | > So there are more reasons for symlinks than just "hard links
         | are restricted to linking within the same filesystem"
         | 
         | I think this might be the only real (technical/historical)
         | limitation. The rest could probably be worked around, but maybe
         | having two distinct types of links, with these other binary
         | decisions (allowing loops, making deletion explicit vs. a
         | matter of referenc counting) being more or less arbitrarily
         | bucketed into those two types based on what was easier to
         | implement.
        
           | XorNot wrote:
           | Hard links don't have a canonical name though - they're all
           | equally the same file, and this is really a problem: opening
           | and editing a file in one location, edits it in all of them
           | without you knowing what those locations might be.
           | 
           | Symlinks at least explicitly declare the dependency and how
           | it should mutate.
           | 
           | A classic being /etc/resolve.conf symlinks - if I'm untarring
           | and restore a symlink for it, I'm currently saying the file
           | should have content from somewhere else on the system - not
           | that the file _is_ specific content.
        
             | masklinn wrote:
             | > Hard links don't have a canonical name though - they're
             | all equally the same file, and this is really a problem:
             | opening and editing a file in one location, edits it in all
             | of them without you knowing what those locations might be.
             | 
             | That is something the filesystem could store tho, in the
             | same way it stores the number of links to a file it could
             | be a bit more capable and store the links themselves
             | (possibly in a xattr).
             | 
             | > Symlinks at least explicitly declare the dependency and
             | how it should mutate.
             | 
             | They only declare one dependency one way, it's not like a
             | symlink gives you all the other symlinks to the terminal
             | location it will affect.
        
             | GoblinSlayer wrote:
             | Symlinks do that too even inevitably: no matter how you
             | change the file, it changes at all links and you can't
             | prevent it; systemd uses this feature when it creates
             | dependency references (the linked dependency must never
             | differ from the source, what hard links don't ensure).
        
           | mzs wrote:
           | ELOOP errno
        
           | jandrese wrote:
           | > Couldn't this pretty easily be solved at the file system
           | level? Just store a back pointer from a file to each of its
           | names.
           | 
           | In theory yes, but no filesystem does this as far as I know.
        
           | js2 wrote:
           | I tried to construct my argument to make it clear that I'm
           | aware there are ways to solve the issues with hard links, but
           | they have their own sets of trade-offs.
           | 
           | For hard links, it's not only that they can cause loops.
           | There are the other issues I outlined (linking across file
           | systems, no single canonical representation of the file in
           | the file system, finding all the links to the file, etc).
           | 
           | There's no "just store a back pointer." That will obviously
           | introduce its own set of complexities and trade-offs. Where
           | do you store the pointers? What's the API for viewing them?
           | What's the CLI for viewing them? Is it a new switch to `ls`?
           | A new CLI entirely? How do you keep the pointers up to date?
           | What sort of locking is needed when updating the pointers?
           | What about `fsck`? How do you get this implemented across the
           | multitude of Unix and Unix-like OS's and file systems?
           | 
           | (As an aside, I've been really trying to stop using the word
           | "just" lately as I've learned that things are rarely so
           | simple to justify the word.)
           | 
           | Again, I'm not saying there isn't a better solution, but I
           | don't think it's patching up hard links. I think it's
           | something outside the box of both hard links and symbolic
           | links.
        
             | lxgr wrote:
             | > [...] I think it's something outside the box of both hard
             | links and symbolic links.
             | 
             | Absolutely agreed - given your examples and all the other
             | challenges around backwards compatibility with decades of
             | application code, I'd also assume it would be something new
             | entirely.
             | 
             | But my guess is that it would be able to meet the existing
             | use cases of both.
        
             | DiggyJohnson wrote:
             | Re: Symlink analysis: Well said.
             | 
             | > (As an aside, I've been really trying to stop using the
             | word "just" lately as I've learned that things are rarely
             | so simple to justify the word.)
             | 
             | Me too! I realized how it immediately frustrated me to hear
             | it used about my domains. I'm constantly having to work to
             | not seem as short/blunt/know-it-all as I _feel_. I think
             | this word is a connotation trap, because when I use it
             | feels inoffensive, but when I hear it seems blunt and
             | dismissive and I'm quick to assume the person doesn't
             | understand or empathize with the complexities of the
             | situation. That's a long way of saying I really enjoyed
             | your aside.
        
           | lloeki wrote:
           | >> You can't normally hard link directories.
           | 
           | > That's only to avoid loops, as far as I understand
           | 
           | Later HFS+ does support directory hard links, a feature
           | introduced for Time Machine IIRC, but generally unavailable
           | to the user.
        
           | [deleted]
        
           | kazinator wrote:
           | Symlink loops are handled in the pathname resolution function
           | in the kernel. Too many indirections of symlinks (typically
           | around forty or so?) result in the resolution bailing with an
           | ELOOP errno.
        
             | gweinberg wrote:
             | I at first read "errno" as "emo" and was trying to picture
             | what that would look like.
        
         | waynesonfire wrote:
         | symlinks are great, I don't see why we would remove such
         | feature. The author pointed out a bunch of issues around atomic
         | operations related to symlinks which in my view are valid.
         | Similar TOCTOU race exists with PIDs, see
         | https://lwn.net/Articles/773459/
         | 
         | Not sure whether the pid issue was ever resolved, havn't
         | checked in on that in a while.
        
           | IshKebab wrote:
           | Symlinks are great from a "just make it work!" point of view
           | but they're absolutely terrible from a "make it robust, sane
           | and secure" point of view.
           | 
           | All of the points in the article are valid but there's even
           | simpler stuff like the fact that you can't canonicalise paths
           | (resolve ..) without reading the filesystem.
           | 
           | This should be required reading:
           | https://9p.io/sys/doc/lexnames.html
        
           | GoblinSlayer wrote:
           | https://man7.org/linux/man-pages/man2/pidfd_open.2.html
        
         | OnlyMortal wrote:
         | I take it an "alias" isn't good in this case? An alias does
         | follow a move of the original - usually.
        
           | yjftsjthsd-h wrote:
           | Interesting; how does that work under the hood?
        
             | OnlyMortal wrote:
             | https://en.m.wikipedia.org/wiki/Alias_(Mac_OS)
        
             | js2 wrote:
             | It's macOS magic that requires HFS/APFS and doesn't work at
             | the POSIX layer. It would not work for my use case, no.
             | 
             | An alias is like a hybrid between a symbolic link and a
             | hard link. Like a symbolic link, it's its own file type
             | whose contents point to the original, but like a hard link
             | it points to the original using its ID, not its path. So an
             | alias works even if the original is moved, but it does not
             | increase the original's link count and is its own distinct
             | entity in the file system.
        
       | rob_c wrote:
       | The trouble with symbolic links is users who don't understand/use
       | linking coming from alternate platforms and developing on/for
       | UNIX.
       | 
       | This is a feature in the same way that shellsock was used as a
       | feature for many years by experts. Thankfully I'm expecting
       | something like POSIX to save us this time.
        
       | cosmiccatnap wrote:
       | Features add complexity and robust features add robust
       | complexity. Robust features that span a core component like
       | filesystem handling span many utilities.
       | 
       | Maybe we should do the same search on samba vulnerabilities have
       | have him take a look in the mirror...
        
       | jhallenworld wrote:
       | Another problem with hard links: you can not hard link a
       | directory. It would make ".." ambiguous.
       | 
       | Also with hard link directories, you would want to be able
       | "rmdir" non-empty directories, just to delete the link. But then
       | you have the problem of reference loops, so how do you reclaim
       | space reliably? You would need a garbage collection algorithm to
       | find data not reachable by root.
        
         | jmillikin wrote:
         | Whether directories can be hardlinked depends on the filesystem
         | and OS. When macOS switched from HFS+ to APFS, one of the
         | changes was that they dropped support for directory hardlinks.
         | 
         | https://developer.apple.com/library/archive/documentation/Fi...
        
           | jhallenworld wrote:
           | I'm not a macOS user, but these sure seem to add complexity:
           | 
           | https://stackoverflow.com/questions/80875/what-is-the-
           | unix-c...
           | 
           | The filesystem has to check for and disallow loops.
        
         | samatman wrote:
         | A filesystem can afford a very, very slow, mark and sweep.
        
         | nine_k wrote:
         | The GC algorithms for hardlinked directories can be the same as
         | for harlinked files: reference counting.
        
           | jhallenworld wrote:
           | Only if there are no loops..
        
       | mongol wrote:
       | "X is fundamentally broken" is a tired trope. To me, something is
       | broken if it is no longer working as intended. It used to work,
       | but now it does not - it is broken.
       | 
       | If something works as intended, but its utility is limited, and
       | it can be improved, it is not broken.
        
         | joosters wrote:
         | Symlinks work as intended, but they cause a lot of unintended
         | security vulnerabilities, i.e. they break lots of otherwise-
         | functioning code.
         | 
         | You can play with your words and redefine their meanings, but
         | the vulnerabilities remain.
        
           | pif wrote:
           | > they break lots of otherwise-functioning code.
           | 
           | There is no otherwise! POSIX has symbolic links: if your
           | software does not function with symbolic links, it does not
           | function on POSIX.
        
             | IshKebab wrote:
             | Of course there is otherwise. Windows (more or less)
             | doesn't have symlinks. Plan9 doesn't have symlinks. You
             | don't _have_ to have symlinks.
             | 
             | Can you not imagine anything other than POSIX?
        
           | mongol wrote:
           | They have been around for 40+ years, they don't break code
           | unless we are talking about code predating their
           | introduction. It is not me playing with words, I am just
           | pointing out a tired trope.
        
           | indymike wrote:
           | > You can play with your words and redefine their meanings,
           | but the vulnerabilities remain.
           | 
           | I don't think OP was trying to play with words, I do think
           | there's an absolute "this symlink thing is a vuln" vs an
           | absolute "I use symlinks to make X work" argument. Symlinks
           | have always been at the line between the absolutes. They do
           | enable a great deal of functionality but they can be a
           | security risk, and source of bugs when developers don't
           | handle them correctly. That said, they are heavily used
           | feature on unix like oses. My /usr/bin on Ubuntu has 48 of
           | them (most were put there by apt installed packages).
        
           | rob_c wrote:
           | > but they cause a lot of unintended security vulnerabilities
           | 
           | No, bad coders on UNIX platforms do this.
           | 
           | The code may be valid code, but if it's intending to support
           | running on UNIX it should do it properly not assuming it's on
           | a FAT32 filesystem in 2022.
           | 
           | Or, even better, run the code you don't trust on fat32
           | filesystems, see how far that gets.
        
             | joosters wrote:
             | Don't victim blame.
             | 
             | Do you really think that using open(), stat(), lstat() (!),
             | realpath(), mkdir(), rename() etc etc etc is a sign of a
             | bad coder? The problem is that the APIs set you up for
             | unexpected failure, and even some of the provided
             | workarounds to 'safely' handle symlinks don't do it well
             | enough.
             | 
             | In the case of symlinks, I think it's fair to blame the
             | tools rather than the workman.
        
               | zx8080 wrote:
               | API is always simplification and is not supposed to be
               | used without understanding concepts and reality under the
               | hood.
               | 
               | Example: wanna show 1M POI in browser on some small
               | territory. Openmaps/googlemaps API allows that, no prob.
               | Looks good, yeah? Sorry, doesn't work. Because 1M is too
               | large to show and browser gets stuck.
               | 
               | The API do not prevent _all_ kinds of legshooting
               | engineers invent.
        
               | rob_c wrote:
               | Don't worry, they don't listen to speaking out against
               | bigG or others for being bad
        
               | rob_c wrote:
               | This is not victim blame. RTFM READ IT!!!
               | 
               | Most sane languages and low level tools describe what you
               | want and how to work correctly.
               | 
               | If you don't want this feature in the filesystem, move to
               | one that doesn't support it, or better yet submit a patch
               | to run the filesytem you want with this feature
               | deactivted for "security concerns".
               | 
               | Demanding a whole OS change the way it works for
               | bad/lazy/inept coders is akin to 2 people getting blind
               | drunk and blaming the other person or the drink for the
               | stupid things they did. Take some responsibility.
        
               | com2kid wrote:
               | > Demanding a whole OS change the way it works for
               | bad/lazy/inept coders is akin to 2 people getting blind
               | drunk and blaming the other person or the drink for the
               | stupid things they did. Take some responsibility.
               | 
               | And if people were just more careful, none of rust's
               | memory safety stuff is needed! Also, why do modern
               | languages hand hold multi-threading so much, just give
               | developers some mutex primitives and let them have at it,
               | the good coders will be just fine!
               | 
               | Of course the rest of us will have to deal with machines
               | getting pwned due to security bugs, but hey, at least the
               | "well written" programs won't have those problems...
        
               | rob_c wrote:
               | Stop defending poor coding and lack of skill.
               | 
               | Everything you're saying is an excusory situation for
               | hiring poor coders at minimum wage who can't or won't
               | read documentation. This is 80IQ points South of frankly
               | most of the conversations on here.
               | 
               | Yes the rest of us cope with security incidents. There
               | will always be security incidents. Stop defending
               | practices that leads to them.
        
               | com2kid wrote:
               | > Stop defending poor coding and lack of skill.
               | 
               | This is a hopelessly elitist attitude. Also it is a
               | useless one, over 1000 CVEs, yelling "be better at your
               | job!" is just going to result in another 1000 CVEs. That
               | is exactly what happened for decades with buggy C code,
               | buffer overflows and use after frees, for a long time the
               | refrain was "just do better!".
               | 
               | Well millions of dollars of damages later, it turns out
               | berating people to "just do better" doesn't actually make
               | things any better. A combination of static analysis and
               | runtime tooling, and then the eventual creation of new
               | programming languages that allow for correct modeling of
               | memory ownership, is what the industry en masse has
               | decided on.
               | 
               | For APIs that get misused? The solution is to provide
               | higher level APIs that allow programmers to easily
               | accomplish the correct thing in a secure manner.
               | 
               | As an aside, and in general, when designing software, I
               | want to maximize the amount of brain power I am
               | dedicating to solving the business problem at hand.
               | Dealing with poorly designed insecure APIs detracts from
               | me getting my actual job done.
               | 
               | > Stop defending practices that leads to them.
               | 
               | The practice in this case is the direct use of filesystem
               | APIs that were designed in the 1970s for a very different
               | security ecosystem than what exists today.
               | 
               | Lots of things designed in the 1970s are not secure by
               | default. Heck most things designed in the 1970s, outside
               | of maybe some IBM Mainframe stuff, was not designed to be
               | secure by default.
               | 
               | What you are arguing is that instead of buying a fire
               | extinguisher to put in the kitchen of an old house,
               | people should just try and not set things on fire.
               | 
               | I mean, yeah, sure, good goal, but _buy the fire
               | extinguisher anyway_.
        
               | rob_c wrote:
               | How is it hopelessly elitist to call out insecure code as
               | being INSECURE!!!
               | 
               | My whole point is the same as yours fix it at the source.
               | You seem to think hacking off the hands of some coders is
               | safer (I may agree). But why not try to EDUCATE THEM?!?
               | 
               | Education costs 1000s of dollars at most rather than your
               | hypothetical billion dollar APPLICATION LEVEL hack.
               | 
               | Why are you so elitist to assume people can't cope with
               | these concepts?
               | 
               | My whole point is that they need to be tought they're
               | running on a Unix server rather than an 1998 SD card. The
               | rest of your complaining is either you don't understand
               | this or are trying to excuse bad or insecure practice as
               | acceptable. If this is your case. RUN THE CODE IN AN
               | ENVIRONMENT WHERE THIS CAN'T HAPPEN. Seriously there are
               | filesystems and options for this.
               | 
               | Calling for these features to be removed from extX, ZFS
               | or other shows you don't understand storage technologies
               | well enough.
        
               | com2kid wrote:
               | > How is it hopelessly elitist to call out insecure code
               | as being INSECURE!!!
               | 
               | If a lot of code, written by a lot of different
               | engineers, all ends up being insecure, it is worth
               | asking, why is code dealing with this particular domain
               | so often insecure?
               | 
               | > But why not try to EDUCATE THEM?!?
               | 
               | You can do that, and of course we should, but here is the
               | thing about security:
               | 
               | The good guys have to write secure code _every_ time, or
               | else the attackers guys win.
               | 
               | Eternal vigilance is inhumanly hard to maintain. A better
               | solution is to write higher level APIs or API wrappers
               | that don't have these flaws.
               | 
               | > Why are you so elitist to assume people can't cope with
               | these concepts?
               | 
               | Sure they can, but how many concepts can people cope with
               | at once? Humans have a limit for how much they can juggle
               | in their head. A huge part of software engineering is
               | picking what abstraction layer to operate at. If I am
               | writing code that deals with tons of string parsing and
               | manipulation, I'd be a fool to write it in C or C++. Now
               | I've done that when I needed the performance, but
               | managing a massive number of strings in native code is
               | easily 5x the work compared to using a GC language that
               | also automatically tracks string length.
               | 
               | C is the wrong abstraction there. And indeed an obscene
               | number of security holes have historically centered
               | around string processing in C. That is because on top of
               | managing all the business logic (which may be obscenely
               | complicated by itself!) engineers now have to do so in a
               | language that is _really_ bad at dealing with strings and
               | they have to do a lot of mental work to ensure the code
               | is correct.
               | 
               | If I am manually flipping bits in hardware, well, JS can
               | do it (I have seen it!) but honestly, that shouldn't be
               | anyone's language of choice for directly interfacing with
               | hardware.
               | 
               | (Doing that in C, really fun!)
               | 
               | > Calling for these features to be removed from extX, ZFS
               | or other shows you don't understand storage technologies
               | well enough.
               | 
               | I am not saying that. I am saying that the original POSIX
               | APIs make writing secure code around symlinks hard, and I
               | am saying that solely based on the fact that a bunch of
               | security holes around POSIX APIs and symlinks exist!
               | 
               | This isn't some shocking statement. The original POSIX
               | APIs make a lot of things hard.
        
             | the8472 wrote:
             | Part of the problem is that handles on directories on which
             | one can then use the the *at family of syscalls are not
             | first-class citizens in many programming languages. Which
             | in turn might be due to portability concerns with windows,
             | e.g. Java's SecureDirectoryStream isn't available there[0].
             | Apparently windows does have an openat-like API[1], but
             | it's low-level.
             | 
             | Programmers aren't using them because the language standard
             | libraries point them in the wrong direction.
             | 
             | [0] https://github.com/google/guava/wiki/Release21#user-
             | content-... [1] https://github.com/rust-
             | lang/rust/blob/1c63ec48b8cbf553d291a...
        
               | rob_c wrote:
               | This a serious issue in support of the language for
               | running on a UNIX system.
               | 
               | Frankly blaming the kernel/filesystem for this is like
               | saying I want my 720p monitor to display 8k correctly, it
               | must be the display-drivers fault...
        
               | ziml77 wrote:
               | Weird that you need to dip down to an Nt* function to
               | open a child given a parent handle. It's not like it's
               | unheard of to be able to do that in the Windows API. The
               | registry is also a hierarchical system and opening any
               | key requires passing in a parent key handle (the root
               | handles are predefined).
        
               | ChrisSD wrote:
               | It's a quirk of history, imho. If Win32 had been written
               | without concern for what came before, it probably would
               | have more closely followed NT conventions.
               | 
               | But it wasn't. It followed on from Win16 and DOS so, to
               | an extent, it emulated DOS-style path and file handling.
               | After all, that's what developers and users were familiar
               | with. The Windows registry did not have all this baggage
               | so it followed the style of the NT kernel.
               | 
               | Though this doesn't explain why Win32 never added
               | CreateFileRelativeToDirectoryHandleW
        
         | littlestymaar wrote:
         | That "X - designed in the 70s when we had no idea of anything
         | regarding computers - is fundamentally broken" isn't so
         | surprising after all.
         | 
         | In fact, computers are probably the only place in the entire
         | technology landscape where we keep using almost unmodified
         | stuff from the 70s and decided we cannot change it because
         | there's too much things relying on it.
         | 
         | I don't like breaking everything all the time more than anyone,
         | but maybe one time every 20 or 30 years is OK...
        
           | SiempreViernes wrote:
           | I take it that here "technology landscape" means something
           | like "the gui portion of the software stack", right?
        
           | Beltalowda wrote:
           | > In fact, computers are probably the only place in the
           | entire technology landscape where we keep using almost
           | unmodified stuff from the 70s and decided we cannot change it
           | because there's too much things relying on it.
           | 
           | Bridges and buildings from the 1970s (and much older) are
           | still working fine today.
           | 
           | The thing is, if I do decide to replace my bridge or building
           | because it's old and outdated then I can just replace that
           | one thing without affecting much else. With computers, that
           | is obviously not the case: you need to replace the entire
           | city.
           | 
           | Plus, it's not really the case that we "keep using almost
           | unmodified stuff from the 70s"; while many concepts remained
           | the same and things remained compatible, things have been
           | greatly extended and modified since; it's like those old
           | buildings that were built during the middle ages (or
           | sometimes even earlier) that have been changed and upgraded
           | extensively throughout the centuries to the point you really
           | need to know where to look to see it's actually a centuries-
           | old building.
        
             | mmis1000 wrote:
             | > Bridges and buildings from the 1970s (and much older) are
             | still working fine today.
             | 
             | I am not sure about that, floods here go past the 200 years
             | average line at the time that many bridges or buildings was
             | designed. And actually breaks a lot of buildings.
             | 
             | Climate change these days is just as unexpected as hackers
             | these days to who we were.
        
             | Shorel wrote:
             | It would be like upgrading the train network to increase
             | the distance between rails, to increase the size of cargo
             | that can be transported.
             | 
             | Not just one city needs the upgrade, all of them will need
             | it, and all the related infrastructure like bridges and
             | tunnels too.
        
             | com2kid wrote:
             | > Bridges and buildings from the 1970s (and much older) are
             | still working fine today.
             | 
             | With modern earthquake straps added, and I bet the locks
             | got replaced a few times over, also the building likely had
             | its insulation improved, better venting added, a sprinkler
             | system, fire exits, and a wheelchair ramp put in at some
             | point.
             | 
             | Are there a few quaint stone bridges from 1700 still in
             | use? Sure, going over the neighborhood creek. But all the
             | bridges around me have undergone serious upgrades or
             | retrofitting over the decades.
        
           | toast0 wrote:
           | Internal plumbing is largely unchanged. Sure, there's more
           | flexible pipe, and a lot more plastic pipe, and a lot more
           | quarter turn valves, but thread pitch and pipe diameters are
           | largely unchanged and unchangable.
        
           | kllrnohj wrote:
           | I don't think that's entirely true. There's plenty of major
           | systems that have made fundamentally incompatible breaking
           | changes in order to move things forward. Windows did that
           | with Vista, Android did that with Linux (eg, app sandboxing
           | per UID, heavily restricted filesystem access to shared
           | directories), etc..
           | 
           | It's kinda mainly desktop/server Linux where there's this
           | inability to move forward.
        
         | nine_k wrote:
         | If it used to work without letting others pwn your account or
         | box, and not it fails to, it's broken.
        
         | 3pt14159 wrote:
         | The word "broken" came from before we had constant arms races
         | in technology. Obviously we don't call clubs broken because we
         | now have rapid artillery, but there was enough time between
         | clubs and swords, and swords and guns to allow transitions
         | away.
         | 
         | When I'm exposed to a core OS feature I expect by default that
         | it should not come with expected, critical security
         | vulnerabilities. It is rational to expect a user to use the
         | basic features of an OS and expect them not to cause severe
         | issues. You can say it's not broken, and that's true in as much
         | as they're still _functional_ , but if by broken one means the
         | larger question of "is this reliable and safe?" then I think
         | the answer is pretty clear that symlinks are broken.
        
           | ploxiln wrote:
           | In the modern world, the demand seems to be that every tool
           | be perfectly safe in every situation no matter what you do
           | (and it seems practically nothing lives up to this demand,
           | given the ever increasing river of silly CVEs for almost
           | every component, like regex DoS on build tools).
           | 
           | It's important to understand the scope of the issue. If you
           | create and operate on your own symlinks in your own folders,
           | there is no problem. The problem is when a more privileged
           | user operates on folders that can be written to by less
           | privileged users, for example system daemons (like a /tmp
           | cleanup, or a web server serving /home/*/www), or suid
           | binaries. These things need to be written very carefully, it
           | is now clear.
           | 
           | But if I'm working with my own files, media, source code,
           | build tools, web pages, etc, in my own folders, then symlinks
           | are still fine.
           | 
           | And there is an existing setting to mitigate a couple common
           | forms of the issue that does exist when accessing folders
           | other users can write to:
           | https://www.kernel.org/doc/html/latest/admin-
           | guide/sysctl/fs...
        
             | 3pt14159 wrote:
             | I appreciate the reply, but after thinking about it I think
             | it's more akin to someone having been sold a house only to
             | be told eight years later that the seller of the house knew
             | that if someone tied a shoelace to the front door and
             | pulled on it, then the entire house would explode.
             | 
             | Could we consider it a broken doorknob Pierce?
        
             | seoaeu wrote:
             | > In the modern world, the demand seems to be that every
             | tool be perfectly safe in every situation no matter what
             | you do
             | 
             | The problem is that there is such a huge number of tools in
             | widespread use that each one causing even a few security
             | vulnerabilities means that the ecosystem overall is
             | constantly vulnerable
        
       | shadowgovt wrote:
       | Don't hard links suffer from the issue that because they're
       | actually links to a specific file, not path pointers, that you
       | can replace the target file thinking you have updated something
       | in the system and instead have stale hard links lying around,
       | referencing the older version when you intended to replace the
       | older version for all users?
       | 
       | I think that in a hypothetical world where symlinks worked like
       | hard links, we'd be swapping out the security complaints in this
       | post for articles about how hard it is to upgrade a POSIX system
       | properly, tools and tricks you can use to make sure you truly
       | replaced all instances of a binary with a known vulnerability,
       | and so on.
        
         | jhallenworld wrote:
         | Hardlinks exist, so you already have this problem. Tar has to
         | keep track of inode numbers to recreate hard links for example.
        
           | shadowgovt wrote:
           | What I mean is that it's pretty SOP to have a package that
           | installs a new command to work by installing to /opt/my-
           | package and then symlinking /usr/bin/my-cmd to /opt/my-
           | package/my-cmd.
           | 
           | In the absence of symlinks, that link would be hard and dpkg
           | et. al would have to do package management by deleting the
           | /usr/bin/my-cmd link and re-creating it instead of letting it
           | ride, trusting that it will point to the correct thing when
           | the update completes because the target bin will have
           | changed.
        
       | partialzero wrote:
       | Maybe I'm being naive, but I don't get how "pathnames as a
       | concept are now utterly broken in POSIX". Isn't this "merely" a
       | problem that the resolution of the path name is dynamic and can
       | change between inspection and use? Wouldn't a practice of
       | resolving pathnames once (recursively, atomically, whatever) into
       | an immutable, opaque, direct handle, such as file descriptor,
       | before use solve this issue? I realize what I just said may be
       | tantamount to "all file io ops taking path strings are broken" -
       | but that seems like a problem with the initial API design, not
       | with the concept of having a level of indirection in path name
       | resolution itself.
        
         | bityard wrote:
         | This is basically what I was going to say. The article spends a
         | lot of time arguing that TOCTOU patterns introduce security
         | vulnerabilities, which I think all programmers (should!)
         | already know but then comes to the weird conclusion that we'd
         | just be better off without symlinks instead of designing an API
         | to work with them atomically.
         | 
         | Kinda reminds me of how a lot of UX changes happen: "This
         | really popular feature is a bit kludgy and hard to maintain,
         | let's just rewrite the whole app without it! (Instead of doing
         | the work required to make it not suck.)"
        
           | drdec wrote:
           | Almost all the TOCTOU examples given in the article could be
           | modified not to involve symlinks and still be valid.
        
       | amluto wrote:
       | I personally think that hard links should go away. I have trouble
       | thinking of any use for hard links that isn't better served by
       | CoW links. Hard links have quite surprising properties with
       | respect to chmod, they are awkward to handle in archival tools,
       | and even reliably identifying them is awkward to impossible in
       | general.
        
       | kazinator wrote:
       | > _An application running as root may try to check that
       | /data/mydir is a regular directory (not a symlink) before opening
       | the file /data/mydir/passwd. In between the time the program does
       | the directory check and the file open, an attacker could replace
       | the mydir directory with a symlink to /etc, and now the file
       | opened is, unexpectedly, /etc/passwd. This is a kind of race
       | condition known as a time-of-check-to-time-of-use (TOCTOU) race._
       | 
       | That application is doing the wrong check; it should be
       | validating that every component of the path is a directory which
       | is only writable to root.
       | 
       | First you stat("/"). OK, that is a directory and writable only to
       | root: so no non-root process can put a symlink there. Next we
       | check "/data". OK, that's a directory, and since we know / is
       | owned by root and not world-writable, /data cannot be replaced by
       | a symlink.
       | 
       | And so on ...
       | 
       | This can easily be made into a function like
       | safe_path("/data/dir/path/to/mypasswd") which returns true only
       | if no pathname component is something which a user other than the
       | caller, or root, could tamper with to point to a different file.
       | 
       | The open system call should have a flag for this, O_SAFE. That
       | would alter the behavior of the name resolution function
       | (traditionally, "namei") to do these checks along the path.
       | 
       | The path could have symlinks, if they are not tamperable from the
       | POV of the calling user.
       | 
       | Typically superuser applications in Unix rely on filesystem
       | structure. They set environment variables like PATH carefully,
       | and stick to accessing data in known directories that had better
       | be safe. If /data/mydir/passwd is something that is manipulated
       | by a root application, then the system is misconfigured if any of
       | these is writable to a non-root user: /, /data, /data/mydir or
       | /data/mydir/passwd.
       | 
       | If that is the case, you don't need symlinks to wreak havoc on
       | the application. You can, for instance, write your own password
       | into that password file and then falsely authenticate with that
       | app.
        
       | GoblinSlayer wrote:
       | >Clients that have write access to the exported part of the file
       | system under a share via SMB1 unix extensions or NFS can create
       | symlinks that
       | 
       | So, it's not symlinks broken, but SMB1 unix extensions broken
       | when exposed to the world for symlink creation. AIU this features
       | doesn't even serve windows interoperability. And if the author
       | wanted to disable all symlinks altogether, what is the purpose of
       | these extensions?
        
       ___________________________________________________________________
       (page generated 2022-07-22 23:00 UTC)