Zaibatsu upgrade, setuid woes ----------------------------- Last week I upgraded the OS at the Zaibatsu (if any sundogs have seen scary warnings from ssh/scp about changed keys, this is the reason why! It's safe to connect - email me if you are extra paranoid and want to confirm the new fingerprint). We had been running the stable release of Debian 8 (Stretch) since the server was setup over two years ago. Stretch is pretty darn old now, and with Debian stable typically shipping slightly old software from day one, we were well behind the times. Which, for the most part, I imagine was absolutely fine by most people involved. We're not exactly all about being at the cutting edge here! But with very old versions of programming languages installed it was becoming increasingly common for new projects to not run smoothly, and upgrades to security libraries like OpenSSL are always a good thing. So I've been ready to uprade for a while, but have had to wait for the VPS provider RamNode to be ready as well. The Zaibatsu is an OpenVZ VPS, meaning it shares a running Linux kernel with several other VPSes (virtualisation magic keeps the processes, filesystems etc. of one VPS invisible from the others). Because we don't have our own independent kernel, we can't upgrade whenever we like (or fiddle with most sysctl settings, either). We can only run the kernels that RamNode provide. However, RamNode semi-recently updated their OpenVZ infrastructure to a non-legacy version, meaning they were at last able to offer some more modern OS options, including Debian 10 (Buster). However, there's no possibility for a typical "in place" upgrade via the standard `apt` tooling. I basically had to back the whole thing up, re-image the Zaibatsu with Buster, reinstall all the software and restore our files. The recent May Day long weekend gave me a good opportunity to do this with enough time to hopefully fix any issues if things really went South. For the most part, things went pretty smoothly. No data was lost, and I prioritised getting the gopher and mail servers running again ASAP so that the visible downtime to the rest of the world with regard to our basic services was very short. I'm glad I did this with a lot of spare time available, though, because it took me what seemed like an eternity to get our BBS working again. The Circumlunar BBS works as follows (it's not proprietary software with unknown inner workings like SDF's BBOARD is!): all posts are stored in a board/thread/post directory structure in /var/bbs, which is owned by a "bbs" user. The content is world readable, so if you really want you can poke around and read everything with cd, ls and cat/less/more. Or write your own read-only client, or search engine, or whatever. To post, you have to use one of the approved clients, which are setuid and owned by the bbs user, enabling them to create new files in /var/bbs. These clients ensure that people's correct usernames are attached to posts - if we just made /var/bbs world-writable, people could impersonate other users, or vandalise other user's posts. Not that I think any sundogs would do that, but at some point we hoped other pubnixes would pick this software up so we didn't want to necessarily assume a small, high-trust society. There's nothing super innovative about this - take a look in /usr/games and you'll find many of those binaries are setgid and have their group set to "games". This is precisely to allow any user on a multiuser unix system update a shared high score file in a controlled way. Even though the games (and our BBS clients) are free software, if a user cloned the repo and modified the game to give them free points, without root access they can't make their new hax0red binary owned by the "games" group, so they can't cheat their way to the top of the scoreboard. The main BBS client we use is written in Lua, because of its low memory footprint (we have 128MB of RAM to share amongst all logged in users!). Now, running interpreted programs (or shell scripts) set setuid or setgid is not straightforward. Because setuid programs which are owned by the root user are potentially big security holes, and because shell scripts in particular are hard to guarantee the security of because users can easily influence their behaviour via environment variables, aliases, etc., most modern unixes do not allow shells or interpreters to run setuid, and this extends to the Lua interpreter. The way around this is to write a small "wrapper" in a compiled language like C, which does nothing but call the interpeter you want with the required arguments. The resulting binary wrapper can be made setuid/setgid no problems. People online will scream at you that you are going directly to sysadmin hell if you attempt this, but they are being lazy: many, many people seem to have forgotten that setuid binaries are not necessarily owned by root. If they are owned by a low-privilege user/group like "games" or "bbs" then the risk is comparatively minimal and there's no reason to freak out. Most likely, in this day and age where using unix as an actual, genuine multi-user system is seen more as quaint historical re-enactment than serious computing, they're unable to conceive of what the point would be in a non-root setuid program. Heck, maybe they don't even have a /usr/games/ directory!!! Anyway, we have just such a setuid binary wrapper for our BBS client, and for whatever reason it was not working correctly after the upgrade, leaving the BBS in a read-only state. I went nuts trying to figure out why it had stopped working, verifying that it was owned by the correct user, that the setuid bit was set, etc. Eventually I discovered that between the versions of dash (Debian's default /bin/sh) shipped with Stretch and with Buster, a change was made where the first thing the shell does is check whether its true and effective uid are equal and, if they aren't, drops priveleges so that they are. Because the binary wrapper used the C standard library's `system()` function to lauch the Lua interpeter, and because `system()` uses /bin/sh to launch things, we were ending up with a non-priveleged Lua. This change to dash was made "for security reasons". It really pisses me off that nobody thought to implement this feature in such a way that the shell checks whether its effective uid is 0 and only drops privileges in such a case, realising that the threat from non-root setuids is minior. At first I thought this would be an easy fix - I changed the wrapper to use one of the `exec()` functions instead of `system()` so that /bin/sh was not invoked. This resulted in a privelged Lua interpeter! However, because Lua has no built-in filesystem support, the client relies crucially on launching standard unix tools like cp, mv, rm, chown and chmod to maintain all the correct permissions in /var/bbs in a secure way. And Lua launches those tools...using `system()`, and hence /bin/sh. So the problem was only partially solved. In the end, I hunted down and installed an older version of dash which didn't drop privileges and everything immediately just worked. It's not an ideal solution, but the BBS is a central part of our community and I wanted it up and running ASAP. However, this situation does pretty well rule out the notion of this system becoming widely deployed at other pubnixes. I will have to see if I can either get sudo to function as an alternative binary wrapper (this is widely advised online, but I have tried previously to apply it to our situation and although I've forgotten the details I recall that there was a very real and principled reason why it didn't work for us), or if we can reimplement the client in a language which compiles to a binary. I'm not optimistic on that last part, though. Nobody wants to write something like this in C in this day and age. However, modern compiled languages tend to have extremely smart and high-powered toolchains (in order to manage the dependency hell that modern software development fashion seems to eagerly embrace) which don't play well in a low-resource pubnix environment. Rust wants every single user to have a multi-hundred megabyte copy of the toolchain in their home directory, and "Hello, world" has about 250 dependencies - one of which was always released only yesterday. Go can be installed system wide in the traditional sane manner, but 128MB of RAM is not enough to compile anything I've tried. Sigh... Aside from this saga, all other problems have been pretty minor. Systemd adoption is much more widespread in Buster than it was in Stretch, but I'm trying to see this as a learning opportunity rather than a hassle. We are now no longer using xinet to run some things that we previously were, they are now implemented as systemd socket-activated services. I still think the death of /etc/rc.local is as a dirt-simple place-of-last-resort to stick system initialisation code is a tragedy, but I'll spare any further complaining until a later post at least. Basically I think we are now right back to where we were, in terms of everything working, but with a much more modern environment which isn't frighteningly close to EOL. Hopefully I don't have to do this again for another few years!