[HN Gopher] POSIX v. reality: A position on O_PONIES (2009) ___________________________________________________________________ POSIX v. reality: A position on O_PONIES (2009) Author : nolist_policy Score : 53 points Date : 2023-11-22 12:58 UTC (1 days ago) (HTM) web link (lwn.net) (TXT) w3m dump (lwn.net) | 082349872349872 wrote: | Would that more PM's had had ponies as children: then they might | understand that some shiny must-have features are the sort where | (a) one occasionally has to call in expensive consultants, just | to return them to the (b) steady state where they merely require | constant feeding and cleaning-up-after. | h2odragon wrote: | There's a narrow window of "has ponies" and "has to care for | the ponies" that learns those lessons. | nolist_policy wrote: | By the way, Valerie has more great articles on lwn.net: | https://lwn.net/Archives/GuestIndex/#Aurora_Henson_Valerie | dale_glass wrote: | I believe the solution is actually pretty simple, though maybe | not easily implementable: | | Provide new API calls with precisely defined semantics. | | Rather have this rigamarole with fsync and rename, provide an | actual syscall with the actual effect the userspace developers | are looking for. Eg, an atomic_replace() syscall that ensures | that either a file is replaced with a new, fully written to disk | version, or nothing happens. | | The main problem I see is that this of course would be Linux | specific, so of course somebody would build a library to either | invoke the syscall or do the fsync/rename mess underneath, and | this would of course run into the same exact problem on those | systems. | IshKebab wrote: | I don't think there's really any issue with it being Linux | specific. There are plenty of Linux specific APIs that people | happily use already. | | Seems to me the bigger issue is the deification of POSIX and | UNIX. There's a stupidly large contingent of people that think | that they are flawless and must be followed unthinkingly. | anonacct37 wrote: | I've noticed that this week. After spending a little time | just reading on how to correctly write posix compliant code | that was also guaranteed to do what I want, it's hard to come | to any other conclusion than "posix is a last ditch attempt | to slap a bandaid on Unix fragmentation by attempting to | retcon some lowest common denominator behavior from a couple | popular unixish operating systems and calling that a spec". | | It's not what I would actually call designed. The closest | analogy I can think of for non-c programmers is that posix is | like if we decided that people should only make websites | using javascript that was mutually interpretable by IE6 and | Netscape navigator and we occasionally made updates every | decade or two. | | There's really nothing particularly noble or good or correct | or elegant about all the API calls with implementation | defined semantics. At best it's a necessary evil for | compatibility. You can admire the cleverness required to get | a single simple c codebase that works correctly on multiple | operating systems and future operating systems that conform | to posix in creative ways, but only in the way I admire those | old school zines that are simultaneously a PDF and a jpeg and | a shell script. Clever and ingenious but not actually good | engineering design. | Gibbon1 wrote: | I think that was the problem with how Linux implemented | pselect() circa 10 years ago. I think pselect fixes the race | condition you get when you set a timer then call select(). | Linux's implementation was a function that set a timer and then | called select(). ... slam head on keyboard. | | So I think that fear is valid that they'll implement the API | without the guarantees. | | Myself I'm annoyed with the reckless push always to remove | guarantees in return for 'performance'. | leoc wrote: | I'm starting to come round to the belief that the only robust | remedy is to release the chaos monkey: have the kernel delete | all the state of the filesystem driver at randomly-chosen | intervals, a few hours apart on average, and force the driver | to recover itself. | bcrl wrote: | That is pretty close to what those of us building robust | storage systems in the real world have to do. At a previous | employer, we had test suites that exercised all kinds of | corner cases by triggering failover and system reboots in | the middle of heavy persistent messaging workloads. | | Filesystems have other horrors that you learn about during | testing. I had one test case where it would take ext4 80 | seconds to write out an 8MB file after a fresh mount of an | 8TB filesystem. Free space was fragmented in just the right | way that we hit the single threaded reading of block groups | and bitmaps in the kernel and it took _forever_ to get that | data off the disk array. | Gibbon1 wrote: | I've done the same with embedded drivers, randomly flip | bits in the drivers state and see what happens. Does it | recover, throw fault, hang, or explode like a bomb. I got | that from a hardware designer talking about robust state | machines. Any random illegal state should sequence back | to a known good state. | tedunangst wrote: | The problem is every filesystem would implement a different | version of fsync_that_works or rename_for_safety and then | applications will have to call all of them, and then people | will start building a compat framework that tries to assemble | the optimal sequence of calls for each filesystem except it | won't work in every case and so a new fsync_but_no_cheating | function will be proposed. ___________________________________________________________________ (page generated 2023-11-23 23:00 UTC)