[HN Gopher] POSIX v. reality: A position on O_PONIES (2009)
       ___________________________________________________________________
        
       POSIX v. reality: A position on O_PONIES (2009)
        
       Author : nolist_policy
       Score  : 53 points
       Date   : 2023-11-22 12:58 UTC (1 days ago)
        
 (HTM) web link (lwn.net)
 (TXT) w3m dump (lwn.net)
        
       | 082349872349872 wrote:
       | Would that more PM's had had ponies as children: then they might
       | understand that some shiny must-have features are the sort where
       | (a) one occasionally has to call in expensive consultants, just
       | to return them to the (b) steady state where they merely require
       | constant feeding and cleaning-up-after.
        
         | h2odragon wrote:
         | There's a narrow window of "has ponies" and "has to care for
         | the ponies" that learns those lessons.
        
       | nolist_policy wrote:
       | By the way, Valerie has more great articles on lwn.net:
       | https://lwn.net/Archives/GuestIndex/#Aurora_Henson_Valerie
        
       | dale_glass wrote:
       | I believe the solution is actually pretty simple, though maybe
       | not easily implementable:
       | 
       | Provide new API calls with precisely defined semantics.
       | 
       | Rather have this rigamarole with fsync and rename, provide an
       | actual syscall with the actual effect the userspace developers
       | are looking for. Eg, an atomic_replace() syscall that ensures
       | that either a file is replaced with a new, fully written to disk
       | version, or nothing happens.
       | 
       | The main problem I see is that this of course would be Linux
       | specific, so of course somebody would build a library to either
       | invoke the syscall or do the fsync/rename mess underneath, and
       | this would of course run into the same exact problem on those
       | systems.
        
         | IshKebab wrote:
         | I don't think there's really any issue with it being Linux
         | specific. There are plenty of Linux specific APIs that people
         | happily use already.
         | 
         | Seems to me the bigger issue is the deification of POSIX and
         | UNIX. There's a stupidly large contingent of people that think
         | that they are flawless and must be followed unthinkingly.
        
           | anonacct37 wrote:
           | I've noticed that this week. After spending a little time
           | just reading on how to correctly write posix compliant code
           | that was also guaranteed to do what I want, it's hard to come
           | to any other conclusion than "posix is a last ditch attempt
           | to slap a bandaid on Unix fragmentation by attempting to
           | retcon some lowest common denominator behavior from a couple
           | popular unixish operating systems and calling that a spec".
           | 
           | It's not what I would actually call designed. The closest
           | analogy I can think of for non-c programmers is that posix is
           | like if we decided that people should only make websites
           | using javascript that was mutually interpretable by IE6 and
           | Netscape navigator and we occasionally made updates every
           | decade or two.
           | 
           | There's really nothing particularly noble or good or correct
           | or elegant about all the API calls with implementation
           | defined semantics. At best it's a necessary evil for
           | compatibility. You can admire the cleverness required to get
           | a single simple c codebase that works correctly on multiple
           | operating systems and future operating systems that conform
           | to posix in creative ways, but only in the way I admire those
           | old school zines that are simultaneously a PDF and a jpeg and
           | a shell script. Clever and ingenious but not actually good
           | engineering design.
        
         | Gibbon1 wrote:
         | I think that was the problem with how Linux implemented
         | pselect() circa 10 years ago. I think pselect fixes the race
         | condition you get when you set a timer then call select().
         | Linux's implementation was a function that set a timer and then
         | called select(). ... slam head on keyboard.
         | 
         | So I think that fear is valid that they'll implement the API
         | without the guarantees.
         | 
         | Myself I'm annoyed with the reckless push always to remove
         | guarantees in return for 'performance'.
        
           | leoc wrote:
           | I'm starting to come round to the belief that the only robust
           | remedy is to release the chaos monkey: have the kernel delete
           | all the state of the filesystem driver at randomly-chosen
           | intervals, a few hours apart on average, and force the driver
           | to recover itself.
        
             | bcrl wrote:
             | That is pretty close to what those of us building robust
             | storage systems in the real world have to do. At a previous
             | employer, we had test suites that exercised all kinds of
             | corner cases by triggering failover and system reboots in
             | the middle of heavy persistent messaging workloads.
             | 
             | Filesystems have other horrors that you learn about during
             | testing. I had one test case where it would take ext4 80
             | seconds to write out an 8MB file after a fresh mount of an
             | 8TB filesystem. Free space was fragmented in just the right
             | way that we hit the single threaded reading of block groups
             | and bitmaps in the kernel and it took _forever_ to get that
             | data off the disk array.
        
               | Gibbon1 wrote:
               | I've done the same with embedded drivers, randomly flip
               | bits in the drivers state and see what happens. Does it
               | recover, throw fault, hang, or explode like a bomb. I got
               | that from a hardware designer talking about robust state
               | machines. Any random illegal state should sequence back
               | to a known good state.
        
         | tedunangst wrote:
         | The problem is every filesystem would implement a different
         | version of fsync_that_works or rename_for_safety and then
         | applications will have to call all of them, and then people
         | will start building a compat framework that tries to assemble
         | the optimal sequence of calls for each filesystem except it
         | won't work in every case and so a new fsync_but_no_cheating
         | function will be proposed.
        
       ___________________________________________________________________
       (page generated 2023-11-23 23:00 UTC)