[HN Gopher] Leaky Abstractions
       ___________________________________________________________________
        
       Leaky Abstractions
        
       Author : jakub_g
       Score  : 227 points
       Date   : 2021-06-02 18:35 UTC (4 hours ago)
        
 (HTM) web link (textslashplain.com)
 (TXT) w3m dump (textslashplain.com)
        
       | vidarh wrote:
       | This kind of thing doesn't surprise me at all - a surprising
       | number of developers miss those types of things, and one of the
       | things a lot of people miss seems to be checking for situations
       | where you end up with excessive amount of calls doing little
       | work, for some reason.
       | 
       | E.g. at one point (long time ago), MySQL's C client library would
       | call read() for 4 bytes to read a length indicator, and then read
       | exactly the number of bytes indicated by the protocol, which led
       | to a _lot_ of time spent context-switching vs. reading into a
       | larger user-space buffer and copying out of that.
       | 
       | On Linux simple, plain "strace" remains one of my favourite first
       | steps to find performance problems in part because these things
       | are near immediately apparent if you strace a process because the
       | really pathological cases often ends up dominating the output so
       | much that you'll spot them right away.
       | 
       | Another "favourite" issue that shows up often when you use strace
       | like this is e.g. excessive include paths - running strace on MRI
       | Ruby with rubygems enables and lots of gems pulled in is a good
       | way of seeing that in action - like this zip problem it's an
       | example that seems totally reasonable when the number of gems is
       | small, and that first becomes apparent when you test with lots of
       | gems, and look at what's actually happening under the hood.
       | 
       | A similar example of testing with too small datasets and/or on a
       | large enough machine to not spot what becomes immediately obvious
       | if you trace execution was how a type 1 font reference library
       | made available by Adobe (no idea if they wrote it or if it was
       | from another source originally) which would exhibit another
       | typical pathological behaviour and call malloc() hundreds of
       | times on loading a font to allocate 4 bytes at the time instead
       | of allocating larger buffers. (strace won't catch malloc(), but
       | ltrace does)) To avoid messing too much with the code, we
       | replaced the calls to malloc() with calls to a simple arena
       | allocator and load speed shot through the roof and memory usage
       | dropped massively.
       | 
       | Too few people trace execution of their code. I know that because
       | if most developers did, issues like the above would get caught
       | much sooner, and would be rarer in published code.
        
         | formerly_proven wrote:
         | > Another "favourite" issue that shows up often when you use
         | strace like this is e.g. excessive include paths - running
         | strace on MRI Ruby with rubygems enables and lots of gems
         | pulled in is a good way of seeing that in action - like this
         | zip problem it's an example that seems totally reasonable when
         | the number of gems is small, and that first becomes apparent
         | when you test with lots of gems, and look at what's actually
         | happening under the hood.
         | 
         | When you add e.g. Python's standard library as a ZIP to the
         | Python search path, the thing will open/read/read/read/close
         | that file approximately a gazillion times on startup. That's OK
         | on Linux/Unix, where that is fairly cheap. Guess which OS
         | doesn't like that pattern at all?
        
       | mrkramer wrote:
       | "Unfortunately, the code hasn't really been updated in a while. A
       | long while. The timestamp in the module claims it was last
       | updated on Valentine's Day 1998"
       | 
       | When I saw this timestamp sometime ago on my PC I thought it was
       | a joke?! C'mon 1998 like WTF!
        
       | osipov wrote:
       | The programmer responsible for this implementation successfully
       | inverted a red-black tree during the job interview and explained
       | a Big-Oh optimal solution for doing so.
        
       | [deleted]
        
       | Sol- wrote:
       | Interesting, didn't even know you can cut files from a zip in
       | that Windows zip file viewer. I would have thought it's some
       | read-only filesystem like viewing a mounted CD or so.
       | 
       | In that context I wonder if you could cut from rewritable CD-RWs
       | as well back in the day (can't remember) - that seems like
       | another abstraction that's similarly slow in reality.
        
         | jakub_g wrote:
         | I never tried cutting from CD-RW, but AFAIR each burning would
         | append a non-trivial header (like 20MBs or so) so that would be
         | a pretty expensive thing to do :)
         | 
         | AFAIR when you "copied" into CD-RW the files would show up
         | semi-transparent (pending) and you'd have to click a button to
         | process with burning. Probably same for cutting I guess.
        
           | TeMPOraL wrote:
           | Speaking of old Windows and CDs, Windows had some crazy trick
           | to turn CD-R ( _not_ CD-RW) into rewriteable medium. I 'm
           | guessing they simulated a regular file system on top of an
           | append-only representation. I never dug into the details back
           | in the day, because I never used this feature. Unfortunately,
           | IIRC, this trickery was enabled by default when copying files
           | to CDs - which was a problem, because nothing else could read
           | it. This caused me an unending stream of calls from friends
           | and relatives, who all tried to burn some family/vacation
           | photos onto a CD-R, in order to view them on their TVs, but
           | the CD readers for TVs couldn't parse the format.
        
             | marcosdumay wrote:
             | Oh, the trick of placing the table of contents on the
             | outtermost place of the written area, instead of the
             | beginning of the CD.
             | 
             | I think you were supposed to write a pointer at the
             | original place, so the driver would know to look again, but
             | outside in. Video CD players often didn't support that
             | pointer.
             | 
             | But actually, Windows was quite a latecomer on that
             | feature. It's just that its UX was horrible so people would
             | never know what option they choose, on other OSes (and
             | other Windows software) people had to actually decide to
             | use it.
        
           | bluGill wrote:
           | Been years, but the ISO file system says 16 sectors past the
           | start of the last track is where the file system index
           | starts. (the first 15 sectors are reserved for booting) Then
           | however many sectors you need for the index - which depends
           | on how many files you have, and how long the filenames are.
           | Note that both CDR and CDRW don't allow you to write sectors
           | in the middle of a track, so you need to know where every
           | file will be on disk before you can write anything - you
           | can't build the index on the fly. (I remember this part
           | because I was writing backups and so I didn't know how big
           | the file would be until I was done and thus couldn't write
           | the index first - I eventually got around the rule by
           | creating a new track with just the index)
        
       | zbendefy wrote:
       | Another similar and good article:
       | https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-a...
        
         | an1sotropy wrote:
         | Another common leaky abstraction: floating-point numbers as an
         | abstraction of the real numbers. It works nearly all the time,
         | until you have to really know about numerical precision, or
         | where a NaN came from.
         | 
         | https://www.johndcook.com/blog/2009/04/06/numbers-are-a-leak...
        
       | yurishimo wrote:
       | These are the kinds of articles I love to see on HN!
       | 
       | I think it's so interesting to see how different OS' have
       | approached these long lived operations over the years. Arguably,
       | *nix OS' have been able to move past these problems since it
       | appears that package management is much more of a first party
       | concern. "Oh, you want to upgrade this utility? Well, I rely on
       | these other things so you should update them too."
       | 
       | I'm a webdev so a lot of this is foreign to me w/ how it actually
       | all functions but it's cool to read about these problems and
       | speculate how/why they came to be and why they may only exist on
       | certain platforms.
        
       | flameon wrote:
       | Fauci emails reveal how the virus was created:
       | 
       | https://twitter.com/Littleb29872980/status/14001539060895047...
       | 
       | Arrest Fauci Now!
        
       | bentcorner wrote:
       | I wonder why it's implemented as a per-file copy+delete instead
       | of a "copy all files" then "delete all files".
       | 
       | I also have a gut feeling that doing similar operations to a
       | connected android phone (e.g., moving photos from your phone to
       | your PC over USB) is also slow, probably for similar reasons.
        
         | genpfault wrote:
         | > doing similar operations to a connected android phone
         | 
         | MTP is terrible[1].
         | 
         | [1]:
         | https://en.wikipedia.org/wiki/Media_Transfer_Protocol#Perfor...
        
           | jakub_g wrote:
           | Oh boy. I was thinking USB 2.0 is main reason for
           | Android<>Windows copying of photos to suck so much, but the
           | rabbit hole is much deeper.
           | 
           | It's sad that with cloud being the solution for everything
           | those days, this will probably never be improved within next
           | decade.
        
         | vidarh wrote:
         | Probably because someone worked on a high enough abstraction
         | not to spot it, then tested it on small enough files to not see
         | enough of a performance issue to dig into what actually
         | happened..
        
           | TeMPOraL wrote:
           | Perhaps the common approach to performance testing is wrong
           | (forget for a moment that most shops don't think about it at
           | all; let's just consider quality developers/companies).
           | Instead of just monitoring whether the product is fast enough
           | for minimal/typical/peak expected usage, maybe it would be
           | good to focus on determining a boundary. E.g. how much data
           | does it take for the program to run 60 seconds? Or, in
           | general, 10x longer than the maximum you'd consider
           | acceptable? How much data does it take for it to run out of
           | memory?
           | 
           | These determinations can be made with few tests each. Start
           | with some reasonable amount of data, keep doubling it until
           | the test fails. Continue binary-searching between last
           | success and last failure.
           | 
           | The results may come out surprising. Performance does not
           | scale linearly - just because your program takes 1 second for
           | 1 unit of data, and 2 seconds for 2 units of data, doesn't
           | mean it'll take 20 seconds for 20 units of data. It may as
           | well take an hour. Picking a threshold far above what's
           | acceptable, and continuously monitoring how much load is
           | required to reach it, will quickly identify when something is
           | working much slower than it should.
        
         | duxup wrote:
         | I find it's kinda hell moving files from both my android AND
         | iPhone ... it feels like a 20 year old process in terms of it
         | working in fits and starts and random failures.
        
         | lmilcin wrote:
         | It is easier to abstract (at least nively). First, you abstract
         | moving a single file, then you create abstraction for "all
         | files" basically by repeating same operation for all files. You
         | could do this for a subset of files and so on.
         | 
         | As to slow operations... that is more likely because of
         | synchronous implementation.
         | 
         | The popular, naive implementation is, as above, to repeat same
         | simple operation over and over again: read from source, write
         | to destination, read from source, write to destination.
         | 
         | A better implementation (what I would do) would be to pipeline
         | operations. Basic pipeline would have three components, each
         | streaming data to and/or from buffer
         | 
         | 1: Read from source to pipeline buffer
         | 
         | 2: Read from pipeline buffer to write to destination, write
         | information about files to delete to another pipeline buffer
         | 
         | 3: Read files to delete from pipeline buffer and execute
         | deletions.
         | 
         | Using *nix shell you could do something like that in single
         | line
         | 
         | 1: tar -c files to output
         | 
         | 2: pipe output to tar -xv, write files to destination producing
         | list of written files, pipe written files to output
         | 
         | 3: read piped list of written files and remove them from input
         | dir
         | 
         | Now, this is not perfect because we are wasting performance on
         | creating tar file when we immediately discard it, but you get
         | the picture.
        
           | tetha wrote:
           | > It is easier to abstract (at least nively). First, you
           | abstract moving a single file, then you create abstraction
           | for "all files" basically by repeating same operation for all
           | files. You could do this for a subset of files and so on.
           | 
           | Litte bit of a rant, but I see this SO MUCH in database
           | layers in applications. Implement an operation for one row,
           | slap a for loop around it, it works kinda quickly on the
           | small test data set... and then prod has an intimate
           | conversation with a brick wall. It has been 0 days at work
           | since that happened.
           | 
           | > As to slow operations... that is more likely because of
           | synchronous implementation.
           | 
           | Interestingly, I think the answer is a solid maybe and
           | depends on the storage and how you issue your i-o operations.
           | A flash storage will increase performance if you increase
           | parallel operations, up to a point. However - and this code
           | apparently was written 20 years ago - on spinning drives,
           | parallel io-operations slow you down if the OS does not merge
           | those operations. So it's entirely not obvious.
        
             | lmilcin wrote:
             | On single drive you run a variation where you read X MB of
             | data to a buffer, then write X MB of data out, then execute
             | deletes, and so on. This lets avoid some of the problems
             | with small files. Not all, because small files will
             | unlikely to be consecutive, the head will have to jump a
             | lot, and then you still need to do a lot of small writes to
             | filesystem (to remove the files).
             | 
             | There are obviously improvements you could do. For some
             | filesystem you can just remove entire folders rather than
             | remove the files individually just to remove parent folder.
        
           | TeMPOraL wrote:
           | > _The popular, naive implementation is, as above, to repeat
           | same simple operation over and over again: read from source,
           | write to destination, read from source, write to
           | destination._
           | 
           | Reminds me of the kind of patterns functional programming
           | languages introduce, where you process data by describing
           | operations on individual items and assembling them into "a
           | stream". I'm always wary of those - without a good
           | implementation and some heavy magic at the language level,
           | they tend to become the kind of context-switching performance
           | disaster you describe.
        
             | lmilcin wrote:
             | Yes. Functional world is not impervious to leaky
             | abstractions.
             | 
             | I am personally of the opinion that, to be a good
             | developer, you have to have mental model of what happens
             | beneath. If you are programming in a high level language it
             | is easy to try forget about the fact that your program runs
             | on real hardware.
             | 
             | I know, because I work mostly on Java projects and trying
             | to talk to Java developers about real hardware is useless.
             | 
             | I have an interview question where I ask "what prevents one
             | process from dereferencing a pointer written out by another
             | process on the same machine" and I get all sorts of funny
             | answers and only 5-10% candidates even have beginning of
             | understanding what is going on. Most don't know what
             | virtual memory is or are surprised that two processes can
             | resolve different values under same pointer.
        
               | MaxBarraclough wrote:
               | > Most don't know what virtual memory is or are surprised
               | that two processes can resolve different values under
               | same pointer.
               | 
               | Yikes. How many of them have a degree in computer
               | science?
        
               | yurishimo wrote:
               | Probably not many of them, because it doesn't usually
               | take a graduate to write Java. Especially if your
               | experience with Java is also limited to tools like
               | Spring. I think the question is kind of pointless unless
               | your specifically hiring people to work on an application
               | that needs that sort of thru-put. Most apps don't need
               | it.
        
         | toast0 wrote:
         | copy + delete one at a time makes a lot of sense if you're
         | working on a filesystem without a way to move without copying
         | (I don't think you can actually move a file in fat32), because
         | copy all could require more space than is available.
         | 
         | The same could be true here where you're moving from a zip file
         | to probably the same filesystem the zip files is in; if
         | removing a file from the zip file is actually an in-place move
         | data then truncate. The problem, of course, is that removing a
         | file from the zip file is tremendously expensive. Reading the
         | file with one syscall per byte doesn't help (especially post-
         | Spectre workarounds that make syscalls more expensive).
        
           | codetrotter wrote:
           | This is the real answer IMO. For example, if you have a 2TB
           | drive with 50GB available space left and are trying to move
           | 1TB of data, and the move is requiring copying, but all
           | individual files on their own are less than 50GB in size,
           | then I'd be pretty upset if my computer was unable to move
           | the files just because it was wanting to not delete anything
           | until the end.
           | 
           | But ideally I'd want the system to delete at the end if
           | possible, and to otherwise delete as needed, instead of
           | either doing only all at end or only after every single file.
        
         | worewood wrote:
         | It's so if the move fails midway you won't need to start from
         | scratch.
        
         | slver wrote:
         | In theory all software is coded using a "many by default"
         | approach. So every time batching matters, we take those
         | batching opportunities automatically due to the way software is
         | coded.
         | 
         | In practice we only batch when it starts hurting. It doesn't
         | hurt to delete files one by one on a normal file system. It's
         | made for that. So the API wasn't "many by default" and that's
         | how it works for zip files as well.
        
         | formerly_proven wrote:
         | IIRC explorer/shell namespaces are built on top of IStorage /
         | IStream and those don't know bulk operations.
        
       | dsego wrote:
       | Unless I'm mistaken, this seems to be the original programmer on
       | youtube:
       | 
       | Dave's Garage - Secret History of Windows ZIPFolders
       | 
       | https://youtu.be/aQUtUQ_L8Yk
        
         | codetrotter wrote:
         | It is. I have seen a few of his videos before. They are
         | generally interesting and worth checking out.
        
       ___________________________________________________________________
       (page generated 2021-06-02 23:00 UTC)