[HN Gopher] Leaky Abstractions ___________________________________________________________________ Leaky Abstractions Author : jakub_g Score : 227 points Date : 2021-06-02 18:35 UTC (4 hours ago) (HTM) web link (textslashplain.com) (TXT) w3m dump (textslashplain.com) | vidarh wrote: | This kind of thing doesn't surprise me at all - a surprising | number of developers miss those types of things, and one of the | things a lot of people miss seems to be checking for situations | where you end up with excessive amount of calls doing little | work, for some reason. | | E.g. at one point (long time ago), MySQL's C client library would | call read() for 4 bytes to read a length indicator, and then read | exactly the number of bytes indicated by the protocol, which led | to a _lot_ of time spent context-switching vs. reading into a | larger user-space buffer and copying out of that. | | On Linux simple, plain "strace" remains one of my favourite first | steps to find performance problems in part because these things | are near immediately apparent if you strace a process because the | really pathological cases often ends up dominating the output so | much that you'll spot them right away. | | Another "favourite" issue that shows up often when you use strace | like this is e.g. excessive include paths - running strace on MRI | Ruby with rubygems enables and lots of gems pulled in is a good | way of seeing that in action - like this zip problem it's an | example that seems totally reasonable when the number of gems is | small, and that first becomes apparent when you test with lots of | gems, and look at what's actually happening under the hood. | | A similar example of testing with too small datasets and/or on a | large enough machine to not spot what becomes immediately obvious | if you trace execution was how a type 1 font reference library | made available by Adobe (no idea if they wrote it or if it was | from another source originally) which would exhibit another | typical pathological behaviour and call malloc() hundreds of | times on loading a font to allocate 4 bytes at the time instead | of allocating larger buffers. (strace won't catch malloc(), but | ltrace does)) To avoid messing too much with the code, we | replaced the calls to malloc() with calls to a simple arena | allocator and load speed shot through the roof and memory usage | dropped massively. | | Too few people trace execution of their code. I know that because | if most developers did, issues like the above would get caught | much sooner, and would be rarer in published code. | formerly_proven wrote: | > Another "favourite" issue that shows up often when you use | strace like this is e.g. excessive include paths - running | strace on MRI Ruby with rubygems enables and lots of gems | pulled in is a good way of seeing that in action - like this | zip problem it's an example that seems totally reasonable when | the number of gems is small, and that first becomes apparent | when you test with lots of gems, and look at what's actually | happening under the hood. | | When you add e.g. Python's standard library as a ZIP to the | Python search path, the thing will open/read/read/read/close | that file approximately a gazillion times on startup. That's OK | on Linux/Unix, where that is fairly cheap. Guess which OS | doesn't like that pattern at all? | mrkramer wrote: | "Unfortunately, the code hasn't really been updated in a while. A | long while. The timestamp in the module claims it was last | updated on Valentine's Day 1998" | | When I saw this timestamp sometime ago on my PC I thought it was | a joke?! C'mon 1998 like WTF! | osipov wrote: | The programmer responsible for this implementation successfully | inverted a red-black tree during the job interview and explained | a Big-Oh optimal solution for doing so. | [deleted] | Sol- wrote: | Interesting, didn't even know you can cut files from a zip in | that Windows zip file viewer. I would have thought it's some | read-only filesystem like viewing a mounted CD or so. | | In that context I wonder if you could cut from rewritable CD-RWs | as well back in the day (can't remember) - that seems like | another abstraction that's similarly slow in reality. | jakub_g wrote: | I never tried cutting from CD-RW, but AFAIR each burning would | append a non-trivial header (like 20MBs or so) so that would be | a pretty expensive thing to do :) | | AFAIR when you "copied" into CD-RW the files would show up | semi-transparent (pending) and you'd have to click a button to | process with burning. Probably same for cutting I guess. | TeMPOraL wrote: | Speaking of old Windows and CDs, Windows had some crazy trick | to turn CD-R ( _not_ CD-RW) into rewriteable medium. I 'm | guessing they simulated a regular file system on top of an | append-only representation. I never dug into the details back | in the day, because I never used this feature. Unfortunately, | IIRC, this trickery was enabled by default when copying files | to CDs - which was a problem, because nothing else could read | it. This caused me an unending stream of calls from friends | and relatives, who all tried to burn some family/vacation | photos onto a CD-R, in order to view them on their TVs, but | the CD readers for TVs couldn't parse the format. | marcosdumay wrote: | Oh, the trick of placing the table of contents on the | outtermost place of the written area, instead of the | beginning of the CD. | | I think you were supposed to write a pointer at the | original place, so the driver would know to look again, but | outside in. Video CD players often didn't support that | pointer. | | But actually, Windows was quite a latecomer on that | feature. It's just that its UX was horrible so people would | never know what option they choose, on other OSes (and | other Windows software) people had to actually decide to | use it. | bluGill wrote: | Been years, but the ISO file system says 16 sectors past the | start of the last track is where the file system index | starts. (the first 15 sectors are reserved for booting) Then | however many sectors you need for the index - which depends | on how many files you have, and how long the filenames are. | Note that both CDR and CDRW don't allow you to write sectors | in the middle of a track, so you need to know where every | file will be on disk before you can write anything - you | can't build the index on the fly. (I remember this part | because I was writing backups and so I didn't know how big | the file would be until I was done and thus couldn't write | the index first - I eventually got around the rule by | creating a new track with just the index) | zbendefy wrote: | Another similar and good article: | https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-a... | an1sotropy wrote: | Another common leaky abstraction: floating-point numbers as an | abstraction of the real numbers. It works nearly all the time, | until you have to really know about numerical precision, or | where a NaN came from. | | https://www.johndcook.com/blog/2009/04/06/numbers-are-a-leak... | yurishimo wrote: | These are the kinds of articles I love to see on HN! | | I think it's so interesting to see how different OS' have | approached these long lived operations over the years. Arguably, | *nix OS' have been able to move past these problems since it | appears that package management is much more of a first party | concern. "Oh, you want to upgrade this utility? Well, I rely on | these other things so you should update them too." | | I'm a webdev so a lot of this is foreign to me w/ how it actually | all functions but it's cool to read about these problems and | speculate how/why they came to be and why they may only exist on | certain platforms. | flameon wrote: | Fauci emails reveal how the virus was created: | | https://twitter.com/Littleb29872980/status/14001539060895047... | | Arrest Fauci Now! | bentcorner wrote: | I wonder why it's implemented as a per-file copy+delete instead | of a "copy all files" then "delete all files". | | I also have a gut feeling that doing similar operations to a | connected android phone (e.g., moving photos from your phone to | your PC over USB) is also slow, probably for similar reasons. | genpfault wrote: | > doing similar operations to a connected android phone | | MTP is terrible[1]. | | [1]: | https://en.wikipedia.org/wiki/Media_Transfer_Protocol#Perfor... | jakub_g wrote: | Oh boy. I was thinking USB 2.0 is main reason for | Android<>Windows copying of photos to suck so much, but the | rabbit hole is much deeper. | | It's sad that with cloud being the solution for everything | those days, this will probably never be improved within next | decade. | vidarh wrote: | Probably because someone worked on a high enough abstraction | not to spot it, then tested it on small enough files to not see | enough of a performance issue to dig into what actually | happened.. | TeMPOraL wrote: | Perhaps the common approach to performance testing is wrong | (forget for a moment that most shops don't think about it at | all; let's just consider quality developers/companies). | Instead of just monitoring whether the product is fast enough | for minimal/typical/peak expected usage, maybe it would be | good to focus on determining a boundary. E.g. how much data | does it take for the program to run 60 seconds? Or, in | general, 10x longer than the maximum you'd consider | acceptable? How much data does it take for it to run out of | memory? | | These determinations can be made with few tests each. Start | with some reasonable amount of data, keep doubling it until | the test fails. Continue binary-searching between last | success and last failure. | | The results may come out surprising. Performance does not | scale linearly - just because your program takes 1 second for | 1 unit of data, and 2 seconds for 2 units of data, doesn't | mean it'll take 20 seconds for 20 units of data. It may as | well take an hour. Picking a threshold far above what's | acceptable, and continuously monitoring how much load is | required to reach it, will quickly identify when something is | working much slower than it should. | duxup wrote: | I find it's kinda hell moving files from both my android AND | iPhone ... it feels like a 20 year old process in terms of it | working in fits and starts and random failures. | lmilcin wrote: | It is easier to abstract (at least nively). First, you abstract | moving a single file, then you create abstraction for "all | files" basically by repeating same operation for all files. You | could do this for a subset of files and so on. | | As to slow operations... that is more likely because of | synchronous implementation. | | The popular, naive implementation is, as above, to repeat same | simple operation over and over again: read from source, write | to destination, read from source, write to destination. | | A better implementation (what I would do) would be to pipeline | operations. Basic pipeline would have three components, each | streaming data to and/or from buffer | | 1: Read from source to pipeline buffer | | 2: Read from pipeline buffer to write to destination, write | information about files to delete to another pipeline buffer | | 3: Read files to delete from pipeline buffer and execute | deletions. | | Using *nix shell you could do something like that in single | line | | 1: tar -c files to output | | 2: pipe output to tar -xv, write files to destination producing | list of written files, pipe written files to output | | 3: read piped list of written files and remove them from input | dir | | Now, this is not perfect because we are wasting performance on | creating tar file when we immediately discard it, but you get | the picture. | tetha wrote: | > It is easier to abstract (at least nively). First, you | abstract moving a single file, then you create abstraction | for "all files" basically by repeating same operation for all | files. You could do this for a subset of files and so on. | | Litte bit of a rant, but I see this SO MUCH in database | layers in applications. Implement an operation for one row, | slap a for loop around it, it works kinda quickly on the | small test data set... and then prod has an intimate | conversation with a brick wall. It has been 0 days at work | since that happened. | | > As to slow operations... that is more likely because of | synchronous implementation. | | Interestingly, I think the answer is a solid maybe and | depends on the storage and how you issue your i-o operations. | A flash storage will increase performance if you increase | parallel operations, up to a point. However - and this code | apparently was written 20 years ago - on spinning drives, | parallel io-operations slow you down if the OS does not merge | those operations. So it's entirely not obvious. | lmilcin wrote: | On single drive you run a variation where you read X MB of | data to a buffer, then write X MB of data out, then execute | deletes, and so on. This lets avoid some of the problems | with small files. Not all, because small files will | unlikely to be consecutive, the head will have to jump a | lot, and then you still need to do a lot of small writes to | filesystem (to remove the files). | | There are obviously improvements you could do. For some | filesystem you can just remove entire folders rather than | remove the files individually just to remove parent folder. | TeMPOraL wrote: | > _The popular, naive implementation is, as above, to repeat | same simple operation over and over again: read from source, | write to destination, read from source, write to | destination._ | | Reminds me of the kind of patterns functional programming | languages introduce, where you process data by describing | operations on individual items and assembling them into "a | stream". I'm always wary of those - without a good | implementation and some heavy magic at the language level, | they tend to become the kind of context-switching performance | disaster you describe. | lmilcin wrote: | Yes. Functional world is not impervious to leaky | abstractions. | | I am personally of the opinion that, to be a good | developer, you have to have mental model of what happens | beneath. If you are programming in a high level language it | is easy to try forget about the fact that your program runs | on real hardware. | | I know, because I work mostly on Java projects and trying | to talk to Java developers about real hardware is useless. | | I have an interview question where I ask "what prevents one | process from dereferencing a pointer written out by another | process on the same machine" and I get all sorts of funny | answers and only 5-10% candidates even have beginning of | understanding what is going on. Most don't know what | virtual memory is or are surprised that two processes can | resolve different values under same pointer. | MaxBarraclough wrote: | > Most don't know what virtual memory is or are surprised | that two processes can resolve different values under | same pointer. | | Yikes. How many of them have a degree in computer | science? | yurishimo wrote: | Probably not many of them, because it doesn't usually | take a graduate to write Java. Especially if your | experience with Java is also limited to tools like | Spring. I think the question is kind of pointless unless | your specifically hiring people to work on an application | that needs that sort of thru-put. Most apps don't need | it. | toast0 wrote: | copy + delete one at a time makes a lot of sense if you're | working on a filesystem without a way to move without copying | (I don't think you can actually move a file in fat32), because | copy all could require more space than is available. | | The same could be true here where you're moving from a zip file | to probably the same filesystem the zip files is in; if | removing a file from the zip file is actually an in-place move | data then truncate. The problem, of course, is that removing a | file from the zip file is tremendously expensive. Reading the | file with one syscall per byte doesn't help (especially post- | Spectre workarounds that make syscalls more expensive). | codetrotter wrote: | This is the real answer IMO. For example, if you have a 2TB | drive with 50GB available space left and are trying to move | 1TB of data, and the move is requiring copying, but all | individual files on their own are less than 50GB in size, | then I'd be pretty upset if my computer was unable to move | the files just because it was wanting to not delete anything | until the end. | | But ideally I'd want the system to delete at the end if | possible, and to otherwise delete as needed, instead of | either doing only all at end or only after every single file. | worewood wrote: | It's so if the move fails midway you won't need to start from | scratch. | slver wrote: | In theory all software is coded using a "many by default" | approach. So every time batching matters, we take those | batching opportunities automatically due to the way software is | coded. | | In practice we only batch when it starts hurting. It doesn't | hurt to delete files one by one on a normal file system. It's | made for that. So the API wasn't "many by default" and that's | how it works for zip files as well. | formerly_proven wrote: | IIRC explorer/shell namespaces are built on top of IStorage / | IStream and those don't know bulk operations. | dsego wrote: | Unless I'm mistaken, this seems to be the original programmer on | youtube: | | Dave's Garage - Secret History of Windows ZIPFolders | | https://youtu.be/aQUtUQ_L8Yk | codetrotter wrote: | It is. I have seen a few of his videos before. They are | generally interesting and worth checking out. ___________________________________________________________________ (page generated 2021-06-02 23:00 UTC)