[HN Gopher] The Wrong Abstraction (2016) ___________________________________________________________________ The Wrong Abstraction (2016) Author : signa11 Score : 124 points Date : 2023-05-13 10:40 UTC (12 hours ago) (HTM) web link (sandimetz.com) (TXT) w3m dump (sandimetz.com) | samsquire wrote: | I feel there's a beautiful representation of (many) problems that | is waiting to be found that makes particular software problems | easy to read and understand. When the mental model of the | software just makes sense. I don't want to reject other people's | models of how programs work but I want to understand them. | | Unfortunately, I feel the the original problem can be obfuscated | by adding ideas to the existing problem who now need to | understand your ideas (or mental model) of the problem to | understand your code. I need to understand how you think to read | your code. And if your way of thinking is more advanced than mine | or incomplete or not great, then my work is harder. | | Mental models are how I understand software, there is what I call | a "critical insight" that makes the code obvious and easy to | understand. I don't want to be deciphering and spend days | investigating code to understand how to change it or build upon | it or use it. I want the APIs to reflect their expected usage and | behaviours. | | My perspective is that computers are adding and arrangement | machines - they add and do operation on numbers and move things | around to different locations. My mental model of computers is | that it comes down to LOGISTICS and arrangement/ordering | problems. Unfortunately, APIs and data structures are nested and | ordered and obfuscate the underlying movement of things between | places or addition to different things. | | Everything that obfuscates the rules of the computation means | getting the behaviour you want from the code is harder. | | I've often thought about "commutative computation" where we | specify what we want to be true to the computer and the computer | works out how to arrange all its existing computations to satisfy | that additional invariant. I often think of software as a series | of behaviours rather than functional or imperative. | | Think of a materialised view, we have an existing behaviour of | the computer and we want to customise the behaviour. You could | work out where you need to insert your code snippet into but | that's really hard. Or you could add an invariant to the system | that the system now satisfies. | kqr wrote: | > Unfortunately, I feel the the original problem can be | obfuscated by adding ideas to the existing problem who now need | to understand your ideas (or mental model) of the problem to | understand your code. | | What do you think are the steps someone goes through when they | obfuscate the problem by adding ideas? Like why do you think | they do it? | gbear0 wrote: | Things get obfuscated because someone's viewing the problems | from a different abstraction lens, and they're building a | system onto that lens. | | Eg. Iterate through an array: const arr = [1, | 2, 3]; for (let i = 0, l = arr.length; i < l; ++i) { | console.log(arr[i]) } | | Let's model it differently using an iterator: | const arr = [1, 2, 3]; const arrIter = | arr[Symbol.iterator](); let i = arrIter.next(); | while (!i.done) { console.log(i.value); i = | arrIter.next(); } | | At this level it's still pretty obvious what's going on, but | you can still see that there's a level of abstraction between | an array access vs calling 'next/value', and that obfuscates | what is actually happening at the computation/instruction | level. | | If I extend this another level then I'm going to start | modelling problems using an iterable and not an array/index. | New requirements come in and we extend to use an async | iterable. Everything still works nicely, but in some | scenarios where the actual iterable is just an array, now | there's a lot of extra overhead to just do an index lookup. | | Using the iterator allows the code to be reused in more | scenarios, but there's usually a cost to switching the lens | of abstraction so that it fits into a problems modeled | differently. | samsquire wrote: | I can tell you what I want to believe of myself that I think | I do. | | I try think of the simplest most elegant, beautiful solution | to the problem that allows the minimal of code and minimal | cleverness and complexity be used to solve the problem with | trivial loop, map, hash lookup or traversal or association. | | That usually comes with trying to see the problem in a | different light, to reframe the problem as a different kind | of problem, which can obfuscate the original problem. | veyh wrote: | (2016) | pachico wrote: | I agree 100%, but here's a plot twist: | | - Don't fall in the trap of early abstraction... | | - and abstracting one single use case is very hard, if not | impossible... | | - but you are writing your app in Go and interfaces are the only | way to properly test your stuff... | | - The end. | kubanczyk wrote: | > in Go and interfaces are the only way to properly test | | Adding tests using interfaces is solely adding more code (and | the method signatures are simply duplicated) so it's the exact | opposite of OP's problem. | dustingetz wrote: | Agree with the big idea, but the problem is - If you are not a | computer scientist and current with the latest papers, you | certainly have the "wrong" abstraction | | So i'd collapse this whole article down to 1 bit - "software | development is hard" | [deleted] | jsnell wrote: | Significant past discussions: | | https://news.ycombinator.com/item?id=11032296 | | https://news.ycombinator.com/item?id=12061453 | | https://news.ycombinator.com/item?id=17578714 | | https://news.ycombinator.com/item?id=23739596 | | https://news.ycombinator.com/item?id=27095503 | revskill wrote: | I think what people want is Design Pattern, not abstraction | "implementation". | | Design pattern is real abstraction, because it's about thinking | and designing. Abstraction is not related with specific | implementation. | | So, duplication is fine until you figure out the real Design | Pattern to be used. | seo-speedwagon wrote: | > Existing code exerts a powerful influence. Its very presence | argues that it is both correct and necessary. | | I've probably paraphrased this line to every junior engineer I've | mentored. It's such a succinct and pithy insight. | scrubs wrote: | It's what I call small 'c' culture, which regrettably, get's | confused with 'C' culture. In software, 'C'ulture is: | | - know your customer, and their use cases | | - continuous improvement | | - specialists have got to know the big picture and engage in it | | - good cross functional coordination | | - CS fundamentals: DBs, algorithms, UNIX, functional | programming, C/C++, CDCI etc. | | That never ages away. | | However, stuff like we use and have always used Kafka (read the | code!) for messaging, so we're not doing kernel-by-pass to move | data now is small 'c' culture. | | Small 'c' culture is the kind of stuff that, if you abrogate | it, a small army of people will come out of the woodwork and | brow beat you for it. Brow beating to keep you inline is not | engineering. It's nagging. | | Tradition, when it's small 'c', is stifling. Don't fall for it. | kgeist wrote: | In our team we have this rule that you should'nt even think about | introducing an abstraction unless there's at least 3 real use | cases to consider. You're most likely to create a wrong | abstraction if there's only 1 use case; 2 cases may be just a | coincidence (2 business rules look similar on the surface but | have nothing to do with each other really). 3 is an heuristic but | it saves us from investing too much time on most likely useless | abstractions which only get in your way. | sodapopcan wrote: | You are describing Rule of Three :) | | https://en.m.wikipedia.org/wiki/Rule_of_three_(computer_prog... | dllthomas wrote: | I think the rule of 3 tends to lead to reasonable results, but | asking "is this really the same piece of knowledge I'm encoding | in N places" (as the original formulation of DRY suggests) is | going to be a little better. Sometimes it's two places but it's | really clear they'll always change together, sometimes it's ten | places but each is going to evolve independently (which, to be | fair, might well be the determination made when "think[ing] | about introducing an abstraction" in your formulation). | | To push back a bit on naive misapplication of DRY I've been | saying we should call collapsing things that are just | coincidentally similar (and likely to change independently) | "Huffman coding". | Tade0 wrote: | More often than not, when I tried to employ the strategy | explained in the post, the sunk-cost people would try to shut me | down. | | Fortunately my current project is different, because the team is | very small and we have silos of responsibility, so we don't | really get in each other's way that much. | | It appears that the largest obstacle here is not the lack of | ability, but agency. | [deleted] | croes wrote: | >duplication is far cheaper than the wrong abstraction | | Isn't that a pretty useless sentence? Of course duplication is | cheaper because aren't the higher costs one of the reasons why | it's a wrong abstraction? | | Reminds of this sketch from Fry and Laurie | | https://youtu.be/XewVicFzRxw | | >Hugh: Yes but too much is bad for you. | | > Stephen: Well of course too much is bad for you, that's what | "too much" means you blithering twat. If you had too much water | it would be bad for you, wouldn't it? "Too much" precisely means | that quantity which is excessive, that's what it means. Could you | ever say "too much water is good for you"? I mean if it's too | much it's too much. Too much of anything is too much. Obviously. | Jesus. | jakelazaroff wrote: | Duplication and abstraction aren't the same, though. | Abstraction is a tool for reducing duplication. The point of | the post is that if the abstraction is wrong, it's worse than | just leaving the seemingly duplicated code. | yxhuvud wrote: | I'd disagree. An abstraction is a way to reason about a | problem. Often that reduces duplication, but it is a side | effect from a better understanding of the problem. | croes wrote: | Of course because the higher costs are what make the | abstraction wrong in the first place. | | That's like saying don't do the wrong thing. | jakelazaroff wrote: | Oh I see what you're saying. No, "the wrong abstraction" | doesn't intrinsically mean it's more costly than duplicated | code. A lot of people argue that the wrong abstraction is | still better than having duplicated code. She's saying | that's not the case. | williamcotton wrote: | Exactly, the Church of DRY. Heathens! The truth is found | in the Church of the Rule of Three. ;) | gpderetta wrote: | We | scrubs wrote: | To add to the OP's post: | | 1. Organizations must value continuous improvement. We want to | avoid two extremal behaviors that sours individuals. First, the | lethargic in-bred sterility of: hey, it worked before you got | here, and it's fine now. Play-it-off is not wisdom. On the other | extreme is frustration gone wrong. Sure, you can see a problem | AND be right about it. But whining and constant criticism sours. | Everybody's problem is there are 10,000 things that could be | worked on, and resources only for 1000. You better make sure | you're customer driven so you pick the right 1000. | | 2. Duplication is better for the medium term ... if you stay with | the problem for a while, you are better able to distill the big | picture into a more coherent new abstraction. Here you can cite a | problem, cite a solution, and stick to your guns. You are better | positioned to impact change without being a whiner. Now, problems | are working for you, not against. | kmac_ wrote: | [flagged] | bcrosby95 wrote: | Whenever this article is posted it amazes me. People seem to only | reply to the title, and ignore the substance of it. The point is | not to "not abstract" or "rule of 3". The point is requirements | change, features are added, and _when_ an abstraction becomes | wrong, tear it out. | carlivar wrote: | Yes, and then potentially rebuild it based on what you know | now! No one reads that part. | morrvs wrote: | > The point is requirements change, features are added, and | when an abstraction becomes wrong, tear it out. | | I like this phrasing a lot, thanks for this! | | I'm still wondering if there's also potential in _avoiding_ the | wrong abstractions in the first place. For that we 'd need a | "cheap" way to decide whether an abstraction is | good/bad/something else. | | Is there generally applicable, widely accepted principles or | research around this? A quick search only revealed random blog | posts; nothing I'd consider widely accepted. | [deleted] | krona wrote: | > Is there generally applicable, widely accepted principles | or research around this | | J. Ousterhout is gaining traction, at least in my corner of | the industry. https://web.stanford.edu/~ouster/cgi- | bin/cs190-winter18/lect... | hamdouni wrote: | In all the article, no reference to the business those | abstractions or duplications are made for. I mean the way to | decide if it is a "good duplication" is to ask ourself if it is a | coincidence that it is the same code: 2 business rules having the | same implementation does not mean it is a duplication. | kristiandupont wrote: | This is part of the reason why I like Tailwind-style utility | classes and Typescript union/intersection types. The simple fact | that I am (often) spared the intellectual effort of coming up | with a name. I wrote this: https://itnext.io/and-naming-things- | tailwind-css-typescript-... | layer8 wrote: | For limited occurrences that's correct, but if you find | yourself having the same `foo | bar | baz` all over the code, | you're going to want to introduce a shortcut term for it. Even | just to be able to efficiently talk about it. | | The other thing is that unions/intersections are not an | abstraction, because they don't hide any details. The purpose | of an abstraction is to separate essential properties of | whatever is being modeled (the interface) from current details | that may change later, or that client code shouldn't depend on | (the implementation). | ivalm wrote: | Yes, one of the reasons I like using mypy for python and | typescript for frontend is that it forces me to recognise | opportunities for abstractions. If some input/return type is | getting really complicated or reappears in many places in the | code then likely it's a good candidate for an abstraction. | kristiandupont wrote: | In case it's unclear, I agree with you completely. | Introducing the right abstraction into a code base can feel | like someone switching on the lights. Far more benefit than | just DRY. | | Conversely, I am currently working with a frontend code base | that is using "classic CSS", and it's striking to me how | frustrating it is to have to think up what the "semantics" of | this and that particular <div> can be said to be, when there | very often aren't any. | pkolaczk wrote: | While I agree with the statements made in the original post, I'm | afraid this thinking can be used as an excuse for avoiding any | attempts at finding proper abstractions. Similarly to how the | term "premature optimisation" is so frequently used by people | unable to write efficient code to excuse for their lack of skill | or laziness, despite the context and the times when those words | were first used were vastly different and the author meant | something else. | | IMHO abstraction should not be guided by the desire to remove | duplication. Duplication is not even the only (and far from the | worst) result of insufficient abstraction. | | Insufficient abstraction leads to increased complexity, not just | duplication. | | Example: just this week I've been working on some code that has | to deal with arbitrary ranges of ordered values. Typically when | you think of a range, you think of a pair of bounds - the lower | and the upper bound. However, the input is allowed to have only | half-ranges so that one of the ends might be unbounded. So in the | code I inherited there are 3 cases: a range with both lower and | upper bounds defined, a range with only a lower bound, and a | range with only an upper bound. All code processing those ranges | has to deal with that optionality of either end, thus making it | way more complex than needed - lot of if ladders or switch | statements. And it multiplies very quickly when you deal with | more than one range at a time. It is insufficiently abstract, | even though it doesn't have any obvious duplication. The proper | abstraction would be to transform the half-ranges to full ranges | by introducing special open-end items (always smaller or greater | than every possible value) which would allow one simple type of | range to cover all possible cases. | auggierose wrote: | I'd say this can serve as an example where triplication is | better than your abstraction. What are these special open-ended | items? How do you need to extend comparison to account for | them? Etc. Whereas the three cases are perfectly clear and easy | to understand. | JadeNB wrote: | > How do you need to extend comparison to account for them? | | The post already says: -[?] < x < [?] for all numerical x. | (And the mathematician in me clarifies that that's all _real_ | numerical x.) | pkolaczk wrote: | I don't remember who said that, but mathematics is all | about building abstractions, not about computation. So many | times mathematics helped me make code simpler. | auggierose wrote: | The post I am replying to made no assumptions about the | domain the order is defined on. If it is over the reals, | sure, you can use -[?] and [?]. If it is over the integers, | you can use MIN_VALUE and MAX_VALUE, sacrificing some of | your domain (which might be a problem, depending on the | context), or you can use Option[Int], which comes with | performance issues. | | Or, you can use a range which is a sum of three/four cases, | and not worry about any of that. | Rumudiez wrote: | > What are these special open-ended items? | | > If it is over the reals, sure, you can use -[?] and | [?]. If it is over the integers, you can use MIN_VALUE | and MAX_VALUE | | If you already knew the answer, why did you ask? | pkolaczk wrote: | The three cases force _every code_ using ranges to deal with | them. Apply this way if thinking many times for multiple | concepts and you end up with a spaghetti of multiply nested | if statements that 's near impossible to analyze for | correctness. Because now you have to read all code instead of | just a tiny subset. | | > What are these special open-ended items? How do you need to | extend comparison to account for them? | | The whole point of abstraction is to make those decisions | once and isolate the complexity in one place instead of | having it spread over N places in the code, forcing everybody | to solve the same problems again and again. | hitchdev wrote: | [dead] | kqr wrote: | Do ranges not support a limited set of operations through | which the rest of the code can interact with them, instead | of manipulating the endpoints directly? | | I would think of the range itself as the abstraction, and | then it matters less how it's implemented since any | potential problems are local to the implementation and | cheap-ish to fix. | pkolaczk wrote: | Sure you can probably also do it, but this is not the way | how the code was originally written. The original ranges | present the bounds in their public API, and most code | just operated on them. | kqr wrote: | Then I would say that's the problem. Whatever | implementation you leak, the problem is the lack of | implementation hiding, not that the implementation looks | this way or that way. | pkolaczk wrote: | But that's still insufficient abstraction and my general | point holds. | auggierose wrote: | Yes, this is correct. The range is the abstraction, and | then you can choose how to represent it. Not much | difference between the three range cases, and the single | case with special endpoints, except that the three cases | are more general, as no special values are needed. | travisjungroth wrote: | When your three cases run into someone else's two you have | six, and it can even get much worse than that. Or an object | running into itself and getting nine cases. | | Intervals are nice for representing acceptable ranges. Half | intervals mean greater/less than. If you stick infinities on | the ends, everything likely works. You then expose methods or | functions for all your operations. From the outside, you | don't have to care if it's a half interval or not (unless | that is what you're particularly checking). On the inside you | don't really, either. | | If you're messing with intervals in a business setting, it's | worth considering if you need multi intervals, non continuous | regions. | | These are all great for handling uncertainty. Like if you add | two weights that have +/- values, you can have the sum have | those and be correct. The math is all well defined and rather | easy. Wikipedia has good pages on it. | JadeNB wrote: | > The proper abstraction would be to transform the half-ranges | to full ranges by introducing special open-end items (always | smaller or greater than every possible value) which would allow | one simple type of range to cover all possible cases. | | You wouldn't even need to create anything new--both math and C | already provide this abstraction in the form of -[?]/-inf and | [?]/inf. | pkolaczk wrote: | I wasnt talking specifically about real (float) numbers, but | yes - this is that abstraction. And it generalizes to any | type with ordering (can work with integral types as well). | hakunin wrote: | Problems like these often come from the pressure to ship fast, | and not writing code 2-3 times to find a good way to express | something. If you're going to rush through abstracting things | away, I'd rather you duplicated. If you will take time to | express it well, then I'd prefer a good abstraction. | dgb23 wrote: | Very good example of how even a small scale piece of logic can | have very messy effects down the line. | | Both insufficient and wrong abstractions are viral. They infect | everything they touch, which can snowball into large parts | being more complex, harder to understand and debug and often | also slower. | | The wrong abstraction is wrong, insufficient abstraction is | wrong. | | Really the only weapons against complexity we have as | programmers are decomposition and abstraction. We have to take | things apart, like in your example it would be the meaning of | each parameter, and then we put them together in such away that | the details below our abstraction can mostly be ignored. | | I say that all with a caveat: I tend to prefer less, | insufficient or no abstraction over the wrong one. The former | few options can lead to code that is hard to understand as a | whole and can be brittle, but the latter drives you into a | corner: The only way out is either trying to patch over it or | starting from scratch - choose your poison... | marcosdumay wrote: | > the latter drives you into a corner: The only way out is | either trying to patch over it or starting from scratch | | Often enough, the way out is going back. Why are developers | collectively so reluctant to go back? (Myself included.) | epiccoleman wrote: | I'm encountering a lot of these types of small abstraction | projects in a React project I'm working on. It's a music theory | "explorer" app and, maybe unsurprisingly if you know any music | theory, getting a good abstraction that doesn't fall victim to | lots of weird little edge cases is tricky. | | I'm using Tonal which makes it easier, because I can mostly | push weirdness into wrappers for individual Tonal calls. It's | honestly been a great little challenge because the scope is so | small that it doesn't take all that much analysis or thought to | see where abstractions break down. Fun little exercise in code | design. | _a_a_a_ wrote: | IDK anything about music theory but I wonder, if you're | having trouble finding a good at abstraction to express | theory, perhaps the theory is at fault. | | I mean, at a high level, theory _is_ the abstraction isn 't | it? | pm wrote: | The time dimension is often forgotten when applying these | maxims. When we see code, we often fail to consider the the | journey it's taken to arrive at that point in time, and where | it might be headed in future. | | In the example you set, it's the right time to apply an | abstraction, so it's no longer premature. Perhaps the maxim | should be labelled as "premature abstraction", rather than | "premature optimisation". | nathias wrote: | Wrong abstraction is a type of premature optimization, it's an | anti-pattern that's very common among the senior and supersenior | coders that 'already know everything in advance' and that | knowledge turns out to be false. | BulgarianIdiot wrote: | Why is this article recognizing only two extremes? Either | everyone uses the same instance, or there's NO SHARING AT ALL: | | "Re-introduce duplication by inlining the abstracted code back | into every caller." | | Or maybe if there are, say, 10 places dependent on the shared | code, we can make them 5 places dependent on one version and 5 | places dependent on another? | | Forking and merging is part of business as usual in programming | and we should be used to it. We should not be shocked that | sometimes you have to fork a function because adding more | parameters is not feasible, but nor should we declare sharing is | therefore wrong or harmful. | | Also, how you design parameters is extremely important. One | callback parameter may be worth a hundred "normal" ones. | inimino wrote: | The point she is making is about choosing to go back, towards | less abstraction, rather than forward. So I expect the answer | to your question is that two endpoints are enough to establish | both directions and make the intended point. | | If midpoints are introduced then comments like yours "but what | about..." can always be made until the entire abstraction tower | is fully described, and that's not the blog post (or book) the | author wanted to write. | BulgarianIdiot wrote: | I wish every time someone established two extreme points, | everyone is like you, automatically interpolating an entire | space of endless possibilities, countless shades of gray. But | this is decidedly NOT how we think, because dichotomies are | simpler to mentally process, and in fact ultimatums or | "single right solution" situations are easiest to process. | | Have you ever seen an online argument? If someone is right, | and someone is not AS right as they are, they are a "left | shill" and vice versa. If you promote solution A, and someone | promotes solution B, then they're "wrong". Not establishing | points, just "wrong". | | So I think establishing two directions is best accomplished | not by marking up two extreme points and leaving the rest to | the imagination as our imagination is apparently quite poor. | | It's more correct to describe the next _step_ in a direction, | and let us take things step by step and know that nuance is | inherent to our success, not optional. ___________________________________________________________________ (page generated 2023-05-13 23:01 UTC)