[HN Gopher] DRY Is a Trade-Off
       ___________________________________________________________________
        
       DRY Is a Trade-Off
        
       Author : soopurman
       Score  : 73 points
       Date   : 2020-12-17 19:44 UTC (1 days ago)
        
 (HTM) web link (orbifold.xyz)
 (TXT) w3m dump (orbifold.xyz)
        
       | Ma8ee wrote:
       | I see similar arguments come up over and over at HN, and I say
       | they stem from a fundamental misunderstanding of DRY. What we
       | must not repeat is not lines of code, but how we do certain
       | things in the code. That is, it's fine to repeat syntactically
       | identical sections of code if their semantic meaning is
       | different. But if the semantic meaning is the same, they must
       | never be repeated because we must never have several definitions
       | of the same thing in the same program. This is similar to the
       | concept of normalisations in DBs.
        
         | takeiteasyy wrote:
         | Could you give an example of syntactically identical sections
         | of code with different semantic meanings?
        
           | hknapp wrote:
           | y = m * x + b
           | 
           | can refer to the distance a car has traveled over time at a
           | fixed speed with a starting point as well as the cost of
           | ordering a certain quantity of an item with a fixed shipping
           | cost
           | 
           | that's my best attempt at an example
        
             | Ma8ee wrote:
             | That is better than I managed to come up with on short
             | notice.
        
               | hknapp wrote:
               | Thanks. It was inspired by an Uncle Bob example.
        
           | Ma8ee wrote:
           | Only contrived ones without supplying quite a bit of context.
        
       | shhsshs wrote:
       | Someone else's comment [1] I saved from an older post also about
       | DRY
       | 
       | I've usually heard this phenomenon called "incidental
       | duplication," and it's something I find myself teaching junior
       | engineers about quite often.
       | 
       | There are a lot of situations where 3-5 lines of many methods
       | follow basically the same pattern, and it can be aggravating to
       | look at. "Don't repeat yourself!" Right?
       | 
       | So you try to extract that boilerplate into a method, and it's
       | fine until the very next change. Then you need to start passing
       | options and configuration into your helper method... and before
       | long your helper method is extremely difficult to reason about,
       | because it's actually handling a dozen cases that are
       | superficially similar but full of important differences in the
       | details.
       | 
       | I encourage my devs to follow a rule of thumb: don't extract
       | repetitive code right away, try and build the feature you're
       | working on with the duplication in place first. Let the code go
       | through a few evolutions and waves of change. Then one of two
       | things are likely to happen:
       | 
       | (1) you find that the code doesn't look so repetitive anymore,
       | 
       | or, (2) you hit a bug where you needed to make the same change to
       | the boilerplate in six places and you missed one.
       | 
       | In scenario 1, you can sigh and say "yeah it turned out to be
       | incidental duplication, it's not bothering me anymore." In
       | scenario 2, it's probably time for a careful refactoring to pull
       | out the bits that have proven to be identical (and, importantly,
       | must be identical across all of the instances of the code).
       | 
       | [1]
       | https://news.ycombinator.com/reply?id=22022603&goto=item%3Fi...
        
         | snapcore wrote:
         | This sounds similar, but different to the three strike rule:
         | 
         | https://en.wikipedia.org/wiki/Rule_of_three_(computer_progra...
        
         | systemvoltage wrote:
         | > Then one of two things are likely to happen
         | 
         | You forgot the third most common case - 90% of those repeated
         | methods will never have a bug and now you've got a code base
         | that will start smelling bad and no one is going to want to
         | work on it.
         | 
         | I agree with some of the things here, but allowing multiple
         | places of repeated code _unless_ there isn 't a bug sounds like
         | a terrible idea. Lot of these small methods will never have a
         | bug and they'll continue to rot the codebase.
        
           | lostcolony wrote:
           | Can you get the trade off right 100% of the time? Because I
           | can tell you, every time I've worked on a codebase that
           | repeated itself, it has been a freakin' -delight-, compared
           | to the times where DRY was taken as a commandment from on
           | high.
           | 
           | The former when something broke, I could just...fix it. And
           | it would be fixed. Would other, similar situations, still be
           | broken? Sure! And when those would be raised up we'd fix them
           | too, and compare them with the other changes, and possible
           | refactor. Fixing one bug = one bug less.
           | 
           | The latter? Oh God. Something is broken? We'd fix it. Aaaand,
           | now there'd be two bugs. Fixing one bug = more bugs.
           | 
           | Perfectly balanced code, yes, fixing one bug = multiple bugs
           | fixed. That's the goal. But you won't get it right if you do
           | it pre-emptively. Which of those other two options would you
           | prefer?
        
           | dhruvkar wrote:
           | Actually asking, as I've never worked on a large existing
           | codebase:
           | 
           | If there is no bug, why would it continue to rot the
           | codebase?
        
             | systemvoltage wrote:
             | https://en.wikipedia.org/wiki/Code_smell
        
               | oxfeed65261 wrote:
               | Will you be here all week?
        
               | systemvoltage wrote:
               | Not sure if I follow?
        
               | oxfeed65261 wrote:
               | I thought that your response was hilarious, and deserved
               | a "Thank you, I'll be here all week." :)
        
           | [deleted]
        
           | hombre_fatal wrote:
           | These kinds of abstract code discussions almost become
           | immediately absurd because, for it to work, we have to be
           | imagining the same hypothetical codebase. Yet we never bust
           | out concrete code. It's funny.
           | 
           | But the outcome that you're lambasting so rudely (really?
           | "terrible idea" when we aren't even looking at code?) is
           | still often the best outcome.
           | 
           | Some rotting, bugless, duplicated code is some of the easiest
           | code to work with. It's the code you wish you had when you're
           | debugging the complicated failing abstraction that GP wanted
           | to avoid. The most damning thing you yourself could say about
           | it is that it was taking up space.
           | 
           | In fact, you seem to be making the exact reverse argument of
           | GP: that some duplicated, unabstracted, bugless code poses
           | such a risk ("rot") that it's a "terrible idea" to not
           | immediately merge it into one frankenabstraction.
           | 
           | When this happens in an argument, usually you both are
           | imagining an absurd extreme that's opposite of the other's
           | chosen extreme. And you actually are in agreement, as you'd
           | both go "oh well yeah, if you go to _that_ extreme, then I 'd
           | definitely agree with you."
        
         | leni536 wrote:
         | Rule of thumb: If you can't give the thing a name then maybe
         | don't extract it. What you extract becomes an abstraction. No
         | abstraction is better than a bad abstraction.
        
           | axaxs wrote:
           | Not a bad rule, but from what I've seen mainly in corporate
           | Java, you can give everything and anything a name.
        
             | leni536 wrote:
             | I should have wrote a "good name".
        
               | loopz wrote:
               | GoodNameAbstractionFactory to the rescue!
        
           | hi_hello wrote:
           | As a chronic over-generalizer, you have no idea how much time
           | I've wasted trying to think of the right name for meta-
           | monstrosities.
        
             | kylegill wrote:
             | My friend told me at their company they'd commonly convene
             | the "variable naming committee" for such occasions, and I
             | can't help but think of it every time I find myself in the
             | same place.
        
           | dcolkitt wrote:
           | In principle I agree, but in practice don't think this leads
           | to great results. Naming things well is really cognitively
           | challenging. The human brain is naturally lazy and easily
           | makes excuses to avoid thinking hard.
           | 
           | When something like this is adopted, the average person will
           | look at something then quickly throw up their hands and
           | declare that they can't name it without really trying. Many
           | things can be named well, but it takes 60 seconds of hard
           | thought and focus to realize it.
        
             | mrits wrote:
             | Being at a startup for 10 years had the advantage that I'd
             | frequently run into some of my original code. My favorite
             | real life naming example:
             | 
             | IDataProcessor. And if you need the raw data it will be
             | several layers deep in dataProcessor.data.data.data.
        
               | gabereiser wrote:
               | I read that last line in the theme of Super Mario Bros
               | World 1-2 song. Data.data.data... Data.data.data...
        
               | boogies wrote:
               | https://invidious.site/watch?v=OsnfYn_ZFdE
        
           | bluSCALE4 wrote:
           | This is absolutely the rule. The problem I see a lot is that
           | someone else will come in and want to add an edge case to my
           | generalized code or will try to include edge cases in their
           | "generalized" code. Personally, I have 0 problems with DRY
           | code. I don't even treat it like a requirement but sometimes,
           | I'm trying to code something and I KNOW the code needs to be
           | generalized or at least part of it and that's when I'll spend
           | an hour on a 15 min fix. Effective DRY comes with experience.
        
         | js8 wrote:
         | In imperative languages without macros, it might be difficult
         | to abstract structural similarity, like a common if statement.
         | 
         | In functional languages, where every statement is a function,
         | this is often easier.
        
           | jayd16 wrote:
           | You can write your method in a functional style in the
           | popular imperative languages. This can stretch how DRY you
           | can be a bit further.
        
       | city41 wrote:
       | Happy to see people questioning DRY. Hard to argue with some of
       | these points.
       | 
       | Dan Abramov also wrote on this fairly recently:
       | https://overreacted.io/the-wet-codebase/
        
       | danparsonson wrote:
       | "...putting common lines into functions, _without careful thought
       | about abstractions_ , is never a good idea..."
       | 
       | (emphasis mine)
       | 
       | I think this is the crucial part. DRY works fine and in fact
       | arises naturally if the code is well factored to isolate areas of
       | commonality - as the author points out though, this is very
       | difficult to do and I think that's the core problem.
        
       | lock-free wrote:
       | Not sure if anyone else shares this anecdote, but I've noticed
       | the most DRY-hard programmers tend to be the most resistant to
       | things like functional programming, monads, and other generic
       | approaches which are the ultimate realization of DRY. And often
       | the most inscrutable.
       | 
       | Another anecdote on DRY: I'm currently refactoring two
       | interrelated systems that share a single function, and very
       | horrible bad things happen if the systems disagree on the return
       | value of that function. However _sharing the same code_ is more
       | complex than duplicating it and this has a nontrivial impact on
       | how the systems are distributed. So today I 'm undoing the
       | original work I did to make it DRY - turns out that sometimes,
       | you need to copy/paste.
        
       | bob1029 wrote:
       | Duplication is usually a safe default course of action because
       | you are not locking yourself in to any particular consensus of
       | the problem domain. Obviously, too much of this will render a
       | codebase a nightmare to maintain as bugfixes and feature
       | enhancements have to be applied in multiple places.
       | 
       | I have found that starting with duplication is by far the easiest
       | and most flexible way to work through problem domains that are
       | complex. Once you have a really good grasp of the modeling, then
       | you can iterate and decide on normalization where appropriate.
       | 
       | Thinking about this from an analytical perspective - If you build
       | your application with duplication by default (i.e. define a
       | domain model for each logical use case/scenario), then you will
       | have an excellent analysis already in front of you regarding
       | which business types should be normalized and which ones might be
       | a little bit trickier to make common. Many times it is impossible
       | to fully explore a problem domain until you have already written
       | software against its entire extent.
        
         | xnx wrote:
         | And often the process of de-duping repeated code later on isn't
         | as bad as DRY purists make it out to be, especially since it
         | can be done incrementally. Example: If you have similar
         | functions in 7 places, you can consolidate them one by one. If
         | you have 1 function used in 7 places, you have to consider all
         | implications for all of those code paths.
        
           | Ma8ee wrote:
           | But when you set out to do that you first have to make sure
           | they really are identical, and since everything from
           | indentation to names often differ it might not be so
           | automatic. And then you do find differences, and you have to
           | figure out if they are there by accident because someone
           | forgot to implement a fix or a change in some places or if
           | the differences are intentional.
           | 
           | On the other hand, if you realise your abstraction was bad,
           | duplicating a function is always trivial.
        
           | chousuke wrote:
           | In my experience, the "later" may come too late.
           | 
           | I've seen code copy-pasted and slightly modified over a
           | _dozen_ times, sometimes without even eliminating dead code!
           | Copying a function or even a whole file is fine if you
           | _actually_ take a moment to consider whether it needs
           | refactoring or not, but more often than not people will just
           | copy and bash at the code until things work without actually
           | making a conscious choice to duplicate code over refactoring.
        
           | azundo wrote:
           | This seems like an equivalent duality to me but in the case
           | of seven independent functions you're much more likely to
           | miss considering a case. If behavior changes for one you
           | likely should be considering if that change applies to all
           | the others.
        
       | layer8 wrote:
       | A better formulation of DRY is SPOT -- Single Point Of Truth. In
       | the event that the logic is changed in one copy, should the other
       | copy always be updated accordingly? If the answer is yes, combine
       | them into a single copy, so that they don't diverge and create
       | ambiguous "sources of truth" in the future. Conversely, if it is
       | likely that the logic in the two copies will need to diverge in
       | the future, due to having a different context, then do not
       | combine them, because they represent different "truths" that just
       | currently happen to have the same form.
       | 
       | Of course, the answer to that question can change over time, and
       | one has to combine or duplicate accordingly. This also serves to
       | document the intent that "yes, these two occurrences are expected
       | to evolve identically", or "no, these two occurrences are
       | expected to evolve independently, even though they currently
       | happen to look the same".
       | 
       | The article is correct though that there is a trade-off in terms
       | of the complexity created by the abstraction, and in how
       | important the "common truth" is. Sometimes a source comment
       | pointing out the dependency is better than introducing a
       | nontrivial abstraction.
       | 
       | The book "A Philosophy of Software Design" argues that there are
       | two sources of complexity in software: dependencies and
       | obscurity. Combining two similar pieces of logic into one can
       | reduce dependencies (of one having to be changed when the other
       | is changed), but can increase obscurity due to the added
       | abstraction. If the combining was done for the wrong reasons (the
       | two occurrences actually need to evolve independently), then the
       | dependencies are increased instead of reduced.
        
         | camgunz wrote:
         | Love this, yeah. I've heard it phrased like "how do we answer
         | questions" so like, "how do we answer questions about a job's
         | status", "how do we answer questions about a user's bank
         | account balance". Either way, once you have those kind of
         | product requirements in place you can build to that spec, and
         | then start iterating as you gain more knowledge.
        
       | salmonellaeater wrote:
       | It helps a lot to only extract common code that has a clear
       | purpose. If I find myself naming a method
       | "doThisThingThisOtherThingAndLogSomeStuff" then it's not a good
       | refactor. On the other hand, "markStaleRows" or
       | "enforceTreeInvariants" have clear purposes that help future
       | engineers make intelligent decisions about how to use these
       | functions and how they should evolve. I want to build in to the
       | common code answers to the questions "should I call markStaleRows
       | or roll my own code?" and "should I make this code change in the
       | calling function or in markStaleRows?".
        
         | Ma8ee wrote:
         | And if you have a function that is called markStaleRows,
         | everyone better use it, because when I need to change how stale
         | rows are marked, I will change this function and nothing else.
         | I won't go through the whole codebase trying to find if someone
         | might have written some inline code somewhere else trying to do
         | the same thing.
        
       | __jem wrote:
       | Abstractions are addicting for many developers, including myself.
       | I switch between Go and Java. Go is the language I want my
       | coworkers to use. I'd rather read "bad" Go code than "bad" Java
       | code all day long. Bad Java can be truly excruciating to read and
       | review, particularly due to the poor choice of abstractions.
       | Whereas, Go mostly gets out of the way and may be written poorly
       | but is straightforwardly written poorly.
       | 
       | Still, there's a certain sense of aesthetic beauty that I just
       | can't derive from Go, and why I kind of hate working in it.
       | There's lots of things about Java and OO that I don't love, but
       | reading a perfectly factored Java code base can be just
       | beautiful. Mostly due to good choice of interfaces.
       | 
       | Now, those code bases might be rare and not worth the lift of a
       | million bad abstractions. I'd probably agree at this point, but
       | still, I find it odd that most Go code bases just feel dirty and
       | thrown together to me. Hacking stuff together in a mostly
       | procedural language with good deployment story is probably the
       | right way to write for-profit code, but I'm not sure I love it.
        
       | Osiris wrote:
       | The way I like to work is first to write out all the code I need
       | to make something work correctly, then I go back over the code to
       | see if there's anything that could be simplified or split into
       | separate functions, etc.
       | 
       | I really like to see DRY code, but if you have to make a helper
       | function that takes a bunch of parameters with a bunch of
       | conditionals to something slightly different, you might be better
       | off just sticking the specific logic you need in each place.
       | 
       | The worst case of copy-pasta I saw in a codebase I came into was
       | a function that was 1000 lines long, duplicated 3 times with < 10
       | lines of it different for each copy. That's a classic case for
       | DRY to be applied.
        
       | kazinator wrote:
       | Hey, you mean "DRY IA TO"! Don't repeat common phrases like "is
       | a" and "trade-off", damn it! Define an acronym and use that
       | instead.
       | 
       | DRY is literally impossible. If something has to be performed or
       | evaluated two or more times, and you factor that out under a
       | definition, you still have to invoke the definition multiple
       | times. I.e. you are still repeating yourself, just using an
       | abbreviation.
       | 
       | What you are doing is called "compression". Classic data
       | compression algorithms like LZ77 work by abbreviating.
       | 
       | "LZ77 algorithms achieve compression by replacing repeated
       | occurrences of data with references to a single copy of that data
       | existing earlier in the uncompressed data stream. " - Wikipedia
       | 
       | Outside of alcoholic drinks, that's the ultimate DRY.
       | 
       | Thus, the argument against DRY is obvious: it's a form of
       | compression, and excessive compression destroys readability: or
       | else we would all be able to read source code that has been put
       | through LZ77.
       | 
       | Only mild compression improves readability. Mild compression
       | improves readability largely because it's easier to see that two
       | brief invocations of a definition are exactly the same, than to
       | see that two repetitions of a code block are exactly the same.
       | When we see that two code blocks are exactly the same, we don't
       | have to understand them separately.
       | 
       | Basically, brainless repetition and verbosity hinders
       | readability, as does dense, thorough compression. One extreme
       | might be represented by reams of Java boilerplate; the other by
       | IOCCC entries.
        
       | BeetleB wrote:
       | Predictably, another article that doesn't know what DRY is,
       | leading to slaying of strawmen.
       | 
       | I believe DRY was coined in _Programming Pearls_ , and probably
       | none of the examples in the article are instances of DRY.
       | 
       | DRY is about knowledge/requirements, not similar code. It is
       | about ensuring that a given _requirement_ is not duplicated in
       | multiple places in the code. It is _not_ about similar looking
       | code, which often involves differing requirements but just happen
       | to be coded similarly. The latter leads to coupling if you unify
       | it into one piece of code.
        
         | deathanatos wrote:
         | I've referred to this in the past as "semantic duplication"
         | (code that is the same by definition/requirement) vs.
         | "syntactic duplication" (code that just happens to do the same
         | thing today, but there is no requirement that requires both
         | copies to _remain_ the same).
        
         | Ensorceled wrote:
         | I sort of agree with you. A better title/approach would have
         | been "These aren't DRY, they're silly 'no common code'
         | bigotry". I find the article resonates because I've seen all of
         | these anti-patterns defended with "because DRY". I agree, it's
         | not DRY; but so many people get stuck on the "no code
         | duplication" part. I'm not sure if the "This isn't DRY" is the
         | best fight or "Sometimes DRY is not the best".
         | 
         | My favourite block of "DRY" code was a method that had a triple
         | nested loop (for all object A, for all objects A.B, for all
         | objects B.C) with a bunch of flags (like 15 different bools,
         | ints, dates and arrays) that changed the ORM filters for A, for
         | A->B, for B->C, and then changed what operations were done on
         | A, B, C. Basically, at the end, the only similarity was the
         | foreach part. The comment on the block of code had "Keep this
         | loop together for DRY", as if they knew this was going wrong
         | but not sure why. It ended up being 3 or 4 much simpler
         | methods, based as you say, on requirements NOT code "shape".
        
           | BeetleB wrote:
           | > I'm not sure if the "This isn't DRY" is the best fight or
           | "Sometimes DRY is not the best".
           | 
           | The problem is that when people believe this is DRY, they
           | then tend to oppose the "real" DRY as well.
           | 
           | Likely we'll have to give another name to the "real" DRY
           | principle. In general, I've always felt that catchy
           | names/acronyms are a bad idea for anything (e.g. free
           | software, open source, pro-life/choice, etc). Almost all of
           | them end up being used in ways that were differ from the
           | original intent.
        
             | Ensorceled wrote:
             | I guess I have to agree, in the example I gave, I actually
             | had to fight with members of the team because they were
             | sure the new multiple function approach (one for user,
             | company and analytics) wasn't "DRY" in their eyes.
        
         | Ma8ee wrote:
         | Yes, thank you! Exactly what I was trying to express in another
         | comment. And if you do it the right way, none of the problems
         | with DRY usually brought up will be relevant. The whole notion
         | of looking at code and trying to spot similarities to find
         | abstractions is completely backwards.
        
       | chrisweekly wrote:
       | DRY -> AHA
       | 
       | ("Avoid Hasty Abstractions")
        
       | weavejester wrote:
       | Clearly we need a good acronym for "duplication is better than
       | the wrong abstraction".
        
         | bluetwo wrote:
         | "Abstraction extraction is an expensive transaction."
        
       | blackbear_ wrote:
       | You definiteyly shouldn't strive to DRY for everything. The
       | tricky part is to understand when duplicated code can be
       | abstracted away and when it is duplicated "by chance", i.e. the
       | code is the same but there is no real underlying abstraction.
       | 
       | The latter tends to happen for business logic: suddenly
       | requirements change and your beautiful "abstraction" falls like a
       | castle of cards.
        
       | gauravphoenix wrote:
       | So is WET (write everything twice). The right answer typically
       | lies somewhere between the two extremes.
        
       | camgunz wrote:
       | The problem isn't DRY, the problem is "helpers". Helpers are an
       | anti-pattern, they don't fit in your architecture, they have no
       | mental model, they're difficult (impossible) to name and
       | organize, and they're extremely resistant to refactoring.
       | Effectively they're spaghetti code.
       | 
       | The example I always come back to is auth. If you're doing the
       | same thing like "parse a cookie header, get the session, make a
       | DB connection, look up the session info, etc. etc.", consider how
       | you could architect the layers of your application using a mental
       | model that people would find easy to reason about. That might be
       | some OO, middleware, or even a macro, but the point is that it's
       | thought about, designed, engineered, and documented.
       | 
       | The reason helpers are more prevalent than thoughtful
       | architecture is that humans are a lot better at prioritizing the
       | short term "I improved it" fix from factoring into helpers over
       | doing the long term work of architecture. If you want to change
       | this, it starts with cultural values that prioritize long term
       | sustainability.
        
       | indymike wrote:
       | A lot of programming maxims like "stay DRY" are rules of thumb,
       | and often are dangerous, or at least lead to unexpected results
       | when treated like laws of nature. I had a developer who drank
       | functional flavored Koolaid and refactored any single line of
       | code that appeared more than once in an application. Was about
       | 38K lines of code. When he was done, it was still about 38K lines
       | of code. Was it functionally pure? yes. Was it very difficult to
       | debug? Yes... you had to step into sometimes five or six
       | functions to get to a single line of logic.
        
         | bluetwo wrote:
         | I agree. I also think that when building a prototype,
         | flexibility is more important than stability. I allow myself to
         | repeat code when I think two similar things will be different
         | by the time requirements are better known. Later I'll usually
         | re-write in a more stable fashion.
        
       | recursivedoubts wrote:
       | _> Loss of locality_
       | 
       | Yes. I am trying to popularize the term Locality of Behavior
       | (LoB) to capture this software concept:
       | 
       | https://htmx.org/essays/locality-of-behaviour/
       | 
       | There are always trade-offs in development, and the older you get
       | the more you value clarity of code over other potential concers,
       | and locality of behaviour is a big part of that.
       | 
       | I can't find the original source (it was somewhere on medium) but
       | this graph struck me:
       | 
       | https://ibb.co/qm4cLW7
        
       | thisiszilff wrote:
       | I think this misses the point of DRY a little bit. DRY isn't
       | about not copy pasting code, it's about ensuring that knowledge
       | isn't repeated. If two parts of the system need to know the same
       | thing (for example, who the currently logged in user is, or what
       | elasticsearch instance to send queries to, etc.), then there
       | should be a single way to "know" that fact. Put that way, DRY
       | violations are repetitions of knowledge and make the system more
       | complex because different parts know the same fact but in
       | different ways and you need to maintain all of them, understand
       | all of them, etc. etc.
       | 
       | Code blocks that look to be syntactically the same are the lowest
       | expression of "this might be the same piece of knowledge" insofar
       | as they express knowledge about "how to do X", but the key is
       | identifying the knowledge that is duplicated and working from
       | there. Sometimes it comes out that the "duplication" is something
       | like "this is a for loop iterating over the elements of this list
       | in this field in this object" and that is the kind of code block
       | that contains very little knowledge in terms of our system. But
       | supposing that that list had a special structure (ie, maybe we've
       | parsed text into tokens and have information about whitespace,
       | punctuation, etc in that list) and we start to notice we're
       | repeating code to iterate over elements of the list and ignore
       | the whitespace, punctuation elements in it, then we've got a
       | piece of knowledge worth DRYing out given that all the clients
       | now need to know what whitespace & punctuation look like even
       | when they'd like to filter them out.
       | 
       | It's worth pointing out that DRYing out something isn't
       | necessarily "abstracting", it is more like consolidating
       | knowledge into one place.
        
         | majormajor wrote:
         | I haven't seen this "don't repeat knowledge" take before, it's
         | pretty interesting. I see why you don't want mutated various
         | versions of the same information all over the place, but you
         | still have dangers.
         | 
         | Especially if you "overly reduce" your knowledge. If your
         | common recipe is "do A, B, C, D, E" and you reduce that to just
         | "do X," for instance.
         | 
         | I've seen this often turn into "now, instead of the knowledge
         | being repeated in several places, it's hidden in one place and
         | only one person knows it." Everybody else just relies on the
         | library doing its magic, and when someone needs to do something
         | differently, they have this huge mountain to climb to figure
         | out how to modify the code to also do "J" for certain cases
         | without breaking everyone else.
        
           | thisiszilff wrote:
           | There is definitely a spectrum of "knowledge" at play when it
           | comes to these considerations. The most obvious DRY
           | violations are those kinds of things that you go "oh I need
           | to test for this case" because that is usually an indication
           | of some knowledge you need to know when interacting with a
           | piece of code. EG, if you ever use -1 as a sentinel value
           | then the knowledge of "what -1" means should be consolidated
           | together, otherwise all clients will have to know that -1 is
           | a sentinel, what it means and at best you'll have duplicate
           | code, at worst those interpretations won't align and you
           | might have a subtle bug where that -1 is doing something
           | somewhere (ie it is supposed to mean "No information
           | provided" but somewhere something is keeping an arithmetic
           | mean of this field and those -1s are now screwing up your
           | metrics and you don't really notice).
           | 
           | When we think about the knowledge of "how to _do_ something "
           | that's where things can get confusing. 9/10 times I'd say
           | that right move is to look for common assumptions or facts.
           | IE it isn't just "doing something" that is important, but the
           | assumptions made in the process of doing it:
           | 
           | As an example, consider finding the average word length in
           | some piece of text. We might start writing that feature like:
           | def count_words(text: str) -> int:           return
           | len(text.split(' '))            def average_word_length(text:
           | str) -> int:           num_words = count_words(text)
           | word_lengths = []           for word in text.split(' '):
           | word_lengths.append(len(word))           return
           | sum(word_lengths) / num_words
           | 
           | then the piece of knowledge they share is "what a word is"
           | and the DRY refactoring would pull out that piece of
           | knowledge into its own function                 def
           | words(text: str) -> List[str]:           return text.split('
           | ')
           | 
           | that might be code you write when starting to write a feature
           | and that's the kind of "ding ding ding there's common
           | knowledge here" that should guide refactoring. The system has
           | a concept of a "word" that we've introduced and its important
           | that knowledge about "what a word is" in one place. For DRY
           | things it frequently doesn't make any sense for there to be
           | multiple statements of "what a word is" where the system
           | wants to use the same concept.
           | 
           | Kind of orthogonal to this is abstraction where the focus is
           | on "usefulness" and that is where 100% you can abstract
           | incorrectly, prematurely, get screwed over by requirement
           | changes, write a library that hides everything and makes
           | people angry. The example you provide seems more like an
           | error in abstraction where things that should be close
           | together are too far apart in the system (ie, some "fact" is
           | hidden away and another part of the system wants to know it),
           | but the consolidation and DRYing of those facts, I'd argue,
           | is a lot easier once we've figured out how to identify them
        
             | majormajor wrote:
             | Yeah, I like this approach, because the "what is a word"
             | knowledge is a nice piece of common functionality that
             | doesn't make sense to repeat. It's unlikely to change for
             | just one of those two functions.
             | 
             | In my example, it's less a "core piece of knowledge" that
             | people are trying to DRY, and more just a "common
             | sequence." Someone sees a bunch of different places where
             | we have a sequence of calls like A, B, C, D.. and says "oh
             | this is a shared method I can extract" even if there's
             | plenty of ways that in the future you might want to do A,
             | B, C, E without D. And so then you pass in a bool, than
             | another one, and you have a centralized mess...
        
         | cogman10 wrote:
         | These things need to be balanced. I live in an ecosystem of DRY
         | gone amok and it's not pleasant.
         | 
         | There's a standard library to connect to databases. There's a
         | huge hierarchy setup just to start an app running.
         | 
         | All of these super dry infrastructure changes have,
         | unfortunately, come with a huge cost. We are still stuck on
         | ubuntu 14.04 because our super dry puppet framework we invented
         | can't be ported to puppet 6.
         | 
         | We are stuck talking to MS-SQL, because our super dry database
         | connection management library can't handle establishing other
         | database interactions.
         | 
         | We are still stuck on Tomcat 7 because our super dry Jersey
         | libraries don't work with newer versions of Jersey (which has
         | locked us into older versions of tomcat!).
         | 
         | Consolidation is a decent goal, but it really needs to be
         | measured. For me, it is FAR more important to consolidate on
         | the how to do things and not the what does things. In
         | otherwords, rather than making an "elasticsearch connection
         | library" specify "This environment variable is the
         | elasticsearch host/credentials" and let the apps move from
         | there.
         | 
         | That's because, when it comes right down to it, configuration
         | code is super easy to write and it really doesn't matter if
         | it's duplicated. You want your libraries consolidating
         | knowledge to be for things that are easy to get wrong (such as
         | checking who is currently logged in or how to authenticate).
        
           | thisiszilff wrote:
           | > Consolidation is a decent goal, but it really needs to be
           | measured. For me, it is FAR more important to consolidate on
           | the how to do things and not the what does things. In
           | otherwords, rather than making an "elasticsearch connection
           | library" specify "This environment variable is the
           | elasticsearch host/credentials" and let the apps move from
           | there.
           | 
           | I think we're in agreement here. Config is the most basic
           | kind of knowledge because when something wants to know about
           | the elastic credentials,it almost never makes sense to have
           | it in two places if those two places are supposed to be the
           | same thing.
           | 
           | How to actually connect to elastic -- that's the part that is
           | more iffy. If there is some knowledge we've added there, then
           | it makes sense to DRY it up, but the knowledge of "this is
           | how you pass credentials to this elastic search client" isn't
           | the kind of system knowledge we care about. If, for example,
           | there were some kind of parameters that we had to set on each
           | connection and we claimed it as a piece of knowledge that all
           | of our connections to this service are of this specific TYPE
           | and have these specific parameters, then we've started to add
           | some additional systemic knowledge that might need to get
           | consolidated.If someone were to start working on a piece of
           | code and I feel the need to tell them "Don't forget about X"
           | then that is the kind of situation where DRY comes into play.
           | If it's just a vanilla connection to a database and we don't
           | care about the connections made, then I do given't think we
           | have a violation of DRY given that there isn't an important
           | piece of knowledge that's repeated.
           | 
           | At some point, especially when we pay too much attention to
           | copy-pasted code, we end up abstracting. Abstracting is hard,
           | more general, very difficult to do right, almost always done
           | to early. DRYing out knowledge is easier and almost always
           | improves things.
        
         | citrin_ru wrote:
         | IMHO it is not the author who misses the point of DRY, but
         | countless developers who make code less readable only to reduce
         | visible repetition or to avoid copy-n-paste. May be DRY is just
         | a bad name.
        
           | thisiszilff wrote:
           | Yeah, I'd agree. When the principle was introduced it was
           | stated as:
           | 
           | > The DRY principle is stated as "Every piece of knowledge
           | must have a single, unambiguous, authoritative representation
           | within a system"
           | 
           | (from wikipedia)
           | 
           | It feels like the name really took over the intention and it
           | became about code repetition instead of knowledge repetition.
        
             | gen220 wrote:
             | I agree that the name took over. The intention sounds
             | synonymous with bounded contexts of DDD.
             | 
             | I find the vocabulary of DDD to have more explanatory
             | power. Especially with people who don't grok the difference
             | between removing repetition and consolidating models.
             | 
             | I think repetition is a symptom that a code base may be
             | afflicted with interwoven domains, but the existence of
             | repetition is not sufficient for the diagnosis, IMO.
        
           | Tyr42 wrote:
           | One Source of Truth or OSoT doesn't sound as nice as DRY
           | though.
        
         | [deleted]
        
         | ThrustVectoring wrote:
         | > ensuring that knowledge isn't repeated
         | 
         | The most fun bug I've encountered as a web developer is of this
         | category. Two pages, both check for a logged-in user and
         | redirects to the other if found or not found, respectively. The
         | bug was a subtle difference in how these were calculated, the
         | details of which are unfortunately lost to the sands of time.
         | The end result was that if you sat on one of the pages and
         | waited for your user session to time out, you'd get stuck in a
         | redirect loop between the "logged in" and "please log in"
         | versions of the page.
         | 
         | Anyhow, the point of this is that when you calculate the same
         | fact two different ways, you will occasionally build something
         | that makes an unwarranted assumption that because it's the
         | "same fact" you wind up with the same answer. This is an entire
         | category of easily missed and often subtle bugs.
        
       | pydry wrote:
       | It's not just DRY. _Every_ axiomatic or semi-axiomatic principle
       | of software development ends up being a trade off. Good code lies
       | at a local minima where multiple competing concerns are all
       | balanced against one another.
       | 
       | At least, I've yet to see one which isn't.
        
         | kens wrote:
         | I agree about the importance of tradeoffs. Looking at the
         | historical perspective, though, the reason that every principle
         | is a tradeoff is that the principles that are uniformly worse
         | get discarded.
         | 
         | For instance, structured programming (building code out of
         | blocks with structured control flow rather than a pile of
         | gotos) was victorious in the 1970s. Nowadays nobody considers
         | the tradeoffs of using if-then-else versus gotos; structured
         | code is the automatic choice.
         | 
         | Self-modifying code was very popular in the 1950s (since it was
         | the only way to get many things done), but essentially nobody
         | uses it now.
         | 
         | Modularity is another victorious principle of software
         | development, winning out over big blobs of code with global
         | variables.
         | 
         | Using a stack for subroutine calls used to have tradeoffs, but
         | now nobody would consider an alternative.
         | 
         | Looking at the long perspective, there is real progress in
         | software development (although slower than I'd hope).
        
       ___________________________________________________________________
       (page generated 2020-12-18 23:01 UTC)