[HN Gopher] Deming's Red Bead Experiment (2002)
       ___________________________________________________________________
        
       Deming's Red Bead Experiment (2002)
        
       Author : gkop
       Score  : 111 points
       Date   : 2022-02-04 15:40 UTC (7 hours ago)
        
 (HTM) web link (maaw.info)
 (TXT) w3m dump (maaw.info)
        
       | dang wrote:
       | A related video from around 1994:
       | http://www.youtube.com/watch?v=C5Io2WweTxQ
       | 
       | (via https://news.ycombinator.com/item?id=5193898, but no
       | comments there)
        
         | cpach wrote:
         | If anyone wants to know more more about Deming, I can warmly
         | recommend this blog post by Avery Pennarun:
         | https://apenwarr.ca/log/20161226
        
           | hitekker wrote:
           | I'd be careful about taking that particular author's
           | interpretation of management literature at face value.
           | 
           | His understanding of _High Output Management_ (a seminal book
           | by the CEO of Intel) was so flawed that the CEO of Dropbox
           | had to correct him.
           | 
           | https://news.ycombinator.com/item?id=21088425
        
             | ignoramous wrote:
             | Drew, fwiw, pointed out that Avery Pennarun (who is,
             | frankly, phenomenal at distilling ideas in a given context)
             | was right about a bunch of things except the TLDR.
             | 
             | Though Avery does hint that "output" itself is a function
             | of values/principles execs ought to imbibe in their org:
             | 
             | > _What executives need to do is come up with
             | organizational values that indirectly result in the
             | strategy they want._
             | 
             | > _That is, if your company makes widgets and one of your
             | values is customer satisfaction, you will probably end up
             | with better widgets of the right sort for your existing
             | customers. If one of your values is to be environmentally
             | friendly, your widget factories will probably pollute less
             | but cost more. If one of your values is to make the tools
             | that run faster and smoother, your employees will probably
             | make less bloatware and you 'll probably hire different
             | employees than if your values are to scale fast and capture
             | the most customers in the shortest time._
             | 
             | It remains to be seen if Avery ends up building a larger
             | company than Drew. I'm willing to bet all of $100 in my
             | depleting bank account that they will.
        
               | hitekker wrote:
               | Drew Houston disagrees with you and Avery. Not just the
               | TL;DR but in the critical details.
               | 
               | > Contrary to what the post suggests, HOM does not say
               | not that the job of an executive is to wave some kind of
               | magic culture or "values" wand and rubber-stamp whatever
               | emergent strategy and behavior results from that. CEOs
               | and executives absolutely do (and must) make important
               | decisions of all kinds, break ties, and set general
               | direction.
               | 
               | Notwithstanding that Drew is a billionaire who founded a
               | billion dollar company and Tailscale has yet to crack a
               | large valuation, Avery basically misinterpreted Andy
               | Groves. That misinterpretation, "CEO as a passive referee
               | whose job is to set the culture", is only sensible to
               | people who've never managed a large group of people.
        
       | Tomte wrote:
       | The other well-known experiment Deming used is the funnel:
       | https://www.2uo.de/deming/#the-funnel-experiment
        
       | jeffreyrogers wrote:
       | Interesting experiment. I don't think this applies to knowledge
       | work in the same way it does to manufacturing.
        
         | kqr wrote:
         | You're right. It doesn't. But it's a matter of degree, not
         | kind. Knowledge work has many times the variability of
         | manufacturing (by design: if you remove variability from
         | knowledge work, you're no longer producing anything new each
         | time.)
         | 
         | In other words, this applies even more severely to knowledge
         | work.
         | 
         | More concretely: in manufacturing you can have a process that
         | yields 9-14 % defective or whatever. The variation is
         | relatively small; say a CV of 10 %. In knowledge work, you'll
         | be looking at processes that generate somewhere between 0.1 and
         | 100 defective ideas for every really good idea. This variation
         | is enormous: 1000 % or so.
        
         | com2kid wrote:
         | Software engineer is put on a team that has more legacy code.
         | If management judges by # of incidents, they are under
         | performing.
         | 
         | Heck I can tell you from experience that if you want to get
         | promoted fast, new product teams are the way to go. You get to
         | file lots of patents, architect huge new systems, and look like
         | a rock star.
         | 
         | Another example: Partner teams upstream keep pushing breaking
         | API changes, downstream teams look bad because their services
         | are the ones having the outage. You do your due diligence, your
         | code is defect free, well tested. Doesn't matter, you are
         | spending half your day putting out fires caused by someone
         | else. Meanwhile another co-worker starts on a team where their
         | upstream services are written to be robust against bad incoming
         | data and have APIs that maintain back compat. Your co-worker
         | puts out buggy poorly tested code, but the upstream services
         | are robust enough that everything keeps chugging along.
         | 
         | Management doesn't see any of this. They just see your team has
         | poor performance, and this other team has great performance.
         | Heck maybe that other team has a higher "velocity" because they
         | can turn out features faster.
        
         | riskable wrote:
         | In knowledge work the skill of the worker matters vastly more
         | because "the process" mostly takes place in their head. You can
         | optimize a workspace and tools for productivity and the
         | reduction of errors but ultimately that has a minimal impact on
         | "the process" that's taking place in the knowledge worker's
         | mind.
         | 
         | If the process was the problem (or perfect), adding a new (or
         | replacing) a worker would have minimal impact but we know this
         | is not true. You could have the best documentation, training,
         | and absolutely stellar code yet one person can turn everything
         | to shit quite quickly! The opposite is true as well: Bringing
         | on a fantastic new worker can make your existing team look like
         | a bunch of inefficient laggards.
         | 
         | Neither of these situations can be fixed by improving processes
         | (maybe hiring processes? Though I doubt it). It'd be like
         | having one magic blue bean in the box that--if found--can
         | either drastically improve or degrade the final productivity by
         | 90%. Would the optimum process improvement then be to try to
         | eliminate magic beans entirely? Sure seems like it (i.e. hire
         | the lowest common denominator and don't try to optimize for the
         | 1%). That way you reduce the likelihood of taking on the "bad
         | 1%"--even though it reduces your chances of obtaining the
         | perfect magic bean.
        
       | kqr wrote:
       | If you are in any sort of leadership position -- either a formal
       | manager, or in that you have respect from your peers, I strongly
       | urge you to read Deming.
       | 
       | There are few authors that have taught me so much about people,
       | motivation, systems, quality, statistics, what high-leverage
       | effort looks like, and so on.
       | 
       | I first picked up a book by Deming a few years ago, and not a
       | single day has passed that I have not had use for what he taught
       | me through his writing.
       | 
       | The things he says are only becoming more and more relevant with
       | every year. I honestly think it ought to be compulsory reading in
       | school. The world would be a much better place that way; kinder,
       | more efficient, and less superstitial.
        
         | Litost wrote:
         | Thanks for the suggestion, how does what he says stack up
         | against those who came after, I ask this because this is one of
         | my favourite management talks by Russell Ackoff [1] and he
         | mentions Dr Deming so assume he was influenced
         | by/worked/studied with him and given their relative ages
         | wondered if his work might be valuable to start with?
         | 
         | [1] https://www.youtube.com/watch?v=OqEeIG8aPPk
        
         | marbex7 wrote:
         | Which book?
        
           | kqr wrote:
           | I started with The New Economics, then read Out of the
           | Crisis, and finally A Theory of Sampling or whatever its name
           | is.
        
             | marbex7 wrote:
             | Thanks.
        
         | [deleted]
        
       | hencq wrote:
       | There's a brilliant little book Four Days with Dr. Deming[0] that
       | goes over the red bead experiment among other things. It
       | basically follows the format of a four day seminar that Dr.
       | Deming used to do. It's full of wisdom like this and it does a
       | painfully good job making you recognize all the ineffective
       | things still going on in companies today.
       | 
       | [0]
       | https://www.goodreads.com/en/book/show/34987.Four_Days_with_...
        
       | [deleted]
        
       | buescher wrote:
       | Years ago I found a discussion of this on the web that involved a
       | deep dive into optimal strategies for getting white beads,
       | variations in paddle construction, root cause analysis on bead
       | size and weight and hole depth in the paddles, and so on. It was
       | a six sigma nightmare come to life and missed the point so
       | profoundly I wish I could find it again to use as an example of
       | how easily Deming is misunderstood.
       | 
       | Related: "A bad system beats a good person any time" does not
       | mean "having any system, no matter how bad, is better than having
       | even the best people and no apparent system".
        
         | AnIdiotOnTheNet wrote:
         | Wait a bit and HN will probably provide you a similar
         | discussion.
        
         | krallja wrote:
         | "beats" in the sense of "the beatings will continue until
         | morale improves," right?
        
           | buescher wrote:
           | It certainly isn't positive.
        
         | hencq wrote:
         | Oh boy, I'd love to see that too. Unfortunately it's all too
         | common to see this stuff in reality as well.
         | 
         | > Related: "A bad system beats a good person any time" does not
         | mean "having any system, no matter how bad, is better than
         | having even the best people and no apparent system".
         | 
         | I'm a big fan of sociotechnical systems [0] where the motto is
         | to give people complex jobs in simple organizations.
         | Unfortunately in practice you usually see the tendency to do
         | exactly the opposite.
         | 
         | [0] https://en.wikipedia.org/wiki/Sociotechnical_system
        
       | vintermann wrote:
       | Deming made me realize that there is actually management
       | literature out there that isn't just fads and slogans.
        
         | openknot wrote:
         | Would there be any other similar recommendations to Deming's
         | books? I would think that Eliyahu M. Goldratt's books sound
         | similar (specifically, the "Theory of Constraints,"
         | alternatively presented through a fictional story in "The
         | Goal").
        
           | mark_undoio wrote:
           | I'm a fan of Womack & Jones's "Lean Thinking". This is all
           | about Lean manufacturing, which I think is partly rooted in
           | Deming's work. The focus is more on how to optimise the
           | overall system than the management of individuals.
           | 
           | The content of that book isn't directly applicable to e.g.
           | software companies but if you think a bit you can see quite a
           | lot of analogous situations (e.g. warehoused inventory is
           | incomplete projects or not-yet-shipped code, "monuments"
           | could be inappropriate central test / build systems, etc).
        
             | kqr wrote:
             | If you want someone to translate it to software for you,
             | Reinertsen's Principles of Product Development Flow is
             | about adapting the philosophy to knowledge work.
             | 
             | Ward's Lean Product and Process Development is also a good
             | take on those ideas.
        
           | m104 wrote:
           | I recommend Russell Ackoff's writings as somewhat related and
           | more to do with how systems of people and processes work (or
           | don't). Here's a great place to start:
           | https://thesystemsthinker.com/a-lifetime-of-systems-
           | thinking...
        
         | [deleted]
        
       | larrydag wrote:
       | The key to doing this experiment well is having the right
       | facilitator that brings the attitude. A good facilitator will
       | roleplay a leader/manager/exec that will praise when measures are
       | good and berate when measures are bad. The idea of this
       | experiment is to show how management can harm the process even
       | when there is inherent variability, good or bad.
       | 
       | Here is Dr. Deming himself performing the experiment
       | https://www.youtube.com/watch?v=7pXu0qxtWPg
        
         | Rickasaurus wrote:
         | Thanks for sharing this, it's amazing to see it in action.
        
         | laserlight wrote:
         | What an amazing demonstration. It's unfortunate that these
         | lessons haven't been learned decades later.
        
       | curiouscats wrote:
       | The W. Edwards Deming Institute Blog https://deming.org/blog/
       | 
       | Deming on various management topics
       | https://deming.org/category/deming-on-management/
       | 
       | More resources on Deming's ideas https://deming.org/online-
       | resources-on-w-edwards-demings-man...
        
       | pakitan wrote:
       | I don't get it. This "experiment" could have been replicated by a
       | simple computer simulation, given that worker output is entirely
       | random. The supposed moral of the story is that system design
       | defines outcome, not individual performance but how does that
       | even count as "science" when you don't have control and
       | experimental group. He designed a system with inherent flaws and,
       | surprise, it has flaws. We can see there is variance in
       | "productivity" but we have no idea how this same variance would
       | have affected output if workers actually had agency.
        
         | lupire wrote:
         | It's a demo experiment, like whe the physics teacher swinga a
         | heavy pendulum at their own nose, or shoots a BB gun at a
         | falling toy.
        
         | function_seven wrote:
         | That's the point.
         | 
         | So first, not all science requires an RCT. Dividing
         | expiremental subjects into study and control groups is one way
         | of doing science. It's not the only way.
         | 
         | In this case, this is a concrete demonstration of just how much
         | variance can emerge from a "statistically neutral" process. The
         | systemic flaws are part of the demonstration. What appear at
         | first glance to be identical tools, inputs, and processes are
         | in fact subtly different. The demo shows management types that
         | their charts and graphs cannot always be relied upon to
         | differentiate performance levels among staff. The system itself
         | must also be scrutinized. If Bob's ad campaigns are
         | outperforming Alice's by 20% in the first quarter, it doesn't
         | necessarily mean Bob is a marketing genius and Alice needs a
         | PIP.
         | 
         | A computer simulation would not have nearly as powerful effect
         | on most people as a live demonstration using real beads. And
         | the imperfections in the paddles is something that naturally
         | arises when they're physically made, but would have to be tuned
         | by the programmer building the simulation. Which would lead to
         | questions about "just how did they decide what variances would
         | come into play?"
        
           | jiggawatts wrote:
           | A real world example is that I do programming on a high-end
           | workstation laptop. My coworkers use old budget laptops.
           | 
           | This is not in their control -- they're victims of corporate
           | policy.
           | 
           | Does it influence quality in complex and hard to quantify
           | ways?
           | 
           | Most assuredly...
        
           | pakitan wrote:
           | I think I get it now. The point of the experiment is to ELI5
           | the concept of variance to management types who skipped
           | statistics classes :) Could be useful for some bosses I had
           | :)
        
             | function_seven wrote:
             | So I'm watching a video of this right now[0], and it's even
             | more enlightening than I figured it would be! Deming makes
             | comments throughout the demonstration that I swear I've
             | heard in the real world. For example, one worker--whose
             | previous results put him on probation (he had 12 red
             | beads)--managed to have only 6 the next day. "Looks like
             | probation worked".
             | 
             | Meanwhile another worker--previously scoring 5, and getting
             | a merit-based raise from it--did poorly with 12. The
             | remark: "That raise went to his head. He's getting lazy".
             | 
             | So yeah, the value of this is in the actual doing of it.
             | 
             | [0] https://www.youtube.com/watch?v=7pXu0qxtWPg
        
         | advisedwang wrote:
         | This experiment is something he did in classes etc so people
         | could _experience_ the obvious idiocy in trying to manage
         | individuals for system behaviour. His point is that ALL actual
         | work is also dominated by system behaviour, just more subtly,
         | and managers must focus on the systems and not worker
         | performance.
        
           | pakitan wrote:
           | > His point is that ALL actual work is also *dominated* by
           | system behaviour
           | 
           | If that's his point, it seems obviously wrong. Some work,
           | like in the experiment is dominated by system behavior. For
           | others, system would play a much smaller role. For example, 2
           | people cranking code in a startup. No matter what system you
           | apply, if they are not good programmers, nothing of value
           | will come out.
        
             | kqr wrote:
             | That's also missing the point somewhat. Put the best two
             | programmers in the world in a shitty system that rewards
             | them for the wrong things and they will produce garbage.
             | Put mediocre programmers in a fantastic system that brings
             | out the best in their collaboration and you might actually
             | get to market sooner and better than the other group.
        
             | salawat wrote:
             | Something will most certainly come out. You're just not
             | defining the system. If I define the system as "one
             | programmer is responsible for looking at the desired
             | product and writing specifications" and the other
             | programmer is to translate specification into programming
             | code, and never shall one do the other's job, I assure you,
             | the best programmers in the world will produce shit over
             | time.
             | 
             | Randomly swap in two new actors with different life
             | experiences into the same spots to do the same work, and
             | you'll still get shit. If in the unlikely event, you get
             | amazing work, it's not that the people doing it were
             | special; it's just anothe outlier in the data stream. Add
             | in the emotional toll of working as hard as possible to
             | succeed but never being able to meet prescribed quality
             | levels?
             | 
             | A system is perfectly tuned to produce the results it does.
             | Want different results? Change the system. That is Deming's
             | point. We have a tendency to blame variance in a system on
             | the human actors immediately proximal, instead of paying
             | attention to the actual significant constraints. This is an
             | important lesson to management types, as they are to
             | process/system what a programmer is to a computer.
             | 
             | The planners cast the dice for downstream long before
             | downstream can do anything about it, and in many corporate
             | setups, top down works just fine, but bottom up never gets
             | any attention.
        
       | pierrebai wrote:
       | I find the experiment skewed. Or more precisely, that it is not
       | meant to investigate human behaviour or psychology. It is rather
       | precisely designed to support a chosen result to support a given
       | world view. The fact that it has been ran for 50 years is a
       | strong indication of this.
       | 
       | IOW, the experimenter wanted to be able to arrive at the
       | conclusion that difference in performance was unrelated to
       | workers and designed the experiment so it would give this result.
       | In short, this demonstrate few things outside of a very
       | artificially setup situation, where the workers have no say and
       | the job is predestined to fail.
       | 
       | Anyone who worked anywhere knows very well that there are
       | actually vast difference between two workers.
        
         | cool_dude85 wrote:
         | >IOW, the experimenter wanted to be able to arrive at the
         | conclusion that difference in performance was unrelated to
         | workers and designed the experiment so it would give this
         | result.
         | 
         | That's the whole point. The experiment is not that we're
         | supposed to be surprised that the workers did not affect
         | performance - in fact, that's the subtext of the whole thing!
         | We know it from the start cause he explains exactly how the
         | process works and we can all see that individuals cannot affect
         | their output.
         | 
         | The point is, if we are unaware that we're in such a situation,
         | we can still find metrics to allow us to rank workers, fire low
         | performers, give out raises, etc. When we myopically focus on
         | such metrics, and disregard the system that makes them
         | worthless, we're making all our decisions on random chance,
         | even though we have a clear process, data collection, the whole
         | thing.
        
           | pierrebai wrote:
           | That's also my whole point: this is not an experiment but an
           | elaborate artificial argument designed to prove a point of
           | view decided in advance. That is why I find it unsavory.
        
             | Jtsummers wrote:
             | The point of the experiment is to be extreme, but after
             | reading (a very large portion of) Deming's work, I don't
             | think he'd disagree with your initial assertion that there
             | are differences between workers.
             | 
             | The broader points he makes, related to this experiment at
             | least: There are individual and systemic issues that
             | influence the outcome of a process. The actual ratio will
             | vary depending on what kinds of processes are involved.
             | 
             | If the job is to be a literal screw turner on an assembly
             | line, then there is relatively little difference between
             | the majority of people (assuming they are generally able
             | bodied, sighted, and have decent coordination), the
             | _system_ (tempo, length of shift, accessibility of the
             | thing being screwed together, tools being used) will have a
             | much larger impact than the individual 's skill. The system
             | of the assembly line will influence the outcome more than
             | the individual's skill (at least above a basic threshold, a
             | supremely uncoordinated individual could flounder even with
             | the slowest pace of work). Switch to more skilled work and
             | you will find, increasingly, more differences in outcome
             | based on individual performance versus the system of the
             | work, but even there the system matters.
             | 
             | Look at software development offices that still favor
             | things like manual build processes, code versioning
             | control, testing, and deployment over automation. They
             | provide many opportunities for human error (even just
             | simple miskeying of data) that can reduce everyone's
             | effectiveness no matter how skilled. (Fortunately these
             | kinds of places are increasingly rare, at least outside of
             | US defense contractors.)
             | 
             | The experiment, then, is an artificial construct (like most
             | classroom experiments) meant to illustrate a point by
             | showing one extreme. This acts as a counterpoint to the
             | more conventional wisdom that the individual, and not the
             | system, is what actually matters for the outcome. The
             | conventional wisdom, of course, being wrong in many
             | circumstances since it tends to place too strong a weight
             | on the individual performance and too weak a weight on the
             | system.
             | 
             | It would be unsavory if he had said, "See, stop evaluating
             | individuals their contribution doesn't matter." But he
             | never did say that (in anything I read, at least), and
             | anyone who looks at this experiment and draws that
             | conclusion would be an idiot.
        
       | emeraldd wrote:
       | I wonder what the limits of this are? From a naive point of view
       | there has to be a point where training/skill/physical
       | endurance/etc. come into play. The bed experiment seems to fit a
       | fixed rate, assembly line style of work. While I would agree that
       | numeric/performance ranking is mostly meaningless, everyone knows
       | that one somebody you go to when no one else can fix a problem.
        
         | IggleSniggle wrote:
         | I see what you mean, but I also think that's encapsulated in
         | the idea of "ready willing workers."
         | 
         | Obviously there are differences between people, and better and
         | worse teams. But the lesson here is about how the environment
         | factors in, and how management can accidentally arbitrarily
         | suppress innovation or reward luck within normal bounds of
         | success. Or hamper themselves to failure by insisting on a
         | broken process.
         | 
         | Could it be the case that "everybody goes to Jim," and as a
         | result, Jim gets good at helping people? Could it be that if
         | everybody just went to Kim for 2 weeks, that her fixes might
         | turn out to be better yet completely orthogonal method of
         | solving the problem?
         | 
         | The Red Bean experiment is an antidote to rigid process and the
         | praise/blame game as based on inspection of results. It's a
         | story intended for management to hear, not an absolution or
         | dismissiveness of personal reasonability.
         | 
         | If you've hired "ready willing workers," then looking at the
         | results doesn't necessarily show you who was killing it and who
         | wasn't.
         | 
         | That worker who is always "killing it" may be good at scooping
         | up projects that always look great. That worker who is always
         | underperforming might be maintaining essential infrastructure
         | without which the system would fall apart.
         | 
         | The worker who's killing it may be doing so by spending all
         | their time "buttering up" a customer. The worker who appears
         | underperforming may appear so because they spend all their time
         | "buttering up" a customer, but someone else always lands the
         | sale.
         | 
         | It's a meditation on imperfect knowledge.
        
         | kqr wrote:
         | As you have observed already, this experiment is set up
         | specifically to eliminate the effect of training/skill/physical
         | endurance etc, and YET when it's performed in real life with a
         | good facilitator, people who are unlucky start to feel like
         | they're underperforming and need to step it up, while people
         | who are lucky start to feel like they deserve the praise for
         | doing well.
         | 
         | I've read about people who go for days after the experiment and
         | feel bad about their subpar performance because they feel like
         | they've let down or brought shame to their company and wonder
         | if they couldn't have done something better.
         | 
         | And this is an experiment that's set up to remove any trace
         | indivdual agency what so ever! People still beat themselves up
         | over it.
         | 
         | When you experience this experiment for real, you start to
         | forget that it's actually designed to eliminate any sort of
         | skill.
         | 
         | In other words, the experiment shows how hard it is to
         | recognise when we're judging the system and not the people in
         | it. The experiment shows that even when you think you're seeing
         | individual performance, it's very plausible you're not.
        
         | ziggus wrote:
         | Focusing on the type of work being done is a bit of a bike
         | shed, since the experiment isn't about the work per se, but the
         | measurement of the work as a function of the employee alone -
         | ie, without the context of the systems in which the employee
         | functions.
         | 
         | A good example of the type of mismeasurement done in non-
         | manufacturing contexts is the ridiculously stupid burn-down
         | chart.
        
           | webmaven wrote:
           | _> A good example of the type of mismeasurement done in non-
           | manufacturing contexts is the ridiculously stupid burn-down
           | chart._
           | 
           | Bad management can find a misuse for any tool, I don't think
           | burn-down charts are a particularly attractive nuisance in
           | that regard.
        
       | candyman wrote:
       | I was lucky enough to do this with the man himself at NYU. He had
       | trouble speaking then but the class was dead silent and hung on
       | his every word. Profound thinker.
        
       | mark-r wrote:
       | I saw a link to this in a discussion of another topic, I'm glad
       | somebody pushed it to the top level. Definitely worth the read.
        
       ___________________________________________________________________
       (page generated 2022-02-04 23:00 UTC)