[HN Gopher] Deming's Red Bead Experiment (2002) ___________________________________________________________________ Deming's Red Bead Experiment (2002) Author : gkop Score : 111 points Date : 2022-02-04 15:40 UTC (7 hours ago) (HTM) web link (maaw.info) (TXT) w3m dump (maaw.info) | dang wrote: | A related video from around 1994: | http://www.youtube.com/watch?v=C5Io2WweTxQ | | (via https://news.ycombinator.com/item?id=5193898, but no | comments there) | cpach wrote: | If anyone wants to know more more about Deming, I can warmly | recommend this blog post by Avery Pennarun: | https://apenwarr.ca/log/20161226 | hitekker wrote: | I'd be careful about taking that particular author's | interpretation of management literature at face value. | | His understanding of _High Output Management_ (a seminal book | by the CEO of Intel) was so flawed that the CEO of Dropbox | had to correct him. | | https://news.ycombinator.com/item?id=21088425 | ignoramous wrote: | Drew, fwiw, pointed out that Avery Pennarun (who is, | frankly, phenomenal at distilling ideas in a given context) | was right about a bunch of things except the TLDR. | | Though Avery does hint that "output" itself is a function | of values/principles execs ought to imbibe in their org: | | > _What executives need to do is come up with | organizational values that indirectly result in the | strategy they want._ | | > _That is, if your company makes widgets and one of your | values is customer satisfaction, you will probably end up | with better widgets of the right sort for your existing | customers. If one of your values is to be environmentally | friendly, your widget factories will probably pollute less | but cost more. If one of your values is to make the tools | that run faster and smoother, your employees will probably | make less bloatware and you 'll probably hire different | employees than if your values are to scale fast and capture | the most customers in the shortest time._ | | It remains to be seen if Avery ends up building a larger | company than Drew. I'm willing to bet all of $100 in my | depleting bank account that they will. | hitekker wrote: | Drew Houston disagrees with you and Avery. Not just the | TL;DR but in the critical details. | | > Contrary to what the post suggests, HOM does not say | not that the job of an executive is to wave some kind of | magic culture or "values" wand and rubber-stamp whatever | emergent strategy and behavior results from that. CEOs | and executives absolutely do (and must) make important | decisions of all kinds, break ties, and set general | direction. | | Notwithstanding that Drew is a billionaire who founded a | billion dollar company and Tailscale has yet to crack a | large valuation, Avery basically misinterpreted Andy | Groves. That misinterpretation, "CEO as a passive referee | whose job is to set the culture", is only sensible to | people who've never managed a large group of people. | Tomte wrote: | The other well-known experiment Deming used is the funnel: | https://www.2uo.de/deming/#the-funnel-experiment | jeffreyrogers wrote: | Interesting experiment. I don't think this applies to knowledge | work in the same way it does to manufacturing. | kqr wrote: | You're right. It doesn't. But it's a matter of degree, not | kind. Knowledge work has many times the variability of | manufacturing (by design: if you remove variability from | knowledge work, you're no longer producing anything new each | time.) | | In other words, this applies even more severely to knowledge | work. | | More concretely: in manufacturing you can have a process that | yields 9-14 % defective or whatever. The variation is | relatively small; say a CV of 10 %. In knowledge work, you'll | be looking at processes that generate somewhere between 0.1 and | 100 defective ideas for every really good idea. This variation | is enormous: 1000 % or so. | com2kid wrote: | Software engineer is put on a team that has more legacy code. | If management judges by # of incidents, they are under | performing. | | Heck I can tell you from experience that if you want to get | promoted fast, new product teams are the way to go. You get to | file lots of patents, architect huge new systems, and look like | a rock star. | | Another example: Partner teams upstream keep pushing breaking | API changes, downstream teams look bad because their services | are the ones having the outage. You do your due diligence, your | code is defect free, well tested. Doesn't matter, you are | spending half your day putting out fires caused by someone | else. Meanwhile another co-worker starts on a team where their | upstream services are written to be robust against bad incoming | data and have APIs that maintain back compat. Your co-worker | puts out buggy poorly tested code, but the upstream services | are robust enough that everything keeps chugging along. | | Management doesn't see any of this. They just see your team has | poor performance, and this other team has great performance. | Heck maybe that other team has a higher "velocity" because they | can turn out features faster. | riskable wrote: | In knowledge work the skill of the worker matters vastly more | because "the process" mostly takes place in their head. You can | optimize a workspace and tools for productivity and the | reduction of errors but ultimately that has a minimal impact on | "the process" that's taking place in the knowledge worker's | mind. | | If the process was the problem (or perfect), adding a new (or | replacing) a worker would have minimal impact but we know this | is not true. You could have the best documentation, training, | and absolutely stellar code yet one person can turn everything | to shit quite quickly! The opposite is true as well: Bringing | on a fantastic new worker can make your existing team look like | a bunch of inefficient laggards. | | Neither of these situations can be fixed by improving processes | (maybe hiring processes? Though I doubt it). It'd be like | having one magic blue bean in the box that--if found--can | either drastically improve or degrade the final productivity by | 90%. Would the optimum process improvement then be to try to | eliminate magic beans entirely? Sure seems like it (i.e. hire | the lowest common denominator and don't try to optimize for the | 1%). That way you reduce the likelihood of taking on the "bad | 1%"--even though it reduces your chances of obtaining the | perfect magic bean. | kqr wrote: | If you are in any sort of leadership position -- either a formal | manager, or in that you have respect from your peers, I strongly | urge you to read Deming. | | There are few authors that have taught me so much about people, | motivation, systems, quality, statistics, what high-leverage | effort looks like, and so on. | | I first picked up a book by Deming a few years ago, and not a | single day has passed that I have not had use for what he taught | me through his writing. | | The things he says are only becoming more and more relevant with | every year. I honestly think it ought to be compulsory reading in | school. The world would be a much better place that way; kinder, | more efficient, and less superstitial. | Litost wrote: | Thanks for the suggestion, how does what he says stack up | against those who came after, I ask this because this is one of | my favourite management talks by Russell Ackoff [1] and he | mentions Dr Deming so assume he was influenced | by/worked/studied with him and given their relative ages | wondered if his work might be valuable to start with? | | [1] https://www.youtube.com/watch?v=OqEeIG8aPPk | marbex7 wrote: | Which book? | kqr wrote: | I started with The New Economics, then read Out of the | Crisis, and finally A Theory of Sampling or whatever its name | is. | marbex7 wrote: | Thanks. | [deleted] | hencq wrote: | There's a brilliant little book Four Days with Dr. Deming[0] that | goes over the red bead experiment among other things. It | basically follows the format of a four day seminar that Dr. | Deming used to do. It's full of wisdom like this and it does a | painfully good job making you recognize all the ineffective | things still going on in companies today. | | [0] | https://www.goodreads.com/en/book/show/34987.Four_Days_with_... | [deleted] | buescher wrote: | Years ago I found a discussion of this on the web that involved a | deep dive into optimal strategies for getting white beads, | variations in paddle construction, root cause analysis on bead | size and weight and hole depth in the paddles, and so on. It was | a six sigma nightmare come to life and missed the point so | profoundly I wish I could find it again to use as an example of | how easily Deming is misunderstood. | | Related: "A bad system beats a good person any time" does not | mean "having any system, no matter how bad, is better than having | even the best people and no apparent system". | AnIdiotOnTheNet wrote: | Wait a bit and HN will probably provide you a similar | discussion. | krallja wrote: | "beats" in the sense of "the beatings will continue until | morale improves," right? | buescher wrote: | It certainly isn't positive. | hencq wrote: | Oh boy, I'd love to see that too. Unfortunately it's all too | common to see this stuff in reality as well. | | > Related: "A bad system beats a good person any time" does not | mean "having any system, no matter how bad, is better than | having even the best people and no apparent system". | | I'm a big fan of sociotechnical systems [0] where the motto is | to give people complex jobs in simple organizations. | Unfortunately in practice you usually see the tendency to do | exactly the opposite. | | [0] https://en.wikipedia.org/wiki/Sociotechnical_system | vintermann wrote: | Deming made me realize that there is actually management | literature out there that isn't just fads and slogans. | openknot wrote: | Would there be any other similar recommendations to Deming's | books? I would think that Eliyahu M. Goldratt's books sound | similar (specifically, the "Theory of Constraints," | alternatively presented through a fictional story in "The | Goal"). | mark_undoio wrote: | I'm a fan of Womack & Jones's "Lean Thinking". This is all | about Lean manufacturing, which I think is partly rooted in | Deming's work. The focus is more on how to optimise the | overall system than the management of individuals. | | The content of that book isn't directly applicable to e.g. | software companies but if you think a bit you can see quite a | lot of analogous situations (e.g. warehoused inventory is | incomplete projects or not-yet-shipped code, "monuments" | could be inappropriate central test / build systems, etc). | kqr wrote: | If you want someone to translate it to software for you, | Reinertsen's Principles of Product Development Flow is | about adapting the philosophy to knowledge work. | | Ward's Lean Product and Process Development is also a good | take on those ideas. | m104 wrote: | I recommend Russell Ackoff's writings as somewhat related and | more to do with how systems of people and processes work (or | don't). Here's a great place to start: | https://thesystemsthinker.com/a-lifetime-of-systems- | thinking... | [deleted] | larrydag wrote: | The key to doing this experiment well is having the right | facilitator that brings the attitude. A good facilitator will | roleplay a leader/manager/exec that will praise when measures are | good and berate when measures are bad. The idea of this | experiment is to show how management can harm the process even | when there is inherent variability, good or bad. | | Here is Dr. Deming himself performing the experiment | https://www.youtube.com/watch?v=7pXu0qxtWPg | Rickasaurus wrote: | Thanks for sharing this, it's amazing to see it in action. | laserlight wrote: | What an amazing demonstration. It's unfortunate that these | lessons haven't been learned decades later. | curiouscats wrote: | The W. Edwards Deming Institute Blog https://deming.org/blog/ | | Deming on various management topics | https://deming.org/category/deming-on-management/ | | More resources on Deming's ideas https://deming.org/online- | resources-on-w-edwards-demings-man... | pakitan wrote: | I don't get it. This "experiment" could have been replicated by a | simple computer simulation, given that worker output is entirely | random. The supposed moral of the story is that system design | defines outcome, not individual performance but how does that | even count as "science" when you don't have control and | experimental group. He designed a system with inherent flaws and, | surprise, it has flaws. We can see there is variance in | "productivity" but we have no idea how this same variance would | have affected output if workers actually had agency. | lupire wrote: | It's a demo experiment, like whe the physics teacher swinga a | heavy pendulum at their own nose, or shoots a BB gun at a | falling toy. | function_seven wrote: | That's the point. | | So first, not all science requires an RCT. Dividing | expiremental subjects into study and control groups is one way | of doing science. It's not the only way. | | In this case, this is a concrete demonstration of just how much | variance can emerge from a "statistically neutral" process. The | systemic flaws are part of the demonstration. What appear at | first glance to be identical tools, inputs, and processes are | in fact subtly different. The demo shows management types that | their charts and graphs cannot always be relied upon to | differentiate performance levels among staff. The system itself | must also be scrutinized. If Bob's ad campaigns are | outperforming Alice's by 20% in the first quarter, it doesn't | necessarily mean Bob is a marketing genius and Alice needs a | PIP. | | A computer simulation would not have nearly as powerful effect | on most people as a live demonstration using real beads. And | the imperfections in the paddles is something that naturally | arises when they're physically made, but would have to be tuned | by the programmer building the simulation. Which would lead to | questions about "just how did they decide what variances would | come into play?" | jiggawatts wrote: | A real world example is that I do programming on a high-end | workstation laptop. My coworkers use old budget laptops. | | This is not in their control -- they're victims of corporate | policy. | | Does it influence quality in complex and hard to quantify | ways? | | Most assuredly... | pakitan wrote: | I think I get it now. The point of the experiment is to ELI5 | the concept of variance to management types who skipped | statistics classes :) Could be useful for some bosses I had | :) | function_seven wrote: | So I'm watching a video of this right now[0], and it's even | more enlightening than I figured it would be! Deming makes | comments throughout the demonstration that I swear I've | heard in the real world. For example, one worker--whose | previous results put him on probation (he had 12 red | beads)--managed to have only 6 the next day. "Looks like | probation worked". | | Meanwhile another worker--previously scoring 5, and getting | a merit-based raise from it--did poorly with 12. The | remark: "That raise went to his head. He's getting lazy". | | So yeah, the value of this is in the actual doing of it. | | [0] https://www.youtube.com/watch?v=7pXu0qxtWPg | advisedwang wrote: | This experiment is something he did in classes etc so people | could _experience_ the obvious idiocy in trying to manage | individuals for system behaviour. His point is that ALL actual | work is also dominated by system behaviour, just more subtly, | and managers must focus on the systems and not worker | performance. | pakitan wrote: | > His point is that ALL actual work is also *dominated* by | system behaviour | | If that's his point, it seems obviously wrong. Some work, | like in the experiment is dominated by system behavior. For | others, system would play a much smaller role. For example, 2 | people cranking code in a startup. No matter what system you | apply, if they are not good programmers, nothing of value | will come out. | kqr wrote: | That's also missing the point somewhat. Put the best two | programmers in the world in a shitty system that rewards | them for the wrong things and they will produce garbage. | Put mediocre programmers in a fantastic system that brings | out the best in their collaboration and you might actually | get to market sooner and better than the other group. | salawat wrote: | Something will most certainly come out. You're just not | defining the system. If I define the system as "one | programmer is responsible for looking at the desired | product and writing specifications" and the other | programmer is to translate specification into programming | code, and never shall one do the other's job, I assure you, | the best programmers in the world will produce shit over | time. | | Randomly swap in two new actors with different life | experiences into the same spots to do the same work, and | you'll still get shit. If in the unlikely event, you get | amazing work, it's not that the people doing it were | special; it's just anothe outlier in the data stream. Add | in the emotional toll of working as hard as possible to | succeed but never being able to meet prescribed quality | levels? | | A system is perfectly tuned to produce the results it does. | Want different results? Change the system. That is Deming's | point. We have a tendency to blame variance in a system on | the human actors immediately proximal, instead of paying | attention to the actual significant constraints. This is an | important lesson to management types, as they are to | process/system what a programmer is to a computer. | | The planners cast the dice for downstream long before | downstream can do anything about it, and in many corporate | setups, top down works just fine, but bottom up never gets | any attention. | pierrebai wrote: | I find the experiment skewed. Or more precisely, that it is not | meant to investigate human behaviour or psychology. It is rather | precisely designed to support a chosen result to support a given | world view. The fact that it has been ran for 50 years is a | strong indication of this. | | IOW, the experimenter wanted to be able to arrive at the | conclusion that difference in performance was unrelated to | workers and designed the experiment so it would give this result. | In short, this demonstrate few things outside of a very | artificially setup situation, where the workers have no say and | the job is predestined to fail. | | Anyone who worked anywhere knows very well that there are | actually vast difference between two workers. | cool_dude85 wrote: | >IOW, the experimenter wanted to be able to arrive at the | conclusion that difference in performance was unrelated to | workers and designed the experiment so it would give this | result. | | That's the whole point. The experiment is not that we're | supposed to be surprised that the workers did not affect | performance - in fact, that's the subtext of the whole thing! | We know it from the start cause he explains exactly how the | process works and we can all see that individuals cannot affect | their output. | | The point is, if we are unaware that we're in such a situation, | we can still find metrics to allow us to rank workers, fire low | performers, give out raises, etc. When we myopically focus on | such metrics, and disregard the system that makes them | worthless, we're making all our decisions on random chance, | even though we have a clear process, data collection, the whole | thing. | pierrebai wrote: | That's also my whole point: this is not an experiment but an | elaborate artificial argument designed to prove a point of | view decided in advance. That is why I find it unsavory. | Jtsummers wrote: | The point of the experiment is to be extreme, but after | reading (a very large portion of) Deming's work, I don't | think he'd disagree with your initial assertion that there | are differences between workers. | | The broader points he makes, related to this experiment at | least: There are individual and systemic issues that | influence the outcome of a process. The actual ratio will | vary depending on what kinds of processes are involved. | | If the job is to be a literal screw turner on an assembly | line, then there is relatively little difference between | the majority of people (assuming they are generally able | bodied, sighted, and have decent coordination), the | _system_ (tempo, length of shift, accessibility of the | thing being screwed together, tools being used) will have a | much larger impact than the individual 's skill. The system | of the assembly line will influence the outcome more than | the individual's skill (at least above a basic threshold, a | supremely uncoordinated individual could flounder even with | the slowest pace of work). Switch to more skilled work and | you will find, increasingly, more differences in outcome | based on individual performance versus the system of the | work, but even there the system matters. | | Look at software development offices that still favor | things like manual build processes, code versioning | control, testing, and deployment over automation. They | provide many opportunities for human error (even just | simple miskeying of data) that can reduce everyone's | effectiveness no matter how skilled. (Fortunately these | kinds of places are increasingly rare, at least outside of | US defense contractors.) | | The experiment, then, is an artificial construct (like most | classroom experiments) meant to illustrate a point by | showing one extreme. This acts as a counterpoint to the | more conventional wisdom that the individual, and not the | system, is what actually matters for the outcome. The | conventional wisdom, of course, being wrong in many | circumstances since it tends to place too strong a weight | on the individual performance and too weak a weight on the | system. | | It would be unsavory if he had said, "See, stop evaluating | individuals their contribution doesn't matter." But he | never did say that (in anything I read, at least), and | anyone who looks at this experiment and draws that | conclusion would be an idiot. | emeraldd wrote: | I wonder what the limits of this are? From a naive point of view | there has to be a point where training/skill/physical | endurance/etc. come into play. The bed experiment seems to fit a | fixed rate, assembly line style of work. While I would agree that | numeric/performance ranking is mostly meaningless, everyone knows | that one somebody you go to when no one else can fix a problem. | IggleSniggle wrote: | I see what you mean, but I also think that's encapsulated in | the idea of "ready willing workers." | | Obviously there are differences between people, and better and | worse teams. But the lesson here is about how the environment | factors in, and how management can accidentally arbitrarily | suppress innovation or reward luck within normal bounds of | success. Or hamper themselves to failure by insisting on a | broken process. | | Could it be the case that "everybody goes to Jim," and as a | result, Jim gets good at helping people? Could it be that if | everybody just went to Kim for 2 weeks, that her fixes might | turn out to be better yet completely orthogonal method of | solving the problem? | | The Red Bean experiment is an antidote to rigid process and the | praise/blame game as based on inspection of results. It's a | story intended for management to hear, not an absolution or | dismissiveness of personal reasonability. | | If you've hired "ready willing workers," then looking at the | results doesn't necessarily show you who was killing it and who | wasn't. | | That worker who is always "killing it" may be good at scooping | up projects that always look great. That worker who is always | underperforming might be maintaining essential infrastructure | without which the system would fall apart. | | The worker who's killing it may be doing so by spending all | their time "buttering up" a customer. The worker who appears | underperforming may appear so because they spend all their time | "buttering up" a customer, but someone else always lands the | sale. | | It's a meditation on imperfect knowledge. | kqr wrote: | As you have observed already, this experiment is set up | specifically to eliminate the effect of training/skill/physical | endurance etc, and YET when it's performed in real life with a | good facilitator, people who are unlucky start to feel like | they're underperforming and need to step it up, while people | who are lucky start to feel like they deserve the praise for | doing well. | | I've read about people who go for days after the experiment and | feel bad about their subpar performance because they feel like | they've let down or brought shame to their company and wonder | if they couldn't have done something better. | | And this is an experiment that's set up to remove any trace | indivdual agency what so ever! People still beat themselves up | over it. | | When you experience this experiment for real, you start to | forget that it's actually designed to eliminate any sort of | skill. | | In other words, the experiment shows how hard it is to | recognise when we're judging the system and not the people in | it. The experiment shows that even when you think you're seeing | individual performance, it's very plausible you're not. | ziggus wrote: | Focusing on the type of work being done is a bit of a bike | shed, since the experiment isn't about the work per se, but the | measurement of the work as a function of the employee alone - | ie, without the context of the systems in which the employee | functions. | | A good example of the type of mismeasurement done in non- | manufacturing contexts is the ridiculously stupid burn-down | chart. | webmaven wrote: | _> A good example of the type of mismeasurement done in non- | manufacturing contexts is the ridiculously stupid burn-down | chart._ | | Bad management can find a misuse for any tool, I don't think | burn-down charts are a particularly attractive nuisance in | that regard. | candyman wrote: | I was lucky enough to do this with the man himself at NYU. He had | trouble speaking then but the class was dead silent and hung on | his every word. Profound thinker. | mark-r wrote: | I saw a link to this in a discussion of another topic, I'm glad | somebody pushed it to the top level. Definitely worth the read. ___________________________________________________________________ (page generated 2022-02-04 23:00 UTC)