[HN Gopher] New study disavows marshmallow test's predictive powers ___________________________________________________________________ New study disavows marshmallow test's predictive powers Author : npalli Score : 73 points Date : 2022-02-21 20:36 UTC (2 hours ago) (HTM) web link (anderson-review.ucla.edu) (TXT) w3m dump (anderson-review.ucla.edu) | karaterobot wrote: | A good test for whether a psychological or sociological study may | turn out to be hard to replicate is: does it make a sweeping | claim about something as complex as human beings? If it's not | tentative, incremental, wrapped in caveats and conditionals, I | don't put much weight in it anymore. | goatlover wrote: | Pretty much this. | dang wrote: | All: if you're going to post here, can please make sure you're | not posting a shallow dismissal? Those are the quickest and | easiest reactions to post, but they're repetitive and boring. | This site is supposed to be for _interesting_ conversation, and | that requires new information--not things we 've all heard | before. | | Hint: if you're making a strong, large statement--e.g. an | emphatic claim about an entire category of things--then it's most | likely a shallow comment. | | https://news.ycombinator.com/newsguidelines.html | awb wrote: | Here's a 2011 meta analysis that reports that DRD (delayed reward | discounting -- basically putting lower importance on delayed | gratification and instead putting greater importance on immediate | rewards) is highly associated with addictive personalities: | | https://addictions.psych.ucla.edu/wp-content/uploads/sites/1... | | > _Conclusions_ These results provide strong evidence of greater | DRD in individuals exhibiting addictive behavior in general and | particularly in individuals who meet criteria for an addictive | disorder. | erichocean wrote: | The article, and especially the headline, are extremely | misleading. | | The actual result: measures of self-control either weakly or | strongly predict positive life outcomes, depending on the measure | and how much adjusting was done, e.g. | | > _[The study] created a new measure of the time each original | preschooler waited before taking a bite (or getting the reward) | to adjust for variables such as age, gender and experiment | conditions._ | | This study found that the "marshmallow test"--as a single measure | --is no more or less predictive than a basket of other measures | of self-control the study tested, or any of those other measures | of self-control taken alone. | | Despite the misleading article and headline, the study itself | seems well-designed (e.g. pre-registered), but the conclusion in | the headline is utterly wrong as that is not what the study | found: self-control matters, can be measured, and those measures | weakly or strongly predict positive life outcomes. | | Here's an accurate headline: Self-control still predicts positive | life outcomes, Marshmallow Test creator finds. | antonfire wrote: | > This study found that the "marshmallow test"--as a single | measure--is no more or less predictive than a basket of other | measures of self-control the study tested, or any of those | other measures of self-control taken alone. | | Are we looking at the same study? I don't see where "no more or | less predictive than a basket" comes from, specifically where | "no less predictive" comes from. | | My reading of the abstract is that, the study found that a | measure based on the "marshmallow test" ("preschool delay of | gratification", RND in the article body), is not predictive of | the outcomes they measured (11 capital formation outcomes). | | It also found that a basket of measures of self-control | (collected at various ages, RNSRI/RNCCQ in the article body) | _is_ predictive of the outcomes, whether you include the | preschool measurement or not. | | So from skimming the study without even reading the article, it | sounds to me like they found that the preschool measure doesn't | predict the outcomes they're measuring by itself, and it | doesn't contribute predictive power when it's used as part of | an index of self-control measured at a variety of ages. | oraphalous wrote: | Headline seems accurate to me. The headline: | | New Study Disavows Marshmallow Test's Predictive Powers. | | And it does. The marshmellow test as a single measure is | referred to as RND. Which is a test which measures | gratification wait times and is applied in pre-school. Their | hypothesis regarding RND: | | hyp2: On its own, RND (measured around age four) will have only | a very small correlation with the measures of mid-life capital | formation. | | And they report a confirmation of this hypothesis. | | The other hypothesis refers to RNSRI rank-normalized self- | regulatory index - which is 4 different components measured at | different ages - each component is RND + 86 other measures! | This is reported as having a "modest" impact on outcomes - not | "strong" as you say. So your reporting seems the more | inaccurate to me. | | But this is irrelevant anyway with respect to your claim about | the headline, which is only referring to the paper's disavowel | of RND. | | Further evidence of the headline's accuracy and the paper's | disavowel of RND is that they also looked at RNCCQ - which is | RNSRI minus the inclusion of the RND test from each of the four | components. They found that including RND did not improve RNSRI | over RNCCQ in terms of their predictive power. | KerrAvon wrote: | The original headline isn't inaccurate, though it is clumsily | worded and neither it nor your proposed headline fully describe | the results of the study. The study itself says the following | (quoting verbatim). Note that the second point is more or less | what the headline says. | | - Self-regulation composite (preschool & ages 17-37) predicts | capital formation at 46. | | - Preschool delay of gratification alone does not predict | capital formation at 46. | | - The composite is more predictive partly because it consists | of many items. | | - No evidence of more predictive power for self-regulation | reported later in life. | feanaro wrote: | Yes, but how would you then inanely riff on psychology, which | all the cool kids are doing nowadays? | [deleted] | nefitty wrote: | Thank you for the clarification. You've thought about this a | lot. If a friend asked you for advice on how to help their | struggling teenage son improve his self-control, what do you | think you would say? | renewiltord wrote: | Oh interesting. Self-regulation is correlated with good outcomes | but the marshmallow test is a poor test of self regulation. Okay, | interesting. | | What I would enjoy, I think, is taking a monthly test battery and | uploading that to a central database with other self-researchers | and then looking at that in a historic sense to derive ideas to | study. Obviously, since one is post-hoc slicing one will find | many spurious correlations but perhaps these correlations will | yield interesting areas to search around. Does anyone know of | anything like this? | acchow wrote: | From the Journal Article: "They included a total of 550 students | from Stanford's Bing Nursery School, aged about 4 years old | (ranging from 2 to 6). Many of the participants are children of | Stanford faculty and staff." | | https://www.sciencedirect.com/science/article/pii/S016726811... | | How can we conclude anything at all about the general population | using a sample of Stanford kids? | gkop wrote: | This isn't my field, but I think this is par for the course | unfortunately and a manifestation of a larger issue. Eg | https://journals.plos.org/plosone/article/file?id=10.1371/jo... | dahart wrote: | I like to ask this whenever Dunning Kruger comes up; DK was a | sample of Cornell undergrads, and the study was one tenth the | size of the Stanford study. DK participants were volunteers of | a psych class who got extra credit. Presumably they needed and | could use extra credit, which may have excluded the A and the F | students. It's hard to imagine ways to start with more bias, or | how we can possibly accept this sample as representative of | humanity. | ren_engineer wrote: | Social sciences are all a sham, now think about the trillions | of dollars of government spending that are based on that same | sham science as justification. People wonder why so many | government programs fail, it's because they are built on a | rotten foundation | | https://en.wikipedia.org/wiki/Replication_crisis | | >How can we conclude anything at all about the general | population using a sample of Stanford kids? | | it's a well known problem that is rarely brought up, WEIRD | bias. Most social science research participants are college | students being bribed with extra credit or gift cards | | https://en.wikipedia.org/wiki/Psychology#WEIRD_bias | brimble wrote: | Good social science possible, but is difficult and | (sometimes) expensive. If you can get the same _personal_ | outcome by doing something cheap and easy instead, of course | that 's what most people are going to do. Fixing that seems | to be the Big Problem for most of science, for at least the | last few decades (though, yes, particularly social science). | | > People wonder why so many government programs fail | | This, though, I'm not so sure about. Do government programs | fail at a rate greater than those undertaken by other large | organizations, like corporations or non-profits? | scotuswroteus wrote: | David Brooks in shambles | learn_more wrote: | Sounds like it is predictive. Just not when: | | > Controlling for differences such as household income and | cognitive abilities ... | | So it's a (predictive) IQ test. | | Perhaps it disavows the prior assumed basis of "deferred | gratification", but not the predictive power of the test. | api wrote: | Am I overreacting to consider these kinds of psychometric studies | to be not much better than phrenology? | | "Behavioral phrenology" maybe? | lr4444lr wrote: | Delayed gratification AFAIK has solid research as a trait | predictive of many things. That a child's ability at 4 or 5 to | do it being predictive of their adult self is something else, | though. | TrinaryWorksToo wrote: | There could easily be confounders to that though. Like people | who are wealthy might be able to delay satisfaction better | than poor, because their needs are more satisfied. | dahart wrote: | Indeed, and the article mentions this. "The Watts study | findings support a common criticism of the marshmallow | test: that waiting out temptation for a later reward is | largely a middle or upper class behavior. If you come from | a place of shortages and broken promises, eating the treat | in front of you now might be the better bet than trusting | there will be more later." | fancifalmanima wrote: | To say this more explicitly, even the idea of waiting for | the second marshmallow being the "preferred" behavior is | somewhat classist. | | Sounds more like the test is just testing for an | adaptation that happens to be well suited to living in a | upper-middle class to wealthy environment. If resources | are scarce, the kid that takes what they can get now | rather than trusting other people will do better in the | long run. | nostrademons wrote: | A lot of observable phenomena function as positive feedback | loops, simply because positive feedback loops are usually | needed to generate effect sizes that become "observable" | beyond individual variation. It's very likely the being | able to delay gratification makes you wealthy, which makes | you better able to delay gratification, which makes you | wealthier, and so on. And that's why we have discernible | social classes, where mobility from one to another becomes | very difficult. | | Breaking the feedback loop usually involves doing something | farsighted, risky, and irrational - for example, risking | getting fired from your retail job by studying programming | and applying to software engineering jobs in your downtime, | or quitting your stable corporate job to found a startup. | dahart wrote: | Predictive is synonymous with correlated in a research | setting, but lay use of that word seems like it runs the risk | of implying causation. This may be the primary problem with | the Standford Marshmallow experiment, right? - that delayed | gratification is highly correlated with socioeconomic status, | which is well known to be an excellent predictor of future | socioeconomic status. | civilized wrote: | It's an interesting idea. But to me the marshmallow test at | least had some plausible connection to personality. | | But maybe in the days of phrenology, people thought a hooked | nose* had a plausible connection to personality as well? | | Weird to think about. | | *Sorry, this is physiognomy not phrenology. The same basic | point stands though. | frgtpsswrdlame wrote: | >But maybe in the days of phrenology, people thought a hooked | nose had a plausible connection to personality as well? | | Phrenology is bumps on the head right? I think hook nose | would be physiognomy. But yes, the idea was that your | behavior was due to your brain and your brain was composed of | many different parts that each controlled different | propensities or abilities. Then it was just a matter of | identifying where those propensities lived, in relation to | the head and then you could feel for the differences from | person to person across the surface of the head. From the | naive viewpoint it _is_ plausible, oh, you say the back right | section of my head, above the ear is a bit larger so the | self-control portion of my brain is well developed? I 've | always thought so! | | Setting aside the marshmallow test, you can easily see how | scientific theories about this sort of thing, both right and | wrong, easily integrate. | well_i_guess wrote: | I think that the issue is that there is no true metric for | "highly marketable talents/traits." One generations genius | could be another generations average worker, solely because | market forces eliminate the competitive advantage of certain | things. Many, many authors seem to lament the distractability | of the current generations yet I would bet you many of the | most famous people to Gen Z are incredibly attention-fickle. | Whereas, 20 years ago, focus would probably be an essential | skill for key performers. | fancifalmanima wrote: | Focus is almost surely an essential skill for key | performers. Even among the most famous Gen Z -- you don't | think they focus on their social media presence and what | they do? What is an 8 hour photo shoot if not focusing? A | lot of work goes into what social media influencers post, | its not all done on a whim. There's also plenty of Gen Z | doing other more traditional work (almost everyone of that | generation, really). If anything, they've probably had to | develop coping mechanisms from an extremely early age to | deal with distraction, compared to prior generations. | brimble wrote: | I'm reminded of the SlateStarCodex post that mulls over | the difference between "real" ADD and just having totally | ordinary (but pretty great) difficulty focusing on the | exact same boring crap on a computer screen day, after | day, after day--especially if, in the latter case, a lot | of the people these folks are comparing themselves to, | when deciding that they might have ADD, are _already_ on | ADD meds (or coke...) for exactly that reason. | | If our society needs 1% of the population to be | accountants (to pick an example) but only 0.1% of the | population either have incredible focus abilities or | don't find accounting brain-meltingly dull, then at least | 90% of accountants are going to feel like they have a lot | of trouble focusing at work. Once enough start medicating | (legally or otherwise) it's gonna feel to others like | they really do have a condition that most don't, but they | both kinda do (in a practical sense, they _do_ need to | focus better to keep up with their peers) and kinda don | 't (in that it's sort of our society that's sick, not | them--they're just acting like _most_ people would, in | that situation). | jrumbut wrote: | My poorly informed impression is that the key challenge of | any data driven investigation is striking the balance | between how hard something is to measure and how close it | is to what you really want to know. | | The marshmallow test was so appealing because it was | incredibly easy to perform and seemed like it was pretty | close to a measure of the kind of self-control and | discipline that's needed to succeed in a variety of life's | most important challenges. | DeusExMachina wrote: | Given the current replication crisis, I would say, not much. | | https://en.wikipedia.org/wiki/Replication_crisis | gumby wrote: | Yes, I think it's better, for the reason this article explains: | people are following up and revisiting the conclusions. | | Nutrition studies are more like phrenology in that they are so | hard to do with so many confounding factors that you can't | really trust any macro conclusion. | yboris wrote: | I think your distrust in nutrition studies might stem from | the fact there are nefarious entities publishing things. | Various industries have a financial interest in making it | look like their product, because it contains substance X, is | beneficial to people. So they can design the most flimsy | experiment with no pre-registration, and re-run it numerous | times until they get the result they want. | | Lots of conclusions from nutrition studies (especially meta | analyses) are robust _and_ useful to follow. | | Consider the _NutritionFacts_ website as a good starting | point: a non-profit which has no ads, no industry | "partners", etc - focused on distilling well-designed studies | to see what everyday people can use from them. | | https://nutritionfacts.org/ | gumby wrote: | I was not even considering the issue of bad actors. Simply | that longitudinal, multi-variate studies of sufficient | scale are essentially impossible to conduct. | | Even though nutrition is one of the very oldest, and | perhaps _the_ very oldest, fields of human study, it still | remains in the "butterfly collecting" phase of development | as a science. It's very very hard. I'm glad some people | try. | brimble wrote: | I've got some pretty good predictive powers, myself. | | I predict that in twenty years, no matter how thoroughly this is | debunked, I'll still see this treated as true _constantly_ , and, | even when what's under discussion is _taking action_ based on its | being true, I 'll only get eye-rolls and head-pats and plain | disapproval/loss-of-face for bringing up that it's questionable | at best, then everyone will go on treating it as true. | suzzer99 wrote: | I've never understood the marshmallow test. I'm supposed to sit | there and stare at a delicious marshmallow for some indeterminate | amount of time in order to get _one extra marshmallow_? Offer me | a whole bag and we 'll talk. | | I've always wondered if this test measures more of the child's | willingness to please the researcher, and not so much their | capacity for delayed gratification. | mansoon wrote: | This deserves better study. | jl2718 wrote: | "Adding the marshmallow test results to the index does virtually | nothing to the prognosis, the study finds." | | This does not mean that the test is not predictive. It means that | the index (a bunch of measurements) contains statistical | dependencies. From a practical view, the marshmallow test result | depends on many cognitive factors unrelated to self-control. The | child must understand the instructions, remember them for the | duration of the test, trust the provider, value the second | marshmallow, and then make a decision. To be of any value, it | should have been tested against a standard cognitive battery, | which it almost certainly would have failed to improve upon. | Cognitive tests have worked extremely well to predict life | outcomes for decades now if not centuries. | ramesh31 wrote: | Sure. And fifty years worth of other studies have shown its' | effectiveness. This is meaningless noise in the absence of meta | analysis. | awb wrote: | Related meta analysis: | | https://addictions.psych.ucla.edu/wp-content/uploads/sites/1... | | > _Conclusions_ These results provide strong evidence of | greater DRD in individuals exhibiting addictive behavior in | general and particularly in individuals who meet criteria for | an addictive disorder. | | They don't draw conclusions about causative success, just a | correlation with addiction. | rilezg wrote: | I think the 'golden goose award' page from 2015 (linked in the | article) gives a better overview of the original research than | the article: | https://www.goldengooseaward.org/01awardees/marshmallowtest | | A small quote: "But this is not a story of fate - of children's | long-term success being determined by their self-control as four- | year-olds. It is a story about how children can change: those who | are "low delayers" can in fact learn to be "high delayers," and | gain the life benefits that self-control imparts." | | So this is more olds than news, but perhaps it is good to be | reminded that we all have room to grow (or shrink) from who we | were at age 4. I personally would bet high-delayers can also | learn to become low-delayers, and I also would bet there are | times in life when you would be better off eating the marshmallow | now instead of investing it for another 30 years at 5% because | the man in the suit told you to. ___________________________________________________________________ (page generated 2022-02-21 23:00 UTC)