hngopher.com

       [HN Gopher] Replace peer review with "peer replication" (2021)
       ___________________________________________________________________
        
       Replace peer review with "peer replication" (2021)
        
       Author : dongping
       Score  : 340 points
       Date   : 2023-08-06 12:33 UTC (10 hours ago)
        
 (HTM) web link (blog.everydayscientist.com)
 (TXT) w3m dump (blog.everydayscientist.com)
        
       | geysersam wrote:
       | Both review and replication has their place. The mistake is
       | treating researchers and the scientific community as a machine:
       | "pull here, fill these forms, comment this research, have a gold
       | star"
       | 
       | Let people review what they want, where they want, how they want.
       | Let people replicate when they find interesting and motivating to
       | work on.
        
       | SonOfLilit wrote:
       | My first thought was "this would never work, there is so much
       | science being published and not enough resources to replicate it
       | all".
       | 
       | Then I remembered that my main issue with modern academia is that
       | everyone is incentivized to publish a huge amount of research
       | that nobody cares about, and how I wish we would put much more
       | work into each of much fewer research directions.
        
       | tines wrote:
       | "Replace peer code review with 'peer code testing.'"
       | 
       | Probably not gonna catch on.
        
         | dongping wrote:
         | "peer code testing" is already the job of the CI server. As it
         | is nothing new, it probably is not going to catch on.
        
       | fastneutron wrote:
       | As much as I agree with the sentiment, we have to admit it isn't
       | always practical. There's only one LIGO, LHC or JWST, for
       | example. Similarly, not every lab has the resources or know-how
       | to host multi-TB datasets for the general public to pick through,
       | even if they wanted to. I sure didn't when I was a grad student.
       | 
       | That said, it infuriates me to no end when I read a Phys. Rev.
       | paper that consists of a computational study of a particular
       | physical system, and the only replicability information provided
       | is the governing equation and a vague description of the
       | numerical technique. No discretized example, no algorithm, and
       | sure as hell no code repository. I'm sure other fields have this
       | too. The only motivation I see for this behavior is the desire
       | for a monopoly on the research topic on the part of authors, or
       | embarrassment by poor code quality (real or perceived).
        
       | fabian2k wrote:
       | I don't see how this could ever work, and non-scientists seem to
       | often dramatically underestimate the amount of work it would be
       | to replicate every published paper.
       | 
       | This of course depends a lot on the specific field, but it can
       | easily be months of effort to replicate a paper. You save some
       | time compared to the original as you don't have to repeat the
       | dead ends and you might receive some samples and can skip parts
       | of the preparation that way. But properly replicating a paper
       | will still be a lot of effort, especially when there are any
       | issues and it doesn't work on the first try. Then you have to
       | troubleshoot your experiments and make sure that no mistakes were
       | made. That can add a lot of time to the process.
       | 
       | This is also all work that doesn't benefit the scientists
       | replicating the paper. It only costs them money and time.
       | 
       | If someone cares enough about the work to build on it, they will
       | replicate it anyway. And in that case they have a good incentive
       | to spend the effort. If that works this will indirectly support
       | the original paper even if the following papers don't
       | specifically replicate the original results. Though this part is
       | much more problematic if the following experiments fail, then
       | this will likely remain entirely unpublished. But the solution
       | here unfortunately isn't as simple as just publishing negative
       | results, it take far more work to create a solid negative result
       | than just trying the experiments and abandoning them if they're
       | not promising.
        
         | ebiester wrote:
         | It's simple but not easy: You create another path to tenure
         | which is based on replication, or on equal terms as a part of a
         | tenure package. (For example, x fewer papers but x number of
         | replications, and you are expected to have x replications in
         | your specialty.) You also create a grant funding section for
         | replication which is then passed on to these independent
         | systems. (You would have to have some sort of randomization
         | handled as well.) Replication has to be considered at the same
         | value as original research.
         | 
         | And maybe smaller faculties at R2s pivot to replication hubs.
         | And maybe this is easier for some sections of biology,
         | chemistry and psychology than it is for particle physics. We
         | could start where cost of replication is relatively low and
         | work out the details.
         | 
         | It's completely doable in some cases. (It may never be doable
         | in some areas either.)
        
           | SkyMarshal wrote:
           | _> x fewer papers but x number of replications, and you are
           | expected to have x replications in your specialty._
           | 
           | Could it be simplified it even further to say x number of
           | papers, but they only count if they're replicated by others
           | in the field?
        
             | nine_k wrote:
             | No, the idea is that the same researcher should produce _k_
             | papers and _n_ replications, instead of just _k + n_
             | published papers.
             | 
             | I'd argue that since replication is somehow faster than
             | original research, the requirement would count a
             | replication somewhat lower than an original paper (say, at
             | 0.75).
        
               | ebiester wrote:
               | That is my idea... If we opened it up, there's probably
               | more interesting iterations, such as requiring pre-
               | registration for all papers, having papers with pre-
               | registration count as some portion of a full paper even
               | if they fail so long as the pre-registration passed
               | scrutiny, having non-replicated papers count as some
               | portion of a fully replicated paper, and having
               | replication as a separate category such that there is a
               | minimum k, a minimum n, and a minimum k+n.
               | 
               | The non-easy part of this is once we start making changes
               | to the criteria for tenure, this opens up people trying
               | to stuff all the solutions for all of the problems that
               | everyone knows already. (See Above.) Would some one try
               | to stuff code-available for CS conference papers, for
               | example? What does it mean for a poster session? At what
               | point are papers released for pre-print? What does it
               | mean for the tenure clock or the Ph.D clock? Does it mean
               | that pre-tenure can't depend on studies that take time to
               | replicate? What do we do with longitudinal studies?
               | 
               | I think you're looking at a 50 year transition where you
               | would have to start simple and iterate.
        
               | harimau777 wrote:
               | Is tenure really as mechanical as "publish this many
               | papers and you get it"? My impression was that it took
               | into account things like impact factor and was much more
               | subjective. If that were the case, then wouldn't you run
               | into problems with whoever decides tenure paying lip
               | service to counting replication or failed pre-registered
               | papers but in practice being biased in favor of original
               | research?
        
           | rapjr9 wrote:
           | Another approach I've seen actually used in Computer Science
           | and Physics is to make replication a part of teaching to
           | undergrads and masters candidates. The students learn how to
           | do the science, and they get a paper out of replicating the
           | work (which may or may not support the original results), and
           | the field benefits from the replication.
        
           | Eddy_Viscosity2 wrote:
           | It's not easy because it isn't simple. How do get all of the
           | universities to change their incentives to back this?
        
             | ebiester wrote:
             | We agree - the "simple not easy" turn of phrase is speaking
             | to that point. It is easy once implemented, but it isn't
             | easy to transition. (I am academia-adjacent by marriage but
             | closer to the humanities, so I understand the amount of
             | work it would take to perform the transition.)
        
               | MichaelZuo wrote:
               | This isn't just not easy, it would probably be extremely
               | political to change the structure of the NSF, National
               | Labs, all universities and colleges, etc., so
               | dramatically.
        
           | tnecniv wrote:
           | Your proposal has a whole slew of issues.
           | 
           | First, people that want to be professors normally do so
           | because they want to steer their research agenda, not repeat
           | what other people are doing without contribution. Second, who
           | works in their lab? Most of the people doing the leg work in
           | a lab are PhD students, and, to graduate, they need to do
           | something novel to write up in their dissertation. Thus, they
           | can't just replicate three experiments and get a doctorate.
           | Third, you underestimate how specialized lab groups are --
           | both in terms of the incredibly expensive equipment it is
           | equipped with and the expertise within the lab. Even folks in
           | the same subfield (or even in the same research group!) often
           | don't have much in common when it comes to interests,
           | experience, and practical skills.
           | 
           | For every lab doing new work, you'd basically need a clone of
           | that lab to replicate their work.
        
             | majormajor wrote:
             | > First, people that want to be professors normally do so
             | because they want to steer their research agenda, not
             | repeat what other people are doing without contribution.
             | 
             | If we're talking about weird incentives and academia you
             | hit on one of the worst ones right here, I think, since
             | nothing there is very closely connected to helping students
             | learn.
             | 
             | I know that's a dead horse, but it's VERY easy to find
             | reasons that we shouldn't be too closely attached to the
             | status quo.
             | 
             | > For every lab doing new work, you'd basically need a
             | clone of that lab to replicate their work.
             | 
             | Hell, that's how startup funding works, or market economies
             | in general. Top-down, non-redundant systems are way more
             | fragile than distributed ecosystems. If you don't have the
             | competition and the complete disconnection, you so much
             | more easily fall into political games of "how do we get
             | this published even if it ain't great" vs "how do we find
             | shit that will survive the competition"
        
           | harimau777 wrote:
           | I think that there's also a lot of
           | psychological/cultural/political issues that work also need
           | to be worked out:
           | 
           | If someone wins the Nobel Prize, do the people who replicated
           | their work also win it? When the history books are written do
           | the replicators get equal billing to the people who made the
           | discovery?
           | 
           | When selecting candidates for prestigious positions, are they
           | really going to consider a replicator equal to an original
           | researcher?
        
         | kergonath wrote:
         | > I don't see how this could ever work, and non-scientists seem
         | to often dramatically underestimate the amount of work it would
         | be to replicate every published paper.
         | 
         | They also tend to over-estimate the effect of peer review
         | (often equating peer review with validity).
         | 
         | > If someone cares enough about the work to build on it, they
         | will replicate it anyway. And in that case they have a good
         | incentive to spend the effort. If that works this will
         | indirectly support the original paper even if the following
         | papers don't specifically replicate the original results.
         | Though this part is much more problematic if the following
         | experiments fail, then this will likely remain entirely
         | unpublished.
         | 
         | It can also remain unpublished if other things did not work
         | out, even if the results could be replicated. A half-fictional
         | example: a team is working on a revolutionary new material to
         | solve complicated engineering problems. They found a material
         | that was synthesised by someone in the 1980s, published once
         | and never reproduced, which they think could have the specific
         | property they are after. So they synthesise it, and it turns
         | out that the material exists, with the expected structure but
         | not with the property they hoped. They aren't going to write it
         | up and publish it; they're just going to scrap it and move on
         | to the next candidate. Different teams might be doing the same
         | thing at the same time, and nobody coming after them will have
         | a clue.
        
           | techdragon wrote:
           | This waste of effort by way of duplicating unpublished
           | negative results is a big factor in why replicated results
           | deserve to be rated more highly than results that have not
           | been replicated regardless of the prestige of the researchers
           | or the institutions involved... if no one can prove your work
           | work was correct... how much can anyone trust your work...
           | 
           | I have gone down the rabbit hole of engineering research
           | before and 90% of the time I've managed to find an anecdote
           | or subsequent research footnotes or actual subsequent
           | research publications, that substantially invalidated the
           | lofty claims of the engineers in the 70s or 80s (which is
           | amazing still despite this, a genuine treasure trove of
           | research unused and sometimes useful aerospace engineering
           | research and development) and unfortunately outside the few
           | proper publications, a lot of the invalidations are not
           | properly reverse cited research material and I could have
           | spent a week cross referencing before I spot the link and
           | realise the unnamed work they are saying they are proving
           | wrong is actually some footnotes containing the only
           | published data (before their new paper) on some old work that
           | has a bad scan copy on the NASA NTRS server under some
           | obscure title and no related keywords to the topic the
           | research is notionally about...
           | 
           | Academic research can genuinely suck sometimes...
           | particularly when you want to actually apply it.
        
           | vibrio wrote:
           | "They also tend to over-estimate the effect of peer review
           | (often equating peer review with validity)."
           | 
           | In my experience, scientists ate comfortably cynical about
           | peer review- even those that serve as reviewers and editors-
           | except maybe junior scientists that haven't gotten burned
           | yet.
        
             | renonce wrote:
             | I don't know how scientists handle peer review but aren't
             | they fighting with peer review to get their papers
             | published and apply for PhD and tenure and grants etc with
             | these publications?
        
             | kergonath wrote:
             | Yes, because we know how the metaphorical sausage is made:
             | with unpaid reviewers who have many other, more interesting
             | things to do and often an axe to grind. That is, if they
             | don't delegate the review to one of their post-docs.
        
               | aftoprokrustes wrote:
               | Post doc? In what kind of utopian field did you work? In
               | my former institute virtually all papers were written by
               | PhD candidates, and reviewed by PhD candidates. With the
               | expected effect on quality (due to lack of experience and
               | impostor-syndrome-induced "how can I propose to reject?
               | They are likely better than me"). But the Prof-to-
               | postdoc-to-PhD-ratio was particularly bad (1-2-15).
        
               | kelipso wrote:
               | I was reviewing papers starting second semester of grad
               | school with my advisor just signing off on it, so not
               | even PhD candidates, and it was the same for my lab mates
               | too.
               | 
               | Initially we spent probably a few hours on a paper for
               | peer review because we were relatively unfamiliar with
               | the field but eventually I spent maybe a couple of hours
               | doing the review. Wouldn't say peer review is a joke but
               | it's definitely overrated by the public.
        
             | jakear wrote:
             | It's the general public that equates "peer reviewed" with
             | "definitely correct, does not need to be questioned".
        
         | dongping wrote:
         | While it is a lot of work, I tend to think that one can then
         | always publish preprints if they can't wait for the
         | replication. I don't understand why a published paper should
         | count as an achievement (against tenure or funding) at all
         | before the work is replicated. The current model just creates
         | perverse incentives to encourage lying, P-hacking, and cherry-
         | picking. This would at least work for fields like machine
         | learning.
         | 
         | This is, of course, a naive proposal without too much thought
         | into it. But I was wondering what I would have missed here.
        
           | i_no_can_eat wrote:
           | and in this proposal, who will be tasked with replicating the
           | work?
        
             | dongping wrote:
             | In some fields, replication is already the prerequisite to
             | benchmark the SoTA. So the incentives boil down to
             | publishing them along with negative results. Or as some
             | have suggested, make it mandatory for PHD candidates to
             | replicate.
             | 
             | Though, it seems that it is possible to game the system, by
             | creating positive/negative replication intentionally, to
             | collude with/harm the author.
        
         | omgwtfbyobbq wrote:
         | What about a system where peer replication is required if the
         | number of citations exceeds some threshold?
        
           | p1esk wrote:
           | Who will be replicating it? Why would I want to set aside my
           | own research to replicate some claim someone made? How would
           | this help my career?
        
             | Knee_Pain wrote:
             | Academia's values are not objective. Why is it that
             | replicating or refuting a study is not seen on par as being
             | a co-author of said study? There is nothing set in stone
             | preventing this, only the current academic culture.
        
               | p1esk wrote:
               | Because I want to do original research, and be known for
               | doing original research. Only if I fail at that, I might
               | settle for being a guy who reproduces others' work (which
               | basically means the transition from a researcher to an
               | engineer).
        
               | omgwtfbyobbq wrote:
               | Whether or not you would be doing original research
               | depends on whether the cited work can be replicated.
               | 
               | If the cited work is unable to be replicated, and you try
               | to replicate but get different results, then you would be
               | doing original research, and then you can base further
               | work on your initial original study that came to a
               | different result.
               | 
               | On the flip side, if you are able to replicate it, then
               | you are doing extra work initially, but after replicating
               | the work you've cited, the work you've done is more
               | likely to be reproducible by someone else.
               | 
               | The amount of citations needed to require replication
               | could itself be a function of how easy it is to replicate
               | work across an entire field.
               | 
               | A field where there's a high rate of success in
               | replicating work could have a higher threshold for
               | requiring replication compared to a field where it's
               | difficult to replicate work.
        
             | omgwtfbyobbq wrote:
             | I dunno. Offhand, I guess whoever is citing the work would
             | need to replicate it, but only if it's cited sufficiently
             | (overall number of citations, considered foundational,
             | etc...)
             | 
             | This could help your career by increasing the probability
             | that the work you're citing is more likely accurate, and as
             | a result, your work is also likely more accurate.
        
               | RoyalHenOil wrote:
               | A typical paper may cite dozens or hundreds of other
               | papers. This does not sound feasible. It honestly seems
               | like it would worsen the existing problem and force
               | scientists to focus even more on their own original
               | research in isolation from others, to avoid the expense
               | of running myriad replication experiments that they
               | likely don't have the funding and personnel to do.
        
         | boxed wrote:
         | > I don't see how this could ever work, and non-scientists seem
         | to often dramatically underestimate the amount of work it would
         | be to replicate every published paper.
         | 
         | I don't see how the current system works really either. Fraud
         | is rampant, and replication crisis is the most common state of
         | most fields.
         | 
         | Basically the current system is failing at finding out what is
         | true. Which is the entire point. That's pretty damn bad.
        
           | tptacek wrote:
           | Fraud seems rampant because you hear about cases of fraud,
           | but not about the tens of thousands of research labs plugging
           | away day after day.
        
             | mike_hearn wrote:
             | Unfortunately there's a lot of evidence that fraud really
             | is very prevalent and we don't hear about it anywhere near
             | enough. It depends a lot on the field though.
             | 
             | One piece of evidence comes from software like GRIM and
             | SPRITE. GRIM was run over psychology papers and found
             | around 50% had impossible means in them (that could not be
             | arrived at by any combination of allowed inputs) [1]. The
             | authors generally did not cooperate to help uncover the
             | sources of the problems.
             | 
             | Yet another comes from estimates by editors of well known
             | journals. For example Richard Horton at the Lancet is no
             | stranger to fraud, having published and promoted the
             | Surgisphere paper. He estimates that maybe 50% of medical
             | papers are making untrue claims, which is interesting in
             | that this intuition matches the number obtained in a
             | different field by a more rigorous method. The former
             | editor of the New England Journal of Medicine stated that
             | it was "no longer possible to believe much of the medical
             | research that is published".
             | 
             | 50%+ is a number that crops up frequently in medicine. The
             | famous Ioannidis paper, "Why most published research
             | findings are false" (2005) has been cited over 12,000
             | times.
             | 
             | Marc Andreessen has said in an interview that he talked to
             | the head of a very large government grant agency, and asked
             | him whether it could really be true that half of all
             | biomedical research claims were fake? The guy laughed and
             | said no it's not true, it's more like 90%. [2]
             | 
             | Elizabeth Bik uncovers a lot of fraud. Her work is behind
             | the recent resignation of the head of Stanford University
             | for example. Years ago she said, _" Science has a huge
             | problem: 100s (1000s?) of science papers with obvious
             | photoshops that have been reported, but that are all swept
             | under the proverbial rug, with no action or only an author-
             | friendly correction ... There are dozens of examples where
             | journals rather accept a clean (better photoshopped?)
             | figure redo than asking the authors for a thorough
             | explanation."_ In reality there seem to be far more than
             | mere thousands, as there are companies that specialize in
             | professionally producing fake scientific papers, and whole
             | markets where they are bought and sold.
             | 
             | So you have people who are running the scientific system
             | saying, on the record, that they think science is overrun
             | with fake results. And there is some quantitive data to
             | support this. And it seems to happen quite often now that
             | presidents of entire universities are being caught having
             | engaged in or having signed off on rule breaking behavior,
             | like image manipulation or plagiarism, implying that this
             | behavior is at least rewarded or possibly just very common.
             | 
             | There are also whole fields in which the underlying
             | premises are known to be false so arguably that's also
             | pretty deceptive (e.g. "bot studies"). If you include those
             | then it's quite likely indeed that most published research
             | is simply untrue.
             | 
             | [1] https://peerj.com/preprints/2064v1/
             | 
             | [2] https://www.richardhanania.com/p/flying-x-wings-into-
             | the-dea...
        
             | lliamander wrote:
             | I agree that most labs are probably not out to defraud
             | people. But without replication I don't think it's
             | reasonable to have much confidence in what is published.
        
               | magimas wrote:
               | replication happens over time. For example, when I did my
               | PhD I wanted to grow TaS2 monolayers on a graphene layer
               | on an Iridium crystal. So I took published growth
               | recipees of related materials, adapted them to our setup
               | and then finetuned the recipee for TaS2. This way I
               | basically "peer replicated" the growth of the original
               | paper. I then took those samples to a measurement device
               | and modified the sample in-situ by evaporating Li atoms
               | on top (which was the actual paper but I needed a sample
               | to modify first). I published the paper with the growth
               | recipee and the modification procedure and other
               | colleagues then took those instructions to grow their own
               | samples for their own studies (I think it was MoS2 on
               | Graphene on Cobalt that they grew).
               | 
               | This way papers are peer replicated in an emerging manner
               | because the knowledge is passed from one group to another
               | and they use parts of that knowledge to then apply it to
               | their own research. You have to see this from a more
               | holistic picture. Individual papers don't mean too much,
               | it's their overlap that generates scientific consesus.
               | 
               | In contrast, requiring some random reviewer to instead
               | replicate my full paper would be an impossible task.
               | He/she would not have the required equipment (because
               | there's only 2 lab setups in the whole world with the
               | necessary equipment), he/she would probably not have the
               | required knowledge (because mine and his research only
               | partially overlap - e.g. we're researching the same
               | materials but I use angle-resolved photoemission
               | experiments and he's doing electronic transport) and
               | he/she would need to spend weeks first adapting the
               | growth recipee to the point where his sample quality is
               | the same as mine.
        
               | tptacek wrote:
               | That's not what publication is about. Publication is a
               | conversation with other researchers; it is part of the
               | process of reaching the truth, not its endpoint.
        
               | cpach wrote:
               | People in general (at least on da Internetz) seem to
               | focus way to much on single studies, and way too little
               | on meta-studies.
               | 
               | AFAICT meta-studies is the level where we as a society
               | really can try to say something intelligent about how
               | stuff works. If an important question is not included in
               | a meta-study, we (i.e. universities and research labs)
               | probably need to do more research on that topic before we
               | really can say that much about it.
        
               | lliamander wrote:
               | Sure, and scientists need a place to have such
               | conversations.
               | 
               | But publication is not a closed system. The "published,
               | peer-reviewed paper" is frequently an artifact used to
               | decide practical policy matters in many institutions both
               | public and private. To the extent that Science (as an
               | institution in its own right) wants to influence policy,
               | that influence needs to be grounded in reproducible
               | results.
               | 
               | Also, I would not be surprised if stronger emphasis on
               | reproducibility improved the quality of conversation
               | among scientists.
        
               | vladms wrote:
               | Maybe replication should (and probably does) happen when
               | the published thing is relevant to some entity and also
               | interesting.
               | 
               | I never seen papers as "truth", but more as
               | "possibilities". After many other "proofs" (products,
               | papers, demos, etc.) you can assign some concepts/ideas
               | the label "truth" but one/two papers from the same group
               | is definitely not enough.
        
               | tnecniv wrote:
               | Yeah passing peer review doesn't mean that the article is
               | perfect and to be taken as truth now (and remember, to
               | err is human; any coder on here has had some long
               | standing bug that went mostly unnoticed in their code
               | base). It means it passed the journal's standards for
               | novelty, interest, and rigor based on the described
               | methods as a retained by the editor / area chair and peer
               | reviewers that are selected for being knowledgeable on
               | the topic.
               | 
               | Implicit in this process is that the authors are acting
               | in good faith. To treat the authors as hostile is both
               | demoralizing for the reviewers (who wants to be that
               | cynical about their field) and would require extensive
               | verification of each statement well beyond what is
               | required to return the review in a timely manner.
               | 
               | Unless your paper has mathematical theory (and mistakes
               | do slip through), a publication should not be taken as
               | proof of something on its own, but a data point. Over
               | time and with enough data points, a field builds evidence
               | to turn a hypothesis into a scientific theory.
        
         | majormajor wrote:
         | I think the current system is just measuring entirely the wrong
         | thing. Yes, fewer papers would be published. But today's goal
         | is "publish papers" not "learn and disseminate truly useful and
         | novel things", and while this doesn't solve it entirely, it
         | pushes incentives further away from "publish whatever pure crap
         | you can get away with." You get what you measure -> sometimes
         | you need to change what/how you measure.
         | 
         | > If someone cares enough about the work to build on it, they
         | will replicate it anyway.
         | 
         | That's duplicative at the "oh maybe this will be useful to me"
         | stage, with N different people trying to replicate. And with
         | replication not a first-class part of the system, the effort of
         | replication (e_R) is high. For appealing things, N is probably
         | > 2. So N X e_R total effort.
         | 
         | If you move the burden at the "replicate to publish" stage, you
         | can fix the number of replicas needed so N=2 (or whatever)
         | _and_ you incentive the orginal researchers to make e_R lower
         | (which will improve the quality of their research _even before
         | the submit-for-publication stage_ ).
         | 
         | I've been in the system, I spent a year or two chasing the tail
         | of rewrites, submissions, etc, for something that was
         | detectable as low-effect-size in the first place but I was told
         | would still be publishable. I found out as part of that that it
         | would only sometimes yield a good p-value! And everything in
         | the system incentivized me to hide that for as long as
         | possible, instead of incentivizing me to look for something
         | else or make it easy for others to replicate and judge for
         | themselves.
         | 
         | Hell, do something like "give undergrads the opportunity to
         | earn Master's on top of their BSes, say, by replicating (or
         | blowing holes in) other people's submissions." I would've eaten
         | up an opportunity like that to go _really really deep* in some
         | specialized area in exchange for a masters degree in a less-
         | structured way than "just take a bunch more courses."_
        
         | DoctorOetker wrote:
         | > [...] non-scientists seem to often dramatically underestimate
         | the amount of work it would be to replicate every published
         | paper
         | 
         | Either "peer reviewed" articles describe progress of promising
         | results, or they don't. If they don't the research is
         | effectively ignored (at least until someone finds it
         | promising). So let's consider specifically output that
         | described promising results.
         | 
         | After "peer review" any apparently promising results prompt
         | other groups to build on them by utilizing it as a step or
         | building block.
         | 
         | It can take many failed attempts by independent groups before
         | anyone dares publish the absence of the proclaimed
         | observations, since they may try it over multiple times
         | thinking they must have botched it somewhere.
         | 
         | On paper it sounds more expensive to require independent
         | replication, but only because the costs of replication attempts
         | are hidden until its typically rather late.
         | 
         | Is it really more expensive if the replication attempts are in
         | some sense mandatory?
         | 
         | Or is it perhaps more expensive to pretend science has found a
         | one-shot "peer reviewed" method, resulting in uncoordinated
         | independent reproduction attempts that may go unannounced
         | before, or even after failed replications?
         | 
         | The pseudo-final word, end of line?
         | 
         | What about the "in some sense mandatory" replication? Perhaps
         | roll provable dice for each article, and in-domain sortition to
         | randomly assign replicators. So every scientist would be
         | spending a certain fraction of their time replicating the
         | research of others. The types of acceptable excuses to derelict
         | these duties should be scrutinized and controlled. But some
         | excuses should be very valid, for example _conscientious
         | objection_. If you are tasked to reproduce some of Dr. Mengele
         | 's works, you can cop out on condition that you thoroughly
         | motivate your ethical concerns and objections. This could also
         | bring a lot of healthy criticism to a lot of practices, which
         | is otherwise just ignored an glossed over for fear of future
         | career opportunities.
        
         | jofer wrote:
         | Also, don't forget that a lot of replication would
         | fundamentally involve going and collecting additional samples /
         | observations / etc in the field area, which is often expensive,
         | time consuming, and logistically difficult.
         | 
         | It's not just "can we replicate the analysis on sample X", but
         | also "can we collect a sample similar to X and do we observe
         | similar things in the vicinity" in many cases. That alone may
         | require multiple seasons of rather expensive fieldwork.
         | 
         | Then you have tens to hundreds of thousands of dollars in
         | instrument time to pay to run various analysis which are needed
         | in parallel with the field observations.
         | 
         | It's rarely the simple data analysis that's flawed and far more
         | frequently subtle issues with everything else.
         | 
         | In most cases, rather than try to replicate, it's best to test
         | something slightly different to build confidence in a given
         | hypothesis about what's going on overall. That merits a
         | separate paper and also serves a similar purpose.
         | 
         | E.g. don't test "can we observe the same thing at the same
         | place?", and instead test "can we observe something
         | similar/analogous at a different place / under different
         | conditions?". That's the basis of a lot of replication work in
         | geosciences. It's not considered replication, as it's a
         | completely independent body of work, but it serves a similar
         | purpose (and unlike replication studies, it's actually
         | publishable).
        
         | b59831 wrote:
         | [dead]
        
         | kshahkshah wrote:
         | When I looked into this, more than 15 years ago, I thought the
         | difficult portion wasn't sharing the recipe, but the
         | ingredients, if you will - granted I was in a molecular biology
         | lab. Effectively the Material Transfer Agreements between
         | Universities all trying to protect their IP made working with
         | each other unbelievably inefficient.
         | 
         | You'd have no idea if you were going down a well trodden path
         | which would yield no success because you have no idea it was
         | well trod. No one publishes negative results, etc.
        
         | RugnirViking wrote:
         | lets be brutally honest with ourselves.
         | 
         | 99% of all papers mean nothing. They add nothing to the
         | collective knowledge of humanity. In my field of robotics there
         | are SOOO many papers that are basically taking three or four
         | established algorithms/machine learning models, and applying
         | them to off-the-shelf hardware. The kind of thing any person
         | educated in the field could almost guess the results exactly.
         | Hundreds of such iterations for any reasonably popular problems
         | space (prosthetics, drones for wildfires, museum guide robot)
         | etc every month. Far more than could possibly be useful to
         | anyone.
         | 
         | There should probably be some sort of separate process for
         | things that actually claim to make important discoveries. I
         | don't know what or how that should work. In all honesty maybe
         | there should just be less papers, however that could be
         | achieved.
        
           | indymike wrote:
           | > 99% of all papers mean nothing. They add nothing to the
           | collective knowledge of humanity.
           | 
           | A lot of papers are done as a part of the process of getting
           | a degree or keeping or getting job. The value is mostly the
           | candidate showing they have the acumen to produce a paper of
           | such quality that meets the publisher and peer review
           | requirements. In some cases, it is to show a future employer
           | some level of accomplishment or renown. The knowledge for
           | humanity is mostly the authors ability to get published.
        
             | RugnirViking wrote:
             | well yes. But these should go somewhere else than the
             | papers that may actually contain significant results. The
             | problem we have here is that there is an enormous quantity
             | of such useless papers mixed in with the ones actually
             | trying to do science.
             | 
             | I understand that part of the reason for that is that
             | people need to appear as though they are part of the
             | "actually trying" crowd to get the desired job effects. But
             | it is nonetheless a problem, and a large one very worth at
             | least trying to solve.
        
           | staunton wrote:
           | 99% of science is a waste of time, not just the papers. We
           | just don't know which 1% will turn out not to be. The point
           | is that this is making progress. As such, these 99%
           | definitely _are_ adding to the collective knowledge. Maybe
           | they add very little and maybe it 's not worth the effort but
           | it's not nothing. I think one of the effects of AI progress
           | will be allowing to extract much more of the little value
           | such publications have (the 99% of papers might not be worth
           | reading but are good enough for feeding the AI).
        
         | [deleted]
        
         | throwaway4aday wrote:
         | What's the value in publishing something that is never
         | replicated? If no one ever reproduces the experiment and gets
         | the same results then you don't know if any interpretations
         | based on that experiment are valid. It would also mean that
         | whatever practical applications could have come from the
         | experiment are never realized. It makes the entire pursuit seem
         | completely useless.
        
           | geysersam wrote:
           | It still has value if we assume the experiment was done by
           | competent honest people who are unlikely to try to fool us on
           | purpose and unlikely do have made errors.
           | 
           | It would be even better if it was replicated of course.
           | 
           | Depending on what certainty you need you might have to wait
           | for the result of one or several replications, but that is
           | application dependent.
        
           | wizofaus wrote:
           | > What's the value in publishing something that is never
           | replicated?
           | 
           | Because it presents an experimental result to other
           | scientists that they may consider worth trying to replicate?
        
             | dongping wrote:
             | Then those unconfirmed results are better put on arxiv,
             | instead of being used to evaluate the performance of
             | scientists. Tenure and grant committees should only
             | consider replicated work.
        
               | geysersam wrote:
               | I don't agree. A published article should not be taken
               | for Gods Truth no matter if it's replicated or peer
               | reviewed.
               | 
               | Lots of "replicated" "peer-reviewed" research have been
               | found to be wrong. That's fine, it's part of the process
               | of discovery.
               | 
               | A paper should be taken for what it is: a piece of
               | scientific work, a part of a puzzle.
        
         | justinpombrio wrote:
         | > If someone cares enough about the work to build on it, they
         | will replicate it anyway.
         | 
         | Well, the trouble is that hasn't been the case in practice. A
         | lot of the replication crisis was attempting for the first time
         | to replicate a _foundational_ paper that dozens of other papers
         | took as true and built on top of, and then seeing said
         | foundational paper fail to replicate. The incentives point
         | toward doing new research instead of replication, and that
         | needs to change.
        
           | p1esk wrote:
           | It is the case in my field (ML): if I care enough about a
           | published result I try to replicate it.
        
             | tnecniv wrote:
             | This is something very sensible in ML since, you likely
             | want to use that algorithm for something else (or to extend
             | / modify it), so you need to get it working in your
             | pipeline and verify it works by comparing with the
             | published result.
             | 
             | In something like psychology that is likely harder, since
             | the experiment you want to do might be related to but
             | differ significantly from the prior work. I am no
             | psychologist, but I'd like to think that they don't take
             | one study as ground truth for that reason but try to
             | understand causal mechanisms with multiple studies as data
             | points. If the hypothesis is correct, it will likely
             | present in multiple ways.
        
         | brightball wrote:
         | > I don't see how this could ever work, and non-scientists seem
         | to often dramatically underestimate the amount of work it would
         | be to replicate every published paper.
         | 
         | The alternative is a bunch of stuff being published which
         | people belief as "science" that doesn't hold up under scrutiny,
         | which undermines the reliability of science itself. The current
         | approach simply gives people reason to be skeptical.
        
           | ImPostingOnHN wrote:
           | I'm not convinced this proposed alternative is better than
           | the status quo. It's simply not feasible, no matter how many
           | benefits one might imagine.
           | 
           | the concern about skepticism is not irrelevant, but many of
           | these skeptics also are skeptical of the earth being round,
           | or older than a few thousand years, or not created by an
           | omnipotent skylord, and I'm not sure it's actually a
           | significant concern given the current number and expertise of
           | those who are skeptical
           | 
           | so, we can hear their arguments for their skepticism, but
           | that doesn't mean the arguments are valid to warrant the
           | skepticism exhibited. And in the end, that's what matters:
           | skepticism warranted by valid arguments, not just any Cletus
           | McCletus's skepticism of heliocentrism, as if his opinion is
           | equal to that of an astrophysicist (it isn't). And you know
           | what? It isn't necessary to convince a ditch digger that the
           | earth goes around the sun, if they feel like arguing about
           | it.
        
         | backtoyoujim wrote:
         | Yes it would indeed mean slowing down and having more
         | scientists.
         | 
         | It would mean disruption is no longer a useful tool for human
         | development.
        
           | brnaftr361 wrote:
           | It may not be. I would be willing to argue that there was a
           | tipping point and we've long exceeded its boundary - progress
           | and disruption now is just making finding an equilibrium in
           | the future increasingly difficult.
           | 
           | So entering into a paradigm where we test the known space -
           | especially presently - would 1) help reduce cruft; 2) abate
           | undersirable forward progress; 3) train the next
           | generation(s) of scientists to be more diligent and better
           | custodians of the domain.
        
           | ebiester wrote:
           | I don't necessarily think it would mean more scientists, but
           | it would mean more expense. You have a moderate number of low
           | impact papers that people are doing for tenure today - papers
           | for the purpose of cranking out papers. We are talking about
           | redirecting efforts but increasing quality of what you have.
        
         | jononomo wrote:
         | If it is not replicated it shouldn't be published, other than
         | as a provisional draft. I don't care if it hurts your feelings.
        
         | sqrt_1 wrote:
         | FYI there is a at least one science journal that only publishes
         | reproduced research:
         | 
         | Organic Syntheses "A unique feature of the review process is
         | that all of the data and experiments reported in an article
         | must be successfully repeated in the laboratory of a member of
         | the editorial board as a check for reproducibility prior to
         | publication"
         | 
         | https://en.wikipedia.org/wiki/Organic_Syntheses
        
         | throwawaymaths wrote:
         | > I don't see how this could ever work,
         | 
         | http://www.orgsyn.org/
         | 
         | > All procedures and characterization data in OrgSyn are peer-
         | reviewed and checked for reproducibility in the laboratory of a
         | member of the Board of Editors
         | 
         | Never is a strong word.
        
         | indymike wrote:
         | > This is also all work that doesn't benefit the scientists
         | replicating the paper. It only costs them money and time.
         | 
         | Maybe this is what needs to change. If we only reward discovery
         | and success, then the incentive is to only produce discovery
         | and success.
        
         | johnnyworker wrote:
         | > If someone cares enough about the work to build on it, they
         | will replicate it anyway.
         | 
         | Does it really deserve to be called _work_ if it doesn 't
         | include the a full, working set of instructions that if
         | followed to a T allow it to be replicated? To me that's more
         | like pollution, making it someone else's problem. I certainly
         | don't see how "we did this, just trust us" can even be
         | considered science, and that's not because I don't understand
         | the scientific method, that's because I don't make a living
         | with it, and have no incentive to not rock the boat.
        
           | davidktr wrote:
           | You just described the majority of scientific papers. A
           | "working set of instructions" is not really feasible in most
           | cases. You can't include every piece of hard- and software
           | required to replicate your own setup.
        
             | lliamander wrote:
             | Sounds like a problem worth solving.
        
             | johnnyworker wrote:
             | Then don't call it science, since it doesn't contribute
             | anything to the body of human knowledge.
             | 
             | I think it's fascinating that we can at the same time hold
             | things like "one is none" to be true, or that you should
             | write tests first, but with science we already got so used
             | to a lack of discipline that we just declare it fine.
             | 
             | It's not hard to not climb a tower you can't get down from.
             | It's the default, actually. You start with something small
             | where you can describe everything that goes into
             | replicating it. Then you replicate it yourself, based on
             | your own instructions. Before that, you don't bother anyone
             | else with it. Once that is done, and others can replicate
             | as well, it "actually exists".
             | 
             | And if that means the majority of stuff has to be thrown
             | out, I'd suggest doing that sooner rather than later,
             | instead of just accumulating scientific debt.
        
               | davidktr wrote:
               | Imagine two scientists, Bob and Alice. Bob has spent the
               | last 5 years examining a theory thoroughly. Now he can
               | explain down to the last detail why the theory does not
               | hold water, and why generations of researchers have been
               | wrong about the issue. Unfortunately, he cannot offer an
               | alternative, and nobody else can follow his long winded
               | arguments anyway.
               | 
               | Meanwhile, Alice has spent the last 5 years making the
               | best possible use of the flawed theory, and published a
               | lot of original research. Sure, many of her publications
               | are rubbish, but a few contain interesting results.
               | Contrary to Bob, Alice can show actual results and has
               | publications.
               | 
               | Who do you believe will remain in academia? And,
               | according to public perception, will seem more like an
               | actual scientist?
        
               | tnecniv wrote:
               | Then Bob has failed.
               | 
               | Academic science isn't just the doing science part but
               | the articulation and presentation of your work to the
               | broader community. If Bob knows this space so well, he
               | should be able to clearly communicate the issue and,
               | ideally, present an easily understandable counter example
               | to the existing theory.
               | 
               | Technical folks undervalue presentation when writing
               | articles and presenting at conferences. The burden of
               | proof is on the presenter, and, unless there's some
               | incredible demonstration at the end, most researchers
               | won't have the time or attention to slog through your
               | mess of a paper to decipher it. There's only so much time
               | in the day and too many papers to read.
               | 
               | In my experience, the best researchers are also the best
               | presenters. I've been to great talks out of my domain
               | that I left feeling like I understood the importance of
               | their work despite not understanding the details. I've
               | also seen many talks in my field that I thought were
               | awful because the presentation was convoluted or they
               | didn't motivate the importance of their problem / why
               | their work addressed it
        
               | johnnyworker wrote:
               | I disagree that Bob doesn't produce actual results, or
               | that something that is mostly rubbish, but partly
               | "interesting" is an actual result. We know the current
               | incentives are all sorts of broken, across the board.
               | Goodhart's law and all that. To me the question isn't who
               | remains in academia given the current broken model, but
               | who would remain in academia in one that isn't as broken.
               | 
               | To put a point on it, if public distrust of science
               | becomes big enough, it all can go away before you can say
               | "cultural revolution" or "fascist strongman". Then
               | there'd be no more academia, and its shell would be
               | inhabited by party members, so to speak. I'd gladly
               | sacrifice the ability of Alice and others like her to
               | live off producing "mostly rubbish" to at least have a
               | _chance_ to save science itself.
        
               | cycomanic wrote:
               | This is a very simplistic view. Why do believe QC
               | departments exist? Even in an industrial setting,
               | companies make the same thing at the same place on the
               | same equipment after sometimes years of process
               | optimisation of well understood technology. This is
               | essentially a best case scenario and still results fail
               | to reproduce. How are scientists who work at the cutting
               | edge of technology with much smaller budgets supposed to
               | give instructions that can be easily reproduced on first
               | go? Moreover how are they supposed to easily reproduce
               | other results?
               | 
               | That is not to say that scientist should not document the
               | process to their best ability so it can be reproduced in
               | principle. I'm just arguing that it is impossible to
               | easily reproduce other people's results. Again when
               | chemical/manufacturing companies open another location
               | they often spend months to years to make the process work
               | in the new factory.
        
               | johnnyworker wrote:
               | > companies make the same thing at the same place on the
               | same equipment after sometimes years of process
               | optimisation of well understood technology. This is
               | essentially a best case scenario and still results fail
               | to reproduce.
               | 
               | We're not talking about 1 of 10 reproduction attempts
               | failing, we're talking about 100%. And no, companies
               | don't time and time again try to reproduce something that
               | has never been reproduced and fail, to then try again,
               | endlessly. That's just not a thing.
               | 
               | > it is impossible to easily reproduce other people's
               | results
               | 
               | We're also not talking about "easily" reproducing
               | something, but _at all_. And in principle doesn 't cut
               | it, it needs to be reproduced in practice.
        
             | johngladtj wrote:
             | You should.
        
           | MrJohz wrote:
           | I work with code, which is about as reproducible as it is
           | possible to get - the artifacts I produce are literally just
           | instructions on how to reproduce the work I've done again,
           | and again, and again. And still people come to me with some
           | bug that they've experienced on their machine, that I cannot
           | reproduce on my machine, despite the two environments being
           | as identical as I can possibly make them.
           | 
           | I agree that reproduction in scientific work is important,
           | but it is also apparently impossible in the best possible
           | circumstances. When dealing with physical materials, inexact
           | measurements, margins of error, etc, I think we have to
           | accept that there is no set of instructions that, if followed
           | to a T, will ever ensure perfect replication.
        
             | johnnyworker wrote:
             | > And still people come to me with some bug that they've
             | experienced on their machine, that I cannot reproduce on my
             | machine
             | 
             | But this is the other way around. Have you ever written a
             | program that doesn't run _anywhere_ except a single machine
             | of yours? Would you release it and advertise it and
             | encourage other people to use it as dependency in their
             | software?
             | 
             | If it only runs on one machine of yours, you don't even
             | know if your code is doing something, or something else in
             | the machine/OS. Or in terms of science, whether the
             | research says something about the world, or just about the
             | research setup.
        
               | MrJohz wrote:
               | I think you misunderstand the point of scientific
               | publication here (at least in theory, perhaps less so in
               | practice). The purpose of a paper is typically to say "I
               | have achieved these results in this environment (as far
               | as I can tell)", and encourages reproduction. But the
               | original result is useful in its own right - it tells us
               | that there may be something worth exploring. Yes, it may
               | just be a measurement error (I remember the magic faster
               | than light neutrinos), but if it is exciting enough, and
               | lots of eyes end up looking, then flaws are typically
               | found fairly quickly.
               | 
               | And yes, there are often overly excited press releases
               | that accompany it - the "advertise it and encourage
               | others to us it as a dependency" part of it analogy - but
               | this is typically just noise in the context of scientific
               | research. If that is your main problem with scientific
               | publishing, you may want to be more critical of science
               | journalism instead.
               | 
               | Fwiw, yes of course I've written code that only runs on
               | my machine. I imagine everyone has, typically
               | accidentally. You do it, you realise your mistake, you
               | learn something from it. Which is exactly what we expect
               | from scientific papers that can't be reproduced.
        
               | johnnyworker wrote:
               | > But the original result is useful in its own right - it
               | tells us that there may be something worth exploring.
               | 
               | I disagree. It shows that when someone writes something
               | in a text editor and publishes it, others can read the
               | words they wrote. That's all it shows, by itself. Just
               | like someone writing something on the web only tells us
               | that a textarea accepts just about any input.
               | 
               | And even if it did show more than that, when someone
               | "explores" it, is the result is more of that, something
               | that might be true, might not be, but "is worth
               | exploring"? Then at what point does falsifiability enter
               | into it? Why not right away? To me it's just another
               | variation of making it someone else's problem, kicking
               | the can down the road.
               | 
               | > if it is exciting enough, and lots of eyes end up
               | looking, then flaws are typically found fairly quickly.
               | 
               | If that was true, there wouldn't even be a replication
               | issue, much less a replication crisis. It's like saying
               | open source means a lot of people look at the code, if
               | it's important enough. Time and time again that's proven
               | wrong, e.g. https://www.zdnet.com/article/open-source-
               | software-security-...
               | 
               | > yes of course I've written code that only runs on my
               | machine. I imagine everyone has
               | 
               | I wouldn't even know how to go about doing that. Can you
               | post something that only runs on one of your machines,
               | and you don't know why? Note I didn't say your machine, I
               | said _one_ machine of yours. Would you publish something
               | that runs on one machine of yours but not a single other
               | one, other than to ask  "can anyone tell me why this only
               | runs on this machine"? I doubt it.
        
               | varjag wrote:
               | > Note I didn't say your machine, I said one machine of
               | yours.
               | 
               | This thread discusses _peer_ replication, this is not
               | even an analogy.
        
               | johnnyworker wrote:
               | If you can't _even_ replicate it yourself, what makes you
               | think peers could? We are talking about something not
               | being replicated, not even by the original author. The
               | most extreme version would be something that you could
               | only get to run once on the same machine, and never on
               | any other machine.
        
               | MrJohz wrote:
               | I think you may be seeing the purpose of these papers
               | differently to me, which may be the cause of this
               | confusion.
               | 
               | The way you're describing a scientific publication is as
               | if it were the end result of the scientific act. To use
               | the software analogy, you're describing publication like
               | a software release: all tests have been performed, all CI
               | workflows have passed, QA have checked everything, and
               | the result is about to be shipped to customers.
               | 
               | But talking to researchers, they see publishing more like
               | making a new branch in a repository. There is no
               | expectation that the code in that branch already be
               | perfect (hence why it might only run on one machine, or
               | not even run at all, because sometimes even something
               | that doesn't work is still worth committing and exploring
               | later).
               | 
               | And just like in software, where you might eventually
               | merge those branches and create a release out of it, in
               | the scientific world you have metastudies or other forms
               | of analysis and literature reviews that attempt to glean
               | a consensus out of what has been published so far. And
               | typically in the scientific world, this is what happens.
               | However, in journalism, this isn't usually what happens,
               | and one person's experimental, "I've only tested this on
               | my machine" research is often treated as equivalent to
               | another person's "release branch" paper evaluating the
               | state of a field and identifying which findings are
               | likely to represent real, universal truths.
               | 
               | Which isn't to say that journalists are the only ones at
               | fault here - universities that evaluate researchers
               | primarily on getting papers into journals, and prestige
               | systems that make it hard to go against conventional
               | wisdom in the field both cause similar problems by
               | conflating different levels of research or adding
               | competing incentives to researchers' work. But I don't
               | think that invalidates the basic idea of published
               | research: to present a found result (or non-really),
               | provide as much information as possible about how to
               | replicate the result again, and then let other people use
               | that information to inform their work. It just requires
               | us to be mindful of how we let that research inform us.
        
               | johnnyworker wrote:
               | > But talking to researchers, they see publishing more
               | like making a new branch in a repository.
               | 
               | Well some do, others don't. Like the one who wrote the
               | article this is a discussion of.
               | 
               | https://en.wikipedia.org/wiki/Replication_crisis
               | 
               | > Replication is one of the central issues in any
               | empirical science. To confirm results or hypotheses by a
               | repetition procedure is at the basis of any scientific
               | conception. A replication experiment to demonstrate that
               | the same findings can be obtained in any other place by
               | any other researcher is conceived as an
               | operationalization of objectivity. It is the proof that
               | the experiment reflects knowledge that can be separated
               | from the specific circumstances (such as time, place, or
               | persons) under which it was gained.
               | 
               | Or, in short, "one is none". One _might_ turn into more
               | than one, it might not. Until it does, it 's not real.
               | 
               | more snippets from the above WP article:
               | 
               | > This experiment was part of a series of three studies
               | that had been widely cited throughout the years, was
               | regularly taught in university courses
               | 
               | > what the community found particularly upsetting was
               | that many of the flawed procedures and statistical tools
               | used in Bem's studies were part of common research
               | practice in psychology.
               | 
               | > alarmingly low replication rates (11-20%) of landmark
               | findings in preclinical oncological research
               | 
               | > A 2019 study in Scientific Data estimated with 95%
               | confidence that of 1,989 articles on water resources and
               | management published in 2017, study results might be
               | reproduced for only 0.6% to 6.8%, even if each of these
               | articles were to provide sufficient information that
               | allowed for replication
               | 
               | I'm not saying it couldn't be fine to just publish things
               | because they "could be interesting". But the overall
               | situation seems like quite the dumpster fire to me. As
               | does software, FWIW.
        
         | techas wrote:
         | Well, you could put incentives to make replication attractive.
         | Give credit for replication. Give money to the researchers
         | doing the replication/review. Today we pay an average of
         | 2000EUR per article, reviewers get 0EUR and the editorial keeps
         | all for putting a pdf online. I would say there is margin there
         | to invest in improving the review process.
        
           | mandmandam wrote:
           | It's wild to me that although we _know_ that it was Ghislaine
           | Maxwell 's daddy who started this incredibly corrupt system,
           | people hardly mention this fact.
           | 
           | The US system, and others, even attack people who dare to try
           | and make science more open. RIP Aaron Swartz, and long live
           | Alexandra Elbakyan.
        
         | sebzim4500 wrote:
         | >I don't see how this could ever work, and non-scientists seem
         | to often dramatically underestimate the amount of work it would
         | be to replicate every published paper.
         | 
         | I think it would be fine to half the productivity of these
         | fields, if it means that you can reasonably expect papers to be
         | accurate.
        
           | dmarchand90 wrote:
           | I believe that, contrary to popular belief, the
           | implementation of this system would lead to a substantial
           | increase in productivity in the long run. Here's why:
           | 
           | Currently, a significant proportion of research results in
           | various fields cannot be reproduced. This essentially means
           | that a lot of work turns out to be flawed, leading to wasted
           | efforts (you can refer to the 'reproducibility crisis' for
           | more context). Moreover, future research often builds upon
           | this erroneous information, wasting even more resources. As a
           | result, academic journals get cluttered with substandard
           | work, making them increasingly difficult to monitor and
           | comprehend. Additionally, the overall quality of written
           | communication deteriorates as emphasis shifts from the
           | accurate transfer and reproduction of knowledge to the
           | inflated portrayal of novelty.
           | 
           | Now consider a scenario where 50% of all research is
           | dedicated to reproduction. Although this may seem to
           | decelerate progress in the short term, it ensures a more
           | consistent and reliable advancement in the long term. The
           | quality of writing would likely improve to facilitate
           | replication. Furthermore, research methodology would be
           | disseminated more quickly, enhancing overall research
           | effectiveness.
        
             | matthewdgreen wrote:
             | In the current system scientists allocate reproduction
             | efforts to results that they intend to build on. So if
             | you've claimed a breakthrough technique for levitating
             | widgets -- and I think this widget technique can be used to
             | build spacecraft (or if I think your technique is wrong) --
             | then I will allocate precious time and resources to
             | reproducing your work. By contrast if I don't think your
             | work is significant and worth following up on, then I
             | allocate my efforts somewhere else. The advantage is that
             | more apparently-significant results ("might cure cancer")
             | tend to get a bigger slice of very limited resources, while
             | dead-end or useless results ("might slightly reduce
             | flatulence in cats") don't. This distributed
             | entrepreneurial approach isn't perfect, but it works better
             | than central planning. By contrast you could adopt a
             | Soviet-like approach where cat farts and cancer both share
             | replication resources, but this seems like it would be bad
             | for everyone (except the cats.)
        
           | advisedwang wrote:
           | It would be more than just half productivity. Not only do you
           | have to do the work twice, but you add the delay of someone
           | else replicating before something can be published and built
           | upon by others. If you are developer, imagine how much your
           | productivity would drop going from a 3 minute build to a 1
           | day build.
        
             | orangepurple wrote:
             | Terrible analogy. It might take months to come up with an
             | idea but another should be able to follow your method and
             | implement it much more quickly than it took you to come up
             | with the concept and implement it.
        
               | magimas wrote:
               | horrible take. Taking the LK99 situation as an example:
               | simply copying and adapting a well described growth
               | recipee to your own setup and lab conditions may take
               | weeks. And how would you address situations where
               | measurement setups only exist once on the earth? How
               | would you do peer replication of LHC measurements? Wait
               | for 50 years till the next super-collider is built and
               | someone else can finally verify the results? On a smaller
               | scale: If you need measurements at a synchrotron
               | radiation source to replicate a measurement, is someone
               | supposed to give up his precious measurement time to
               | replicate a paper he isn't interested in? And is the
               | original author of a paper that's in the queue for peer
               | replication supposed to wait for a year or two till the
               | reviewer gets a beamtime on an appropriate measurement
               | station? Even smaller: I did my PhD in a lab with a
               | specific setup that only a single other group in the
               | world had an equivalent to. You simply would not be able
               | to replicate these results.
               | 
               | Peer replication is completely unfeasible in experimental
               | fields of science. The current process of peer review is
               | alright, people just need to learn that single papers
               | standing by themselves don't mean too much. The "peer
               | replication" happens over time anyway when others use the
               | same tools, samples, techniques on related problems and
               | find results in agreement with earlier papers.
        
               | evandrofisico wrote:
               | Usually coming up with a idea is the _easy_ part. For
               | example, in my PhD project, i started with an idea from
               | my advisor that he had in the early 2000.
               | 
               | Implementing the code for the simulation and analysis of
               | the data? four months, at most. Running the simulation?
               | almost three years until I had data with good enough
               | resolution for publishing.
        
               | tnecniv wrote:
               | It's also very easy to come up with bad ideas -- I did
               | plenty of that and I still do, albeit less than I used
               | to. Finding an idea that is novel, interesting, and
               | tractable given your time, skills, resources, and
               | knowledge of the literature is hard, and maybe the most
               | important skill you develop as a researcher.
               | 
               | For a reductive example, the idea to solve P vs NP is a
               | great one, but I'm not going to do that any time soon!
        
               | cycomanic wrote:
               | I think you don't understand how much work is involved in
               | just building the techniques and expertise to pull some
               | experiments off (let's not even talk about the
               | equipment).
               | 
               | Even if someone meticulously documents their process, it
               | could still take months to replicate the results.
               | 
               | I'm familiar with lithography/nanofabrication and I know
               | that it is typically the case that a process developed in
               | one clean-room can not be directly applied to a different
               | clean room and instead one has to develop a new process
               | based on what the other results.
               | 
               | Even in the same lab it can often happen that if you come
               | back to a process after a longer time, that things don't
               | work out anymore and quite a bit of troubleshooting
               | ensues (maybe a supplier for some chemical changed and
               | even though it should be the same formula it behaves
               | slightly different).
        
               | RoyalHenOil wrote:
               | Months. Haha.
               | 
               | I previously worked in agricultural research (in the
               | private sector), and we spent YEARS trying to replicate
               | some published research from overseas. And that was
               | research that had previously been successfully
               | replicated, and we even flew in the original scientists
               | and borrowed a number of their PhD students for several
               | months, year after year, to help us try to make it work.
               | 
               | We never did get it to fully replicate in our country. We
               | ended up having to make some pretty extreme changes to
               | the research to get similar (albeit less reliable)
               | results here.
               | 
               | We never did figure out why it worked in one part of the
               | world but not another, since we controlled for every
               | other factor we could think of (including literally
               | importing the original team's lab supplies at great
               | expense, just in case there was some trace contaminant on
               | locally sourced materials).
        
           | harimau777 wrote:
           | The issue that I see is: even if halving productivity is
           | acceptable to the field as a whole; how do you incentivize a
           | given scientist to put in the effort?
           | 
           | This seems particularly problematic because it is already
           | notoriously hard to get tenure and academia is already
           | notoriously unrewarding to researchers who don't have tenure.
        
           | hoosieree wrote:
           | Half is wildly optimistic.
        
           | ImPostingOnHN wrote:
           | half would only be possible if, for every single paper
           | published by a given team, there exists a second team just as
           | talented as the original team, skilled in that specific
           | package of techniques, just waiting to replicate that paper
        
         | coding123 wrote:
         | Maybe doing an experiment twice, even with a cost that is
         | double, makes more sense so that we don't all throw away our
         | coffee when coffee is bad, or throw away our gluten when gluten
         | is bad, etc... (those are trivial examples) basically the cost
         | to perform the science in many cases is so minuscule in scale
         | to how it could affect society.
        
           | pvaldes wrote:
           | One. Doing experiments is yet enough difficult and painful.
           | 
           | Two. This drain of resources can't be done for free. Somebody
           | will need to pay twice for half of the research [1], and
           | faster. Peers will need to be hired and paid, maybe by the
           | writer's grants. Researchers cant justify to give their own
           | funds to other teams without a profound change in regulation
           | and even in that case would be harming their own projects.
           | 
           | [1] as the valuable experts are now stuck validating things
           | instead doing their own job
           | 
           | Would open also a door for foul play. Blocking competitors
           | teams in molasses just trowing them secondary silly problems
           | that they know that are a dead end, while the other team work
           | in the real deal, and take the advantage to win the patent.
        
         | mattkrause wrote:
         | Longer, even!
         | 
         | Some experiments that study biological development or trained
         | animals can take a year or more of fairly intense effort to
         | _start_ generating data.
        
           | Maxion wrote:
           | A year? some data sets take decades to build up before
           | significant papers can be published on their data.
           | Replication of the dataset is just not feasible.
           | 
           | This whole thread just shows how little the average HNer
           | knows about the academic sciences.
        
           | tnecniv wrote:
           | I know people that had to take a 6+ month trip to Antarctica
           | for part of their work and others that had to share time on a
           | piece of experimental equipment with a whole department --
           | they got a few weeks per year to run their experiment and had
           | to milk that for all it's worth. Even if they had funding,
           | that machine required large amounts of space and staff to
           | keep it running and they aren't off the shelf products --
           | only a few exist at large research centers.
        
       | seventytwo wrote:
       | There would need to be an incentive structure where the first
       | replications get (nearly) the same credit as the original
       | publisher.
        
       | j45 wrote:
       | Can every thing be replicated in every field
        
         | User23 wrote:
         | That's the defining characteristic of engineering. If you can't
         | reliably replicate everything in an engineering discipline then
         | it's not an engineering discipline.
        
       | Hiromy wrote:
       | Hola te amo
        
       | jimmar wrote:
       | How do you replicate a literature review? Theoretical physics? A
       | neuro case? Research that relies upon natural experiments? There
       | are many types of research. Not all of them lend themselves to
       | replication, but they can still contribute to our body of
       | knowledge. Peer review is helpful in each of these instances.
       | 
       | Science is a process. Peer review isn't perfect. Replication is
       | important. But it doesn't seem like the author understands what
       | it would take to simply replace peer review with replication.
        
         | janalsncm wrote:
         | I don't think the existence of papers that are difficult to
         | replicate undermines the value of replicating those that are
         | easier.
        
       | freeopinion wrote:
       | My mind automatically swapped out the words "peer" for "code". It
       | took my brain to interesting places. When I came back to the
       | actual topic, I had accidentally built a great way to contrast
       | some of the discussion offered in this thread.
        
         | dongping wrote:
         | In the sense of replicating the results, we do have CI servers
         | and even fuzzers running for our "code replication".
        
           | freeopinion wrote:
           | I don't want to derail the science discussion too much, but
           | what if you actually had to reproduce the code by hand? Would
           | that process produce anything of value? Would your habit of
           | writing i+=1 instead of i++ matter? Or iteration instead of
           | recursion?
           | 
           | Would code replication result in fewer use after free, or off
           | by one than code review? Or would it mostly be a waste of
           | resources including time?
        
       | abnry wrote:
       | If scientists are going to complain that's its too hard or too
       | expensive to replicate their studies, then that just shows their
       | work is BS.
        
         | fodkodrasz wrote:
         | I guess if software developers will complain that it's too hard
         | or too expensive to thoroughly test their code to ensure
         | exactly zero bugs at release[1], then that just shows their
         | work is BS.
         | 
         | [1]: if you have delivered telco code to Softbank you may have
         | heard this sentence
        
           | abnry wrote:
           | Replication is not the same thing as zero bugs in software.
        
         | alsodumb wrote:
         | Nah, it doesn't. It just shows that it's time consuming and
         | expensive to replicate their studies.
        
           | abnry wrote:
           | If that's the case, then don't claim confidence in the work
           | or make policy decisions based off of it. If there is no
           | epistemological humility, then yes, it is still BS.
        
             | Levitz wrote:
             | If any study costs X, the study and the replication costs
             | somewhere in the ballpark of 2*X. This is not trivial.
        
               | abnry wrote:
               | But this is science we are talking about. A one-off lucky
               | novel result should not be good enough. Why should our
               | standards and our funding be so low?
        
         | Maxion wrote:
         | Something in switzerland called the Large Hadron Collider comes
         | to mind.
         | 
         | I guess we should not talk about the Higgs before someone else
         | builds a second one and replicates the papers.
        
           | abnry wrote:
           | Physics is generally better since they have good statistical
           | models and can get six sigma (or whatever) results.
           | 
           | And replication can be done by the same party (although an
           | independent party is better), and that may mean many trials.
           | 
           | And do we even set policy based on existence or non-existent
           | of higgs bosons?
           | 
           | I am particularly unhappy with soft sciences in terms of
           | replication.
        
         | azan_ wrote:
         | What if it REALLY is too expensive? You do realize that there
         | are studies which literally cost millions of dollars? Getting
         | funding for original studies is hard enough, good luck securing
         | additional funds for replication.
        
         | snitty wrote:
         | >If scientists are going to complain that's its too hard or too
         | expensive to replicate their studies, then that just shows
         | their work is BS.
         | 
         | 1 mg of anti-rabbit antibody (a common thing to use in a lot of
         | biology experiments) is $225 [1]. Outside of things like
         | standard buffers and growth medium for prokaryotes, this is
         | going to be the cheapest thing you use in an experiment.
         | 
         | 1/10th of that amount for anti-flagellin antibody is $372. [2]
         | 
         | A kit to prep a cell for RNA sequencing is $6-10 per use.
         | That's JUST isolation of the RNA. Not including reverse
         | transcribing it to cDNA for sequencing, or the sequencing
         | itself. [3]
         | 
         | Let's not even reach things like materials science where you
         | may be working on an epitaxial growth paper, and there are only
         | a handful of labs where they could even feasibly repeat the
         | experiment.
         | 
         | Or say something with a BSL-3 lab where there are literally
         | only 15 labs in the US that could feasibly do the work,
         | assuming they aren't working on their own stuff. [4]
         | 
         | [1] - https://www.thermofisher.com/antibody/product/Goat-anti-
         | Rabb... [2] https://www.invivogen.com/anti-flagellin [3]
         | https://www.thermofisher.com/order/catalog/product/12183018A
         | [4] https://www.niaid.nih.gov/research/tufts-regional-
         | biocontain...
        
       | NalNezumi wrote:
       | Imo, A more realistic thing to do is "replicability review"
       | and/or requirement to submit "methodology map" to each paper.
       | 
       | The former would be a back and forth between a reviewer that
       | inquire and ask questions (based on the paper) with the goal to
       | _reproduce the result_ , but don't have to actually reproduce it.
       | This is usually good to find out missing details in the paper
       | that the writer just took for granted everyone in the field knows
       | (I've met Bio PHD that have wasted Months of their life tracking
       | up experimental details not mentioned in a paper)
       | 
       | The latter would be the result of the former. Instead of having
       | pages long "appendix" section in the main paper, you produce
       | another document with meticulous details of the
       | experiment/methodology with every stone turned together with an
       | peer reviewer. Stamp it with the peer reviewes name so they can't
       | get away with hand wavy review.
       | 
       | I've read too many papers where important information to
       | reproduce the result is omitted. (for ML/RL) If the code is
       | included I've countless of times found implementation details
       | that is not mentioned in the paper. In matter of fact, there's
       | even results suggesting that those details are the make or break
       | of certain algorithms. [1] I've also seen breaking details only
       | mentioned in code comments...
       | 
       | Another atrocious thing I've witnessed is a paper claiming they
       | evaluated their method on a benchmark and if you check the
       | benchmark, the task they evaluated on doesn't exit! They forked
       | the benchmark and made their own task without being clear about
       | it! [2]
       | 
       | Shit like this make me lose faith in certain science directions.
       | And I've seen a couple of junior researcher giving it all up
       | because they concluded it's all just house of cards.
       | 
       | [1] https://arxiv.org/abs/2005.12729
       | 
       | [2] https://arxiv.org/abs/2202.02465
       | 
       | Edit: also if you think that's too tedious/costly, reminder that
       | publishers rake in record profits so the resources are already
       | there https://youtu.be/ukAkG6c_N4M
        
         | kergonath wrote:
         | > I've met Bio PHD that have wasted Months of their life
         | tracking up experimental details not mentioned in a paper
         | 
         | Same. Now, when I review manuscripts, I pay much more attention
         | to whether there is enough information to replicate the
         | experiment or simulation. We can put out a paper with wrong
         | interpretations and that's fine because other people will
         | realise that when doing their own work. We cannot let papers
         | get published if their results cannot be replicated.
         | 
         | > The latter would be the result of the former. Instead of
         | having pages long "appendix" section in the main paper, you
         | produce another document with meticulous details of the
         | experiment/methodology with every stone turned together with an
         | peer reviewer. Stamp it with the peer reviewes name so they
         | can't get away with hand wavy review
         | 
         | Things that take too much space to go in the experimental
         | section should go to a electronic supplementary information
         | document. But then it would be nice if the ESI were appended to
         | the article when we download a PDF because tracking them is a
         | pain in the backside. Some fields are better than others about
         | this, for example in materials characterisation studies it's
         | very common to have ESI with a whole bunch of data and details.
         | 
         | Large dataset should go to a repository or a dataset journal,
         | that way the method is still peer reviewed and the dataset has
         | a doi and is much easier to re-use. It's also a nice way of
         | doubling a student's papers count by the end of their PhD.
         | 
         | > Another atrocious thing I've witnessed is a paper claiming
         | they evaluated their method on a benchmark and if you check the
         | benchmark, the task they evaluated on doesn't exit! They forked
         | the benchmark and made their own task without being clear about
         | it! [2]
         | 
         | That's just evil!
        
           | Maxion wrote:
           | > Large dataset should go to a repository or a dataset
           | journal, that way the method is still peer reviewed and the
           | dataset has a doi and is much easier to re-use.
           | 
           | This may be possible in some sciences, but not in
           | epidemiology or biomed. Often the study is based on tissue
           | samples owned by some entity, with permission granted only to
           | some certain entity.
           | 
           | Datasets in epidemiology are often full of PII, and cannot be
           | shared publicly for many reasons.
        
       | infogulch wrote:
       | I like the idea of splitting "peer review" into two, and then
       | having a citation threshold standard where a field agrees that a
       | paper should be replicated after a certain number of citations.
       | And journals should have a dedicated section for attempted
       | replications.
       | 
       | 1. Rebrand peer review as a "readability review" which is what
       | reviewers tend to focus on today.
       | 
       | 2. A "replicability statement", a separately published document
       | where reviewers push authors to go into detail about the
       | methodology and strategy used to perform the experiments,
       | including specifics that someone outside of their specialty may
       | not know. Credit NalNezumi ITT
        
         | analog31 wrote:
         | Every experimental paper I've ever read has contained an
         | "Experimental" section, where they provide the details on how
         | they did it. Those sections tend to be general enough, albeit
         | concise.
         | 
         | In some fields, aside from specialized knowledge, good
         | experimental work requires what we call "hands." For instance,
         | handling air sensitive compounds, or anything in a condensed or
         | crystalline state. In my thesis experiment, some of the
         | equipment was hand made, by me.
         | 
         | Sometimes specialized facilities are needed. My doctoral thesis
         | project used roughly 1/2 million dollars of gear, and some of
         | the equipment that I used was obsolete and unavailable by the
         | time I finished.
        
           | ahmadmijot wrote:
           | > My doctoral thesis project used roughly 1/2 million dollars
           | of gear,
           | 
           | Wow I envy you. My doctoral thesis project spent like...
           | USD2.5k directly for gears (half of it just to buy lego
           | bricks to build our own instrument exactly because we can't
           | afford to buy commercial one lol)
        
             | xioxox wrote:
             | I used a 3 billion dollar space telescope. I don't think
             | NASA are going to launch another to replicate some of my
             | results.
        
           | janalsncm wrote:
           | "Concise" isn't good enough. If other scientists are trying
           | to read through the tea leaves at what you're trying to say
           | you did, that defeats the entire point of a paper. The
           | purpose of science is to create knowledge _that other people
           | can use_ and if people can't replicate your work that's not
           | science.
        
             | analog31 wrote:
             | I think the point is you don't have to give a complete BOM
             | that includes where you got the power cables. Each
             | scientist has to decide what amount of information needs to
             | be conveyed. Of course this can be abused, or done
             | sloppily, like anything else.
             | 
             | A place where you can spread out more is in dissertations.
             | Mine contained an entire chapter on the experiment, another
             | on the analysis, and appendices full of source code,
             | schematics, etc. I happily sent out copies, at my expense.
             | My setup was replicated roughly 3 times.
        
       | User23 wrote:
       | One thing that everyone needs to remember about "peer review" is
       | that it isn't part of the scientific method, but rather that it
       | was imposed on the scientific enterprise by government funding
       | authorities. It's basically JIRA for scientists.
        
       | ahmadmijot wrote:
       | Quite related: nowadays there is this movement within scientific
       | researches ie Open Science where the (raw) data from ones
       | research is open source. And even methods for in-house
       | fabrication and development together with its source code is open
       | source (open hardware and open software)
        
       | waynecochran wrote:
       | I spent a lot of my graduate years in CS implementing the details
       | of papers only to learn that, time and time again, the paper
       | failed to mention all the short comings and fail cases of the
       | techniques. There are great exceptions to this.
       | 
       | Due to the pressure of "publish or die" there is very little
       | honesty in research. Fortunately there are some who are
       | transparent with their work. But for the most part, science is
       | drowning in a sea of research that lacks transparency and
       | replication short falls.
        
         | janalsncm wrote:
         | I had a very similar experience in my masters. Really made me
         | think, what exactly are the peers "reviewing" if they don't
         | even know whether the technique works in the first place.
        
           | waynecochran wrote:
           | I have reviewed many papers and there is never the time to
           | recreate the work and test. That is why I love the "papers w
           | code" site. I think every published CS paper should require a
           | git repo with all their code and experimental data.
        
         | cptskippy wrote:
         | You'll quickly discover when you enter the workforce that the
         | reasons we have CI/CD, Docker, and virtualization are because
         | of a similar problem. The dread "it works on my machine"
         | response.
         | 
         | CI/CD forces people to codify exactly how to build and deploy
         | something in order for it to get into a production environment.
         | Docker and VMs are ways around this by giving people a "my
         | machine" that can be copied and shared easily.
        
       | titzer wrote:
       | In the PL field, conferences have started to allow authors to
       | submit packaged artifacts (typically, source code, input data,
       | training data, etc) that are evaluated separately, typically
       | post-review. The artifacts are evaluated by a separate committee,
       | usually graduate students. As usual, everything is volunteer.
       | Even with explicit instructions, it is hard enough to even get
       | the same _code_ to run in a different environment and give the
       | same results. Would  "replication" of a software technique
       | require another team to reimplement something from scratch? That
       | seems unworkable.
       | 
       | I can't even _imagine_ how hard it would be to write instructions
       | for another lab to successfully replicate an experiment at the
       | forefront of physics or chemistry, or biology. Not just the
       | specialized equipment, but we 're talking about the frontiers of
       | Science with people doing cutting-edge research.
       | 
       | I get the impression that suggestions like these are written by
       | non-scientists who do not have experience with the peer review
       | process of _any_ discipline. Things just don 't work like that.
        
         | Maxion wrote:
         | > I get the impression that suggestions like these are written
         | by non-scientists who do not have experience with the peer
         | review process of any discipline. Things just don't work like
         | that.
         | 
         | Not to mention that the cutting edge in many sciences are
         | perhaps two-three research groups of 5-30 individuals each in
         | varying research institutions around the world.
        
         | mike_hearn wrote:
         | Is PL theory actually science? Although we call it computer
         | science, I don't personally think CS is actually a science in
         | the sense of studying nature to understand it. Computers are
         | artificial constructs. CS is a lot closer to engineering than
         | science. Indeed it's kind of nonsensical to talk about
         | replicating an experiment in programming language theory.
         | 
         | For the "hard" sciences, replication often isn't so difficult
         | it seems. LK-99 being an interesting study in this, where
         | people are apparently successfully replicating an experiment
         | described in a rushed paper that is widely agreed to lack
         | sufficient details. It's cutting edge science but replication
         | still isn't a problem. Most science isn't the LHC.
         | 
         | The real problems with replication are found in the softer
         | fields. There it's not just an issue of randomness or
         | difficulty of doing the experiments. If that's all there was to
         | it, no problem. In these fields it's common to find papers or
         | entire fields where none of the work is replicable even in
         | principle. As in, the people doing it don't think other people
         | being able to replicate their work is even important at all,
         | and they may go out of their way to _stop_ people being able to
         | replicate their work (most frequently by gathering data in non-
         | replicable ways and then withholding it deliberately, but
         | sometimes it 's just due to the design of the study). The most
         | obvious inference when you see this is that maybe they don't
         | want replication attempts because they know their claims
         | probably aren't true.
         | 
         | So even if peer reviewers or journals were just checking really
         | basic things like, is this claim even replicable in principle,
         | that would be a good start. You would still be left with a lot
         | of papers that replicate fine but their conclusions are still
         | wrong because their methodology is illogical, or papers that
         | replicate because their findings are obvious. But there's so
         | much low hanging fruit.
        
       | staunton wrote:
       | Let's get people to publish their data and code first, shall we?
       | That's sooo much easier than demanding whole studies to be
       | replicated... and people still don't do it!
        
       | ayakang31415 wrote:
       | One of the Nobel prizes in Physics was the discovery of Higgs
       | Boson at LHC. It cost billions of dollars just to build the
       | facility, and required hundreds of physicists working on it to
       | just conduct the experiment. You can't replicate this. Although I
       | fully agree that replication must come first when it is
       | reasonably doable.
        
       | TrackerFF wrote:
       | Seems to have been hugged to death.
       | 
       | But - a quick counterexample - as far as replication goes: What
       | if the experiments were run on custom made or exceedingly
       | expensive equipment? How are the replicators supposed to access
       | that equipment? Even in fields which are "easy" to replicate -
       | like machine learning - we are seeing barriers of entry due to
       | expensive computing power. Or data collection. Or both.
       | 
       | But then you move over to physics, and suddenly you're also
       | dealing with these one-off custom setups, doing experiments which
       | could be close to impossible to replicate (say you want to
       | conduct experiments on some physical event that only occurs every
       | xxxx years or whatever)
        
       | pajushi wrote:
       | Why shouldn't we hold science more accountable?
       | 
       | "Science needs accounting" is a search I had saved for months
       | which really resonates with the idea of "peer replication."
       | 
       | In accounting, you always have checks and balances, you never are
       | counting money alone. In many cases, accountants duplicate their
       | work to make sure that it is accurate.
       | 
       | Auditors are the corollary to the peer review process. They're
       | not there to redo your work, but to verify that your methods and
       | processes are sound.
        
       | paulpauper wrote:
       | this would not apply to math or something subjective such as
       | literature. only experimental results need to be replicated.
        
       | Nevermark wrote:
       | Reproducibility would become a much higher priority if electronic
       | versions of papers are required (by their distributors, archives,
       | institutions, ...) to have reproduction sections, which the
       | authors are encouraged to update over time.
       | 
       | UPDATABLE COVER PAGE:
       | 
       | Title Authors
       | 
       | Abstract                  Blah, blah, ...
       | 
       | State of reproduction:                   Not reproduced.
       | Successful reproductions: ...citations...         Reproduction
       | attempts: ...citations...         Countering reproductions:
       | ...citations...
       | 
       | UPDATABLE REPRODUCTION SECTION ATTACHED AT END
       | 
       | Reproduction resources:                  Data, algorithms,
       | processes, materials, ...
       | 
       | Reproduction challenges:                  Cost, time, one-off
       | events, ...
       | 
       | Making this stuff more visible would help reproducers validated
       | the value of reproduction to their home and funding institutions.
       | 
       | Having a standard section for this, with an initial state of "Not
       | reproduced" provides more incentive for original workers to
       | provide better reproduction info.
       | 
       | For algorithm and math work the reproduction could be served best
       | with downloadable executable bundle.
        
       | gordian-not wrote:
       | The incentive should be to clear the way for tenure track
       | 
       | The junior faculty will clear the rotten apples at the top by
       | finding flaws in their research and then will win the tenure that
       | was lost in return
       | 
       | This will create a nice political atmosphere and improve science
        
       | user6723 wrote:
       | I remember showing someone raw video of a Safire plasma chamber
       | keeping the ball of plasma lit for several minutes. They said
       | they would need to see a peer reviewed paper. The presumption
       | brought about by the enlightenment era that everyone should get a
       | vote was a mistake.
        
       | dongping wrote:
       | https://web.archive.org/web/20230130143126/https://blog.ever...
        
       | moelf wrote:
       | I wish we can replicate the LHC
        
         | Maxion wrote:
         | No talking about the Higgs before that happens, apparently.
        
         | kergonath wrote:
         | We will, don't worry.
        
       | janalsncm wrote:
       | For a while Reddit had the mantra "pics or it didn't happen".
       | 
       | At least in CS/ML there needs to be a "code or it didn't happen".
       | Why? Papers are ambiguous. Even if they have mathematical
       | formulas, not all components are defined.
       | 
       | Peer replication in these fields is an easy low hanging fruit
       | that could set an example for other fields of science.
        
         | simlan wrote:
         | That is too simplistic. You underestimate the depth of
         | academia. Sure the latest break through Alzheimers study or
         | related research would benefit from a replication. Which is
         | done out of commercial interest anyway.
         | 
         | But your run of the mill niche topic will not have the dollars
         | behind it to replicate everyones research.just because CS/AI
         | research is very convenient to replicate does not mean this can
         | be extended to all research being done.
         | 
         | That is exactly why peer review exists to weed out the
         | implausible and low effort/relevance work. It is not fraud
         | proof because it was not designed to be.
        
       | hedora wrote:
       | The website dies if I try to figure out who the author ("sam")
       | is, but it sounds like they are used to some awful backwater of
       | academia.
       | 
       | They have this idea that a single editor screens papers to decide
       | if they are uninteresting or fundamentally flawed, then they want
       | a bunch of professors to do grunt work litigating the correctness
       | of the experiments.
       | 
       | In modern (post industrial revolution) branches of science, the
       | work of determining what is worthy of publication is distributed
       | amongst a program committee, which is comprised of reviewers. The
       | editor / conference organizers pick the program committee. There
       | are typically dozens of program committee members, and authors
       | and reviewers both disclose conflicts. Also, papers are
       | anonymized, so the people that see the author list are not
       | involved in accept/reject decisions.
       | 
       | This mostly eliminates the problem where work is suppressed for
       | political reasons, etc.
       | 
       | It is increasingly common for paper PDFs to be annotated with
       | badges showing the level of reproducibility of the work, and
       | papers can win awards for being highly reproducible. The people
       | that check reproducibility simply execute directions from a
       | separate reproducibility submission that is produced after the
       | paper is accepted.
       | 
       | I argue the above approach is about 100 years ahead of what the
       | blog post is suggesting.
       | 
       | Ideally, we would tie federal funding to double blind review and
       | venues with program committees, and papers selected by editors
       | would not count toward tenure at universities that receive public
       | funding.
        
         | jltsiren wrote:
         | The computer science practice you describe is the exception,
         | not the norm. It causes a lot of trouble when evaluating the
         | merits of researchers, because most people in the academia are
         | not familiar with it. In many places, conference papers don't
         | even count as real publications, putting CS researchers at a
         | disadvantage.
         | 
         | From my point of view, the biggest issue is accepting/rejecting
         | papers based on first impressions. Because there is often only
         | one round of reviews, you can't ask the authors for
         | clarifications, and they can't try to fix the issues you have
         | identified. Conferences tend to follow fashionable topics, and
         | they are often narrower in scope than what they claim to be,
         | because it's easier to evaluate papers on topics the program
         | committee is familiar with.
         | 
         | The work done by the program committee was not even supposed to
         | be proper peer review but only the first filter. Old conference
         | papers often call themselves extended abstracts, and they don't
         | contain all the details you would expect in the full paper. For
         | example, a theoretical paper may omit key proofs. Once the
         | program committee has determined that the results look
         | interesting and plausible and the authors have presented them
         | in a conference, the authors are supposed to write the full
         | paper and submit it to a journal for peer review. Of course,
         | this doesn't always happen, for a number of reasons.
        
       | cycomanic wrote:
       | While I agree with the general sentiment of the paper and
       | creating incentives for more replication is definitely a good
       | idea, I do think the approach is flawed in several ways.
       | 
       | The main point is that the paper seriously underestimates the
       | difficulty and time it requires to replicate experiments in many
       | experimental fields. Who will decide which work needs to be
       | replicated? Should capable labs somehow become bogged down with
       | just doing replication work? Even if they don't find the results
       | not interesting?
       | 
       | In reality if labs find results interesting enough to replicate
       | they will try to do so. The current LK-99 hurrah is a perfect
       | example of that, but it happens on a much smaller scale all the
       | time. Researchers do replicate and build on other work all the
       | time, they just use that replication to create new results (and
       | acknowledge the previous work) instead of publishing a "we
       | replicated paper".
       | 
       | Where things usually fail is in publication of "failed
       | replication" studies, and those are tricky. It is not always
       | clear if the original research was flawed or the people trying to
       | reproduce made an error (again just have a look at what's
       | happening with LK-99 at the moment). Moreover, it can be
       | politically difficult to try to publish a "fail to reproduce"
       | result if you are small unknown lab, if the original result came
       | from a big known group. Most people will believe that you are the
       | one who made the error (and unfortunately big egos might get in
       | the way, and the small lab will have a hard time).
       | 
       | More generally, in my opinion the lack of replication of results
       | is just one symptom of a bigger problem in science today. We (as
       | in society) have essentially turned the scientific environment
       | increasingly competitive, under the guise of "value for tax payer
       | money". Academic scientists now have to constantly compete for
       | grant funding, publish to keep the funding going. It's incredibly
       | competitive to even get in ... At the same time they are supposed
       | to constantly provide big headlines for university press
       | releases, communicate their results to the general public and
       | investigate (and patent) the potential for commercial
       | exploitation. No wonder we see less cooperation.
        
       | eesmith wrote:
       | > the real test of a paper should be the ability to reproduce its
       | findings in the real world. ...
       | 
       | > What if all the experiments in the paper are too complicated to
       | replicate? Then you can submit to [the Journal of Irreproducible
       | Results].
       | 
       | Observational science is still a branch of science even if it's
       | difficult or impossible to replicate.
       | 
       | Consider the first photographs of a live giant squid in its
       | natural habitat, published in 2005 at
       | https://royalsocietypublishing.org/doi/10.1098/rspb.2005.315... .
       | 
       | Who seriously thinks this shouldn't have been published until
       | someone else had been able to replicate the result?
       | 
       | Who thinks the results of a drug trial can't be published until
       | they are replicated?
       | 
       | How does one replicate "A stellar occultation by (486958) 2014
       | MU69: results from the 2017 July 17 portable telescope campaign"
       | at
       | https://ui.adsabs.harvard.edu/abs/2017DPS....4950403Z/abstra...
       | which required the precise alignment of a star, the trans-
       | Neptunian object 486958 Arrokoth, and a region in Argentina?
       | 
       | Or replicate the results of the flyby of Pluto, or flying a
       | helicopter on Mars?
       | 
       | Here's a paper I learned about from "In The Pipeline"; "Insights
       | from a laboratory fire" at
       | https://www.nature.com/articles/s41557-023-01254-6 .
       | 
       | """Fires are relatively common yet underreported occurrences in
       | chemical laboratories, but their consequences can be devastating.
       | Here we describe our first-hand experience of a savage laboratory
       | fire, highlighting the detrimental effects that it had on the
       | research group and the lessons learned."""
       | 
       | How would peer replication be relevant?
        
         | phpisthebest wrote:
         | I think in some of those cases you have conclusions drawn from
         | raw data that could be replicated or reviewed. For example many
         | teams use the same raw data from Large Colliders, or JWT, or
         | other large science projects to reach competiting conclusions.
         | 
         | Yes in a perfect world we would also replicate the data
         | collection but we do not live in a perfect world
         | 
         | Same is true for Drug Trials, there is always a battle over
         | getting the raw data from drug trails as the companies claim
         | that data is trade secret, so independent verification of drug
         | trails is very expensive but if the FDA required not just the
         | release of redacted conclusions and supporting redacted data
         | but 100% of all data gathered it would be alot better IMO
         | 
         | For example the FDA says it will take decades to release the
         | raw data from the COVID Vaccine trials.. Why... and that is
         | after being forced to do so via a law suit.
        
           | eesmith wrote:
           | > For example many teams use the same raw data from Large
           | Colliders, or JWT, or other large science projects to reach
           | competiting conclusions.
           | 
           | Yes, but why must the first team wait until the second is
           | finished before publishing?
           | 
           | What if you are the only person in the world with expertise
           | in the fossil record of an obscure branch of snails? You
           | spend 10 years developing a paper knowing that the next
           | person with the right training to replicate the work might
           | not even be born yet.
           | 
           | Other paleontologists might not be able to replicate the
           | work, but still tell if it's publishable - that's what they
           | do now, yes?
           | 
           | > but we do not live in a perfect world
           | 
           | Alternatively, we don't live in a perfect world which is why
           | we have the current system instead of requiring replication
           | first.
           | 
           | Since the same logic works for both cases, I don't think it's
           | persuasive logic.
           | 
           | > the FDA says it will take decades
           | 
           | Well, that's a tangent. The FDA is charged with protecting
           | and promoting public health, not improving the state of
           | scholarly literature.
           | 
           | And the FDA is only one of many public health organizations
           | which carried out COVID vaccine trials.
        
         | msla wrote:
         | With some of the things, but admittedly not most of the things
         | you mentioned, there's a dataset (somewhere) and some code run
         | on that dataset (somewhere) and replication would mean someone
         | else being able to run that code on that dataset and get the
         | same results.
         | 
         | Would this require labs to improve their software environments
         | and learn some new tools? Would this require labs to give up
         | whatever used to be secret sauce? That's. The. Point.
        
           | counters wrote:
           | In practice this is happening in many disciplines, for most
           | research, on a daily basis. What _isn't_ happening is that
           | the results of these replications are being independently
           | peer reviewed, because that isn't incentivized. However, when
           | replication fails for whatever reason, it usually leads to
           | insights that themselves lead to stronger scientific work and
           | better publications later on.
        
           | eesmith wrote:
           | > someone else being able to run that code on that dataset
           | and get the same results.
           | 
           | I think when people talk about "replicate" they mean
           | something more than that.
           | 
           | The dataset could contain coding errors, and the analysis
           | could contain incorrect formulas and bad modeling.
           | Reproducing a bad analysis, successfully, provide no
           | corrective feedback.
           | 
           | I know for one paper I could replicate the paper's results
           | using the paper's own analysis, but I couldn't replicate the
           | paper's results using my analysis.
           | 
           | > Would this require labs to give up whatever used to be
           | secret sauce? That's. The. Point.
           | 
           | That seems to be a very different Point.
           | 
           | Newton famously published results made from using his secret
           | sauce - calculus - by recasting them using more traditional
           | methods.
           | 
           | In the extreme cas, I could publish the factors for RSA-1024
           | without publishing my factorization method. "I prayed to God
           | for the answer and He gave them to me." You can verify that
           | result without the secret sauce.
           | 
           | I mean, people use all sorts of methods to predict a protein
           | structure, including manual tweaking guided by intuition and
           | insight gained during a reverie or day-dream (a la Kekule)
           | which is clearly not reproducible. Yet that final model may
           | be publishable, because it may provide new insight and
           | testable predictions.
        
             | msla wrote:
             | My point is that we can, apparently, improve the baseline
             | expectations in the parts of science where this kind of
             | reproducibility is possible. That isn't all science,
             | granted, but it is some science. It isn't a panacea,
             | granted, but it could guard against some forms of
             | misconduct or honest error some of the time. The self-
             | correcting part of science only works when there's
             | something for it to work on, so open data and runnable code
             | ought to improve that self-correction mechanism.
        
               | eesmith wrote:
               | Understood.
               | 
               | But my point is this linked-to essay appears not only to
               | exclude some areas of good science, but to suggest that
               | any topics which cannot be replicated before publication
               | is only worthy of publication in the Journal of
               | Irreproducible Results.
               | 
               | I gave examples to highlight why I disagree with author's
               | opinion.
               | 
               | Please do not interpret this to mean I do not think
               | improvement is possible.
        
         | kergonath wrote:
         | > Who seriously thinks this shouldn't have been published until
         | someone else had been able to replicate the result?
         | 
         | Nobody, obviously. You cannot reproduce a result that hasn't
         | been published, so no new phenomenon is replicated the moment
         | it is first published. The problem is not the publication of
         | new discoveries, it's the lack of incentives to confirm them
         | once they've been published.
         | 
         | In your example, new observations of giant squids are still
         | massively valuable even if not that novel anymore. So new
         | observations should be encouraged (as I am sure they are).
         | 
         | > Or replicate the results of the flyby of Pluto, or flying a
         | helicopter on Mars?
         | 
         | Well, we should launch another probe anyway. And I am fairly
         | confident we'll have many instances of aircrafts in Mars'
         | atmosphere and more data than we'll know what to do with it. We
         | can also simulate the hell out of it. We'll point spectrometers
         | and a whole bunch of instruments towards Pluto. These are not
         | really good examples of unreproducible observations.
         | 
         | Besides, in such cases robustness can be improved by different
         | teams performing their own analyses separately, even if the
         | data comes from the same experimental setup. It's not all black
         | or white. Observations are on a spectrum, some of them being
         | much more reliable than others and replication is one aspect of
         | it.
         | 
         | > How would peer replication be relevant?
         | 
         | How would you know which aspects of the observed phenomena come
         | from particularities of this specific lab? You need more than
         | one instance. You need some kind of statistical and factor
         | analyses. Replication in this instance would not mean setting
         | actual labs on fire on purpose.
         | 
         | It's exactly like studying car crashes: nobody is going to kill
         | people on purpose, but it is still important to study them so
         | we regularly have new papers on the subject based on events
         | that happened anyway, each one confirming or disproving
         | previous observations.
        
           | eesmith wrote:
           | > Nobody, obviously. You cannot reproduce a result that
           | hasn't been published, .. The problem is not the publication
           | of new discoveries, it's the lack of incentives to confirm
           | them once they've been published.
           | 
           | Your comment concerns post-publication peer-replication, yes?
           | 
           | If so, it's a different topic. The linked-to essay
           | specifically proposes:
           | 
           | ""Instead of sending out a manuscript to anonymous referees
           | to read and review, preprints should be sent to other labs to
           | actually replicate the findings. Once the key findings are
           | replicated, the manuscript would be accepted and published.""
           | 
           | That's _pre-publication_ peer-replication, and my comment was
           | only meant to be interpreted in that light.
        
             | kergonath wrote:
             | > That's pre-publication peer-replication, and my comment
             | was only meant to be interpreted in that light.
             | 
             | Sorry I might have gone mixed up between threads.
             | 
             | Yeah, pre-publication replication is nice (I do it when I
             | can and am suspicious of some simulation results), but is
             | not practical at scale. Besides, the role of peer review is
             | not to ensure results are right, that is just not
             | sustainable for referees.
        
       | hinkley wrote:
       | Is there space in the world for a few publications that only
       | publish replicated work? Seems like that would be a reasonable
       | compromise. Yes you were published, but were you published in
       | Really Real Magazine? Get back to us when you have and we'll
       | discuss.
        
       | hospadar wrote:
       | I assume that the goal here is to reduce the number of not-
       | actually-valid results that get published. Not-actually-valid
       | results happen for lots of reasons (whoops did experiment wrong,
       | mystery impurity, cherry picked data, not enough subjects,
       | straight-up lie, full verification expensive and time consuming
       | but this looks promising) but often there's a common set of
       | incentives: you must publish to get tenure/keep your job, you
       | often need to publish in journals with high impact factor [1].
       | 
       | High impact journals [6] tend to prefer exciting, novel, and
       | positive results (we tried new thing and it worked so well!) vs
       | negative results (we mixed up a bunch of crystals and absolutely
       | none of them are room-temp superconductors! we're sure of it!).
       | 
       | The result is that cherry picking data pays, leaning into
       | confirmation bias pays, publishing replication studies and
       | rigorous but negative results is not a good use of your academic
       | inertia.
       | 
       | I think that creating a new category of rigor (i.e. journals that
       | only publish independently replicated results) is not a bad idea,
       | but: who's gonna pay for that? If the incentive is you get your
       | name on the paper, doesn't that incentivize coming up with a
       | positive result? How do you incentivize negative replications?
       | What if there is only one gigantic machine anywhere that can find
       | those results (LHC, icecube, etc, a very expensive spaceship)?
       | 
       | There might be easier and cheaper pathways to reducing bad papers
       | - incentivizing the publishing of negative results and
       | replication studies separately, paying reviewers for their time,
       | coming up with new metrics for researchers that prioritize
       | different kinds of activity (currently "how much you're cited"
       | and "number of papers*journal impact" things are common, maybe a
       | "how many results got replicated" score would be cool to roll
       | into "do you get tenure"? See [3] for more details). PLoS
       | publish.
       | 
       | I really like OP's other article about a hypothetical "Journal of
       | One Try" (JOOT) [2] to enable publishing of not-very-rigorous-
       | but-maybe-useful-to-somebody results. If you go back and read OLD
       | OLD editions of Philosophical Transactions (which goes back to
       | the 1600's!! great time, highly recommend [4], in many ways the
       | archetype for all academic journals), there are a ton of wacky
       | submissions that are just little observations, small experiments,
       | and I think something like that (JOOT let's say) tuned up for the
       | modern era would, if nothing else, make science more fun. Here's
       | a great one about reports of "Shining Beef" (literally beef that
       | is glowing I guess?) enjoy [5]
       | 
       | [1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668985/ [2]
       | https://web.archive.org/web/20220924222624/https://blog.ever...
       | [3] https://www.altmetric.com/ [4]
       | https://www.jstor.org/journal/philtran1665167 [5]
       | https://www.jstor.org/stable/101710 [6]
       | https://en.wikipedia.org/wiki/Impact_factor, see also
       | https://clarivate.com/
        
       | throwawaymaths wrote:
       | How about we create a Nobel prize for replication. One impressive
       | replication or refutation from last decade (that holds up) gets
       | the prize split up to three ways among the most important
       | authors.
        
       | 37326 wrote:
       | [flagged]
        
       | elashri wrote:
       | Great, but who is going to fund the peer replication?. The
       | economics of research now doesn't even provide a compensation for
       | peer review process time.
        
         | nine_k wrote:
         | Maybe the numerous complaints about the crisis of science are
         | somehow related to the fact that scientific work is severely
         | underpaid.
         | 
         | The pay difference between research and industry in many areas
         | is not even funny.
        
       | matthewdgreen wrote:
       | The purpose of science publications is to share new results with
       | other scientists, so others can build on or verify the
       | correctness of the work. There has always been an element of
       | "receiving credit" to this, but the communication aspect is what
       | actually matters _from the perspective of maximizing scientific
       | progress._
       | 
       | In the distant past, publication was an informal process that
       | mostly involved mailing around letters, or for a major result,
       | self-publishing a book. Eventually publishers began to devise
       | formal journals for this purpose, and some of those journals
       | began to receive more submissions than it was feasible to publish
       | or verify just by reputation. Some of the more popular journals
       | hit upon the idea of applying basic editorial standards to reject
       | badly-written papers and obvious spam. Since the journal editors
       | weren't experts in all fields of science, they asked for
       | volunteers to help with this process. That's what peer review is.
       | 
       | Eventually bureaucrats (inside and largely outside of the
       | scientific community) demanded a technique for measuring the
       | productivity of a scientist, so they could allocate budgets or
       | promotions. They hit on the idea of using publications in a few
       | prestigious journals as a metric, which turned a useful process
       | (sharing results with other scientists) into [from an outsider
       | perspective] a process of receiving "academic points", where the
       | publication of a result appears to be the end-goal and not just
       | an intermediate point in the validation of a result.
       | 
       | Still other outsiders, who misunderstand the entire process, are
       | upset that intermediate results are sometimes incorrect. This
       | confuses them, and they're angry that the process sometimes
       | assigns "points" to people who they perceive as undeserving. So
       | instead of simply accepting that _sharing results widely to
       | maximize the chance of verification_ is the whole point of the
       | publication process, or coming up with a better set of promotion
       | metrics, they want to gum up the essential sharing process to
       | make it much less efficient and reduce the fan-out degree and
       | rate of publication. This whole mess seems like it could be
       | handled a lot more intelligently.
        
         | nine_k wrote:
         | For sharing results widely, there's arxiv. The problem is that
         | the fanout is now overwhelming.
         | 
         | The public perception of a publication in a prestigious journal
         | as the established truth does not help, too.
        
           | isaacremuant wrote:
           | > The public perception of a publication in a prestigious
           | journal as the established truth does not help, too.
           | 
           | it's not so much the public perception but what
           | govs/media/tech and other institutions have pushed down so
           | that the public doesn't question whatever resulting policy
           | they're trying to put forth.
           | 
           | "Trust the science" means "Thou shalt not question us, simply
           | obey".
           | 
           | Anyone with eyes who has worked in institutions knows that
           | bureocracy, careerism and corruption are intrinsic to them.
        
         | casualscience wrote:
         | Most of this is very legit, but this
         | 
         | > Still other outsiders, who misunderstand the entire process,
         | are upset that intermediate results are sometimes incorrect.
         | This confuses them, and they're angry that the process
         | sometimes assigns "points" to people who they perceive as
         | undeserving. So instead of simply accepting that sharing
         | results widely to maximize the chance of verification is the
         | whole point of the publication process, or coming up with a
         | better set of promotion metrics, they want to gum up the
         | essential sharing process to make it much less efficient and
         | reduce the fan-out degree and rate of publication.
         | 
         | Does not represent my experience in the academy at all. There
         | is a ton of gamesmanship in publishing. That is ultimately the
         | yardstick academics are measured against, whether we like it or
         | not. No one misunderstands that IMO, the issue is that it's a
         | poor incentive. I think creating a new class of publication,
         | one that requires replication, could be workable in some fields
         | (e.g. optics/photonics), but probably is totally impossible in
         | others (e.g. experimental particle physics).
         | 
         | For purely intellectual fields like mathematics, theoretical
         | physics, philosophy, you probably don't need this at all. Then
         | there are 'in the middle fields' like machine learning which in
         | theory would be easy to replicate, but also would be
         | prohibitively expensive for, e.g. baseline training of LLMs.
        
           | Maxion wrote:
           | And on the extreme end you have the multi-decade longitudinal
           | studies in epidemiology / biomedicine that would be more-or-
           | less impossible to replicate.
        
           | [deleted]
        
         | sebastos wrote:
         | Very well put. This is the clearest way of looking at it in my
         | view.
         | 
         | I'll pile on to say that you also have the variable of how the
         | non-scientist public gleans information from the academics.
         | Academia used to be a more insular cadre of people seeking
         | knowledge for its own sake, so this was less relevant. What's
         | new here is that our society has fixated on the idea that
         | matters of state and administration should be significantly
         | guided by the results and opinions of academia. Our enthusiasm
         | for science-guided policy is a triple whammy, because 1.
         | Knowing that the results of your study have the potential to
         | affect policy creates incentives that may change how the
         | underlying science is performed 2. Knowing that results of
         | academia have outside influence may change WHICH science is
         | performed, and draw in less-than-impartial actors to perform it
         | 3. The outsized potential impact invites the uninformed public
         | to peer into the world of academia and draw half-baked
         | conclusions from results that are still preliminary or
         | unreplicated. Relatively narrow or specious studies can gain a
         | lot of undue traction if their conclusions appear, to the
         | untrained eye, to provide a good bat to hit your opponent with.
        
           | Maxion wrote:
           | A significant problem we face today is the way research,
           | especially in academia, gets spotlighted in the media. They
           | often hyper-focus on single studies, which can give a skewed
           | representation of scientific progress.
           | 
           | The reality is that science isn't about isolated findings;
           | it's a cumulative effort. One paper might suggest a
           | conclusion, but it's the collective weight of multiple
           | studies that provides a more rounded understanding. Media's
           | tendency to cherry-pick results often distorts this nuanced
           | process.
           | 
           | It's also worth noting the trend of prioritizing certain
           | studies, like large RCTs or systematic reviews, while
           | overlooking smaller ones, especially pilot studies. Pilot
           | studies are foundational--they often act as the preliminary
           | research needed before larger studies can even be considered
           | or funded. By sidelining or dismissing these smaller,
           | exploratory studies, we risk undermining the very foundation
           | that bigger, more definitive research efforts are built on.
           | If we consistently ignore or undervalue pilot studies, the
           | bigger and often more impactful studies may never even see
           | the light of day.
        
         | dmbche wrote:
         | Your analysis seems to portray all scientists as pure hearted.
         | May I remind you of the latest Stanford scandal where the
         | president of Stanford was found to have manipulated data?
         | 
         | Today, publications do not serve the same purpose as they did
         | before the internet. It is trivial today to write a convincing
         | paper without research and getting that
         | published(www.theatlantic.com/ideas/archive/2018/10/new-sokal-h
         | oax/572212/&sa=U&ved=2ahUKEwjnp5mRtsiAAxVwF1kFHesBDC8QFnoECAkQA
         | g&usg=AOvVaw0t_Bo31BrT5D9zHBdmNAqi).
        
           | matthewdgreen wrote:
           | No subset of humanity is "pure hearted." Fraud and malice
           | will exist in everything people do. Fortunately these
           | fraudulent incidents seem relatively rare, when one compares
           | the number of reported incidents to the number of
           | publications and scientists. But this doesn't change
           | anything. The benefit of scientific publication is _to make
           | it easier to detect and verify incorrect results_ , which is
           | exactly what happened in this case.
           | 
           | I understand that it's frustrating it didn't happen
           | instantly. And I also understand that it's deeply frustrating
           | that some undeserving person accumulated status points with
           | non-scientists based on fraud, and that let them take a high-
           | status position outside of their field. (I think maybe you
           | should assign some blame to the Stanford Trustees for this,
           | but that's up to you.) None of this means we'd be better off
           | making publication more difficult: it means the metrics are
           | bad.
           | 
           | PS When a TFA raises something like "the replication crisis"
           | and then entangles it with accusations of deliberate fraud
           | (high profile but exceedingly rare) it's like trying to have
           | a serious conversation about automobile accidents, but
           | spending half the conversation on a handful of rare incidents
           | of intentional vehicular homicide. You're not going to get
           | useful solutions out of this conversation, because it's
           | (perhaps deliberately) misunderstanding the impact and causes
           | of the problem.
        
             | mike_hearn wrote:
             | Fraud isn't exceedingly rare :( It only seems that way
             | because academia doesn't pay anyone to find it, reacts to
             | volunteer reports by ignoring it, and the media generally
             | isn't interested.
             | 
             | Fraud is so frequent and easy to find that there are
             | volunteers who in their spare time manage to routinely
             | uncover not just individual instances of fraud but entire
             | companies whose sole purpose is to generate and sell fake
             | papers on an industrial scale.
             | 
             | https://www.nature.com/articles/d41586-023-01780-w
             | 
             | Fraud is so easy and common that there are a steady stream
             | of journals which publish entire editions consisting of
             | nothing but AI generated articles!
             | 
             | https://www.nature.com/articles/d41586-021-03035-y
             | 
             | Despite being written as a joke over a decade ago, you can
             | page through an endless stream of papers that were
             | generated by SciGen - a Perl script - and yet they are
             | getting published:
             | 
             | https://pubpeer.com/search?q=scigen
             | 
             | The problem is so prevalent that some people created the
             | Problematic Paper Screener, a tool that automatically
             | locates articles that contain text indicative of auto-
             | generation.
             | 
             | https://dbrech.irit.fr/pls/apex/f?p=9999:1::::::
             | 
             | This is all pre-ChatGPT, and is just the researchers who
             | can't be bothered writing a paper at all. The more serious
             | problem is all the human written fraudulent papers with bad
             | data and bad methodologies that are never detected, or only
             | detected by randos with blogs or Twitter accounts that you
             | never hear around.
        
               | dmbche wrote:
               | Thanks you - just discovered Scigen, these links are
               | incredible
        
             | dmbche wrote:
             | For your analogy on car accidents - a notable difference
             | between both is that in the case of car accidents, we are
             | able to get numbers on when, how and why they happen and
             | then make conclusions from that.
             | 
             | In this case, we are not even aware of most events of
             | fraud/"bad papers"/manipulation - the "crisis" is that we
             | are losing faith in the science we are doing - results that
             | were cornerstones of entire fields are found to be
             | nonreproducible, making all the work built on top of it
             | pointless.(psychology, cancer, economics, etc - I'm being
             | very broad)
             | 
             | At this point, we don't know how deep the rot goes. We are
             | at the point of recognizing that it's real, and looking for
             | solutions. For car accidents, we're past that - we're just
             | arguing about what are the best solutions. For the
             | replication crisis, we're trying to find a way forward.
             | 
             | Like that scene in The Thing, where they test the blood?
             | We're at the point where we don't know who to trust.
             | 
             | Ps: what's a tfa?
        
               | [deleted]
        
       | 6510 wrote:
       | Seems like a great way for "inferior" journals to gain
       | reputation. Counting citations seems a pretty silly formula/hack.
       | How often you say something doesn't affect how true it is.
        
       | SubiculumCode wrote:
       | Scientist publishes paper based on ABCD data.
       | 
       | Replicator: Do you know how much data I'll need to collect?
       | 11,000 particpants followed across multiple timepoints of MRI
       | scanning. Show me the money.
        
         | petesergeant wrote:
         | Definitely something that needs large charitable investment,
         | but charities like that do exist, eg Wellcome Trust
        
           | SubiculumCode wrote:
           | Like 290+ million, just to get started.
        
       | jhart99 wrote:
       | Replication in many fields comes with substantial costs. We are
       | unlikely to see this strategy employed on many/most papers. I
       | agree with other commenters that materials and methodology should
       | be provided in sufficient detail so that others could replicate
       | if desired.
        
       | leedrake5 wrote:
       | Peer Review is the right solution to the wrong problem:
       | https://open.substack.com/pub/experimentalhistory/p/science-...
       | 
       | On replication, it is a worthwhile goal but the career incentives
       | need to be there. I think replicating studies should be a part of
       | the curriculum in most programs - a step toward getting a PhD in
       | lieu of one of the papers.
        
         | vinnyvichy wrote:
         | Fear of the frontier.. that's why instead of people getting
         | excited to look for new rtsp superconductor candidates, we get
         | a lot of talk downplaying the only known one. Strong link vs
         | weak link reminds me of how some cultures frown on stimulants
         | while other cultures frown on relaxants.
        
       | nomilk wrote:
       | https://web.archive.org/web/20230130143126/https://blog.ever...
        
         | the_arun wrote:
         | Thank you. Currently the original article is throttled.
         | 
         | Seems like article is not about software code.
        
       | fodkodrasz wrote:
       | How would you peer-replicate observation of a rare, or unique
       | event, for example in astronomy?
        
         | lordnacho wrote:
         | Either get your own telescope and gather your own data, or if
         | only one telescope captured a fleeting event, take that data
         | and see if the analysis turns out the same.
        
       | GuB-42 wrote:
       | Peer review is not the end. When replication is particularly
       | complex or expensive, peer review may just a way to see if the
       | study is worth replicating.
        
       | hgsgm wrote:
       | The problem is equating publication with truth.
       | 
       | Publication is a _starting point_ , not a _conclusion_
       | 
       | Publication is submitting your code. It still needs to be tested,
       | rolled out, evaluated, and time-tested.
        
       | miga wrote:
       | Peer review does not serve to assure replication, but assure
       | readability and comprehensibility of the paper.
       | 
       | Given that some experiments cost billions to conduct, it is
       | impossible to implement "Peer Replication" for all papers.
       | 
       | What could be done is to add metadata about papers that were
       | replicated.
        
         | kergonath wrote:
         | Barriers to publication should be lower for replication
         | studies, I think that's the main problem.
         | 
         | If someone wants to spend some time replicating something
         | that's only been described in a paper or two, that is valuable
         | work for the community and should be encouraged. If the person
         | is a PhD student using that as an opportunity to hone their
         | skills, it's even better. It's not glamorous, it's not
         | something entirely new, but it is _useful_ and _important_. And
         | this work needs to go to normal journals, otherwise there's
         | just be journals dedicated to replication and their impact
         | factor will be terrible and nobody will care.
        
           | s1artibartfast wrote:
           | They're basically no barriers to publication. There are a
           | number of normal journals that publish everything submitted
           | if it appears to be honest research.
        
             | kergonath wrote:
             | Not nice journals, though. At least not in my experience
             | but that's probably very field-dependent. It's not uncommon
             | to get a summary rejection letter for lack of novelty and
             | that is one aspect they stress when they ask us to review
             | articles.
        
               | s1artibartfast wrote:
               | But novelty IS what makes those journals nice and
               | prestigious in the first place. It is the basis of their
               | reputation.
               | 
               | It's basically a catch 22. We want replication in
               | prestigious journals, but any Journal with replications
               | becomes less novel and prestigious.
               | 
               | It all comes down to what people value about journals. If
               | people valued replication more than novelty, replication
               | journals would be the prestigious ones.
               | 
               | It all comes back to the fact that doing novel science is
               | considered more prestigious than replication.
               | Institutions can play all kinds of games to try to make
               | it harder for readers to tell novelty apart from
               | replication, but people will just find new ways to signal
               | and determine the difference.
               | 
               | Let's say we pass a law that prestigious journals must
               | published 50% replications. The Prestige from publishing
               | in that journal will just shift to publishing in that
               | journal with something like first demonstration in the
               | title or publishing in that journal Plus having a high
               | citation or impact value.
               | 
               | It is really difficult to come up with the system or
               | institution level solution when novelty is still what
               | individuals value.
               | 
               | As long as companies and universities value innovation,
               | figure out ways to determine which scientists are
               | innovative, and value them more
        
         | strangattractor wrote:
         | Maybe add people as special authors/contributors to the
         | original work.
         | 
         | There always seems to be a contingent of people that think that
         | anything less than %100 solution is inadequate so nothing is
         | done. Peer review has proven itself inadequate and people hang
         | on to it tooth and nail. Some disciplines should require
         | replication on everything - I won't name Psychology or Social
         | Sciences in general but the failure to replicate rate for some
         | is unacceptable.
        
         | ebiester wrote:
         | Let's not make perfect be the enemy of good. We may never be
         | able to replicate every field, but we could start many fields
         | today. It means changing our values to make replication as a
         | valid path to tenure and promotion and a required element of
         | Ph.D studies.
        
         | julienreszka wrote:
         | >Experiments that cost billions to conduct
         | 
         | If you can't replicate them it's like they didn't happen
         | anyways
        
           | thfuran wrote:
           | So no experiments have happened because I don't have a lab,
           | and CERN is just an elaborate ruse?
        
           | kergonath wrote:
           | It's a bit more subtle than that. Not all papers are equal
           | and I'd trust an article from a large team where error and
           | uncertainty analysis has been done properly (think the Higgs
           | boson paper) over a handful of dodgy experiments that are
           | barely documented properly.
           | 
           | But yeah, in the grand scheme of things if it hasn't been
           | replicated, then it hasn't been proven, but some works are
           | credible on their own.
        
           | tnecniv wrote:
           | Ah yes, if I can't run the LHC at home, none of the work
           | there happened
        
         | mathisfun123 wrote:
         | >Peer review does not serve to assure replication, but assure
         | readability and comprehensibility of the paper.
         | 
         | I have had a paper rejected twice in a row over the last year.
         | Both times the comments include something like "paper was very
         | well-wriiten; well-written enough that an undergrad could read
         | it".
         | 
         | Peer review ensures the gates are kept.
        
         | NalNezumi wrote:
         | Isn't readability and comprehensibility the job of the
         | editor/journal to check. (after all they're actually paid)
         | maybe not for conference, but peer review is more for checking
         | if the methodology, scope, claim, direction, conclusion and
         | relevances is sound&trustable.
         | 
         | At least that's my understanding
        
           | hedora wrote:
           | In CS, the editor / journal don't do those things. Instead,
           | the reviewers do. (Sometimes reviewers "shepherd" papers to
           | help fix readability after acceptance).
           | 
           | Also, most work goes to conferences; journals typically
           | publish longer versions of published works.
        
           | kergonath wrote:
           | The editor is often not the right person to decide based on
           | technical details. Most often, articles they receive anre
           | outside their field of expertise and they don't really have a
           | way of deciding if a section is comprehensible or not. It's
           | very difficult for an outsider to know what bit of jargon is
           | redundant and what bit is actually important to make sense of
           | the results. So this bit of readability check falls to the
           | referees.
           | 
           | In theory editors (or rather copyeditors, the editors
           | themselves have to handle too many papers to do this sort of
           | thing) should help with things like style, grammar, and
           | spelling. In practice, quality varies but it is often subpar.
        
           | kkylin wrote:
           | Highly dependent on journal / field. In mine (mathematics)
           | most associate editors work for free, same as reviwers. The
           | reviewer do all the things you say, and in addition try to
           | ensure readability & novelty. Most journals do have
           | professional copy editing, but that's separate from the
           | content review.
           | 
           | I don't know how refereed conference proceedings work (we
           | don't really use these). The only journals I know of that
           | have professional editors (i.e., editors who are not active
           | researchers themselves) are Nature and affiiliated journals,
           | but someone more knowledgeble should correct me here.
        
       ___________________________________________________________________
       (page generated 2023-08-06 23:00 UTC)