[HN Gopher] More Fakery
       ___________________________________________________________________
        
       More Fakery
        
       Author : rossdavidh
       Score  : 116 points
       Date   : 2022-04-11 13:07 UTC (9 hours ago)
        
 (HTM) web link (www.science.org)
 (TXT) w3m dump (www.science.org)
        
       | qchris wrote:
       | My favorite article on this topic is "Escaping science's paradox"
       | by Stuart Buck[1]. I'm, in particular, interested by the idea (at
       | least within the United States) of "red teaming" science. This
       | would involve having an independent agency funding attempts to
       | replicate (and to find/publish flaws in) NSF- or NIH-funded
       | projects, and publishing those. Ideally, the history of
       | replication for authors' papers could then be part of the
       | criteria for receiving funding for more novel research in the
       | future.
       | 
       | Obviously, there's a few fields where this might not work (you
       | can't just create a second Large Hadron Collider for validation),
       | but in areas from sociology to organic chemistry to environmental
       | science, I think there's a lot of promise in that method for
       | helping to re-align incentives around producing solid, replicable
       | research.
       | 
       | [1] https://www.worksinprogress.co/issue/escaping-sciences-
       | parad...
        
         | a-dub wrote:
         | it's the same as everything. there should be more and easier
         | money for the less rewarding task of verification/replication.
         | some people actually enjoy this sort of work just as much as
         | some people enjoy being on the bleeding edge... but there are
         | probably less of them.
         | 
         | where it would get complicated is also the same as everything.
         | when the verification effort neither supports nor refutes the
         | original one. many would argue that it means it wasn't done
         | right, but lots of things aren't done right in life.
         | 
         | then there can be the triple replication revolution! so it
         | goes...
        
           | Beldin wrote:
           | Funding replication is a great idea, but cannot solve this.
           | It would require roughly add much funding add now goes to
           | science merely to replicate results produced now. That still
           | leaves a rather hefty backlog. Moreover, the pace with which
           | scientific output doubles is increasing. From top of my head
           | out would be below a decade nowadays. Even if 90% of those
           | would not need replication (new algorithms that work), then
           | merely keeping pace would basically require 1 in 10
           | institutions to devote itself fully to replication studies.
           | Even then we'd need more capacity to look at previous results
           | - that 10% is fully needed to investigate new results.
           | 
           | Note that this is optimistic: I'd expect the percentage of
           | publications where a reproducibility study makes sense to be
           | above 50%.
        
           | qchris wrote:
           | This isn't intended as snarky, but I don't understand what
           | "it's the same as everything" is supposed to mean. What is
           | "it"? What is "everything"? Why are they the same?
           | 
           | I'd also argue that your reduction of this problem sort of
           | misses the point. One of the big problems with the way that
           | studies are done is not that replication efforts aren't
           | conclusive (it's very difficult to prove something doesn't
           | exist), it's that a) non-replicable studies are generally
           | considered as valuable as replicable ones, and b) as a
           | result, it's extremely difficult to replicate many studies to
           | begin with, because there's no incentive to take the time to
           | make it possible. Even if the end result of a replication
           | paper is "we couldn't produce the same results", the people
           | working on it can say "this author's experiments were
           | exceedingly difficult to even try to reproduce," or
           | conversely "we didn't get the same results, but their data
           | collection methods and analysis code were well-documented and
           | accessible." That has a lot of value!
           | 
           | If you tried doing triple replication for every paper, I
           | agree that maybe wouldn't be the best use of resources. But
           | the current state of affairs is so bad that a well-organized
           | drive to create single-attempt replication on a fraction of
           | publicly-funded projects has the potential to be a
           | significant driver of change.
        
             | a-dub wrote:
             | "the same as everything" is an observation that often times
             | verification/correctness/accuracy efforts are tossed aside
             | in favor of new development and this is a truism across
             | many fields. in science you see this as funding being
             | committed to shiny new nature and science cover stories,
             | with replication being left as an afterthought. in software
             | you see this as heavy commitments to new features that
             | drive revenue, with security/compliance/architecture and qa
             | remaining underfunded and less respected. (until, of
             | course, the problems that result from underfunding them
             | make themselves apparent).
        
         | 0x0203 wrote:
         | One thing I'd like to see is a requirement that for all
         | government funded research, a certain percentage of that
         | funding, say 30%, must go toward replicating other publicly
         | funded research that has had less than 2 independent and non-
         | affiliated labs replicate. Any original research couldn't be
         | published until at least two independent and non-affiliated
         | labs replicate based on the submitted paper and report on the
         | results that can then be included with the original research.
         | I'd like to see this across all of academia, but I imagine
         | there are enough challenges with enforcing this in a productive
         | manner already that doing it across all research becomes both
         | impractical and difficult to prevent abuse. But at least with
         | public funds, it would be nice to put in some checks to reduce
         | the amount of fraudulent or sloppy research that tax payers pay
         | for.
        
           | the_snooze wrote:
           | I should point out that the notion of "replication" can often
           | be way more difficult and nuanced than people expect. For
           | one, what is the scope of the replication? Would it be simply
           | to re-run the analysis on the data and make sure the math
           | checks out? Or would it be to re-collect the data according
           | to the methods described by the original researchers?
           | 
           | The former is pretty easy, but only catches errors in the
           | analysis phase (i.e., the data itself could be flawed). The
           | latter is very comprehensive, but you essentially have to
           | double up the effort on re-doing the study---which may not
           | always be possible if you're studying a moving target (e.g.,
           | how the original SARS-CoV-2 variant spread through the
           | initial set of hosts).
        
             | OrderlyTiamat wrote:
             | > re-run the analysis on the data and make sure the math
             | checks out?
             | 
             | That isn't a replication in any meaningful sense. But a
             | replication can certainly take many forms. An exact
             | replication is one, another could be to do a conceptual
             | replication, so studying the same effect but with a
             | different design, or combining these with a new analysis
             | pooling the data from both the study and the new study with
             | (possibly) improved statistical analysis.
        
             | epgui wrote:
             | Here's an even easier set of requirement to simplify the
             | first case:
             | 
             | - Require all research to publish their source code.
             | 
             | - Require all research to publish their raw data minus
             | "PII".
             | 
             | * Note: I use "PII" here with the intention of it taking
             | the most liberal meaning possible, where privacy trumps
             | transparency absolutely and where de-anonymization is
             | impossible. This would rule out a lot of data, and
             | personally I think we could take a more balanced approach,
             | but even this minimalist approach would be a vast
             | improvement on the current situation.
        
               | bjelkeman-again wrote:
               | When I learned at university that not all published
               | research, especially government funded, didn't do this
               | already I was dumbfounded.
        
               | epgui wrote:
               | "Not all" is a big understatement... I would estimate
               | that less than 0.00001% of published research does this.
               | Every time I talk about this to someone (colleagues in
               | adjacent fields, PIs...), they seem to give zero pucks.
               | It's really mind-boggling.
        
           | mike_hearn wrote:
           | Be aware that despite how much focus replicability gets, it's
           | only one of many things that goes wrong with research papers.
           | Even if you somehow waved a magic wand and fixed
           | replicability perfectly tomorrow, entire academic fields
           | would still be worthless and misleading.
           | 
           | How can replicable research go wrong? Here's just a fraction
           | of the things I've seen reading papers:
           | 
           | 1. Logic errors. So many logic errors. Replicating something
           | that doesn't make sense leaves you with two things that don't
           | make sense: a waste of time and money.
           | 
           | 2. Tiny effect sizes. Often an effect will "replicate" but
           | with a smaller effect than the one claimed; is this a
           | successful replication or not?
           | 
           | 3. Intellectual fraud. Often this works by taking a normal
           | English term and then at the start of your paper giving it an
           | incorrect definition. Again this will replicate just fine but
           | the result is still misinformation.
           | 
           | 4. Incoherent concepts. What _exactly_ does R0 mean in
           | epidemiology and _precisely_ how is it determined? You can
           | replicate the calculations that are used but you won 't be
           | calculating what you think you are.
           | 
           | 5. A lot of research isn't experimental, it's purely
           | observational. You can't go back and re-observe the things
           | being studied, only re-analyze the data they originally
           | collected. Does this count?
           | 
           | 6. Incredibly obvious findings. Wealthy parents have more
           | successful children, etc. It'll replicate all right but so
           | what? Why are taxpayers being made to fund this stuff?
           | 
           | 7. Fraudulent practices that are nonetheless normalized
           | within a field. The article complains about scientists
           | Photoshopping western blots (a type of artifact produced in
           | biology experiments). That's because editing your data in
           | ways that make it fit your theory is universally understood
           | to be fraud ... except in climatology, where scientists have
           | developed a habit of constantly rewriting the databases that
           | contain historical temperature records. And by "historical"
           | we mean "last year" here, not 1000 years ago. These edits
           | always make global warming more pronounced, and sometimes
           | actually create warming trends where previously there were
           | none (e.g. [1]). Needless to say climatologists don't
           | consider this fraud. It means if you're trying to replicate a
           | claim from climatology, even an apparently factual claim
           | about a certain fixed year, you may run into the problem that
           | it was "true" at the time it was made and may even have been
           | replicated, but is now "false" because the data has been
           | edited since.
           | 
           | Epidemiology has a somewhat similar problem - they don't
           | consider deterministic models to be important, i.e. it may be
           | impossible to get the same numbers out of a model as a paper
           | presents, even if you give it identical inputs, due to race
           | conditions/memory corruption bugs in the code. They do _not_
           | consider this a problem and will claim it doesn 't matter
           | because the model uses a PRNG somewhere, or that they
           | "replicated" the model outputs because they got numbers only
           | 25% different.
           | 
           | What does it even mean to say a claim does or does not
           | replicate, in fields like these?
           | 
           | All this takes place in an environment of near total
           | institutional indifference. Paper replicates? Great. Nobody
           | cares, because they all assumed it would. Paper doesn't
           | replicate, or has methodological errors making replication
           | pointless? Nobody cares about that either.
           | 
           | Your proposal suggests blocking publication until replication
           | is done by independent labs. That won't work, because even if
           | you found some way to actually enforce that (not all grants
           | come from the government!), you'll just end up with lots of
           | papers that can be replicated but are still nonsensical for
           | other reasons.
           | 
           | [1] https://nature.com/articles/nature.2015.17700
        
         | Beldin wrote:
         | One problem is that the amount of scientific output is
         | increasing at an increasing rate.
         | 
         | This means that the vast, vary majority of works will never be
         | considered for replication - even with a dedicated replication
         | institute. So for most applicants, the amount of replicated
         | results will be 0.
        
         | bee_rider wrote:
         | Being on the science red team could also be really cool and
         | fun. Since the goal is to explore the type of error or lie that
         | gets through reliably, put new scientists on a team with some
         | old greybeard, let's pass along that hard earned "how to screw
         | up cleverly" experience.
        
           | JacobThreeThree wrote:
           | >Being on the science red team could also be really cool and
           | fun.
           | 
           | I think it depends on what you're investigating, and how much
           | is at stake. I doubt it would be much fun to be put on a
           | corporate hit list.
           | 
           | >The court was told that James Fries, professor of medicine
           | at Stanford University, wrote to the then Merck head Ray
           | Gilmartin in October 2000 to complain about the treatment of
           | some of his researchers who had criticised the drug.
           | 
           | >"Even worse were allegations of Merck damage control by
           | intimidation," he wrote, ... "This has happened to at least
           | eight (clinical) investigators ... I suppose I was mildly
           | threatened myself but I never have spoken or written on these
           | issues."
           | 
           | https://www.cbsnews.com/news/merck-created-hit-list-to-
           | destr...
        
           | mike_hearn wrote:
           | Talk to people who have actually done it. Not one will tell
           | you it's cool or fun. Here's how science red teaming actually
           | goes:
           | 
           | 1. You download a paper and read it. It's got major, obvious
           | problems that look suspiciously like they might be
           | deliberate.
           | 
           | 2. You report the problems to the authors. They never reply.
           | 
           | 3. You report the problems to the journals. They never reply.
           | 
           | 4. You report the problems to the university where those
           | people work. They never reply.
           | 
           | 5. Months have passed, you're tired of this and besides by
           | now the same team has published 3 more papers all of which
           | are also flawed. So you start hunting around for people who
           | _will_ reply, and eventually you find some people who run
           | websites where bad science is discussed. They do reply and
           | even publish an article you wrote about what is going wrong
           | in science, but it 's the wrong sort of site so nobody who
           | can do anything about the problem is reading it.
           | 
           | 6. In addition if you red-teamed the wrong field, you get
           | booted off Twitter for "spreading misinformation" and the
           | press describe you as a right wing science denier. Nobody has
           | ever asked you what your politics are and you're not denying
           | science, you're denying pseudo-science in an effort to make
           | actual science better, but none of that matters.
           | 
           | 7. You realize that this is a pointless endeavour. The people
           | you hoped would welcome your "red teaming" are actually
           | committed to defending the institutions regardless of merit,
           | and the people who actually do welcome it are all
           | ideologically persona non grata in the academic world - even
           | inviting them to give a talk risks your cancellation. The
           | End.
           | 
           | An essay that explores this problem from the perspective of
           | psychology reform can be found here:
           | 
           | https://www.psychologytoday.com/us/blog/how-do-you-
           | know/2021...
        
         | mherdeg wrote:
         | I took a science journalism class in college where our
         | instructor had us read a paper and then write the news story
         | that explained what was interesting about the result.
         | 
         | "You all got it wrong," he said, "the news is not that Amy
         | Wagers could not make things work with mouse stem cells the way
         | this prior paper said this one time. The news is that Wagers-
         | ize is becoming a verb which means 'to disprove an amazing
         | result after attempting to replicate it'. The lab has Wageres-
         | ed another pluripotent stem cell result. The news is about how
         | often this happens and what it means for this kind of science."
         | 
         | This class was in 2006 and this later profile in 2008 seemed to
         | bear things out the way he said:
         | https://news.harvard.edu/gazette/story/2008/07/amy-wagers-fo...
        
       | j7ake wrote:
       | As long as there is some quantitative criteria on which jobs and
       | promotions depend, there will be people gaming the system.
       | 
       | One solution is to couple this quantitative criteria with
       | independent committees that assess people beyond the metrics, but
       | that requires a lot of human effort and not scalable.
       | 
       | Assessing people in ways that don't scale seem to be the way to
       | avoid this gaming trap in academia.
        
         | cycomanic wrote:
         | I'd argue that it's not just that the metrics don't scale but
         | the problem is that we are trying to find quantitative metrics
         | for something that can't be easily quantified. The worst
         | outcome is not even the forgeries and fakes as in this article,
         | but more that even the vast majority of ethical academics are
         | being pushed into a direction that is detrimental to longterm
         | scientific progress, in particular short term outcomes instead
         | of longterm progress.
        
           | lutorm wrote:
           | _even the vast majority of ethical academics are being pushed
           | into a direction that is detrimental to longterm scientific
           | progress, in particular short term outcomes instead of
           | longterm progress_
           | 
           | I agree. The egregious fraud is just the high-sigma wing.
           | It's a symptom, but the real problem is how it affects the
           | majority.
        
         | rossdavidh wrote:
         | Interesting point; it is much like the problems of trying to
         | assess programmer productivity.
        
           | _tom_ wrote:
           | I was thinking it's much like Google trying to deal with SEO.
           | Most people optimize for high google ranking, not quality
           | content.
           | 
           | Google periodically changes the evaluation, in theory to
           | reward good content and penalize bad, but people still try to
           | game the system, rather than improving content.
        
         | _tom_ wrote:
         | And non-quantitive evaluations are prone to favoritism and
         | prejudice. AKA people gaming the system.
         | 
         | I doubt there's an easy answer.
         | 
         | Trying to better align the short term objectives with longer
         | term ones could help, but that just makes it harder to game,
         | doesn't eliminate it.
        
           | j7ake wrote:
           | It's why one needs both. You need both undeniable
           | productivity by quantitative metrics, as well as glowing
           | reviews from independent panels that are not influenced by
           | favoritism (almost like an audit).
        
       | epolanski wrote:
       | Data fabrication is sadly the norm nowadays.
       | 
       | I was a chemistry researcher working on renewables, and during my
       | master's thesis 9 months were spent validating fake results (from
       | a publication of a scientist who worked in our group moreover).
        
       | some_random wrote:
       | It's crazy to me that academic fraud isn't a more pressing
       | concern to society in general and academia in particular. The
       | scientific process as currently implemented is broken across
       | every single discipline. Even subjects like CS that should in
       | theory be trivially reproducible, are rarely so. The reproduction
       | crisis is still going on, but only nerds like us care.
        
         | derbOac wrote:
         | There are many causes of the lack of concern, but I think at
         | the heart the problem, at least in the US, is that science has
         | become politicized such that attempts at reform are
         | mischaracterized for political gain. There's also a bit of
         | ignorance in the general public, but that's only part of it.
         | 
         | For example, if some on the right suggest some difficult but
         | needed reforms, it tends to be spun as an attack on science. Or
         | complaints that trivial projects are being overfunded get
         | misinterpreted by the right and they try to make an example of
         | the wrong studies for the wrong reasons.
         | 
         | The pandemic was a good example of this in my mind, in that I
         | think there were some serious systematic problems in academics
         | and healthcare that were laid brutally bare, and many people
         | suffered or died as a result. But then the whole thing got
         | misidentified and sucked into the political vortex and all you
         | end up with are hearings about how to rehabilitate the CDC, as
         | if that is the problem and not a symptom of even bigger
         | problems.
         | 
         | I still think there are ways for things to change, but the most
         | likely of them involve unnecessary suffering and chaos.
        
           | N1H1L wrote:
           | I can give a different perspective. It is not because of
           | politicization IMO - at least not in the hard sciences. The
           | problem comes from way up, from Congress because the
           | immediate impact of science is not obvious. Especially, for
           | basic sciences the impact takes decades to be really felt.
           | 
           | But then how do you do promotions? How do you judge output?
           | Worse still, how does US Congress justify spending taxpayer
           | dollars. Rather than acknowledging that any short term
           | measurement of the quality of science is a fool's errand, we
           | have doubled down on meaningless metrics like impact factors
           | and h-indices. And this is what we have as a result.
        
         | ArnoVW wrote:
         | Aside from reproducibility issues in ML, what sort of issue did
         | you have in mind in CS?
         | 
         | Most CS work is 90% maths, I don't see how you can have
         | reproducibility issues?
        
           | the_snooze wrote:
           | Take, for example, network measurement research:
           | https://conferences.sigcomm.org/imc/2021/accepted/
           | 
           | One of those papers is about counting the scale and scope of
           | online political advertising during the 2020 election. How
           | does one reproduce that study? The 2020 election is long
           | past, and that data isn't archived anywhere other than what
           | the researchers have already collected. This is a pretty
           | simple empiracal data collection tastk, but you can't just
           | re-measure that today because that study is about a moving
           | target.
        
           | tlb wrote:
           | I did my dissertation on this problem 25 years ago. It hasn't
           | gone away.
           | 
           | In general, performance comparisons are hard to reproduce.
           | For instance, when benchmarking network protocols, often a
           | tiny change in configuration makes a big change in a results.
           | You might change the size of a buffer from 150 packets to 151
           | packets and see performance double (or halve.)
           | 
           | Instead of making measurements with some arbitrary choices
           | for parameters, you can take lots of measurements with
           | parameters randomly varied to show a distribution of
           | measurements. It's hard work to track down all the possible
           | parameters and decide on a reasonable range for them, so it's
           | rarely done. I found many 10x variances in network protocol
           | performance (like how fairly competing TCP streams can
           | sharing bandwidth).
           | 
           | The big idea was to show that by randomizing some decisions
           | in the protocol (like discarding packets with some
           | probability as the buffer gets full) you can make the
           | performance less sensitive to small changes. ie, more
           | reproducible. Less sensitivity is especially good when you
           | care about the worst-case performance rather than average. It
           | can also make tuning a protocol much easier, since you aren't
           | constantly being fooled by unstable performance.
           | 
           | Performance sensitivity analysis is hard work, so most papers
           | are just like "we ran our new thing 3 times and got similar
           | numbers so there you go."
        
         | thechao wrote:
         | If you're any good at your chosen specialty you get a "feel"
         | for the bullshit. I know this doesn't help the public. My
         | experience is in medical research, crystallography, and
         | computer science. Here's an example for detecting "bullshit" in
         | cardiology: call up the MD PI from the published paper and ask
         | to review anonymized charts from patients targeted with the
         | procedure. Are there any? Then, the research is probably good;
         | are there none? It's probably because it'd kill the patient.
         | Similarly, in Programming Language Theory: we'd just ask which
         | popular compilers added the pass. Is it on in -O3 in LLVM?
         | Serious fucking result; is it in some dodgy branch in GHC? Not
         | useful.
        
         | rossdavidh wrote:
         | I think there's two problems impeding our ability to focus
         | better on this:
         | 
         | 1) for many people, the idea that science has widespread fraud
         | is just hard to accept; in this respect it is similar to the
         | difficulties that many religious communities have in accepting
         | that their clergy could have a corruption problem
         | 
         | 2) the solutions require thinking about problems like
         | p-hacking, incentives, selection effects, and other non-trivial
         | concepts that are tough for the average person to wrap their
         | heads around.
        
           | derbOac wrote:
           | I've often thought religious corruption is a good analogy, in
           | that many of the societal dynamics are very similar. As I'm
           | writing this the parallels are interesting to think about
           | relative to US politics.
        
             | throwawayboise wrote:
             | It it is a good analogy. For most lay people, science is a
             | religion. They lack the expertise to understand the theory,
             | but they unquestioningly accept the explanations and
             | interpretations of the so-called experts.
             | 
             | Most people don't understand astronomy and physics well
             | enough to prove to themselves that the earth orbits the sun
             | and not vice-versa. Yet they believe it does, with
             | certainty, because they have been taught that it is true.
        
           | SubiculumCode wrote:
           | Also: I have not seen evidence of widespread fraud. Evidence
           | of fraud,yes. Evidence of widespread fraud no.
        
             | rossdavidh wrote:
             | Agreed it's an important point that fraud is only a
             | fraction of the problem.
        
             | derbOac wrote:
             | That's a fair point, although fraud per se is only a small
             | part of all the problems. There's other forms of corruption
             | than fraud, and a lot of it falls into this zone of
             | plausible deniability rather than outright fraud. Also, I
             | think the problems tend to find most weight with higher
             | concentration of power, so what matters isn't as much "how
             | widespread is corruption?" but rather "how is corruption
             | distributed among power structures in academics and what is
             | rewarded?"
        
             | bhk wrote:
             | I have. According to [1], "1 in 4 cancer research papers
             | contains faked data". As the article argues, the standards
             | are perhaps unreasonably strict, but even by more favorable
             | criteria, 1/8 of the papers contained faked data.
             | Interestingly, [2] using the same approach, found fraud in
             | 12.4% of papers in the International Journal of Oncology.
             | More broadly, [2] found fraud in about 4% of the papers
             | studied (782 of 20,621). I'd say that's pretty widespread,
             | but you further have to consider that these papers focused
             | narrowly on a very specific type of fraud that is easy to
             | detect (image duplication), so we would expect the true
             | number of fraudulent papers to be much higher.
             | 
             | [1] https://arstechnica.com/science/2015/06/study-
             | claims-1-in-4-...
             | 
             | [2] https://www.biorxiv.org/content/biorxiv/early/2016/04/2
             | 0/049...
        
             | mistermann wrote:
             | Don't forget though: events proceed evidence, and evidence
             | doesn't always follow events.
             | 
             | Also: perception is ~effectively reality.
        
       | javajosh wrote:
       | Could it be there's just too much science being done for much of
       | it to be any use? And that this oversupply causes these schemes,
       | as a side-effect? If so, selling authorship is merely a symptom
       | of the worthlessness of most modern science.
       | 
       | For much of human history, science was something you did in your
       | spare time - or, if you were exceptional, you might have a
       | patron. Then nation states discovered the value of technology and
       | science, and wanted more, and so have created science factories.
       | But, perhaps unsurprisingly, the rate of science production
       | cannot really be improved in this way, and yet the economics of
       | science demand that is does. This disconnect between reality and
       | expectation is the root of this problem, and many others.
        
         | rossdavidh wrote:
         | Oof. Good point. I feel like there's a similar pattern to
         | having too much VC money chasing too few actually good ideas to
         | invest in.
        
           | pphysch wrote:
           | Or a government printing money to hire private contractors,
           | completely disregarding its ability do anything on its own.
           | 
           | To some extent, this is the curse of being the creator of the
           | global reserve currency. The US government can, in theory,
           | print as much money as it wants and pay off whoever it wants
           | to do whatever it wants. This also extends to the academic
           | and financial (VC) sectors, because a lot of that liquidity
           | comes directly from the Government/Fed.
           | 
           | Unfortunately this leads to a culture of corruption (who gets
           | the grants/contracts/funding?) and widespread fraud. This
           | causes the ROI of money printing to go down, the money
           | printer accelerates and we get inflation too.
        
         | SubiculumCode wrote:
         | This is in fact incredibly wrong. At least in my field, there
         | is so much more data than there are qualified experts to
         | analyze it. For one reason, academia pays so much less than the
         | private sphere that post docs are leaving.
        
         | seiferteric wrote:
         | Something I was wondering is if faking results is so common,
         | then surely these things they are researching must never be
         | used in any application right? If they were, it would quickly
         | be found that it does not actually work...
        
           | HarryHirsch wrote:
           | This is exactly how it works in practice. Anyone who works at
           | the bench learns quickly to spot the frauds and fakes and
           | avoids them. That's the "replication" everyone talks about,
           | no special agency to waste funds on boring stuff needed.
        
           | gwd wrote:
           | > If they were, it would quickly be found that it does not
           | actually work...
           | 
           | Unfortunately some of the effect sizes are so small that it's
           | hard to tell what's working or not. The results of papers on
           | body building, for instance, are definitely put into practice
           | by some people. If the claim of the paper is that eating
           | pumpkin [EDIT] decreases muscle recovery time by 5%, how is
           | an individual who starts eating pumpkin supposed to notice
           | that he's not getting any particular benefit from following
           | its advice? Particularly if he's also following random bits
           | of advice from a dozen other papers, half of which are valid
           | and half of which are not?
        
           | btrettel wrote:
           | One problem I've observed is that people applying things
           | often cargo-cult "proven" things from the scientific
           | literature that aren't actually proven. It's easier to say
           | that you're following "best practices" than it is to check
           | that what you're doing works, unfortunately.
        
         | twofornone wrote:
         | Maybe it's a deeper problem related to western liberal notions
         | that anyone can do anything if they just "set their mind to
         | it". We have a glut of "professionals" across industries and
         | institutions who don't really have any business being there,
         | but the machine requires that they appear to be useful, and so
         | mechanisms emerge to satisfy this constraint. A consequence in
         | science is a long list of poor quality junk publications, and
         | few people are willing to acknowledge the nakedness of the
         | emperor for fear of losing their positions, but because doing
         | so may betray their own redundancy.
        
       | photochemsyn wrote:
       | My own rather short academic career involved doing lab work with
       | three different PI-led groups. One PI was actually excellent, and
       | I really had no idea how good I had it. I caught the other two
       | engaging in deliberately fraudulent practices. For example, data
       | they'd collect from experiments would be thrown out selectively
       | so that they could publish better curve-fits. Another trick was
       | fabricating data with highly obscure methods that other groups
       | would be unlikely to replicate. They'd also apply pressure to
       | graduate students to falsify data in order to get results that
       | agreed with their previously published work.
       | 
       | The main difference between the excellent PI and the two
       | fraudsters was that the former insisted on everyone in her lab
       | keeping highly accurate and detailed daily lab notebooks, while
       | the other two had incredibly poor lab notebook discipline (and
       | often didn't even keep records!). She actually caught one of her
       | grad students fudging data via this method, before it went to
       | publication. Another requirement was that samples had to be
       | blindly randomized before we analyzed them, so that nobody could
       | manipulate the analytical process to get their desired result.
       | 
       | If you're thinking about going into academia, that's the kind of
       | thing to look out for when visiting prospective PIs. Shoddy
       | record keeping is a huge red flag. Inability to replicate
       | results, and in particular no desire to replicate results, is
       | another warning sign. And yes, a fair number of PIs have made
       | careers out of publishing fraudulent results and never get
       | caught, and they infest the academic system.
        
         | georgecmu wrote:
         | I would say that this applies even more so outside of academia.
         | At early stages of development, a research group's or company's
         | product is by necessity a report or a presentation rather than
         | a physical plant's or process's real, quantifiable performance.
         | No malicious intent is required; it's just all too easy to fool
         | yourself or cherry-pick data to support desired conclusions
         | when the recordkeeping is poor.
         | 
         | In my hard-tech experiment-heavy start-up there's no way we
         | could have made any actual technical progress without setting
         | up a solid data preservation and analysis framework first. For
         | every experimental run, all the original sensor data are
         | collected and immediately uploaded along with any photos,
         | videos, and operator comments to a uniquely-tagged confluence
         | page. Results and data from any further data or product
         | analysis are linked to this original page.
         | 
         | As an anecdotal example, we recently caught swapped dataset
         | labels in results from analysis performed on our physical
         | samples by a third-party lab. We were able to do this easily
         | just because we could refer back to every other piece of
         | information regarding these samples, including the conditions
         | in which they were generated months prior to this analysis. As
         | soon all the data were on display at once, the discrepancies
         | were obvious.
        
         | ketanmaheshwari wrote:
         | [PLUG] Some of what you mention are "negative results" that are
         | quite prevelant and a necessary part of any research. However,
         | the expected mold at publishing venues is such that they are
         | not considered worthwhile.
         | 
         | My colleagues and I are trying to address this by creating a
         | platform to discuss and publish such "bad" or "negative"
         | results. More info here:
         | 
         | https://error-workshop.org/
        
       | EamonnMR wrote:
       | Does the new In The Pipeline blog have an RSS feed? I haven't
       | been able to find it.
        
       | bannedbybros wrote:
        
       | Enginerrrd wrote:
       | I always thought it would be a good idea to start a journal that
       | has a lab submit their methods and intent of study for peer
       | review and approval / denial PRIOR to performing the work. Then,
       | if approved, and as long as they adhere to the approved methods,
       | they get published regardless of outcome. That would really
       | encourage the publishing of negative results and eliminate a lot
       | of the incentive to fudge the numbers on the data. It would
       | probably overly reward pre-existing clout, but frankly that's a
       | problem ANYWAY.
        
         | Guybrush_T wrote:
         | This is done with clinical trials (or at least it's
         | recommended). Many researchers register their study at
         | https://clinicaltrials.gov/ before data collection starts. I'm
         | not sure if something similar exists for lab based research.
        
       | francislavoie wrote:
       | Reminds me of Bobby Broccoli's video series on Jan Hendrik Schon
       | who almost got the Nobel Prize in Physics fraudulently. Extremely
       | good watch:
       | 
       | https://www.youtube.com/playlist?list=PLAB-wWbHL7Vsfl4PoQpNs...
        
       | slowhand09 wrote:
       | Worked on a NASA program once, about measuring Earth Science
       | data. We built a database application to gather suggested
       | requirements from members of the earth science community. One
       | such member from our own team helped develop specs for our
       | system. After we built it, she wanted to measure its utility and
       | usability. She watched as users navigated and entered data into
       | the system. She also asked myself and members of my team who
       | developed the developed the software to use it and be measured. I
       | and one other developer (2 of 3 members) explained why we
       | implemented each feature as we were utilizing the system. The
       | "scientist" measuring us all promptly published as a conclusion
       | in her paper "The usability of the system was better for
       | inexperienced users than it was for experienced users. The
       | experienced users took nearly 50% longer to navigate and enter
       | similar requirements". She basically made up an "interesting"
       | conclusion by omitting characterization of our testing session,
       | where we explained how we implemented her requirements.
        
       ___________________________________________________________________
       (page generated 2022-04-11 23:00 UTC)