[HN Gopher] Why Can't I Reproduce Their Results?
       ___________________________________________________________________
        
       Why Can't I Reproduce Their Results?
        
       Author : sonabinu
       Score  : 40 points
       Date   : 2020-07-05 15:00 UTC (1 days ago)
        
 (HTM) web link (theorangeduck.com)
 (TXT) w3m dump (theorangeduck.com)
        
       | wittyreference wrote:
       | There's certainly a kernel of truth to this.
       | 
       | I worked in a cancer lab as an undergrad. New PI had just come
       | over to the school. He had us working on a protocol he'd
       | developed in his last lab to culture a specific type of glandular
       | tissue in a specific way.
       | 
       | I and two other students spent months trying to recreate his
       | results, at his behest. When we couldn't get it to work, he'd
       | shrug and say some shit like "keep plugging away at the
       | variables," or "you didn't pipette it right." I don't even know
       | what the fuck the second one means.
       | 
       | Then an experienced grad student joined the lab. He spent, I
       | don't know, a week at it? and was successfully reproducing the
       | culture.
       | 
       | I still don't know how he did it. I just know that he wasn't
       | carrying a magic wand, and the PI certainly wasn't perpetrating a
       | fraud _against himself_. It was just, I guess, experience and
       | skill.
        
         | shultays wrote:
         | Reminds me this hacker koan:
         | 
         | A novice was trying to fix a broken Lisp machine by turning the
         | power off and on.
         | 
         | Knight, seeing what the student was doing, spoke sternly: "You
         | cannot fix a machine by just power-cycling it with no
         | understanding of what is going wrong."
         | 
         | Knight turned the machine off and on.
         | 
         | The machine worked.
        
         | bsder wrote:
         | > It was just, I guess, experience and skill.
         | 
         | Lab technique is definitely a thing. Mostly it's about knowing
         | what _not_ to pay attention to so that you can give extra
         | attention to the things that matter. It 's also about
         | recognizing the intermediate signals that say "Yep, still
         | good." (ie. Are the bubbles the right size? Is the foam the
         | right color? Does it smell right/wrong? Did a layer of glop
         | form at the correct step?)
         | 
         | However, I _rarely_ see anyone taking good enough notes to
         | reproduce an experiment. I had a Physics Lab taught by a
         | professor who would randomly take someone 's notebook in an
         | assignment and reproduce the experiment using your notes--and
         | fail you on that assignment if he couldn't.
         | 
         | Very few students ever passed that. You really have to be
         | excessively meticulous. However, when you are debugging a
         | _real_ experiment without a known correct answer, if you aren
         | 't that meticulous you will never find your own bugs.
        
           | scott_s wrote:
           | I'm reminded of baking. The recipes for everything are all
           | known, but there's still an enormous amount of skill and
           | tacit knowledge required.
        
             | asdff wrote:
             | Finding an old protocol that has been marked up by a former
             | lab member is like finding the Half Blood Prince's potion
             | book.
        
       | ipunchghosts wrote:
       | Let me make a statement and let you can judge it. (its below and
       | stated as "STATEMENT")
       | 
       | BACKGROUND I have been working for 15 years in industry doing
       | hardcore ML. I have the fortunate drive and background that I was
       | able to get my masters degree while working full time from an R1
       | school. No watered-down online degree, no certificate. I would
       | drive twice a week to class for 4 years and did a full thesis
       | which was published. Since then, I have published 6 papers, all
       | peer reviewed. I even did a sabbatical with another research lab
       | of which I was invited to come.
       | 
       | After 15 years, I decided to go back and get my Phd, all while
       | continue to work full time. My thought was that it would be easy
       | to get a phd with all my technical experience and math chops ive
       | developed over the last 15 years. I essentially have been doing
       | math 5 days a week for 15 years. Here's what happened...
       | 
       | Coursework was a breeze. I barely put any time into it and I
       | easily can get a B+. This is really helpful because I am working
       | 40-50 hours a week at my full time job nd managing my family. I
       | passed my candidacy exam on the first try with little issue (this
       | is rare for my department).
       | 
       | The biggest hangup I have about the phd process is what my
       | advisor wants me to do when writing papers. He is the youngest
       | full professor in the department and is from a well known and
       | well respected graduate research university. But, the way he has
       | me slant my papers is absurd. Results which I feel are very
       | important to the assessment of the reader to decide if they
       | should use the method, he has me remove because the results are
       | "too subtle." He is constantly beating on me to think about "the
       | casual reviewer."
       | 
       | Students in the lab produce papers which are very brittle and
       | overfit to the test data. His lab uses the same dataset paper
       | after paper. My advisor was so proud of a method his top student
       | produced that he offered his code for my workplace to use. It
       | didn't work as well as a much simpler method we used. Eventually
       | we gave the student our data so there could be a fairest shake at
       | getting students method to work. The student never got the method
       | work to work nearly as well as in his published paper despite
       | telling my company and I over and over that "it will work". The
       | student is now at amazon lab 126.
       | 
       | STATEMENT: Academia is peer reviewed driven but the peers are
       | other academics and so the system of innovation is dead;
       | academics have very little understanding of what actually works
       | in practice. Great example: its of no surprise that Google has
       | such a hard time using ML on MRI datasets. The groups working on
       | this are made up of Phds from my grad lab!
       | 
       | TL;DR - worked for 15 years, went back for phd, here's what i
       | hear: "think of the casual reviewer" "fiddle with your net so
       | that your results are better than X" "you have to sell your
       | results so that its clear your method has merit" "can you get me
       | results that are 1% better? use the tricks from blog Y" "As long
       | as your results are 1% better, you are fine"
        
       | 7532yahoogmail wrote:
       | Wow. Hell of a good read. And smart points too.
        
       | ishcheklein wrote:
       | Hey, DVC maintainer here. For those who interested in this topic,
       | I like this one about the same problem -
       | https://petewarden.com/2018/03/19/the-machine-learning-repro...
       | (industry focused) and an excellent talk from Patrick Ball -
       | https://www.youtube.com/watch?v=ZSunU9GQdcI&t=1s how they
       | structure data projects.
        
         | Pick-A-Hill2019 wrote:
         | I agree and disagree (btw, previously 2 days ago at
         | https://news.ycombinator.com/item?id=23732531).
         | 
         | Yes I agree, nine times out of ten you've made a typo,
         | connected the red wire where the black wire should be.
         | 
         | But I diagree with the overall sentiment of the article which
         | fails to highlight that - when all else has failed and you have
         | double checked things your end ..... sometimes there IS a typo,
         | an incorrect equation, a simple error IN THE ORIGINAL research.
         | Sure, blame yourself as a first instinct (which is a good thing
         | to do) BUT - There is indeed a replication crisis currently in
         | all fields of study and research. Forgive the link to a totaly
         | non-tech field but it illustrates my counter-point as well as
         | any other.
         | 
         | "The Role of Replication (The Replication Crisis in
         | Psychology)"[1]
         | 
         | The openness of psychological research - sharing our
         | methodology and data via publication - is a key to the
         | effectiveness of the scientific method. This allows other
         | psychologists to know exactly how data were gathered, and it
         | allows them to potentially use the same methods to test new
         | hypotheses.
         | 
         | In an ideal world, this openness also allows other researchers
         | to check whether a study was valid by replication -
         | essentially, using the same methods to see if they yield the
         | same results. The ability to replicate allows us to hold
         | researchers accountable for their work.
         | 
         | [1] https://courses.lumenlearning.com/ivytech-
         | psychology1/chapte...
        
       | smitty1e wrote:
       | > To describe the implementation in a way which is less precise,
       | but simpler, shorter, and easier for the reader to understand.
       | 
       | I'm waiting for the textbook that offers formulae with code and a
       | but of regression data.
        
       | m0zg wrote:
       | To be fair, in my field (deep learning), the papers often do not
       | contain enough information to reproduce the results. To take a
       | recent example, Google's EfficientDet paper did not contain
       | enough detail to be able to implement BiFPN, so nobody could
       | replicate their results until official implementation was
       | released. And even then, to the best of my knowledge, nobody has
       | been able to train the models to the same accuracy in PyTorch -
       | the results matching Google's merely port the TensorFlow weights.
       | 
       | Much of the recent "efficient" DL work is like that. Efficient
       | models are notoriously difficult to train, and all manner of
       | secret sauce is simply not mentioned, and without it you won't
       | get the same result. At higher levels of precision, a single
       | percentage point of a metric can mean 10% increase in error rate,
       | so this is not negligible.
       | 
       | To the authors' credit though, a lot of this work does get
       | released in full source code form, so even if you can't achieve
       | the same result on your own hardware, you can at least test the
       | results using the provided weights, and see that they _are_ in
       | fact achievable.
        
       ___________________________________________________________________
       (page generated 2020-07-06 23:00 UTC)