[HN Gopher] Why Can't I Reproduce Their Results? ___________________________________________________________________ Why Can't I Reproduce Their Results? Author : sonabinu Score : 40 points Date : 2020-07-05 15:00 UTC (1 days ago) (HTM) web link (theorangeduck.com) (TXT) w3m dump (theorangeduck.com) | wittyreference wrote: | There's certainly a kernel of truth to this. | | I worked in a cancer lab as an undergrad. New PI had just come | over to the school. He had us working on a protocol he'd | developed in his last lab to culture a specific type of glandular | tissue in a specific way. | | I and two other students spent months trying to recreate his | results, at his behest. When we couldn't get it to work, he'd | shrug and say some shit like "keep plugging away at the | variables," or "you didn't pipette it right." I don't even know | what the fuck the second one means. | | Then an experienced grad student joined the lab. He spent, I | don't know, a week at it? and was successfully reproducing the | culture. | | I still don't know how he did it. I just know that he wasn't | carrying a magic wand, and the PI certainly wasn't perpetrating a | fraud _against himself_. It was just, I guess, experience and | skill. | shultays wrote: | Reminds me this hacker koan: | | A novice was trying to fix a broken Lisp machine by turning the | power off and on. | | Knight, seeing what the student was doing, spoke sternly: "You | cannot fix a machine by just power-cycling it with no | understanding of what is going wrong." | | Knight turned the machine off and on. | | The machine worked. | bsder wrote: | > It was just, I guess, experience and skill. | | Lab technique is definitely a thing. Mostly it's about knowing | what _not_ to pay attention to so that you can give extra | attention to the things that matter. It 's also about | recognizing the intermediate signals that say "Yep, still | good." (ie. Are the bubbles the right size? Is the foam the | right color? Does it smell right/wrong? Did a layer of glop | form at the correct step?) | | However, I _rarely_ see anyone taking good enough notes to | reproduce an experiment. I had a Physics Lab taught by a | professor who would randomly take someone 's notebook in an | assignment and reproduce the experiment using your notes--and | fail you on that assignment if he couldn't. | | Very few students ever passed that. You really have to be | excessively meticulous. However, when you are debugging a | _real_ experiment without a known correct answer, if you aren | 't that meticulous you will never find your own bugs. | scott_s wrote: | I'm reminded of baking. The recipes for everything are all | known, but there's still an enormous amount of skill and | tacit knowledge required. | asdff wrote: | Finding an old protocol that has been marked up by a former | lab member is like finding the Half Blood Prince's potion | book. | ipunchghosts wrote: | Let me make a statement and let you can judge it. (its below and | stated as "STATEMENT") | | BACKGROUND I have been working for 15 years in industry doing | hardcore ML. I have the fortunate drive and background that I was | able to get my masters degree while working full time from an R1 | school. No watered-down online degree, no certificate. I would | drive twice a week to class for 4 years and did a full thesis | which was published. Since then, I have published 6 papers, all | peer reviewed. I even did a sabbatical with another research lab | of which I was invited to come. | | After 15 years, I decided to go back and get my Phd, all while | continue to work full time. My thought was that it would be easy | to get a phd with all my technical experience and math chops ive | developed over the last 15 years. I essentially have been doing | math 5 days a week for 15 years. Here's what happened... | | Coursework was a breeze. I barely put any time into it and I | easily can get a B+. This is really helpful because I am working | 40-50 hours a week at my full time job nd managing my family. I | passed my candidacy exam on the first try with little issue (this | is rare for my department). | | The biggest hangup I have about the phd process is what my | advisor wants me to do when writing papers. He is the youngest | full professor in the department and is from a well known and | well respected graduate research university. But, the way he has | me slant my papers is absurd. Results which I feel are very | important to the assessment of the reader to decide if they | should use the method, he has me remove because the results are | "too subtle." He is constantly beating on me to think about "the | casual reviewer." | | Students in the lab produce papers which are very brittle and | overfit to the test data. His lab uses the same dataset paper | after paper. My advisor was so proud of a method his top student | produced that he offered his code for my workplace to use. It | didn't work as well as a much simpler method we used. Eventually | we gave the student our data so there could be a fairest shake at | getting students method to work. The student never got the method | work to work nearly as well as in his published paper despite | telling my company and I over and over that "it will work". The | student is now at amazon lab 126. | | STATEMENT: Academia is peer reviewed driven but the peers are | other academics and so the system of innovation is dead; | academics have very little understanding of what actually works | in practice. Great example: its of no surprise that Google has | such a hard time using ML on MRI datasets. The groups working on | this are made up of Phds from my grad lab! | | TL;DR - worked for 15 years, went back for phd, here's what i | hear: "think of the casual reviewer" "fiddle with your net so | that your results are better than X" "you have to sell your | results so that its clear your method has merit" "can you get me | results that are 1% better? use the tricks from blog Y" "As long | as your results are 1% better, you are fine" | 7532yahoogmail wrote: | Wow. Hell of a good read. And smart points too. | ishcheklein wrote: | Hey, DVC maintainer here. For those who interested in this topic, | I like this one about the same problem - | https://petewarden.com/2018/03/19/the-machine-learning-repro... | (industry focused) and an excellent talk from Patrick Ball - | https://www.youtube.com/watch?v=ZSunU9GQdcI&t=1s how they | structure data projects. | Pick-A-Hill2019 wrote: | I agree and disagree (btw, previously 2 days ago at | https://news.ycombinator.com/item?id=23732531). | | Yes I agree, nine times out of ten you've made a typo, | connected the red wire where the black wire should be. | | But I diagree with the overall sentiment of the article which | fails to highlight that - when all else has failed and you have | double checked things your end ..... sometimes there IS a typo, | an incorrect equation, a simple error IN THE ORIGINAL research. | Sure, blame yourself as a first instinct (which is a good thing | to do) BUT - There is indeed a replication crisis currently in | all fields of study and research. Forgive the link to a totaly | non-tech field but it illustrates my counter-point as well as | any other. | | "The Role of Replication (The Replication Crisis in | Psychology)"[1] | | The openness of psychological research - sharing our | methodology and data via publication - is a key to the | effectiveness of the scientific method. This allows other | psychologists to know exactly how data were gathered, and it | allows them to potentially use the same methods to test new | hypotheses. | | In an ideal world, this openness also allows other researchers | to check whether a study was valid by replication - | essentially, using the same methods to see if they yield the | same results. The ability to replicate allows us to hold | researchers accountable for their work. | | [1] https://courses.lumenlearning.com/ivytech- | psychology1/chapte... | smitty1e wrote: | > To describe the implementation in a way which is less precise, | but simpler, shorter, and easier for the reader to understand. | | I'm waiting for the textbook that offers formulae with code and a | but of regression data. | m0zg wrote: | To be fair, in my field (deep learning), the papers often do not | contain enough information to reproduce the results. To take a | recent example, Google's EfficientDet paper did not contain | enough detail to be able to implement BiFPN, so nobody could | replicate their results until official implementation was | released. And even then, to the best of my knowledge, nobody has | been able to train the models to the same accuracy in PyTorch - | the results matching Google's merely port the TensorFlow weights. | | Much of the recent "efficient" DL work is like that. Efficient | models are notoriously difficult to train, and all manner of | secret sauce is simply not mentioned, and without it you won't | get the same result. At higher levels of precision, a single | percentage point of a metric can mean 10% increase in error rate, | so this is not negligible. | | To the authors' credit though, a lot of this work does get | released in full source code form, so even if you can't achieve | the same result on your own hardware, you can at least test the | results using the provided weights, and see that they _are_ in | fact achievable. ___________________________________________________________________ (page generated 2020-07-06 23:00 UTC)