During the last years a lot of attention has been drawn to the prevalence of irreproducible results in science. That published research findings tend to weaken or vanish over time is a pressing problem in particular in some areas of the life sciences, psychology and neuroscience. On the face of it, the issue is that scientists work with too small samples and frequently cherry-pick their data. Next to involuntarily poor statistics, the blame has primarily been put on the publish-or-perish culture of modern academia.While I blame that culture for many ills, I think here the finger is pointed at the wrong target.
Scientists aren’t interested in publishing findings that they suspect to be spurious. That they do it anyway is because a) funding agencies don’t hand out sufficient money for decent studies with large samples b) funding agencies don’t like reproduction studies because, eh, it’s been done before and c) journals don’t like to publish negative findings. The latter in particular leads scientists to actively search for effects, which creates a clear bias. It also skews meta-studies against null results.
That’s bad, of course.
I will not pretend that physics is immune to this problem, though in physics the issue is, forgive my language, significantly less severe.
A point in case though is the application of many different analysis methods to the same data set. Collaborations have their procedures sorted out to avoid this pitfall, but once the data is public it can be analyzed by everybody and their methods, and sooner or later somebody will find something just by chance. That’s why, every once in while we hear of a supposedly interesting peculiarity in the cosmic microwave background, you know, evidence for a bubble collision, parallel universes, a cyclic universe, a lopsided universe, an alien message, and so on. One cannot even blame them for not accounting for other researchers who are trying creative analysis methods on the same data, because that’s unknown unknowns. And theoretical papers can be irreproducible in the sense of just being wrong, but the vast majority of these just get ignored (and if not the error is often of interest in itself).
So even while the fish at my doorstep isn’t the most rotten one, I think irreproducible results are highly problematic, and I welcome measures that have been taken, eg by Nature magazine, to improve the situation.
And then there’s Jared Horvath, over at SciAm blogs, who thinks irreproducibility is okay, because it’s been done before. He lists some famous historical examples where scientists have cherry-picked their data because they had a hunch that their hypothesis is correct even if the data didn’t support it. Jared concludes:
“There is a larger lesson to be gleaned from this brief history. If replication were the gold standard of scientific progress, we would still be banging our heads against our benches trying to arrive at the precise values that Galileo reported.”You might forgive Jared, who is a is a PhD candidate in cognitive neuroscience, for cherry picking his historical data, because he’s been trained in today’s publish-and-perish culture. Unfortunately, he’s not the only one who believes that something is okay because a few people in the past succeeded with it. Michael Brooks has written a whole book about it. In “Free Radicals: The Secret Anarchy of Science”, you can read for example
“It is the intuitive understanding, the gut feeling about what the answer should be, that marks the greatest scientists. Whether they fudge their data or not is actually immaterial.”Possibly the book gets better after this, but I haven’t progressed beyond this page because every time I see that paragraph I want to cry.
The “gut feeling about what the answer should be” does mark great scientists, yes. It also marks pseudoscientists and crackpots, just that you don’t find these the history books. The argument that fudging data is okay because great scientists did it and time proved them right is like browsing bibliographies and concluding that in the past everybody was famous.
I’m not a historian and I cannot set that record straight, but I can tell you that the conclusion that irreproducibility is a necessary ingredient to scientific progress is unwarranted.
But I have one piece of data to make my case, a transcript of a talk given by Irwin Langmuir in the 1950s, published in Physics Today in 1989. It carries the brilliant title “Pathological Science” and describes Langmuir’s first-hand encounters with scientists who had a gut feeling about what the answer should be. I really recommend you read the whole thing (pdf here), but just for the flavor here’s an excerpt:
“Mitogenic rays.Langmuir relates several stories of this type, all about scientists who discarded some of their data or read output to their favor. None of these scientists has left a mark in the history books. They have however done one thing. They’ve wasted their and other scientist’s time by not properly accounting for their methods.
About 1923 there was a whole series of papers by Gurwitsch and others. There were hundreds of them published on mitogenic rays. There are still a few of them being published [in 1953]. I don’t know how many of you have ever heard of mitogenic rays. They are given off by growing plants, living things, and were proved, according to Gurwitsch, to be something that would go through glass but not through quarz. They seemed to be some sort of ultraviolet light… If you looked over these photographic plates that showed this ultraviolet light you found that the amount of light was not so much bigger than the natural particles of the photographic plate, so that people could have different opinions as to whether or not it showed this effect. The result was that less than half of the people who tried to repeat these experiments got any confirmation of it…”
There were hundreds of papers published on a spurious result – in 1953. Since then the scientific community has considerably grown, technology has become much more sophisticated (not to mention expensive), and scientists have become increasingly specialized. For most research findings, there are very few scientists who are able to conduct a reproduction study, even leaving aside the problems with funding and publishing. In 2013, scientists have to rely on their colleagues much more than was the case 60 years ago, and certainly in the days of Millikan and Galileo. The harm being caused by cherry picked data and non-reported ‘post-selection’ (a euphemism for cherry-picking), in terms of waste of time has increase with the community. Heck, there were dozens of researchers who wasted time (and thus their employers money...) on ‘superluminal neutrinos’ even though everybody knew these results to be irreproducible (in the sense that they hadn’t been found by any previous measurements).
Worse, this fallacious argument signals a basic misunderstanding about how science works.
The argument is based on the premise that if a scientific finding is correct, it doesn’t matter where it came from or how it was found. That is then taken to justify the ignorance of any scientific method (and frequently attributed to Feyerabend). It is correct in that in the end it doesn’t matter exactly how a truth about nature was revealed. But we do not speak of a scientific method to say that there is only one way to make progress. The scientific method is used to increase the chances of progress. It’s the difference between letting the proverbial monkey hammer away and hiring a professional science writer for your magazine’s blog. Yes, the monkey can produce a decent blogpost, and if that is so then that is so. But chances are eternal inflation will end before you get to see a good result. That’s why scientists have quality control and publishing ethics, why we have peer review and letters of recommendation, why we speak about statistical significance and double-blind studies and reproducible results: Not because in the absence of methods nothing good can happen, but because these methods have proven useful to prevent us from fooling ourselves and thereby make success considerably more likely.
Having said that, expert intuition can be extremely useful and there is nothing wrong with voicing a “gut feeling” as long as it is marked as such. It is unfortunate indeed that the present academic system does not give much space for scientists to express their intuition, or maybe they are shying away from it. But that’s a different story and shell be told another time.
So the answer to the question posed in the title is a clear no. The question is not whether science has progressed despite the dishonest methods that have been employed in the past, but how much better if would have progressed if that had not been so.
I stole that awesome gif from over here. I don't know its original source.