In 2005, John Ioannidis, a professor of medicine at Stanford University, published a paper, “Why most published research findings are false,” mathematically showing that a huge number of published papers must be incorrect. He also looked at a number of well-regarded medical research findings, and found that, of 34 that had been retested, 41% had been contradicted or found to be significantly exaggerated.
Since then, researchers in several scientific areas have consistently struggled to reproduce major results of prominent studies. By some estimates, at least 51%—and as much as 89%—of published papers are based on studies and experiments showing results that cannot be reproduced.
Researchers have recreated prominent studies from several scientific fields and come up with wildly different results. And psychology has become something of a poster child for the “reproducibility crisis” since Brian Nosek, a psychology professor at the University of Virginia, coordinated a Reproducibility Initiative project to repeat 100 psychological experiments, and could only successfully replicate 40%.
Now, an attempt to replicate another key psychological concept (ego depletion: the idea that willpower is finite and can be worn down with overuse) has come up short. Martin Hagger, psychology professor at Curtin University in Australia, led researchers from 24 labs in trying to recreate a key effect, but found nothing. Their findings are due to be published in Perspectives on Psychological Science in the coming weeks.
Why are they getting it wrong?
No one is accusing the psychologists behind the initial experiments of intentionally manipulating their results. But some of them may have been tripped up by one or more of the various aspects of academic science that inadvertently encourage bias.
For example, there’s massive academic pressure to publish in journals, and these journals tend to publish exciting studies that show strong results.
“Journals favor novelty, originality, and verification of hypotheses over robustness, stringency of method, reproducibility, and falsifiability,” Hagger tells Quartz. “Therefore researchers have been driven to finding significant effects, finding things that are novel, testing them on relatively small samples.”
This has created a publication bias, where studies that show strong, positive results get published, while similar studies that come up with no significant effects sit at the bottom of researchers’ drawers.
Meanwhile, in cases where researchers have access to large amounts of data, there’s a dangerous tendency to hunt for significant correlations. Researchers can thus convince themselves that they’ve spotted a meaningful connection, when in fact such connections are totally random.
A sign of strength
The idea that papers are publishing false results might sound alarming but the recent crisis doesn’t mean that the entire scientific method is totally wrong. In fact, science’s focus on its own errors is a sign that researchers are on exactly the right path.
Ivan Oransky, producer of the blog Retraction Watch, which tracks retractions printed in journals, tells Quartz that ultimately, the alarm will lead to increased rigor.
“There’s going to be some short-term and maybe mid-term pain as all of this shakes out, but that’s how you move forward,” he says. “It’s like therapy—if you never get angry in therapy, you’re probably not pushing hard enough. If you never find mistakes, or failures to reproduce in your field, you’re probably not asking the right questions.”
For psychologists, who have seen so many results crumble in such a short space of time, the replication crisis could be disheartening. But it also presents a chance to be at the forefront of developing new policies.
Ioannidis tells Quartz that he views the most recent psychology reproducibility failures as a positive. “It shows how much effort and attention has gone towards improving the accuracy of the knowledge produced,” he says. “Psychology is a discipline that has always been very strong methodologically and was at the forefront at describing various biases and better methods. Now they are again taking the lead in improving their replication record.”
For example, there’s already widespread discussion within psychology about pre-registering trials (which would prevent researchers from shifting their methods so as to capture more eye-catching results), making data and scientific methods more open, making sample sizes larger and more representative, and promoting collaboration.
Dorothy Bishop, a professor of developmental neuropsychology at Oxford University, tells Quartz that several funding bodies and journals seem to be receptive to these ideas and that, once one or two adopt such policies, she expects them to spread rapidly.
Doing science on science
Each scientific field must adopt its own methods of ensuring accuracy. But ultimately, this self-reflection is a key part of the scientific process.
As Bishop notes, “Science has proved itself to be an incredibly powerful method.” And yet there’s always room for further advancement.
“There’s never an end point,” says Bishop. “We’re always groping towards the next thing. Sometimes science does disappear down the wrong path for a bit before it corrects itself.”
For Nosek, who led the re-testing of 100 psychology papers, the current focus on reproducibility is simply part of the scientific process.
“Science isn’t about truth and falsity, it’s about reducing uncertainty,” he says. “Really this whole project is science on science: Researchers doing what science is supposed to do, which is be skeptical of our own process, procedure, methods, and look for ways to improve.”