In the popular imagination, the work of scientists is marked by a series of “Eureka!” moments. But most science is not about glorious flashes of insight. Instead, its humble aim is to be less wrong about the theories that govern our understanding of the universe.
The best way to judge new theories, then, is the old adage of “trust, but verify.” In recent decades, however, this simple advice has been ignored thanks to perverse incentives and human biases.
The amount of funding available to scientific research hasn’t kept up with the growing number of scientists in training. To get a bite of the funding pie, many scientists have been led astray. How many? Witness the ten-fold rise in the number of retractions issues in scientific literature, nearly half of which may be the result of fraud.
Scientists are starting to do something about it. Among the sciences, psychology has taken the biggest hit to its reputation because of many high-profile controversies. Brian Nosek, a psychologist at the University of Virginia, wanted to remedy that. In 2011, he and 270 other scientists launched the Reproducibility Project to “verify” whether the results of 100 arbitrarily chosen psychology studies stood the scrutiny of replication.
Now, four years later, they have published their results in Science. And the results are worse than they anticipated. Just over a third of the replicated studies produced as strong a result as the original research.
That’s a damning result. And yet, as science writer Ed Yong explains in The Atlantic, “failed replications don’t discredit the original studies, any more than successful ones enshrine them as truth.”
Scientists need to balance their work on research that pushes the boundaries of science with less eye-catching studies that simply strengthen convictions on what we already know. That is why Jason Mitchell of Harvard University says, “we can’t interpret whether 36% [success at replication] is good, bad, or right on the money.”
You might assume that every scientific study is replicable, but it’s not that simple. Universal truths in psychology are much harder to establish than in mathematics. As you go up the complexity pyramid—mathematics to physics to chemistry to biology to psychology—the number of subjective choices a scientist must make increases quickly. Thus, a minor tweak can produce a significantly different outcome.
The upshot from Nosek’s grand experiment is not that psychology studies are unreliable, but that we are starting to learn how to make scientific research more rigorous. More broadly, there is a pressing need to apply the lessons psychologists learned from the Reproducibility Project and apply them to other scientific fields. This is how science can start fixing itself.