The number of published science papers that have been retracted due to misconduct or fraud has ballooned in the last decade. But as with any academic matter, the root cause is being endlessly debated. What if there was an easy way to spot scientific fraud before it gained wide distribution?
If even the best poker players have “tells” when they are bluffing, then surely there must be a way to catch scientists that fake their data. To this end, Jeff Hancock, a professor of communications at Stanford University, reckons that corner-cutting researchers will try to obscure the offending sections of their papers with particularly hard-to-understand language.
Along with graduate student David Markowitz, Hancock set about analyzing papers known to be fraudulent to reveal these “tells.” They developed an “obfuscation index,” scoring the language of a study based on its use of jargon, abstract phrases, positive-emotion terms, and the like.
After filtering the data, Hancock and Markowitz were left with 250 retracted papers once published in life-sciences journals between 1973 and 2013. They compared with unretracted papers on the same topics, in the same journals, published in the same years. The results were published in the Journal of Language and Social Psychology.
“Fraudulent papers had about 60 more jargon-like words per paper compared to unretracted papers,” Markowitz said. In fact, fraudulent papers scored higher on the obfuscation index even when compared with papers that were retracted for other reasons, such as plagiarism or inadvertent mistakes. The authors hope to hone the index and turn in to a software tool.
“Science fraud is of increasing concern in academia, and automatic tools for identifying fraud might be useful,” Hancock said. “But there is a very high error rate that would need to be improved.”