In the scientific literature, spin refers to reporting practices that fail to faithfully reflect the nature and range of findings, increasing the likelihood that readers will interpret study results in a more favorable light than is warranted. Spin is widespread across a broad range of scientific disciplines, from nutritional epidemiology to psychology to microbiology. How widespread? Hard to tell, but the evidence so far is rather damning. For instance, the authors of a 2015 abstract review analyzed 128 study abstracts and found examples of spin in 84% of them.
Spin misleads readers to impress or persuade them, often with the aid of buzz-words like “innovative”, “promising”, “unique”, “robust”, and “novel”. But spin is not just a matter of self-promoting hype – it’s a form of dishonest scholarship that undermines the scientific enterprise. Here are some examples followed by a brief explanation of terms:
Brief Explanation of Terms
Research objectives are what you intend to accomplish in a study, such as to measure something, determine the relationship between two variables, or test a proposed prediction or explanation (i.e., hypothesis). Sometimes during the course of the research, the original objective may no longer appear doable or the initial hypothesis is not being supported by the data. An unethical scientist may change the objective or hypothesis mid-study to ones “more likely to succeed” without reporting the switch.
Statistical significance is the probability that study results are due to chance. The level of statistical significance is often expressed as a p-value. Note that statistical significance in itself says nothing about the size of an effect. The relationship between two variables can be both statistically significant and trivial.
Effect size is an essential component when evaluating the strength of a statistical claim The effect size is a number measuring the magnitude of effect in a quantitative study. In reporting and interpreting studies, both the effect size and P value (statistical significance) should be reported.
P-hacking is a way of misrepresenting true effect sizes in published reports by doing multiple statistical analyses and then focusing on those that produce statistically significant results, ignoring or downplaying the rest. Note also that the more statistical tests one throws at the data, the more likely some will come back as statistically significant - by chance.
Post hoc analysis consists of statistical analyses that were done after the data were seen, in contrast to analyses that were pre-specified by the researchers when they designed the study. Post-hoc analysis is another way researchers dredge for significant findings by doing multiple statistical tests on the data. It’s okay to do post-hoc analyses as long as they are identified as such, but some researchers neglect to do so, giving the impression that they had always intended to run these tests (allowing them to garner undeserved reputation points if the post hoc tests yield significant results).
To beautify methods is to report them as if they met the highest standards when in fact they did not. For example, to describe the study as a randomized controlled trial (RCT) when randomization procedures were not used consistently.
The lack of statistically significant differences between the results of a new treatment and an existing treatment (e.g., cure rate for a medical condition) does not demonstrate equivalence of the two treatments, because they may still differ in other important respects, such as ease of use, safety, or cost. Researchers who misrepresent treatments as equivalent may encourage some clinicians to start prescribing the less-established treatment before it has received full scrutiny.
The limitations of a study are defined as “any characteristics, traits, actions, or influences that could impact the research process, and therefore its findings… ranging from internal aspects, such as flaws in design and methodology, to external influences that a researcher was unable to control” (R. Kinloch, 2020). All studies have limitations that impact their validity and the generalizability of their conclusions, which is why researchers should avoid extrapolating study conclusions to larger populations, different settings, and other species. All research articles should include a serious discussion of limitations.
I also think most research articles should include alternative explanations for study results. If they don’t, or they do but the tone is dismissive, or the authors simply overlook some obvious alternative explanations, well…that’s spin for you.
—
References
Boutron, I. and P. Ravaud (2018). "Misrepresentation and distortion of research in biomedical literature." Proceedings of the National Academy of Sciences 115(11): 2613-2619. https://www.pnas.org/content/115/11/2613.short
Chiu K, Grundy Q, Bero L (2017) ‘Spin’ in published biomedical literature: A methodological systematic review. PLoS Biol 15(9): e2002173. https://doi.org/10.1371/journal.pbio.2002173
Lazarus, C., Haneef, R., Ravaud, P. et al. Classification and prevalence of spin in abstracts of non-randomized studies evaluating an intervention. BMC Med Res Methodol 15, 85 (2015). https://doi.org/10.1186/s12874-015-0079-x
“Limitations Section” by Ruth Kinloch/Ivy Panda January 30, 2020.
“Science Fictions: Exposing Fraud, Bias, Negligence and Hype in Science” by Stuart Ritchie. Metropolitan Books. Published 2020. "
Sullivan, G. M., & Feinn, R. (2012). Using Effect Size-or Why the P Value Is Not Enough. Journal of graduate medical education, 4(3), 279–282. https://doi.org/10.4300/JGME-D-12-00156.1