101. Detecting anomalous referencing patterns in PubMed papers suggestive of author-centric reference list manipulation
- Author
-
Jonathan D. Wren and Constantin Georgescu
- Subjects
General Social Sciences ,Library and Information Sciences ,Computer Science Applications - Abstract
Although citations are used as a quantifiable, objective metric of academic influence, references could be added to a paper solely to inflate the perceived influence of a body of research. This reference list manipulation (RLM) could take place during the peer-review process, or prior to it. Surveys have estimated how many people may have been affected by coercive RLM at one time or another, but it is not known how many authors engage in RLM, nor to what degree. By examining a subset of active, highly published authors (n = 20,803) in PubMed, we find the frequency of non-self-citations (NSC) to one author coming from a single paper approximates Zipf’s law. Author-centric deviations from it are approximately normally distributed, permitting deviations to be quantified statistically. Framed as an anomaly detection problem, statistical confidence increases when an author is an outlier by multiple metrics. Anomalies are not proof of RLM, but authors engaged in RLM will almost unavoidably create anomalies. We find the NSC Gini Index correlates highly with anomalous patterns across multiple “red flags”, each suggestive of RLM. Between 81 (0.4%, FDR
- Published
- 2022