Back to Search Start Over

Semi-automated fact-checking of nucleotide sequence reagents in biomedical research publications: The Seek & Blastn tool

Authors :
Thierry Gautier
Cyril Labbé
Bertrand Favier
Jennifer A. Byrne
Natalie Grima
Systèmes d’Information - inGénierie et Modélisation Adaptables (SIGMA)
Laboratoire d'Informatique de Grenoble (LIG )
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
The University of Sydney
Institute for Advanced Biosciences / Institut pour l'Avancée des Biosciences (Grenoble) (IAB)
Centre Hospitalier Universitaire [Grenoble] (CHU)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Etablissement français du sang - Auvergne-Rhône-Alpes (EFS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
Source :
PLoS ONE, PLoS ONE, Public Library of Science, 2019, 14 (3), pp.e0213266. ⟨10.1371/journal.pone.0213266⟩, PLoS ONE, Vol 14, Iss 3, p e0213266 (2019)
Publication Year :
2019
Publisher :
Public Library of Science, 2019.

Abstract

International audience; Nucleotide sequence reagents are verifiable experimental reagents in biomedical publications , because their sequence identities can be independently verified and compared with associated text descriptors. We have previously reported that incorrectly identified nucleotide sequence reagents are characteristic of highly similar human gene knockdown studies, some of which have been retracted from the literature on account of possible research fraud. Because of the throughput limitations of manual verification of nucleotide sequences, we developed a semi-automated fact checking tool, Seek & Blastn, to verify the targeting or non-targeting status of published nucleotide sequence reagents. From previously described and unknown corpora of 48 and 155 publications, respectively, Seek & Blastn correctly extracted 304/342 (88.9%) and 1066/1522 (70.0%) nucleotide sequences and a predicted targeting/ non-targeting status. Seek & Blastn correctly predicted the targeting/ non-targeting status of 293/304 (96.4%) and 988/1066 (92.7%) of the correctly extracted nucleotide sequences. A total of 38/39 (97.4%) or 31/79 (39.2%) Seek & Blastn predictions of incorrect nucleotide sequence reagent use were correct in the two literature corpora. Combined Seek & Blastn and manual analyses identified a list of 91 misidentified nucleotide sequence reagents, which could be built upon through future studies. In summary, incorrect nucleotide sequence reagents represent an under-recognized source of error within the biomedical literature , and fact checking tools such as Seek & Blastn may help to identify papers and manuscripts affected by these errors.

Details

Language :
English
ISSN :
19326203
Volume :
14
Issue :
3
Database :
OpenAIRE
Journal :
PLoS ONE
Accession number :
edsair.doi.dedup.....838965b5106ec1b69547c35392f1ddcd