Back to Search
Start Over
Semi-automated fact-checking of nucleotide sequence reagents in biomedical research publications: The Seek & Blastn tool
- Source :
- PLoS ONE, PLoS ONE, Public Library of Science, 2019, 14 (3), pp.e0213266. ⟨10.1371/journal.pone.0213266⟩, PLoS ONE, Vol 14, Iss 3, p e0213266 (2019)
- Publication Year :
- 2019
- Publisher :
- Public Library of Science, 2019.
-
Abstract
- International audience; Nucleotide sequence reagents are verifiable experimental reagents in biomedical publications , because their sequence identities can be independently verified and compared with associated text descriptors. We have previously reported that incorrectly identified nucleotide sequence reagents are characteristic of highly similar human gene knockdown studies, some of which have been retracted from the literature on account of possible research fraud. Because of the throughput limitations of manual verification of nucleotide sequences, we developed a semi-automated fact checking tool, Seek & Blastn, to verify the targeting or non-targeting status of published nucleotide sequence reagents. From previously described and unknown corpora of 48 and 155 publications, respectively, Seek & Blastn correctly extracted 304/342 (88.9%) and 1066/1522 (70.0%) nucleotide sequences and a predicted targeting/ non-targeting status. Seek & Blastn correctly predicted the targeting/ non-targeting status of 293/304 (96.4%) and 988/1066 (92.7%) of the correctly extracted nucleotide sequences. A total of 38/39 (97.4%) or 31/79 (39.2%) Seek & Blastn predictions of incorrect nucleotide sequence reagent use were correct in the two literature corpora. Combined Seek & Blastn and manual analyses identified a list of 91 misidentified nucleotide sequence reagents, which could be built upon through future studies. In summary, incorrect nucleotide sequence reagents represent an under-recognized source of error within the biomedical literature , and fact checking tools such as Seek & Blastn may help to identify papers and manuscripts affected by these errors.
- Subjects :
- Biomedical Research
Computer science
Text Mining
[SDV]Life Sciences [q-bio]
Fact checking
Artificial Gene Amplification and Extension
Polymerase Chain Reaction
law.invention
0302 clinical medicine
law
Nucleotide
Polymerase chain reaction
chemistry.chemical_classification
0303 health sciences
Gene knockdown
Hypertext
Multidisciplinary
Database and informatics methods
Publications
Nucleic acid sequence
Sequence analysis
3. Good health
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
030220 oncology & carcinogenesis
Medicine
Information Technology
Algorithms
Research Article
Computer and Information Sciences
Bioinformatics
Science
Nucleotide sequencing
Nucleotide Sequencing
Sequence Databases
Computational biology
Research and Analysis Methods
03 medical and health sciences
Humans
[INFO]Computer Science [cs]
Molecular Biology Techniques
Sequencing Techniques
Molecular Biology
DNA sequence analysis
030304 developmental biology
Sequence (medicine)
Comparative Sequence Analysis
Biology and Life Sciences
Sequence Analysis, DNA
Biological Databases
chemistry
Indicators and Reagents
Sequence Alignment
Subjects
Details
- Language :
- English
- ISSN :
- 19326203
- Volume :
- 14
- Issue :
- 3
- Database :
- OpenAIRE
- Journal :
- PLoS ONE
- Accession number :
- edsair.doi.dedup.....838965b5106ec1b69547c35392f1ddcd