Back to Search Start Over

seqQscorer: automated quality control of next-generation sequencing data using machine learning

Authors :
Steffen Albrecht
Maximilian Sprang
Miguel A. Andrade-Navarro
Jean-Fred Fontaine
Source :
Genome Biology, Vol 22, Iss 1, Pp 1-20 (2021)
Publication Year :
2021
Publisher :
BMC, 2021.

Abstract

Abstract Controlling quality of next-generation sequencing (NGS) data files is a necessary but complex task. To address this problem, we statistically characterize common NGS quality features and develop a novel quality control procedure involving tree-based and deep learning classification algorithms. Predictive models, validated on internal and external functional genomics datasets, are to some extent generalizable to data from unseen species. The derived statistical guidelines and predictive models represent a valuable resource for users of NGS data to better understand quality issues and perform automatic quality control. Our guidelines and software are available at https://github.com/salbrec/seqQscorer .

Details

Language :
English
ISSN :
1474760X
Volume :
22
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Genome Biology
Publication Type :
Academic Journal
Accession number :
edsdoj.513b47928e74db3b970f9add8caa872
Document Type :
article
Full Text :
https://doi.org/10.1186/s13059-021-02294-2