Back to Search
Start Over
Exploring the limit of using a deep neural network on pileup data for germline variant calling
- Source :
- Nature Machine Intelligence. 2:220-227
- Publication Year :
- 2020
- Publisher :
- Springer Science and Business Media LLC, 2020.
-
Abstract
- Single-molecule sequencing technologies have emerged in recent years and revolutionized structural variant calling, complex genome assembly and epigenetic mark detection. However, the lack of a highly accurate small variant caller has limited these technologies from being more widely used. Here, we present Clair, the successor to Clairvoyante, a program for fast and accurate germline small variant calling, using single-molecule sequencing data. For Oxford Nanopore Technology data, Clair achieves better precision, recall and speed than several competing programs, including Clairvoyante, Longshot and Medaka. Through studying the missed variants and benchmarking intentionally overfitted models, we found that Clair may be approaching the limit of possible accuracy for germline small variant calling using pileup data and deep neural networks. Clair requires only a conventional central processing unit (CPU) for variant calling and is an open-source project available at https://github.com/HKU-BAL/Clair. A lack of accurate and efficient variant calling methods has held back single-molecule sequencing technologies from clinical applications. The authors present a deep-learning method for fast and accurate germline small variant calling, using single-molecule sequencing data.
- Subjects :
- 0301 basic medicine
Artificial neural network
Computer Networks and Communications
business.industry
Computer science
Sequencing data
Sequence assembly
Machine learning
computer.software_genre
Germline
Human-Computer Interaction
03 medical and health sciences
030104 developmental biology
0302 clinical medicine
Artificial Intelligence
Deep neural networks
Computer Vision and Pattern Recognition
Nanopore sequencing
Artificial intelligence
Limit (mathematics)
Central processing unit
business
computer
030217 neurology & neurosurgery
Software
Subjects
Details
- ISSN :
- 25225839
- Volume :
- 2
- Database :
- OpenAIRE
- Journal :
- Nature Machine Intelligence
- Accession number :
- edsair.doi...........b727c7fbdfbac66b9c6c4a919b4a4c64