Back to Search Start Over

Combinatorial Detection Algorithm for Copy Number Variations Using High-throughput Sequencing Reads.

Authors :
Yang, Hai
Zhu, Daming
Source :
International Journal of Pattern Recognition & Artificial Intelligence. Dec2019, Vol. 33 Issue 14, pN.PAG-N.PAG. 18p.
Publication Year :
2019

Abstract

Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1 kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02180014
Volume :
33
Issue :
14
Database :
Academic Search Index
Journal :
International Journal of Pattern Recognition & Artificial Intelligence
Publication Type :
Academic Journal
Accession number :
141105860
Full Text :
https://doi.org/10.1142/S0218001419500228