Back to Search
Start Over
Combinatorial Detection Algorithm for Copy Number Variations Using High-throughput Sequencing Reads.
- Source :
-
International Journal of Pattern Recognition & Artificial Intelligence . Dec2019, Vol. 33 Issue 14, pN.PAG-N.PAG. 18p. - Publication Year :
- 2019
-
Abstract
- Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1 kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 02180014
- Volume :
- 33
- Issue :
- 14
- Database :
- Academic Search Index
- Journal :
- International Journal of Pattern Recognition & Artificial Intelligence
- Publication Type :
- Academic Journal
- Accession number :
- 141105860
- Full Text :
- https://doi.org/10.1142/S0218001419500228