1. GenHap: Evolutionary Computation For Haplotype Assembly
- Author
-
Tangherloni, A, Spolaor, S, Rundo, L, Nobile, M, Cazzaniga, P, Mauri, G, Liò, P, Besozzi, D, Merelli, I, Tangherloni, A, Spolaor, S, Rundo, L, Nobile, M, Cazzaniga, P, Mauri, G, Liò, P, Besozzi, D, and Merelli, I
- Subjects
Haplotype assembly, Genetic algorithms, Combinatorial optimization, Weighted Minimum Error Correction problem - Abstract
The reconstruction of the two distinct copies of each chromosome, called haplotypes, is an essential process for the characterization of the genome of an individual. Here we address a successful approach for haplotype assembly, called the weighted Minimum Error Correction (wMEC) problem, which consists in computing the two haplotypes that partition the sequencing reads into two disjoint sub-sets with the least number of corrections to the Single Nucleotide Polymorphisms values. To solve this problem we propose GenHap, a computational method based on Genetic Algorithms, which are able to obtain optimal solutions thanks to a global search process. To evaluate the effectiveness of GenHap, we test it on a synthetic (yet realistic) dataset based on the PacBio RS II sequencing technology. We compare the performance of GenHap against HapCol, an efficient state-of-the-art algorithm for haplotype assembly. We show that GenHap always obtains high accuracy solutions (in terms of haplotype error rate), and is up to 20× faster than HapCol on this synthetic (yet realistic) dataset.
- Published
- 2018