1. Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men
- Author
-
Poznik Gd
- Subjects
Genetics ,ComputingMethodologies_PATTERNRECOGNITION ,Mutation (genetic algorithm) ,Genotype ,Single-nucleotide polymorphism ,Computational biology ,1000 Genomes Project ,Biology ,Y chromosome ,Missing data ,Genotyping ,Haplogroup - Abstract
We have developed an algorithm to rapidly and accurately identify the Y-chromosome haplogroup of each male in a sample of one to millions. The algorithm, implemented in the yHaplo* software package (yHaplo), does not rely on any particular genotyping modality or platform. Full sequences yield the most granular haplogroup classifications, but genotyping arrays can yield reliable calls, provided a reasonable number of phylogenetically informative variants has been assayed. The algorithm is robust to missing data, genotype errors, mutation recurrence, and other complications. We have tested the software on full sequences from phase 3 of the 1000 Genomes Project and on subsets thereof constructed by downsampling to SNPs present on each of four genotyping arrays. We have also run the software on array data from more than 600,000 males.
- Published
- 2016
- Full Text
- View/download PDF