1. NGSremix:A software tool for estimating pairwise relatedness between admixed individuals from next-generation sequencing data
- Author
-
Zilong Li, Genís Garcia-Erill, Kristian Hanghøj, Anne Krogh Nøhr, Ida Moltke, and Anders Albrechtsen
- Subjects
AcademicSubjects/SCI01140 ,relatedness ,Genotype ,Research areas ,AcademicSubjects/SCI00010 ,Computer science ,Maximum likelihood ,Software tool ,low-depth NGS data ,maximum likelihood estimation ,Biology ,Software and Data Resources ,QH426-470 ,AcademicSubjects/SCI01180 ,computer.software_genre ,Polymorphism, Single Nucleotide ,genotype likelihoods ,DNA sequencing ,03 medical and health sciences ,Software ,0302 clinical medicine ,Genetics ,Humans ,Molecular Biology ,Genetics (clinical) ,Probability ,030304 developmental biology ,0303 health sciences ,business.industry ,High-Throughput Nucleotide Sequencing ,Genetics, Population ,AcademicSubjects/SCI00960 ,admixture ,Pairwise comparison ,Data mining ,Estimation methods ,business ,computer ,030217 neurology & neurosurgery - Abstract
Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low depth NGS data, which takes the uncertainty of the genotypes into account via geno-type likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C++ in a multi-threaded software and is freely available on Githubhttps://github.com/KHanghoj/NGSremix.
- Published
- 2021