1. An hidden Markov model to estimate homozygous-by-descent probabilities associated with nested layers of ancestors
- Author
-
Tom Druet, Mathieu Gautier, Université de Liège, Unit of Animal Genomics, GIGA-R & Faculty of Veterinary Medicine, Centre de Biologie pour la Gestion des Populations (UMR CBGP), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Institut de Recherche pour le Développement (IRD [France-Sud])-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE)-Institut Agro Montpellier, Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Université de Montpellier (UM), and This work was supported by the Fonds de la Recherche Scientifique–FNRS under grants J.0134.16 and J.0154.18. Tom Druet is Senior Research Associate from the F.R.S.–FNRS. We used the supercomputing facilities of the 'Consortium d’Equipements en Calcul Intensif en Fédération Wallonie-Bruxelles' (CECI), funded by the F.R.S.–FNRS.
- Subjects
Hidden Markov model ,Genotype ,[SDV]Life Sciences [q-bio] ,Homozygote ,ROH ,Autozygosity ,Estimator ,Computational biology ,Biology ,Polymorphism, Single Nucleotide ,Genome ,Homozygous-by-descent ,Pedigree ,Simulated data ,Humans ,Inbreeding ,Ecology, Evolution, Behavior and Systematics ,Probability - Abstract
Inbreeding results from the mating of related individuals and has negative consequences because it brings together deleterious variants in one individual. Genomic estimates of the inbreeding coefficients are preferred to pedigree-based estimators as they measure the realized inbreeding levels and they are more robust to pedigree errors. Several methods identifying homozygous-by-descent (HBD) segments with hidden Markov models (HMM) have been recently developed and are particularly valuable when the information is degraded or heterogeneous (e.g., low-fold sequencing, low marker density, heterogeneous genotype quality or variable marker spacing). We previously developed a multiple HBD class HMM where HBD segments are classified in different groups based on their length (e.g., recent versus old HBD segments) but we recently observed that for high inbreeding levels with many HBD segments, the estimated contributions might be biased towards more recent classes (i.e., associated with large HBD segments) although the overall estimated level of inbreeding remained unbiased. We herein propose an updated multiple HBD classes model in which the HBD classification is modeled in successive nested levels. In each level, the rate specifying the expected length of HBD segments, and that is directly related to the number of generations to the ancestors, is distinct. The non-HBD classes are now modeled as a mixture of HBD segments from later generations and shorter non-HBD segments (i.e., both with higher rates). The updated model had better statistical properties and performed better on simulated data compared to our previous version. We also show that the parameters of the model are easier to interpret and that the model is more robust to the choice of the number of classes. Overall, the new model results in an improved partitioning of inbreeding in different HBD classes and should be preferred in applications relying on the length of estimated HBD segments.
- Published
- 2021
- Full Text
- View/download PDF