1. A spectral theory for Wright’s inbreeding coefficients and related quantities
- Author
-
Clement Gain, Olivier François, Université Grenoble Alpes INP (Grenoble INP), Translational Innovation in Medicine and Complexity / Recherche Translationnelle et Innovation en Médecine et Complexité - UMR 5525 (TIMC ), VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA), Biologie Computationnelle et Modélisation (TIMC-BCM ), Université Grenoble Alpes (UGA)-VetAgro Sup - Institut national d'enseignement supérieur et de recherche en alimentation, santé animale, sciences agronomiques et de l'environnement (VAS)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), CNRS and Université Grenoble Alpes, 38041 Grenoble, France, and ANR-19-P3IA-0003,MIAI,MIAI @ Grenoble Alpes(2019)
- Subjects
0301 basic medicine ,0106 biological sciences ,Cancer Research ,Heredity ,Population genetics ,QH426-470 ,01 natural sciences ,Consanguinity ,Matrix (mathematics) ,Mathematical and Statistical Techniques ,Statistics ,Quantitative Biology::Populations and Evolution ,Inbreeding ,Genetics (clinical) ,Statistic ,Mathematics ,Principal Component Analysis ,0303 health sciences ,education.field_of_study ,Genome ,Approximation Methods ,Simulation and Modeling ,Genomics ,Quantitative Biology::Genomics ,Population model ,Physical Sciences ,Principal component analysis ,Research Article ,Genotype ,Population ,Biology ,Research and Analysis Methods ,010603 evolutionary biology ,03 medical and health sciences ,Genetic variation ,Genetics ,Animals ,Humans ,Statistical Methods ,education ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,Eigenvalues and eigenvectors ,030304 developmental biology ,Evolutionary Biology ,[SDV.GEN.GPO]Life Sciences [q-bio]/Genetics/Populations and Evolution [q-bio.PE] ,Population Biology ,Models, Genetic ,Biology and Life Sciences ,Genetic Variation ,Eigenvalues ,Models, Theoretical ,Algebra ,Genetics, Population ,030104 developmental biology ,Linear Algebra ,Genetic Loci ,Multivariate Analysis ,Genetic Polymorphism ,Population Genetics - Abstract
Wright’s inbreeding coefficient, FST, is a fundamental measure in population genetics. Assuming a predefined population subdivision, this statistic is classically used to evaluate population structure at a given genomic locus. With large numbers of loci, unsupervised approaches such as principal component analysis (PCA) have, however, become prominent in recent analyses of population structure. In this study, we describe the relationships between Wright’s inbreeding coefficients and PCA for a model of K discrete populations. Our theory provides an equivalent definition of FST based on the decomposition of the genotype matrix into between and within-population matrices. The average value of Wright’s FST over all loci included in the genotype matrix can be obtained from the PCA of the between-population matrix. Assuming that a separation condition is fulfilled and for reasonably large data sets, this value of FST approximates the proportion of genetic variation explained by the first (K − 1) principal components accurately. The new definition of FST is useful for computing inbreeding coefficients from surrogate genotypes, for example, obtained after correction of experimental artifacts or after removing adaptive genetic variation associated with environmental variables. The relationships between inbreeding coefficients and the spectrum of the genotype matrix not only allow interpretations of PCA results in terms of population genetic concepts but extend those concepts to population genetic analyses accounting for temporal, geographical and environmental contexts., Author summary Principal component analysis (PCA) is the most-frequently used approach to describe population genetic structure from large population genomic data sets. In this study, we show that PCA not only estimates ancestries of sampled individuals, but also computes the average value of Wright’s inbreeding coefficient over the loci included in the genotype matrix. Our result shows that inbreeding coefficients and PCA eigenvalues provide equivalent descriptions of population structure. As a consequence, PCA extends the definition of those coefficients beyond the framework of allelic frequencies. We give examples on how FST can be computed from ancient DNA samples for which genotypes are corrected for coverage, and in an ecological genomic example where a proportion of genetic variation is explained by environmental variables.
- Published
- 2021