1. The geometry of admixture in population genetics: the blessing of dimensionality.
- Author
-
Oteo, José-Angel and Oteo-García, Gonzalo
- Subjects
- *
STATISTICAL models , *BIOLOGICAL models , *GENOMICS , *PHYLOGENY , *DATA analysis , *GENETIC markers , *GENETIC variation , *HUMAN reproductive technology , *GENES , *CONCEPTUAL structures , *STATISTICS - Abstract
We present a geometry-based interpretation of the f -statistics framework, commonly used in population genetics to estimate phylogenetic relationships from genomic data. The focus is on the determination of the mixing coefficients in population admixture events subject to post-admixture drift. The interpretation takes advantage of the high dimension of the dataset and analyzes the problem as a dimensional reduction issue. We show that it is possible to think of the f -statistics technique as an implicit transformation of the genomic data from a phase space into a subspace where the mapped data structure is more similar to the ancestral admixture configuration. The 2-way mixing coefficient is, as a matter of fact, carried out implicitly in this subspace. In addition, we propose the admixture test to be evaluated in the subspace because the comparison with the conventional one provides an important assessment of the admixture model. The overarching geometric framework provides slightly more general formulas than the f -formalism by using a different rationale as a starting point. Explicitly addressed are 2- and 3-way admixtures. The mixture proportions are provided by suitable linear fits, in 2 or 3 dimensions, that can be easily visualized. The difficulties encountered with introgression and gene flow are also addressed. The developments and findings are illustrated with numerical simulations and real-world cases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF