1. Detecting inversions with PCA in the presence of population structure
- Author
-
Scott J. Emrich, Ronald J. Nowling, and Krystal R. Manke
- Subjects
0106 biological sciences ,0301 basic medicine ,Anopheles gambiae ,Science ,Population ,Population structure ,Datasets as Topic ,Single-nucleotide polymorphism ,Computational biology ,010603 evolutionary biology ,01 natural sciences ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,Anopheles ,Animals ,education ,Malaria vector ,Chromosomal inversion ,education.field_of_study ,Principal Component Analysis ,Multidisciplinary ,biology ,fungi ,Computational Biology ,Inversion (meteorology) ,Reproductive isolation ,biology.organism_classification ,Chromosomes, Insect ,030104 developmental biology ,Drosophila melanogaster ,Evolutionary biology ,Principal component analysis ,Chromosome Inversion ,Medicine ,Software ,Research Article - Abstract
Chromosomal inversions are associated with reproductive isolation and adaptation in insects such as Drosophila melanogaster and the malaria vectors Anopheles gambiae and Anopheles coluzzii. While methods based on read alignment have been useful in humans for detecting inversions, these methods are less successful in insects due to long repeated sequences at the breakpoints. Alternatively, inversions can be detected using principal component analysis (PCA) of single nucleotide polymorphisms (SNPs). We apply PCA-based inversion detection to a simulated data set and real data from multiple insect species, which vary in complexity from a single inversion in samples drawn from a single population to analyzing multiple overlapping inversions occurring in closely-related species, samples of which that were generated from multiple geographic locations. We show empirically that proper analysis of these data can be challenging when multiple inversions or populations are present, and that our alternative framework is more robust in these more difficult scenarios.
- Published
- 2019