1. Whole genome variation in 27 Mexican indigenous populations, demographic and biomedical insights
- Author
-
Judith Ballesteros-Villascán, Xavier Soberón, Israel Aguilar-Ordoñez, Enrique Hernández-Lemus, Hugo Tovar, Cristóbal Fresno, Lorena Orozco, Humberto García-Ortiz, Alejandro Garciarrubio, Juan Carlos Fernández-López, Fernando Pérez-Villatoro, Francisco Barajas-Olmos, Ram González-Buenfil, and Enrique Morett
- Subjects
Male ,Mexican People ,Heredity ,Genome-wide association study ,Genome ,Geographical locations ,Population genomics ,Mathematical and Statistical Techniques ,Ethnicities ,Phylogeny ,Genetics ,Principal Component Analysis ,Multidisciplinary ,Library Science ,Statistics ,Genomics ,Population groupings ,Genetic Mapping ,Physical Sciences ,Medicine ,Female ,Angiotensin-Converting Enzyme 2 ,Databases, Nucleic Acid ,Research Article ,Computer and Information Sciences ,Science ,Biology ,Research and Analysis Methods ,Polymorphism, Single Nucleotide ,Human Genomics ,Genome-Wide Association Studies ,Humans ,Statistical Methods ,Allele frequency ,Mexico ,American Indian or Alaska Native ,Whole genome sequencing ,Whole Genome Sequencing ,Genome, Human ,SARS-CoV-2 ,Haplotype ,COVID-19 ,Biology and Life Sciences ,Computational Biology ,Latin American people ,Human Genetics ,Genome Analysis ,Genetic divergence ,Haplotypes ,Multivariate Analysis ,North America ,Catalogs ,People and places ,Mathematics - Abstract
There has been limited study of Native American whole genome diversity to date, which impairs effective implementation of personalized medicine and a detailed description of its demographic history. Here we report high coverage whole genome sequencing of 76 unrelated individuals, from 27 indigenous groups across Mexico, with more than 97% average Native American ancestry. On average, each individual has 3.26 million Single Nucleotide Variants and short indels, that together comprise a catalog of 9,737,152 variants, 44,118 of which are novel. We report 497 common Single Nucleotide Variants (with allele frequency > 5%) mapped to drug responses and 316,577 in enhancer or promoter elements; interestingly we found some of these enhancer variants in PPARG, a nuclear receptor involved in highly prevalent health problems in Mexican population, such as obesity, diabetes, and insulin resistance. By detecting signals of positive selection we report 24 enriched key pathways under selection, most of them related to immune mechanisms. No missense variants in ACE2, the receptor responsible for the entry of the SARS CoV-2 virus, were found in any individual. Population genomics and phylogenetic analyses demonstrated stratification in a Northern-Central-Southern axis, with major substructure in the Central region. The Seri, a northern group with the most genetic divergence in our study, showed a distinctive genomic context with the most novel variants, and the most population specific genotypes. Genome-wide analysis showed that the average haplotype blocks are longer in Native Mexicans than in other world populations. With this dataset we describe previously undetected population level variation in Native Mexicans, helping to reduce the gap in genomic data representation of such groups.
- Published
- 2021