Back to Search Start Over

Prediction of Eye, Hair and Skin Color in Admixed Populations of Latin America

Authors :
Tábita Hünemeier
Giovanni Poletti
Maria Cátira Bortolini
Paola Everardo-Martínez
William Arias
Valeria Villegas
Sagnik Palmal
Victor Acuña-Alonzo
Hugo Villamil-Ramírez
Javier Mendoza-Revilla
Rodrigo Barquera Lozano
Pierre Faux
David J. Balding
Carla Gallo
Caio Cesar Silva de Cerqueira
Virginia Ramallo
Juan Camilo Chacón-Duque
Rolando González-José
Samuel Canizales-Quinteros
Francisco Rothhammer
Claudia Jaramillo
Anood Sohail
Andres Ruiz-Linares
Malena Hurtado
Kaustubh Adhikari
Jorge Gómez-Valdés
Vanessa Granja
Macarena Fuentes-Guajardo
Lavinia Schuler-Faccini
Gabriel Bedoya
Publication Year :
2020
Publisher :
Cold Spring Harbor Laboratory, 2020.

Abstract

We report an evaluation of prediction accuracy for eye, hair and skin pigmentation based on genomic and phenotypic data for over 6,500 admixed Latin Americans (the CANDELA dataset). We examined the impact on prediction accuracy of three main factors: (i) The methods of prediction, including classical statistical methods and machine learning approaches, (ii) The inclusion of non-genetic predictors, continental genetic ancestry and pigmentation SNPs in the prediction models, and (iii) Compared two sets of pigmentation SNPs: the commonly-used HIrisPlex-S set (developed in Europeans) and novel SNP sets we defined here based on genome-wide association results in the CANDELA sample. We find that Random Forest or regression are globally the best performing methods. Although continental genetic ancestry has substantial power for prediction of pigmentation in Latin Americans, the inclusion of pigmentation SNPs increases prediction accuracy considerably, particularly for skin color. For hair and eye color, HIrisPlex-S has a similar performance to the CANDELA-specific prediction SNP sets. However, for skin pigmentation the performance of HIrisPlex-S is markedly lower than the SNP set defined here, including predictions in an independent dataset of Native American data. These results reflect the relatively high variation in hair and eye color among Europeans for whom HIrisPlex-S was developed, whereas their variation in skin pigmentation is comparatively lower. Furthermore, we show that the dataset used in the training of prediction models strongly impacts on the portability of these models across Europeans and Native Americans.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........6eb854ecc4c5d02815b32c3acabf4519
Full Text :
https://doi.org/10.1101/2020.12.09.415901