Back to Search Start Over

Genome-wide association study of a semicontinuous trait: Illustration of the impact of the modeling strategy through the study of Neutrophil Extracellular Traps levels

Authors :
Gaëlle Munsch
Carole Proust
Sylvie Labrouche-Colomer
Dylan Aïssi
Anne Boland
Pierre-Emmanuel Morange
Anne Roche
Luc de Chaisemartin
Annie Harroche
Robert Olaso
Jean-François Deleuze
Chloé James
Joseph Emmerich
David M Smadja
Hélène Jacqmin-Gadda
David-Alexandre Trégouët
Publication Year :
2022
Publisher :
Cold Spring Harbor Laboratory, 2022.

Abstract

Semicontinuous data, characterized by an excess of zeros followed by a non-negative and right-skewed distribution, are frequently observed in biomedical research. Different statistical models have been proposed to investigate the association of covariates with such outcome. Motivated by the search of genetic factors associated with Neutrophil Extracellular Traps (NETs), a semicontinuous biomarker involved in thrombosis, we here investigated the impact of the selected model for semicontinuous traits in the context of a Genome Wide Association Study (GWAS). We compared three models that jointly model zero and positive values while allowing the estimation of a single association parameter of covariates with the global mean: Tobit, Negative Binomial and Compound Poisson-Gamma. We assessed the fit of these models to a sample of 657 participants of the FARIVE study measured for NETs plasma levels. For each of these three models, we performed a GWAS on NETs in FARIVE participants and results were compared. A simulation study was also conducted to evaluate the control of the type I error. Compound Poisson-Gamma and Negative Binomial models fitted NETs data observed in FARIVE better than the Tobit model. However, the Negative Binomial model suffered from an inflation of its type I error, attributable to extreme positive values of the NETs and low frequency variants. Conversely, the Compound Poisson-Gamma model was robust to both phenomena. Using the latter model, a GWAS identified a genome wide significant locus on chr21q21.3. The lead variant was rs57502213, a deletion of two nucleotides located ∼40kb upstream the non-coding RNA (miR155HG) hosting the miR-155 that was recently highlighted to have a role in NETs formation. This work indicates that the modeling strategy for a semicontinuous outcome in the framework of GWAS studies is crucial. The choice of the model should take into account the nature of the process generating zero values and the presence of extreme values. Our work also suggests that the Compound Poisson-Gamma model, while still marginally employed, can be a robust modeling strategy for GWAS analysis on a semicontinuous trait.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........976ed4014b14bb5c3b597194e8db5d68