1. Host genetics and COVID-19 severity: increasing the accuracy of latest severity scores by Boolean quantum features
- Author
-
Gabriele Martelloni, Alessio Turchi, Chiara Fallerini, Andrea Degl’Innocenti, Margherita Baldassarri, Simona Olmi, Simone Furini, Alessandra Renieri, GEN-COVID Multicenter study, Francesca Mari, Sergio Daga, Ilaria Meloni, Mirella Bruttini, Susanna Croci, Mirjam Lista, Debora Maffeo, Elena Pasquinelli, Giulia Brunelli, Kristina Zguro, Viola Bianca Serio, Enrica Antolini, Simona Letizia Basso, Samantha Minetto, Giulia Rollo, Martina Rozza, Angela Rina, Rossella Tita, Maria Antonietta Mencarelli, Caterina Lo Rizzo, Anna Maria Pinto, Francesca Ariani, Francesca Montagnani, Mario Tumbarello, Ilaria Rancan, Massimiliano Fabbiani, Elena Bargagli, Laura Bergantini, Miriana d’Alessandro, Paolo Cameli, David Bennett, Federico Anedda, Simona Marcantonio, Sabino Scolletta, Federico Franchi, Maria Antonietta Mazzei, Susanna Guerrini, Edoardo Conticini, Luca Cantarini, Bruno Frediani, Danilo Tacconi, Chiara Spertilli Raffaelli, Arianna Emiliozzi, Marco Feri, Alice Donati, Raffaele Scala, Luca Guidelli, Genni Spargi, Marta Corridi, Cesira Nencioni, Leonardo Croci, Gian Piero Caldarelli, Davide Romani, Paolo Piacentini, Maria Bandini, Elena Desanctis, Silvia Cappelli, Anna Canaccini, Agnese Verzuri, Valentina Anemoli, Manola Pisani, Agostino Ognibene, Maria Lorubbio, Alessandro Pancrazzi, Massimo Vaghi, Antonella D’Arminio Monforte, Federica Gaia Miraglia, Mario U. Mondelli, Stefania Mantovani, Raffaele Bruno, Marco Vecchia, Marcello Maffezzoni, Enrico Martinelli, Massimo Girardis, Stefano Busani, Sophie Venturelli, Andrea Cossarizza, Andrea Antinori, Alessandra Vergori, Stefano Rusconi, Matteo Siano, Arianna Gabrieli, Agostino Riva, Daniela Francisci, Elisabetta Schiaroli, Carlo Pallotto, Saverio Giuseppe Parisi, Monica Basso, Sandro Panese, Stefano Baratti, Pier Giorgio Scotton, Francesca Andretta, Mario Giobbia, Renzo Scaggiante, Francesca Gatti, Francesco Castelli, Eugenia Quiros-Roldan, Melania Degli Antoni, Isabella Zanella, Matteo della Monica, Carmelo Piscopo, Mario Capasso, Roberta Russo, Immacolata Andolfo, Achille Iolascon, Giuseppe Fiorentino, Massimo Carella, Marco Castori, Giuseppe Merla, Gabriella Maria Squeo, Filippo Aucella, Pamela Raggi, Rita Perna, Matteo Bassetti, Antonio Di Biagio, Maurizio Sanguinetti, Luca Masucci, Alessandra Guarnaccia, Serafina Valente, Alex Di Florio, Marco Mandalà, Alessia Giorli, Lorenzo Salerni, Patrizia Zucchi, Pierpaolo Parravicini, Elisabetta Menatti, Tullio Trotta, Ferdinando Giannattasio, Gabriella Coiro, Fabio Lena, Gianluca Lacerenza, Cristina Mussini, Luisa Tavecchia, Lia Crotti, Gianfranco Parati, Roberto Menè, Maurizio Sanarico, Marco Gori, Francesco Raimondi, Alessandra Stella, Filippo Biscarini, Tiziana Bachetti, Maria Teresa La Rovere, Maurizio Bussotti, Serena Ludovisi, Katia Capitani, Simona Dei, Sabrina Ravaglia, Annarita Giliberti, Giulia Gori, Rosangela Artuso, Elena Andreucci, Angelica Pagliazzi, Erika Fiorentini, Antonio Perrella, Francesco Bianchi, Paola Bergomi, Emanuele Catena, Riccardo Colombo, Sauro Luchi, Giovanna Morelli, Paola Petrocelli, Sarah Iacopini, Sara Modica, Silvia Baroni, Giulia Micheli, Marco Falcone, Donato Urso, Giusy Tiseo, Tommaso Matucci, Davide Grassi, Claudio Ferri, Franco Marinangeli, Francesco Brancati, Antonella Vincenti, Valentina Borgo, Stefania Lombardi, Mirco Lenzi, Massimo Antonio Di Pietro, Francesca Vichi, Benedetta Romanin, Letizia Attala, Cecilia Costa, Andrea Gabbuti, Alessio Bellucci, Marta Colaneri, Patrizia Casprini, Cristoforo Pomara, Massimiliano Esposito, Roberto Leoncini, Michele Cirianni, Lucrezia Galasso, Marco Antonio Bellini, Chiara Gabbi, and Nicola Picchiotti
- Subjects
COVID-19 ,host genetics ,integrated polygenic score ,genetic algorithm ,logistic regression ,genetic science modeling ,Genetics ,QH426-470 - Abstract
The impact of common and rare variants in COVID-19 host genetics has been widely studied. In particular, in Fallerini et al. (Human genetics, 2022, 141, 147–173), common and rare variants were used to define an interpretable machine learning model for predicting COVID-19 severity. First, variants were converted into sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. After that, the Boolean features, selected by these logistic models, were combined into an Integrated PolyGenic Score (IPGS), which offers a very simple description of the contribution of host genetics in COVID-19 severity.. IPGS leads to an accuracy of 55%–60% on different cohorts, and, after a logistic regression with both IPGS and age as inputs, it leads to an accuracy of 75%. The goal of this paper is to improve the previous results, using not only the most informative Boolean features with respect to the genetic bases of severity but also the information on host organs involved in the disease. In this study, we generalize the IPGS adding a statistical weight for each organ, through the transformation of Boolean features into “Boolean quantum features,” inspired by quantum mechanics. The organ coefficients were set via the application of the genetic algorithm PyGAD, and, after that, we defined two new integrated polygenic scores (IPGSph1 and IPGSph2). By applying a logistic regression with both IPGS, (IPGSph2 (or indifferently IPGSph1) and age as inputs, we reached an accuracy of 84%–86%, thus improving the results previously shown in Fallerini et al. (Human genetics, 2022, 141, 147–173) by a factor of 10%.
- Published
- 2024
- Full Text
- View/download PDF