Back to Search Start Over

Phenotype harmonization and analysis for The Populations Underrepresented in Mental illness Association Studies (the PUMAS Project).

Authors :
Ramirez-Diaz AM
Diaz-Zuluaga AM
Stroud RE 2nd
Vreeker A
Bitta M
Ivankovic F
Wootton O
Whiteman CA
Mountcastle H
Jha SC
Georgakopoulos P
Kaur I
Mena L
Asaaf S
de Souza Rodrigues AL
Ziebold C
Newton CRJC
Stein DJ
Akena D
Valencia-Echeverry J
Kyebuzibwa J
Palacio-Ortiz JD
McMahon J
Ongeri L
Chibnik LB
Quarantini LC
Atwoli L
Santoro ML
Baker M
Diniz MJA
Castaño-Ramirez M
Alemayehu M
Holanda N
Ayola-Serrano NC
Lorencetti PG
Mwema RM
James R
Albuquerque S
Sharma S
Chapman SB
Belangero SI
Teferra S
Gichuru S
Service SK
Kariuki SM
Freitas TH
Zingela Z
Gadelha A
Bearden CE
Ophoff RA
Neale BM
Martin AR
Koenen KC
Pato CN
Lopez-Jaramillo C
Reus V
Freimer N
Pato MT
Gelaye B
Loohuis LO
Source :
MedRxiv : the preprint server for health sciences [medRxiv] 2024 Oct 04. Date of Electronic Publication: 2024 Oct 04.
Publication Year :
2024

Abstract

Background: The Populations Underrepresented in Mental illness Association Studies (PUMAS) project is attempting to remediate the historical underrepresentation of African and Latin American populations in psychiatric genetics through large-scale genetic association studies of individuals diagnosed with a serious mental illness [SMI, including schizophrenia (SCZ), schizoaffective disorder (SZA) bipolar disorder (BP), and severe major depressive disorder (MDD)] and matched controls. Given growing evidence indicating substantial symptomatic and genetic overlap between these diagnoses, we sought to enable transdiagnostic genetic analyses of PUMAS data by conducting phenotype alignment and harmonization for 89,320 participants (48,165 cases and 41,155 controls) from four cohorts, each of which used different ascertainment and assessment methods: PAISA n=9,105; PUMAS-LATAM n=14,638; NGAP n=42,953 and GPC n=22,624. As we describe here, these efforts have yielded harmonized datasets enabling us to analyze PUMAS genetic variation data at three levels: SMI overall, diagnoses, and individual symptoms.<br />Methods: In aligning item-level phenotypes obtained from 14 different clinical instruments, we incorporated content, branching nature, and time frame for each phenotype; standardized diagnoses; and selected 19 core SMI item-level phenotypes for analyses. The harmonization was evaluated in PUMAS cases using multiple correspondence analysis (MCA), co-occurrence analyses, and item-level endorsement.<br />Outcomes: We mapped >6,895 item-level phenotypes in the aggregated PUMAS data, in which SCZ (44.97%) and severe BP (BP-I, 31.53%) were the most common diagnoses. Twelve of the 19 core item-level phenotypes occurred at frequencies of > 10% across all diagnoses, indicating their potential utility for transdiagnostic genetic analyses. MCA of the 14 phenotypes that were present for all cohorts revealed consistency across cohorts, and placed MDD and SCZ into separate clusters, while other diagnoses showed no significant phenotypic clustering.<br />Interpretation: Our alignment strategy effectively aggregated extensive phenotypic data obtained using diverse assessment tools. The MCA yielded dimensional scores which we will use for genetic analyses along with the item level phenotypes. After successful harmonization, residual phenotypic heterogeneity between cohorts reflects differences in branching structure of diagnostic instruments, recruitment strategies, and symptom interpretation (due to cultural variation).

Details

Language :
English
Database :
MEDLINE
Journal :
MedRxiv : the preprint server for health sciences
Publication Type :
Academic Journal
Accession number :
39502669
Full Text :
https://doi.org/10.1101/2024.10.02.24314732