Back to Search Start Over

Machine learning for effectively avoiding overfitting is a crucial strategy for the genetic prediction of polygenic psychiatric phenotypes.

Authors :
Takahashi Y
Ueki M
Tamiya G
Ogishima S
Kinoshita K
Hozawa A
Minegishi N
Nagami F
Fukumoto K
Otsuka K
Tanno K
Sakata K
Shimizu A
Sasaki M
Sobue K
Kure S
Yamamoto M
Tomita H
Source :
Translational psychiatry [Transl Psychiatry] 2020 Aug 17; Vol. 10 (1), pp. 294. Date of Electronic Publication: 2020 Aug 17.
Publication Year :
2020

Abstract

The accuracy of previous genetic studies in predicting polygenic psychiatric phenotypes has been limited mainly due to the limited power in distinguishing truly susceptible variants from null variants and the resulting overfitting. A novel prediction algorithm, Smooth-Threshold Multivariate Genetic Prediction (STMGP), was applied to improve the genome-based prediction of psychiatric phenotypes by decreasing overfitting through selecting variants and building a penalized regression model. Prediction models were trained using a cohort of 3685 subjects in Miyagi prefecture and validated with an independently recruited cohort of 3048 subjects in Iwate prefecture in Japan. Genotyping was performed using HumanOmniExpressExome BeadChip Arrays. We used the target phenotype of depressive symptoms and simulated phenotypes with varying complexity and various effect-size distributions of risk alleles. The prediction accuracy and the degree of overfitting of STMGP were compared with those of state-of-the-art models (polygenic risk scores, genomic best linear-unbiased prediction, summary-data-based best linear-unbiased prediction, BayesR, and ridge regression). In the prediction of depressive symptoms, compared with the other models, STMGP showed the highest prediction accuracy with the lowest degree of overfitting, although there was no significant difference in prediction accuracy. Simulation studies suggested that STMGP has a better prediction accuracy for moderately polygenic phenotypes. Our investigations suggest the potential usefulness of STMGP for predicting polygenic psychiatric conditions while avoiding overfitting.

Details

Language :
English
ISSN :
2158-3188
Volume :
10
Issue :
1
Database :
MEDLINE
Journal :
Translational psychiatry
Publication Type :
Academic Journal
Accession number :
32826857
Full Text :
https://doi.org/10.1038/s41398-020-00957-5