Back to Search Start Over

Gene pathogenicity prediction of Mendelian diseases via the random forest algorithm.

Authors :
He, Sijie
Chen, Weiwei
Liu, Hankui
Li, Shengting
Lei, Dongzhu
Dang, Xiao
Chen, Yulan
Zhang, Xiuqing
Zhang, Jianguo
Source :
Human Genetics. Jun2019, Vol. 138 Issue 6, p673-679. 7p.
Publication Year :
2019

Abstract

The study of Mendelian diseases and the identification of their causative genes are of great significance in the field of genetics. The evaluation of the pathogenicity of genes and the total number of Mendelian disease genes are both important questions worth studying. However, very few studies have addressed these issues to date, so we attempt to answer them in this study. We calculated the gene pathogenicity prediction (GPP) score by a machine learning approach (random forest algorithm) to evaluate the pathogenicity of genes. When we applied the GPP score to the testing gene set, we obtained an accuracy of 80%, recall of 93% and area under the curve of 0.87. Our results estimated that a total of 10,384 protein-coding genes were Mendelian disease genes. Furthermore, we found the GPP score was positively correlated with the severity of disease. Our results indicate that GPP score may provide a robust and reliable guideline to predict the pathogenicity of protein-coding genes. To our knowledge, this is the first trial to estimate the total number of Mendelian disease genes. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
03406717
Volume :
138
Issue :
6
Database :
Academic Search Index
Journal :
Human Genetics
Publication Type :
Academic Journal
Accession number :
136841219
Full Text :
https://doi.org/10.1007/s00439-019-02021-9