Back to Search Start Over

Identifying oncogenes as features for clinical cancer prognosis by Bayesian nonparametric variable selection algorithm

Authors :
Runyu Jing
Huijun Wang
Liqiu Huang
Menglong Li
Keqin Liu
Yongning Yang
Zhining Wen
Source :
Chemometrics and Intelligent Laboratory Systems. 146:464-471
Publication Year :
2015
Publisher :
Elsevier BV, 2015.

Abstract

In clinical research, DNA microarrays are widely applied in the identification of the oncogenes, which are differentially expressed between two clinical states and considered as predictors for the cancer prognosis. Due to the heterogeneity of clinical samples, the differentially expressed genes (DEGs) discovered by current statistical methods or machine learning algorithms involve a number of genes unrelated to the phenotypic differences between the compared samples and, consequently, will impact on the reliability of the predictive models in the cancer prognosis. In our study, we proposed Bayesian nonparametric variable selection algorithm, a stochastic random and hierarchical search method, to separate out the cancer-related genes from the DEG lists. The importance of the genes in the DEG lists can be inferred from the posterior distribution of the predicted clinical endpoints, which can be simulated by the Markov Chain Monte Carlo (MCMC) algorithm. The cancer-related genes were identified according to their importance and used to construct models for the prediction of three clinical endpoints, namely the estrogen receptor status (ER status) of the breast cancer patient, the preoperative treatment response of breast cancer and the overall survival milestone outcome of acute myeloma leukemia (OS of AML). The prediction accuracies of preoperative treatment response, ER status and OS of AML were 86%, 89% and 58%, and the Mathew’s correlation coefficients were 0.42, 0.77 and 0.33, which were higher than those reported in previous studies. Furthermore, most of the genes identified by our method were reported as oncogenes in previous literatures. Our results demonstrated that the Bayesian nonparametric variable selection algorithm proposed in current study can efficiently identify the oncogenes for cancer prognosis and enhance the performance of the predictive models.

Details

ISSN :
01697439
Volume :
146
Database :
OpenAIRE
Journal :
Chemometrics and Intelligent Laboratory Systems
Accession number :
edsair.doi...........51943d7dbd8b60847b8e38e671732650
Full Text :
https://doi.org/10.1016/j.chemolab.2015.07.004