Back to Search
Start Over
Integration of Survival and Binary Data for Variable Selection and Prediction: A Bayesian Approach
- Source :
- J R Stat Soc Ser C Appl Stat
- Publication Year :
- 2020
-
Abstract
- Summary We consider the problem where the data consist of a survival time and a binary outcome measurement for each individual, as well as corresponding predictors. The goal is to select the common set of predictors which affect both the responses, and not just one of them. In addition, we develop a survival prediction model based on data integration. The paper is motivated by the Cancer Genomic Atlas databank, which is currently the largest genomics and transcriptomics database. The data contain cancer survival information along with cancer stages for each patient. Furthermore, it contains reverse phase protein array measurements for each individual, which are the predictors associated with these responses. The biological motivation is to identify the major actionable proteins associated with both survival outcomes and cancer stages. We develop a Bayesian hierarchical model to model jointly the survival time and the classification of the cancer stages. Moreover, to deal with the high dimensionality of the reverse phase protein array measurements, we use a shrinkage prior to identify significant proteins. Simulations and Cancer Genomic Atlas data analysis show that the joint integrated modelling approach improves survival prediction.
- Subjects :
- Statistics and Probability
0303 health sciences
Computer science
business.industry
Bayesian probability
Feature selection
Accelerated failure time model
Machine learning
computer.software_genre
01 natural sciences
Article
Set (abstract data type)
010104 statistics & probability
03 medical and health sciences
Probit model
Binary data
Bayesian hierarchical modeling
Artificial intelligence
0101 mathematics
Statistics, Probability and Uncertainty
business
computer
030304 developmental biology
Data integration
Subjects
Details
- ISSN :
- 00359254
- Volume :
- 68
- Issue :
- 5
- Database :
- OpenAIRE
- Journal :
- Journal of the Royal Statistical Society. Series C, Applied statistics
- Accession number :
- edsair.doi.dedup.....3ee3ba45e85060ed4915a557d3cab3f6