Back to Search Start Over

Conditional Hierarchical Bayesian Tucker Decomposition for Genetic Data Analysis

Authors :
Sandler, Adam
Klabjan, Diego
Luo, Yuan
Sandler, Adam
Klabjan, Diego
Luo, Yuan
Publication Year :
2019

Abstract

We develop methods for reducing the dimensionality of large data sets, common in biomedical applications. Learning about patients using genetic data often includes more features than observations, which makes direct supervised learning difficult. One method of reducing the feature space is to use latent Dirichlet allocation to group genetic variants in an unsupervised manner. Latent Dirichlet allocation describes a patient as a mixture of topics corresponding to genetic variants. This can be generalized as a Bayesian tensor decomposition to account for multiple feature variables. Our most significant contributions are with hierarchical topic modeling. We design distinct methods of incorporating hierarchical topic modeling, based on nested Chinese restaurant processes and Pachinko Allocation Machine, into Bayesian tensor decomposition. We apply these models to examine patients with one of four common types of cancer (breast, lung, prostate, and colorectal) and siblings with and without autism spectrum disorder. We linked the genes with their biological pathways and combine this information into a tensor of patients, counts of their genetic variants, and the genes' membership in pathways. We find that our trained models outperform baseline models, with respect to coherence, by up to 40%.<br />Comment: 38 pages, 8 figures, 5 tables

Details

Database :
OAIster
Publication Type :
Electronic Resource
Accession number :
edsoai.on1228379180
Document Type :
Electronic Resource