Back to Search Start Over

Non-negative Tensor Factorization with missing data for the modeling of gene expressions in the Human Brain

Authors :
Søren Føns Vind Nielsen
Morten Mørup
Mboup, Mamadou
Adali , Tü lay
Moreau, Éric
Larsen, Jan
Source :
Nielsen, S F V & Mørup, M 2014, Non-negative Tensor Factorization with missing data for the modeling of gene expressions in the Human Brain . in M Mboup, T L Adali, É Moreau & J Larsen (eds), Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2014 ) . IEEE, 2014 IEEE International Workshop on Machine Learning for Signal Processing, Reims, France, 21/09/2014 . https://doi.org/10.1109/MLSP.2014.6958919, MLSP
Publication Year :
2014
Publisher :
IEEE, 2014.

Abstract

Non-negative Tensor Factorization (NTF) has become a prominent tool for analyzing high dimensional multi-way structured data. In this paper we set out to analyze gene expression across brain regions in multiple subjects based on data from the Allen Human Brain Atlas [1] with more than 40 % data missing in our problem. Our analysis is based on the non-negativity constrained Canonical Polyadic (CP) decomposition where we handle the missing data using marginalization considering three prominent alternating least squares procedures; multiplicative updates, column-wise, and row-wise updating of the component matrices. We examine three gene expression prediction scenarios based on data missing at random, whole genes missing and whole areas missing within a subject. We find that the column-wise updating approach also known as HALS performs the most efficient when fitting the model. We further observe that the non-negativity constrained CP model is able to predict gene expressions better than predicting by the subject average when data is missing at random. When whole genes and whole areas are missing it is in general better to predict by subject averages. However, we find that when whole genes are missing from all subjects the model based predictions are useful. When analyzing the structure of the components derived for one of the best predicting model orders the components identified in general constitute localized regions of the brain. Non-negative tensor factorization based on marginalization thus forms a promising framework for imputing missing values and characterizing gene expression in the human brain. However, care also has to be taken in particular when predicting the genetic expression levels at a whole region of the brain missing as our analysis indicates that this requires a substantial amount of subjects with data for this region in order for the model predictions to be reliable.

Details

Language :
English
Database :
OpenAIRE
Journal :
Nielsen, S F V & Mørup, M 2014, Non-negative Tensor Factorization with missing data for the modeling of gene expressions in the Human Brain . in M Mboup, T L Adali, É Moreau & J Larsen (eds), Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2014 ) . IEEE, 2014 IEEE International Workshop on Machine Learning for Signal Processing, Reims, France, 21/09/2014 . https://doi.org/10.1109/MLSP.2014.6958919, MLSP
Accession number :
edsair.doi.dedup.....5055e2085c4b0d6a6f60f81838ef48dc