Back to Search
Start Over
How should we measure proportionality on relative gene expression data?
- Source :
- Recercat. Dipósit de la Recerca de Catalunya, instname, Theory in Biosciences
- Publisher :
- Springer Nature
-
Abstract
- Correlation is ubiquitously used in gene expression analysis although its validity as an objective criterion is often questionable. If no normalization reflecting the original mRNA counts in the cells is available, correlation between genes becomes spurious. Yet the need for normalization can be bypassed using a relative analysis approach called log-ratio analysis. This approach can be used to identify proportional gene pairs, i.e. a subset of pairs whose correlation can be inferred correctly from unnormalized data due to their vanishing log-ratio variance. To interpret the size of non-zero log-ratio variances, a proposal for a scaling with respect to the variance of one member of the gene pair was recently made by Lovell et al. Here we derive analytically how spurious proportionality is introduced when using a scaling. We base our analysis on a symmetric proportionality coefficient (briefly mentioned in Lovell et al.) that has a number of advantages over their statistic. We show in detail how the choice of reference needed for the scaling determines which gene pairs are identified as proportional. We demonstrate that using an unchanged gene as a reference has huge advantages in terms of sensitivity. We also explore the link between proportionality and partial correlation and derive expressions for a partial proportionality coefficient. A brief data-analysis part puts the discussed concepts into practice. The authors were supported by CRG internal funds provided by the Catalan Government. I. E. was also paid by the Spanish Ministerio de Economía y Competitividad under Grant BFU2011-28575.
- Subjects :
- 0301 basic medicine
Normalization (statistics)
Statistics and Probability
Spurious correlation
Data normalization
Genes, Fungal
Compositional data
01 natural sciences
Models, Biological
Database normalization
010104 statistics & probability
03 medical and health sciences
Schizosaccharomyces
Applied mathematics
Gene Regulatory Networks
RNA, Messenger
0101 mathematics
Least-Squares Analysis
Spurious relationship
Scaling
Log-ratio analysis
Statistic
Partial correlation
Ecology, Evolution, Behavior and Systematics
Mathematics
Medicine(all)
Original Paper
Stochastic Processes
Models, Statistical
Stochastic process
Sequence Analysis, RNA
Gene Expression Profiling
Applied Mathematics
Gene networks
Co-expression
030104 developmental biology
Gene Expression Regulation
Subjects
Details
- Language :
- English
- ISSN :
- 14317613
- Volume :
- 135
- Issue :
- 1-2
- Database :
- OpenAIRE
- Journal :
- Theory in Biosciences
- Accession number :
- edsair.doi.dedup.....ad143ee7c5e1ca265c0de829e01d9e28
- Full Text :
- https://doi.org/10.1007/s12064-015-0220-8