51. A hypothesis-driven approach to assessing significance of differences in RNA expression levels among specific groups of genes
- Author
-
Carolyn J. Lawrence-Dill, Peng Liu, and Mingze He
- Subjects
0301 basic medicine ,0106 biological sciences ,Statistical methods ,Plant Science ,Computational biology ,Biology ,Biochemistry ,01 natural sciences ,Normal distribution ,03 medical and health sciences ,Transcription (biology) ,lcsh:Botany ,Gene expression ,Genetics ,Statistical analysis ,Cluster analysis ,Gene ,030304 developmental biology ,Parametric statistics ,0303 health sciences ,Cell Biology ,Genomics ,Expression (mathematics) ,lcsh:QK1-989 ,030104 developmental biology ,Differentially expressed genes ,Rna expression ,Expression data ,Parametric methods ,RNA-seq ,Function (biology) ,Developmental Biology ,010606 plant biology & botany - Abstract
Genome-wide molecular gene expression studies generally compare expression values for each gene across multiple conditions followed by cluster and gene set enrichment analysis to determine whether differentially expressed genes are enriched in specific biochemical pathways, cellular components, biological processes, and/or molecular functions, etc. This approach to analyzing differences in gene expression enables discovery of gene function, but is not useful to determine whether pre-defined groups of genes share or diverge in their expression patterns in response to treatments nor to assess the correctness of pre-defined gene set groupings. Here we present a simple method that changes the dimension of comparison by treating genes as variable traits to directly assess significance of differences in expression levels among pre-defined gene groups. Because expression distributions are typically skewed (thus unfit for direct assessment using Gaussian statistical methods) our method involves transforming expression data to approximate a normal distribution followed by dividing the genes into groups, then applying Gaussian parametric methods to assess significance of observed differences. This method enables the assessment of differences in gene expression distributions within and across samples, enabling hypothesis-based comparison among groups of genes. We demonstrate this method by assessing the significance of specific gene groups’ differential response to heat stress conditions in maize.AbbreviationsGO– gene ontology HSP – heat shock proteinKEGG– Kyoto Encyclopedia of Genes and GenomesHSF TF– heat shock factor transcription factorHSBP– heat shock binding proteinRNA– ribonucleic acidTE– transposable elementTF– transcription factorTPM– transcripts per kilobase millions
- Published
- 2017