Back to Search Start Over

Gene set analysis: limitations in popular existing methods and proposed improvements.

Authors :
Mishra, Pashupati
Törönen, Petri
Leino, Yrjö
Holm, Liisa
Source :
Bioinformatics. Oct2014, Vol. 30 Issue 19, p2747-2756. 10p.
Publication Year :
2014

Abstract

Motivation: Gene set analysis is the analysis of a set of genes that collectively contribute to a biological process. Most popular gene set analysis methods are based on empirical P-value that requires large number of permutations. Despite numerous gene set analysis methods developed in the past decade, the most popular methods still suffer from serious limitations.Results: We present a gene set analysis method (mGSZ) based on Gene Set Z-scoring function (GSZ) and asymptotic P-values. Asymptotic P-value calculation requires fewer permutations, and thus speeds up the gene set analysis process. We compare the GSZ-scoring function with seven popular gene set scoring functions and show that GSZ stands out as the best scoring function. In addition, we show improved performance of the GSA method when the max-mean statistics is replaced by the GSZ scoring function. We demonstrate the importance of both gene and sample permutations by showing the consequences in the absence of one or the other. A comparison of asymptotic and empirical methods of P-value estimation demonstrates a clear advantage of asymptotic P-value over empirical P-value. We show that mGSZ outperforms the state-of-the-art methods based on two different evaluations. We compared mGSZ results with permutation and rotation tests and show that rotation does not improve our asymptotic P-values. We also propose well-known asymptotic distribution models for three of the compared methods.Availability and implementation: mGSZ is available as R package from cran.r-project.org.Contact: pashupati.mishra@helsinki.fiSupplementary information: Available at http://ekhidna.biocenter.helsinki.fi/downloads/pashupati/mGSZ.html [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13674803
Volume :
30
Issue :
19
Database :
Academic Search Index
Journal :
Bioinformatics
Publication Type :
Academic Journal
Accession number :
98635594
Full Text :
https://doi.org/10.1093/bioinformatics/btu374