Back to Search
Start Over
DADA: Degree-Aware Algorithms for Network-Based Disease Gene Prioritization
- Source :
- BioData Mining, BioData Mining, Vol 4, Iss 1, p 19 (2011)
- Publication Year :
- 2010
-
Abstract
- Background High-throughput molecular interaction data have been used effectively to prioritize candidate genes that are linked to a disease, based on the observation that the products of genes associated with similar diseases are likely to interact with each other heavily in a network of protein-protein interactions (PPIs). An important challenge for these applications, however, is the incomplete and noisy nature of PPI data. Information flow based methods alleviate these problems to a certain extent, by considering indirect interactions and multiplicity of paths. Results We demonstrate that existing methods are likely to favor highly connected genes, making prioritization sensitive to the skewed degree distribution of PPI networks, as well as ascertainment bias in available interaction and disease association data. Motivated by this observation, we propose several statistical adjustment methods to account for the degree distribution of known disease and candidate genes, using a PPI network with associated confidence scores for interactions. We show that the proposed methods can detect loosely connected disease genes that are missed by existing approaches, however, this improvement might come at the price of more false negatives for highly connected genes. Consequently, we develop a suite called DADA, which includes different uniform prioritization methods that effectively integrate existing approaches with the proposed statistical adjustment strategies. Comprehensive experimental results on the Online Mendelian Inheritance in Man (OMIM) database show that DADA outperforms existing methods in prioritizing candidate disease genes. Conclusions These results demonstrate the importance of employing accurate statistical models and associated adjustment methods in network-based disease gene prioritization, as well as other network-based functional inference applications. DADA is implemented in Matlab and is freely available at http://compbio.case.edu/dada/.
- Subjects :
- Candidate gene
Computer science
Inference
Disease
lcsh:Analysis
computer.software_genre
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
03 medical and health sciences
0302 clinical medicine
OMIM : Online Mendelian Inheritance in Man
Genetics
Information flow (information theory)
Molecular Biology
030304 developmental biology
Sampling bias
0303 health sciences
Research
lcsh:QA299.6-433
Statistical model
Degree distribution
Computer Science Applications
Computational Mathematics
Computational Theory and Mathematics
lcsh:R858-859.7
Data mining
computer
030217 neurology & neurosurgery
Subjects
Details
- ISSN :
- 17560381
- Volume :
- 4
- Database :
- OpenAIRE
- Journal :
- BioData mining
- Accession number :
- edsair.doi.dedup.....f638f7acca49d0f3aeeaa5a030116d22