Back to Search Start Over

Handling very large numbers of association rules in the analysis of microarray data

Authors :
Gediminas Adomavicius
Alexander Tuzhilin
Source :
KDD
Publication Year :
2002
Publisher :
ACM, 2002.

Abstract

The problem of analyzing microarray data became one of important topics in bioinformatics over the past several years, and different data mining techniques have been proposed for the analysis of such data. In this paper, we propose to use association rule discovery methods for determining associations among expression levels of different genes. One of the main problems related to the discovery of these associations is the scalability issue. Microarrays usually contain very large numbers of genes that are sometimes measured in 10,000s. Therefore, analysis of such data can generate a very large number of associations that can often be measured in millions. The paper addresses this problem by presenting a method that enables biologists to evaluate these very large numbers of discovered association rules during the post-analysis stage of the data mining process. This is achieved by providing several rule evaluation operators, including rule grouping, filtering, browsing, and data inspection operators, that allow biologists to validate multiple individual gane regulation patterns at a time. By iteratively applying these operators, biologists can explore a significant part of all the initially generated rules in an acceptable period of time and thus answer biological questions that are of a particular interest to him or her. To validate our method, we tested our system on the microarray data pertaining to the studies of environmental hazards and their influence of gane expression processes. As a result, we managed to answer several questions that were of interest to the biologists that had collected this data.

Details

Database :
OpenAIRE
Journal :
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Accession number :
edsair.doi...........914b78efe74e548f4a074998bb11bb9d
Full Text :
https://doi.org/10.1145/775047.775104