1. Integrating constraint programming and itemset mining
- Author
-
Siegfried Nijssen, Tias Guns, Balcazar, JL, Bonchi, F, Gionis, A, Sebag, M, Data Analytics Laboratory, Business technology and Operations, and Electromobility research centre
- Subjects
Computer science ,Computation ,Scalability ,Constraint programming ,Graph (abstract data type) ,Itemset Mining ,Data mining ,computer.software_genre ,computer ,Graph ,Theoretical Computer Science ,Computer Science(all) ,Constraint Programming - Abstract
Over the years many pattern mining tasks and algorithms have been proposed. Traditionally, the focus of these studies was on the efficiency of the computation and the scalability towards very large databases. Little research has however been done on a general framework that encompasses several of these problems. In earlier work we showed how constraint programming (CP) can offer such a general framework; unfortunately, however, we also found that out-of-the-box CP solvers lack the efficiency and scalability achieved by specialized itemset mining systems, which could discourage their use. Here we study the question whether a framework can be built that inherits the generality of CP systems and the efficiency of specialized algorithms. We propose a CP-based framework for pattern mining that avoids the redundant representations and propagations found in existing CP systems. We show experimentally that an implementation of this framework performs comparable to specialized itemset mining systems; furthermore, under certain conditions it lists itemsets with polynomial delay, which demonstrates that it also is a promising approach for analyzing pattern mining tasks from more theoretical perspectives. This is illustrated on a graph mining problem. acceptance rate = 19.0% ispartof: pages:467-482 ispartof: Lecture Notes in Computer Science vol:6322 issue:PART 2 pages:467-482 ispartof: European Conference on Machine Learning and Knowledge Discovery in Databases location:Barcelona date:20 Sep - 24 Sep 2010 status: published
- Published
- 2010