Back to Search
Start Over
Mining Top-K multidimensional gradients
- Source :
- Scopus-Elsevier, Data Warehousing and Knowledge Discovery ISBN: 9783540745525, DaWaK, CIÊNCIAVITAE
-
Abstract
- Several business applications such as marketing basket analysis, clickstream analysis, fraud detection and churning migration analysis demand gradient data analysis. By employing gradient data analysis one is able to identify trends, outliers and answering "what-if' questions over large databases. Gradient queries were first introduced by Imielinski et al [1] as the cubegrade problem. The main idea is to detect interesting changes in a multidimensional space (MDS). Thus, changes in a set of measures (aggregates) are associated with changes in sector characteristics (dimensions). MDS contains a huge number of cells which poses great challenge for mining gradient cells on a useful time. Dong et al [2] have proposed gradient constraints to smooth the computational costs involved in such queries. Even by using such constraints on large databases, the number of interesting cases to evaluate is still large. In this work, we are interested to explore best cases (Top-K cells) of interesting multidimensional gradients. There several studies on Top-K queries, but preference queries with multidimensional selection were introduced quite recently by Dong et al [9]. Furthermore, traditional Top-K methods work well in presence of convex functions (gradients are non-convex ones). We have revisited iceberg cubing for complex measures, since it is the basis for mining gradient cells. We also propose a gradient-based cubing strategy to evaluate interesting gradient regions in MDS. Thus, the main challenge is to find maximum gradient regions (MGRs) that maximize the task of mining Top-K gradient cells. Our performance study indicates that our strategy is effective on finding the most interesting gradients in multidimensional space.<br />Supported by a Ph.D. Scholarship from FCT-Foundation of Science and Technology, Ministry of Science of Portugal
- Subjects :
- Science & Technology
Selection (relational algebra)
Basis (linear algebra)
Computer science
Multidimensional space
Ciências Naturais::Ciências da Computação e da Informação
02 engineering and technology
computer.software_genre
Data Cube
Set (abstract data type)
Data cube
Probe Cell
Spreading Factor
Cuboid Cell
020204 information systems
Outlier
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Multidimensional Space
Data mining
Convex function
computer
Clickstream
Subjects
Details
- ISBN :
- 978-3-540-74552-5
- ISBNs :
- 9783540745525
- Database :
- OpenAIRE
- Journal :
- Scopus-Elsevier, Data Warehousing and Knowledge Discovery ISBN: 9783540745525, DaWaK, CIÊNCIAVITAE
- Accession number :
- edsair.doi.dedup.....11813e135b4e9697a3148771ba640d5b