Back to Search
Start Over
Optimal Subgroup Discovery in Purely Numerical Data
- Source :
- Advances in Knowledge Discovery and Data Mining 24th Pacific-Asia Conference, PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), May 2020, Singapore (on line), Singapore. pp.112-124, ⟨10.1007/978-3-030-47436-2_9⟩, Advances in Knowledge Discovery and Data Mining, Advances in Knowledge Discovery and Data Mining ISBN: 9783030474355, PAKDD (2), Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), May 2020, Singapore, Singapore. pp.112-124
- Publication Year :
- 2020
- Publisher :
- HAL CCSD, 2020.
-
Abstract
- International audience; Subgroup discovery in labeled data is the task of discovering patterns in the description space of objects to find subsets of objects whose labels show an interesting distribution, for example the disproportionate representation of a label value. Discovering interesting subgroups in purely numerical data-attributes and target label-has received little attention so far. Existing methods make use of discretization methods that lead to a loss of information and suboptimal results. This is the case for the reference algorithm SD-Map*. We consider here the discovery of optimal subgroups according to an interestingness measure in purely numerical data. We leverage the concept of closed interval patterns and advanced enumeration and pruning techniques. The performances of our algorithm are studied empirically and its added-value w.r.t. SD-Map* is illustrated.
- Subjects :
- Theoretical computer science
Discretization
Computer science
02 engineering and technology
Pattern Mining
Article
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Numerical Data
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
Enumeration
Leverage (statistics)
Labeled data
020201 artificial intelligence & image processing
Subgroup Discovery
[INFO]Computer Science [cs]
Subjects
Details
- Language :
- English
- ISBN :
- 978-3-030-47435-5
- ISBNs :
- 9783030474355
- Database :
- OpenAIRE
- Journal :
- Advances in Knowledge Discovery and Data Mining 24th Pacific-Asia Conference, PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), May 2020, Singapore (on line), Singapore. pp.112-124, ⟨10.1007/978-3-030-47436-2_9⟩, Advances in Knowledge Discovery and Data Mining, Advances in Knowledge Discovery and Data Mining ISBN: 9783030474355, PAKDD (2), Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), May 2020, Singapore, Singapore. pp.112-124
- Accession number :
- edsair.doi.dedup.....23389d690a6f14dd9029316dfad56ad4
- Full Text :
- https://doi.org/10.1007/978-3-030-47436-2_9⟩