1. Active Trace Clustering for Improved Process Discovery
- Author
-
Jan Vanthienen, Bart Baesens, Seppe vanden Broucke, and Jochen De Weerdt
- Subjects
Process modeling ,Computer science ,business.industry ,Conceptual clustering ,Process mining ,Machine learning ,computer.software_genre ,Computer Science Applications ,Business process discovery ,Computational Theory and Mathematics ,Knowledge extraction ,Consensus clustering ,Information system ,Artificial intelligence ,Data mining ,business ,Cluster analysis ,computer ,K-optimal pattern discovery ,Information Systems ,TRACE (psycholinguistics) - Abstract
Process discovery is the learning task that entails the construction of process models from event logs of information systems. Typically, these event logs are large data sets that contain the process executions by registering what activity has taken place at a certain moment in time. By far the most arduous challenge for process discovery algorithms consists of tackling the problem of accurate and comprehensible knowledge discovery from highly flexible environments. Event logs from such flexible systems often contain a large variety of process executions which makes the application of process mining most interesting. However, simply applying existing process discovery techniques will often yield highly incomprehensible process models because of their inaccuracy and complexity. With respect to resolving this problem, trace clustering is one very interesting approach since it allows to split up an existing event log so as to facilitate the knowledge discovery process. In this paper, we propose a novel trace clustering technique that significantly differs from previous approaches. Above all, it starts from the observation that currently available techniques suffer from a large divergence between the clustering bias and the evaluation bias. By employing an active learning inspired approach, this bias divergence is solved. In an assessment using four complex, real-life event logs, it is shown that our technique significantly outperforms currently available trace clustering techniques.
- Published
- 2013
- Full Text
- View/download PDF