Back to Search Start Over

Online correlation for unlabeled process events: A flexible CEP-based approach.

Authors :
Helal, Iman M.A.
Awad, Ahmed
Source :
Information Systems. Sep2022, Vol. 108, pN.PAG-N.PAG. 1p.
Publication Year :
2022

Abstract

Process mining is a sub-field of data mining that focuses on analyzing timestamped and partially ordered data. This type of data is commonly called event logs. Each event is required to have at least three attributes: case ID, task ID/name, and timestamp to apply process mining techniques. Thus, any missing information need to be supplied first. Traditionally, events collected from different sources are manually correlated. While this might be acceptable in an offline setting, this is infeasible in an online setting. Recently, several use cases have emerged that call for applying process mining in an online setting. In such scenarios, a stream of high-speed and high-volume events continuously flow, e.g. IoT applications, with stringent latency requirements to have insights about the ongoing process. Thus, event correlation must be automated and occur as the data is being received. We introduce an approach that correlates unlabeled events received on a stream. Given a set of start activities, our approach correlates unlabeled events to a case identifier. Our approach is probabilistic. That implies a single uncorrelated event can be assigned to zero or more case identifiers with different probabilities. Moreover, our approach is flexible. That is, the user can supply domain knowledge in the form of constraints that reduce the correlation space. This knowledge can be supplied while the application is running. We realize our approach using complex event processing (CEP) technologies. We implemented a prototype on top of Esper, a state of the art industrial CEP engine. We compare our approach to baseline approaches. The experimental evaluation shows that our approach outperforms the throughput and latency of the baseline approaches. It also shows that using real-life logs, the accuracy of our approach can compete with the baseline approaches. • Online event correlation for low-level process events that leverage CEP technology. • A flexible approach using CEP rules can be added, removed, or modified at runtime. • Realization on top of Esper engine (a state-of-the-art industrial CEP engine). • Evaluation on real-life and synthetic logs and comparison with baseline approaches. • Achieving higher throughput, accuracy, and latency compared to the state-of-the-art. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
03064379
Volume :
108
Database :
Academic Search Index
Journal :
Information Systems
Publication Type :
Academic Journal
Accession number :
156852890
Full Text :
https://doi.org/10.1016/j.is.2022.102031