Back to Search
Start Over
A Proposal: High-Throughput Robust Architecture for Log Analysis and Data Stream Mining
- Source :
- Advances in Intelligent Systems and Computing ISBN: 9789811004179
- Publication Year :
- 2016
- Publisher :
- Springer Singapore, 2016.
-
Abstract
- Various data mining approaches are now available, which help in handling large static data sets, in spite of limited computational resources. However, these approaches lack in mining high-speed endless streams, as their learning procedure though simple require the entire training process to be repeated for each new arriving information instance. The main challenges while dealing with continuous data streams: they are of sizes many times greater than the available memory, are real-time, and the new instances should be inspected at most once, and predictions must be made. Another issue with continuous real-time data is changing of concepts with time, which is often called concept drift. This paper addresses the above stated problems, and provides a solution by proposing a real-time, scalable, and robust architecture. It is a general-purpose architecture, based on online machine learning, which efficiently logs and mines the stream data in a fault-tolerant manner. It consists of two frameworks: (1) Event aggregation framework, which reliably collects events and messages from multiple sources and ships them to a destination for processing (2) Real-time computation framework, which processes streams online for extraction of information patterns. It guarantees reliable processing of billions of messages per second. Furthermore, it facilitates the evaluation of the stream learning algorithms and offers change detection strategies to detect concept drifts.
- Subjects :
- Concept drift
Process (engineering)
Data stream mining
Event (computing)
Computer science
Online machine learning
02 engineering and technology
computer.software_genre
020204 information systems
Scalability
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Data mining
Throughput (business)
computer
Change detection
Subjects
Details
- ISBN :
- 978-981-10-0417-9
- ISBNs :
- 9789811004179
- Database :
- OpenAIRE
- Journal :
- Advances in Intelligent Systems and Computing ISBN: 9789811004179
- Accession number :
- edsair.doi...........432155546737b9409b8f8f5de600298d