Back to Search
Start Over
Eventually consistent cardinality estimation with applications in biodata mining
- Source :
- SAC
- Publication Year :
- 2016
- Publisher :
- ACM, 2016.
-
Abstract
- Large set cardinality estimators and other streaming oriented operations are the cornerstone of big data processing. Cardinality estimators combined with in-memory based storage systems provide a fast framework for keeping valuable application data easily queryable and maintanable. This has a plethora of applications. For instance, a common use case is to maintain a number of counters for monitoring application statistics for real time dashboard purposes. Another such case is large set size estimation for big data systems in internal operations like counting. In this paper is addressed the issue of scaling the computation of a cardinality estimator in the presence of node failures in a distributed setting. Moreover, for the proposed estimation technique eventual consistency is proved, which is adequate for most cases in distributed applications. To the best of the authors knowledge, this functionality is not currently provided by commonly used commercial and open source systems. Additionally, the proposed approach is generic enough to be applied to other algorithms, which can help build a basic framework for more complex operations in the big data field. We demonstrate this with graph metric calculation applications in the large scale biodata mining field.
- Subjects :
- 0301 basic medicine
Biodata
Computer science
business.industry
Computation
Big data
Eventual consistency
02 engineering and technology
computer.software_genre
CAP theorem
Graph
03 medical and health sciences
030104 developmental biology
Cardinality
0202 electrical engineering, electronic engineering, information engineering
Graph (abstract data type)
020201 artificial intelligence & image processing
Data mining
business
computer
Scaling
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 31st Annual ACM Symposium on Applied Computing
- Accession number :
- edsair.doi...........3b1ffa002d0ee6e0a86a07872b3312db