Back to Search Start Over

Eventually consistent cardinality estimation with applications in biodata mining

Authors :
Stavros Kontopoulos
Georgios Drakopoulos
Christos Makris
Source :
SAC
Publication Year :
2016
Publisher :
ACM, 2016.

Abstract

Large set cardinality estimators and other streaming oriented operations are the cornerstone of big data processing. Cardinality estimators combined with in-memory based storage systems provide a fast framework for keeping valuable application data easily queryable and maintanable. This has a plethora of applications. For instance, a common use case is to maintain a number of counters for monitoring application statistics for real time dashboard purposes. Another such case is large set size estimation for big data systems in internal operations like counting. In this paper is addressed the issue of scaling the computation of a cardinality estimator in the presence of node failures in a distributed setting. Moreover, for the proposed estimation technique eventual consistency is proved, which is adequate for most cases in distributed applications. To the best of the authors knowledge, this functionality is not currently provided by commonly used commercial and open source systems. Additionally, the proposed approach is generic enough to be applied to other algorithms, which can help build a basic framework for more complex operations in the big data field. We demonstrate this with graph metric calculation applications in the large scale biodata mining field.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 31st Annual ACM Symposium on Applied Computing
Accession number :
edsair.doi...........3b1ffa002d0ee6e0a86a07872b3312db