Back to Search Start Over

Identifying frequent items in distributed data sets.

Authors :
Sacha, Jan
Montresor, Alberto
Source :
Computing. Apr2013, Vol. 95 Issue 4, p289-307. 19p.
Publication Year :
2013

Abstract

Many practical problems in computer science require the knowledge of the most frequently occurring items in a data set. Current state-of-the-art algorithms for frequent items discovery are either fully centralized or rely on node hierarchies which are inflexible and prone to failures in massively distributed systems. In this paper we describe a family of gossip-based algorithms that efficiently approximate the most frequent items in large-scale distributed datasets. We show, both analytically and using real-world datasets, that our algorithms are fast, highly scalable, and resilient to node failures. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0010485X
Volume :
95
Issue :
4
Database :
Academic Search Index
Journal :
Computing
Publication Type :
Academic Journal
Accession number :
86449804
Full Text :
https://doi.org/10.1007/s00607-012-0220-1