Back to Search
Start Over
Parallel computation of probabilistic skyline queries using MapReduce
- Source :
- The Journal of Supercomputing. 77:418-444
- Publication Year :
- 2020
- Publisher :
- Springer Science and Business Media LLC, 2020.
-
Abstract
- In recent years, numerous applications have been continuously generating large amounts of uncertain data. The advanced analysis queries such as skyline operators are essential topics to extract interesting objects from the vast uncertain dataset. Recently, the MapReduce system has been widely used in the area of big data analysis. Although the probabilistic skyline query is not decomposable, it does not make sense to implement the probabilistic skyline query in the MapReduce framework. This paper proposes an effective parallel method called parallel computation of probabilistic skyline query (PCPS) that can measure the probabilistic skyline set in one MapReduce computation pass. The proposed method takes into account the critical sections and detects data with a high probability of existence through a proposed smart sampling algorithm. PCPS implements a new approach to the fair allocation of input data. The experimental results indicate that our proposed approach can not only reduce the processing time of the probabilistic skyline queries, but also achieve fair precision with varying dimensionality degrees.
- Subjects :
- Skyline
Measure (data warehouse)
Uncertain data
Computer science
business.industry
Computation
Big data
InformationSystems_DATABASEMANAGEMENT
Sampling (statistics)
Parallel computing
Theoretical Computer Science
Set (abstract data type)
Hardware and Architecture
business
Software
Information Systems
Curse of dimensionality
Subjects
Details
- ISSN :
- 15730484 and 09208542
- Volume :
- 77
- Database :
- OpenAIRE
- Journal :
- The Journal of Supercomputing
- Accession number :
- edsair.doi...........c9b013af70f45667e80839851370e3d5
- Full Text :
- https://doi.org/10.1007/s11227-020-03279-x