Back to Search
Start Over
Prefetched wald adaptive boost classification based Czekanowski similarity MapReduce for user query processing with bigdata
- Source :
- Distributed and Parallel Databases. 39:855-872
- Publication Year :
- 2021
- Publisher :
- Springer Science and Business Media LLC, 2021.
-
Abstract
- With large volumes of data being generated in recent years and the inception of big data analytics on social media necessitates accurate user query processing with minimum time complexity. Several research works have been conducted in this area, to address accuracy and time complexity involved in query processing, in this work, Wald Adaptive Prefetched Boosting Classification based Czekanowski Similarity MapReduce (WAPBC–CSMR) technique is introduced. The WAPBC–CSMR technique uses the big dataset for processing large number of user queries. First, a technique called, Wald Adaptive Prefetched Boosting is employed with the objective of classifying the big dataset into different classes. To reduce the time involved in classification, in this paper a classifier called Gaussian distributive Rocchio is used that achieves significant classification in minimum time. With the classified results, a Likelihood Radio Test is applied to integrate the weak learner results into strong classification results. Then the classified and refined data are stored on the prefetcher cache. Upon reception of multi-dimensional user queries by the prefetch manager, the queries are now split into multiple keywords and are fed into the map phase, where mapping function is performed using Czekanowski Similarity Index with the objective of identifying the repeated jobs with maximum query processing accuracy. Followed by which the relevant data are retrieved from the prefetcher cache and repeated user query task is removed in the reduce phase via statistical function, therefore contributing to minimum time. Result analysis of WAPBC–CSMR is performed with big dataset using different metrics such as query processing accuracy, error rate and processing time for varied number of user queries. The result shows that WAPBC–CSMR technique enhances query processing accuracy and lessens the time as well as the error rate than the conventional methods.
- Subjects :
- Instruction prefetch
Information Systems and Management
Boosting (machine learning)
business.industry
Computer science
Big data
Word error rate
02 engineering and technology
Data structure
computer.software_genre
Hardware and Architecture
020204 information systems
Classifier (linguistics)
0202 electrical engineering, electronic engineering, information engineering
Cache
Data mining
business
Time complexity
computer
Software
Information Systems
Subjects
Details
- ISSN :
- 15737578 and 09268782
- Volume :
- 39
- Database :
- OpenAIRE
- Journal :
- Distributed and Parallel Databases
- Accession number :
- edsair.doi...........4bd090fec09a5c9fd7764d051690a95b
- Full Text :
- https://doi.org/10.1007/s10619-020-07319-6