Back to Search Start Over

Scalable data analytics using crowdsourced repositories and streams.

Authors :
Veloso, Bruno
Leal, Fátima
González-Vélez, Horacio
Malheiro, Benedita
Burguillo, Juan Carlos
Source :
Journal of Parallel & Distributed Computing. Dec2018, Vol. 122, p1-10. 10p.
Publication Year :
2018

Abstract

Abstract The scalable analysis of crowdsourced data repositories and streams has quickly become a critical experimental asset in multiple fields. It enables the systematic aggregation of otherwise disperse data sources and their efficient processing using significant amounts of computational resources. However, the considerable amount of crowdsourced social data and the numerous criteria to observe can limit analytical off-line and on-line processing due to the intrinsic computational complexity. This paper demonstrates the efficient parallelisation of profiling and recommendation algorithms using tourism crowdsourced data repositories and streams. Using the Yelp data set for restaurants, we have explored two different profiling approaches: entity-based and feature-based using ratings, comments, and location. Concerning recommendation, we use a collaborative recommendation filter employing singular value decomposition with stochastic gradient descent (SVD-SGD). To accurately compute the final recommendations, we have applied post-recommendation filters based on venue suitability, value for money, and sentiment. Additionally, we have built a social graph for enrichment. Our master–worker implementation shows super-linear scalability for 10, 20, 30, 40, 50, and 60 concurrent instances. Highlights • Parallel data stream algorithm. • Scalable Recommendation Engine. • Profiling and Recommendation algorithms using scalable data analytics. • Post-Recommendations filtering to accurate the final recommendations. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
07437315
Volume :
122
Database :
Academic Search Index
Journal :
Journal of Parallel & Distributed Computing
Publication Type :
Academic Journal
Accession number :
132289958
Full Text :
https://doi.org/10.1016/j.jpdc.2018.06.013