Back to Search Start Over

Scalable real-time OLAP on cloud architectures

Authors :
R. Zhou
Andrew Rau-Chaplin
Frank Dehne
Q. Kong
Hamidreza Zaboli
Source :
Journal of Parallel and Distributed Computing. :31-41
Publication Year :
2015
Publisher :
Elsevier BV, 2015.

Abstract

In contrast to queries for on-line transaction processing (OLTP) systems that typically access only a small portion of a database, OLAP queries may need to aggregate large portions of a database which often leads to performance issues. In this paper we introduce CR-OLAP, a scalable Cloud based Real-time OLAP system based on a new distributed index structure for OLAP, the distributed PDCR tree. CR-OLAP utilizes a scalable cloud infrastructure consisting of multiple commodity servers (processors). That is, with increasing database size, CR-OLAP dynamically increases the number of processors to maintain performance. Our distributed PDCR tree data structure supports multiple dimension hierarchies and efficient query processing on the elaborate dimension hierarchies which are so central to OLAP systems. It is particularly efficient for complex OLAP queries that need to aggregate large portions of the data warehouse, such as "report the total sales in all stores located in California and New York during the months February-May of all years". We evaluated CR-OLAP on the Amazon EC2 cloud, using the TPC-DS benchmark data set. The tests demonstrate that CR-OLAP scales well with increasing number of processors, even for complex queries. For example, for an Amazon EC2 cloud instance with 16 processors, a data warehouse with 160 million tuples, and a TPC-DS OLAP query stream where each query aggregates between 60% and 95% of the database, CR-OLAP achieved a query latency of below 0.3 s which can be considered a real time response. Collaboration with the IBM on alleviating performance bottlenecks for OLAP queries.OLAP queries may aggregate large portions of the database, creating bottlenecks.We study the use of parallel computing on scalable clouds to accelerate queries.Our system, CR-OLAP, is based on a new scalable distributed index structure.CR-OLAP uses dynamic cloud elasticity to improve performance.

Details

ISSN :
07437315
Database :
OpenAIRE
Journal :
Journal of Parallel and Distributed Computing
Accession number :
edsair.doi...........ff618306c14875ab2ff83db968d6cd3a
Full Text :
https://doi.org/10.1016/j.jpdc.2014.08.006