Back to Search
Start Over
Scalable real-time OLAP on cloud architectures
- Source :
- Journal of Parallel and Distributed Computing. :31-41
- Publication Year :
- 2015
- Publisher :
- Elsevier BV, 2015.
-
Abstract
- In contrast to queries for on-line transaction processing (OLTP) systems that typically access only a small portion of a database, OLAP queries may need to aggregate large portions of a database which often leads to performance issues. In this paper we introduce CR-OLAP, a scalable Cloud based Real-time OLAP system based on a new distributed index structure for OLAP, the distributed PDCR tree. CR-OLAP utilizes a scalable cloud infrastructure consisting of multiple commodity servers (processors). That is, with increasing database size, CR-OLAP dynamically increases the number of processors to maintain performance. Our distributed PDCR tree data structure supports multiple dimension hierarchies and efficient query processing on the elaborate dimension hierarchies which are so central to OLAP systems. It is particularly efficient for complex OLAP queries that need to aggregate large portions of the data warehouse, such as "report the total sales in all stores located in California and New York during the months February-May of all years". We evaluated CR-OLAP on the Amazon EC2 cloud, using the TPC-DS benchmark data set. The tests demonstrate that CR-OLAP scales well with increasing number of processors, even for complex queries. For example, for an Amazon EC2 cloud instance with 16 processors, a data warehouse with 160 million tuples, and a TPC-DS OLAP query stream where each query aggregates between 60% and 95% of the database, CR-OLAP achieved a query latency of below 0.3 s which can be considered a real time response. Collaboration with the IBM on alleviating performance bottlenecks for OLAP queries.OLAP queries may aggregate large portions of the database, creating bottlenecks.We study the use of parallel computing on scalable clouds to accelerate queries.Our system, CR-OLAP, is based on a new scalable distributed index structure.CR-OLAP uses dynamic cloud elasticity to improve performance.
- Subjects :
- Computer Networks and Communications
Computer science
InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
Cloud computing
02 engineering and technology
computer.software_genre
Theoretical Computer Science
Artificial Intelligence
020204 information systems
Server
0202 electrical engineering, electronic engineering, information engineering
Database
business.industry
Transaction processing
Online analytical processing
InformationSystems_DATABASEMANAGEMENT
Data warehouse
Tree (data structure)
Hardware and Architecture
Scalability
Online transaction processing
020201 artificial intelligence & image processing
Tuple
business
computer
Software
Subjects
Details
- ISSN :
- 07437315
- Database :
- OpenAIRE
- Journal :
- Journal of Parallel and Distributed Computing
- Accession number :
- edsair.doi...........ff618306c14875ab2ff83db968d6cd3a
- Full Text :
- https://doi.org/10.1016/j.jpdc.2014.08.006