Back to Search
Start Over
A Study of SQL-on-Hadoop Systems
- Source :
- Big Data Benchmarks, Performance Optimization, and Emerging Hardware ISBN: 9783319130200, BPOE@ASPLOS/VLDB
- Publication Year :
- 2014
- Publisher :
- Springer International Publishing, 2014.
-
Abstract
- Hadoop is now the de facto standard for storing and processing big data, not only for unstructured data but also for some structured data. As a result, providing SQL analysis functionality to the big data resided in HDFS becomes more and more important. Hive is a pioneer system that support SQL-like analysis to the data in HDFS. However, the performance of Hive is not satisfactory for many applications. This leads to the quick emergence of dozens of SQL-on-Hadoop systems that try to support interactive SQL query processing to the data stored in HDFS. This paper firstly gives a brief technical review on recent efforts of SQL-on-Hadoop systems. Then we test and compare the performance of five representative SQL-on-Hadoop systems, based on some queries selected or derived from the TPC-DS benchmark. According to the results, we show that such systems can benefit more from the applications of many parallel query processing techniques that have been widely studied in the traditional MPP analytical databases.
Details
- ISBN :
- 978-3-319-13020-0
- ISBNs :
- 9783319130200
- Database :
- OpenAIRE
- Journal :
- Big Data Benchmarks, Performance Optimization, and Emerging Hardware ISBN: 9783319130200, BPOE@ASPLOS/VLDB
- Accession number :
- edsair.doi...........004e36be6334873d5a6af4eac1203763
- Full Text :
- https://doi.org/10.1007/978-3-319-13021-7_12