Back to Search Start Over

Data-induced predicates for sideways information passing in query optimizers

Authors :
Srikanth Kandula
Surajit Chaudhuri
Laurel Orr
Source :
The VLDB Journal. 31:1263-1290
Publication Year :
2021
Publisher :
Springer Science and Business Media LLC, 2021.

Abstract

Using data statistics, we convert predicates on a table into data-induced predicates (diPs) that apply on the joining tables. Doing so substantially speeds up multi-relation queries because the benefits of predicate pushdown can now apply beyond just the tables that have predicates. We use diPs to skip data exclusively during query optimization; i.e., diPs lead to better plans and have no overhead during query execution. We study how to apply diPs for complex query expressions and how the usefulness of diPs varies with the data statistics used to construct diPs and the data distributions. Our results show that building diPs using zone-maps which are already maintained in today’s clusters leads to sizable data skipping gains. Using a new (slightly larger) statistic, 50% of the queries in the TPC-H, TPC-DS and JoinOrder benchmarks can skip at least 33% of the query input. Consequently, the median query in a production big-data cluster finishes roughly $$2\times $$ faster.

Details

ISSN :
0949877X and 10668888
Volume :
31
Database :
OpenAIRE
Journal :
The VLDB Journal
Accession number :
edsair.doi...........e4bd6ef4f3667fd0f645f6a10ef81890