Back to Search
Start Over
Auto-join
- Source :
- Proceedings of the VLDB Endowment; June 2017, Vol. 10 Issue: 10 p1034-1045, 12p
- Publication Year :
- 2017
-
Abstract
- Traditional equi-join relies solely on string equality comparisons to perform joins. However, in scenarios such as ad-hoc data analysis in spreadsheets, users increasingly need to join tables whose join-columns are from the same semantic domain but use different textual representations, for which transformations are needed before equi-join can be performed. We developed Auto-Join, a system that can automatically search over a rich space of operators to compose a transformation program, whose execution makes input tables equi-join-able. We developed an optimal sampling strategy that allows Auto-Join to scale to large datasets efficiently, while ensuring joins succeed with high probability. Our evaluation using real test cases collected from both public web tables and proprietary enterprise tables shows that the proposed system performs the desired transformation joins efficiently and with high quality.
Details
- Language :
- English
- ISSN :
- 21508097
- Volume :
- 10
- Issue :
- 10
- Database :
- Supplemental Index
- Journal :
- Proceedings of the VLDB Endowment
- Publication Type :
- Periodical
- Accession number :
- ejs51532367
- Full Text :
- https://doi.org/10.14778/3115404.3115409