1. Proteus: A Self-Designing Range Filter
- Author
-
Knorr, Eric R., Lemaire, Baptiste, Lim, Andrew, Luo, Siqiang, Zhang, Huanchen, Idreos, Stratos, and Mitzenmacher, Michael
- Subjects
Computer Science - Databases ,Computer Science - Data Structures and Algorithms ,Computer Science - Machine Learning ,F.2.m ,H.3.3 - Abstract
We introduce Proteus, a novel self-designing approximate range filter, which configures itself based on sampled data in order to optimize its false positive rate (FPR) for a given space requirement. Proteus unifies the probabilistic and deterministic design spaces of state-of-the-art range filters to achieve robust performance across a larger variety of use cases. At the core of Proteus lies our Contextual Prefix FPR (CPFPR) model - a formal framework for the FPR of prefix-based filters across their design spaces. We empirically demonstrate the accuracy of our model and Proteus' ability to optimize over both synthetic workloads and real-world datasets. We further evaluate Proteus in RocksDB and show that it is able to improve end-to-end performance by as much as 5.3x over more brittle state-of-the-art methods such as SuRF and Rosetta. Our experiments also indicate that the cost of modeling is not significant compared to the end-to-end performance gains and that Proteus is robust to workload shifts., Comment: 14 pages, 9 figures, originally published in the Proceedings of the 2022 International Conference on Management of Data (SIGMOD'22), ISBN: 9781450392495
- Published
- 2022
- Full Text
- View/download PDF