1. Logical Schema Design that Quantifies Update Inefficiency and Join Efficiency
- Author
-
Sebastian Link and Ziheng Wei
- Subjects
Theoretical computer science ,Computer science ,Materialized view ,Parameterized complexity ,020207 software engineering ,02 engineering and technology ,Data redundancy ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Cardinality (SQL statements) ,Special case ,Tuple ,Functional dependency ,Integer (computer science) - Abstract
The goal of classical normalization is to maintain data consistency under updates, with a minimum level of effort. Given functional dependencies (FDs) alone, this goal is only achievable in the special case an FD-preserving Boyce-Codd Normal Form (BCNF) decomposition exists. As we show, in all other cases the level of effort can be neither controlled nor quantified. In response, we establish the l-Bounded Cardinality Normal Form, parameterized by a positive integer l. For every l, the normal form condition requires from every instance that every value combination over the left-hand side of every non-trivial FD does not occur in more than l tuples. BCNF is captured when l=1. We demonstrate that schemata in this normal form characterize the instances that are i) free from level l data redundancy and update inefficiency, and ii) permit level l join efficiency. We establish algorithms that compute schemata in l-Bounded Cardinality Normal Form for the smallest level l attainable across all FD-preserving decompositions. Additional algorithms i) attain even smaller levels of effort based on the loss of some FDs, and ii) decompose schemata based on prioritized FDs that cause high levels of effort. Our framework informs de-normalization already during logical design. In particular, level l quantifies both the incremental maintenance and join support of materialized views. Experiments with synthetic and real-world data illustrate which properties the schemata have that result from our algorithms, and how these properties predict the performance of update and query operations on instances over the schemata, without and with materialized views.
- Published
- 2021