3 results
Search Results
2. Logical Schema Design that Quantifies Update Inefficiency and Join Efficiency.
- Author
-
LINK, SEBASTIAN and ZIHENG WEI
- Subjects
ALGORITHMS ,SCHEMAS (Psychology) ,DECOMPOSITION method ,REDUNDANCY (Linguistics) ,NORMAL forms (Mathematics) - Abstract
The goal of classical normalization is to maintain data consistency under updates, with a minimum level of effort. Given functional dependencies (FDs) alone, this goal is only achievable in the special case an FD-preserving Boyce-Codd Normal Form (BCNF) decomposition exists. As we show, in all other cases the level of effort can be neither controlled nor quantified. In response, we establish the -Bounded Cardinality Normal Form, parameterized by a positive integer l. For every ', the normal form condition requires from every instance that every value combination over the left-hand side of every non-trivial FD does not occur in more than l tuples. BCNF is captured when l = 1. We demonstrate that schemata in this normal form characterize the instances that are i) free from level l data redundancy and update inefficiency, and ii) permit level l join efficiency. We establish algorithms that compute schemata in l-Bounded Cardinality Normal Form for the smallest level l attainable across all FD-preserving decompositions. Additional algorithms i) attain even smaller levels of effort based on the loss of some FDs, and ii) decompose schemata based on prioritized FDs that cause high levels of effort. Our framework informs de-normalization already during logical design. In particular, level l quantifies both the incremental maintenance and join support of materialized views. Experiments with synthetic and real-world data illustrate which properties the schemata have that result from our algorithms, and how these properties predict the performance of update and query operations on instances over the schemata, without and with materialized views. [ABSTRACT FROM AUTHOR]
- Published
- 2021
3. Algorithms for the Discovery of Embedded Functional Dependencies.
- Author
-
ZIHENG WEI, HARTMANN, SVEN, and LINK, SEBASTIAN
- Subjects
DISCOVERY (Law) ,MISSING data (Statistics) ,FUNCTIONAL dependencies ,DATA integration ,ALGORITHMS - Abstract
Embedded functional dependencies (eFDs) were recently introduced to tailor relational schema design to data completeness requirements of applications. They also facilitate data cleaning and data integration. A problem that is essential to unlocking these applications is the discovery of all eFDs that hold on a given data set. We show that the discovery problem of eFDs is NP-complete, W[2]-complete in the output, and has a minimum solution space that is larger than the maximum solution space for functional dependencies. Despite these computational challenges, we use novel data structures and search strategies to develop row-efficient, column-efficient, and hybrid algorithms that can efficiently solve the discovery problem for eFDs on large real-world benchmark data sets. Our experiments also demonstrate that the algorithms scale well in terms of their design targets, and that ranking the eFDs by the number of redundant data values they cause can provide useful guidance in identifying meaningful eFDs for applications. Finally, we demonstrate the benefits of introducing completeness requirements and ranking by the number of redundant data values for approximate and genuine functional dependencies. [ABSTRACT FROM AUTHOR]
- Published
- 2020
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.