Dominique Laurent, Nicolas Spyratos, CY Cergy Paris Université (CY), Equipes Traitement de l'Information et Systèmes (ETIS - UMR 8051), CY Cergy Paris Université (CY)-Centre National de la Recherche Scientifique (CNRS)-Ecole Nationale Supérieure de l'Electronique et de ses Applications (ENSEA), Centre National de la Recherche Scientifique (CNRS), Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Université Paris-Saclay, Ecole Nationale Supérieure de l'Electronique et de ses Applications (ENSEA)-Centre National de la Recherche Scientifique (CNRS)-CY Cergy Paris Université (CY), Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-CentraleSupélec, and Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
In this paper we address the problem of handling inconsistencies in tables with missing values (also called nulls) and functional dependencies. Although the traditional view is that table instances must respect all functional dependencies imposed on them, it is nevertheless relevant to develop theories about how to handle instances that violate some dependencies. Regarding missing values, we make no assumptions on their existence: a missing value exists only if it is inferred from the functional dependencies of the table. We propose a formal framework in which each tuple of a table is associated with a truth value among the following: true, false, inconsistent or unknown; and we show that our framework can be used to study important problems such as consistent query answering, table merging, and data quality measures - to mention just a few. In this paper, however, we focus mainly on consistent query answering, a problem that has received considerable attention during the last decades. The main contributions of the paper are the following: (a) we introduce a new approach to handle inconsistencies in a table with nulls and functional dependencies, (b) we give algorithms for computing all true, inconsistent and false tuples, (c) we investigate the relationship between our approach and Four-valued logic in the context of data merging, and (d) we give a novel solution to the consistent query answering problem and compare our solution to that of table repairs., In the present version a few changes have been made with respect to the previous version: 1/ The following proofs of lemmas 1, 2, 3 and of Proposition 2 have been rewritten. 2/ A new definition of consistent answer is given and compared with existing approaches based on repairs