Back to Search Start Over

Learning From Query-Answers.

Authors :
Meneghetti, Niccolò
Kennedy, Oliver
Gatterbauer, Wolfgang
Source :
ACM Transactions on Database Systems. Dec2018, Vol. 43 Issue 4, p1-41. 41p.
Publication Year :
2018

Abstract

Tuple-independent and disjoint-independent probabilistic databases (TI- and DI-PDBs) represent uncertain data in a factorized form as a product of independent random variables that represent either tuples (TI-PDBs) or sets of tuples (DI-PDBs). When the user submits a query, the database derives the marginal probabilities of each output-tuple, exploiting the underlying assumptions of statistical independence. While query processing in TI- and DI-PDBs has been studied extensively, limited research has been dedicated to the problems of updating or deriving the parameters from observations of query results. Addressing this problem is the main focus of this article. We first introduce Beta Probabilistic Databases (B-PDBs), a generalization of TI-PDBs designed to support both (i) belief updating and (ii) parameter learning in a principled and scalable way. The key idea of B-PDBs is to treat each parameter as a latent, Beta-distributed random variable. We show how this simple expedient enables both belief updating and parameter learning in a principled way, without imposing any burden on regular query processing. Building on B-PDBs, we then introduce Dirichlet Probabilistic Databases (D-PDBs), a generalization of DI-PDBs with similar properties. We provide the following key contributions for both B- and D-PDBs: (i) We study the complexity of performing Bayesian belief updates and devise efficient algorithms for certain tractable classes of queries; (ii) we propose a soft-EM algorithm for computing maximum-likelihood estimates of the parameters; (iii) we present an algorithm for efficiently computing conditional probabilities, allowing us to efficiently implement B- and D-PDBs via a standard relational engine; and (iv) we support our conclusions with extensive experimental results. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
03625915
Volume :
43
Issue :
4
Database :
Academic Search Index
Journal :
ACM Transactions on Database Systems
Publication Type :
Academic Journal
Accession number :
134657559
Full Text :
https://doi.org/10.1145/3277503