1. Answering Approximate Queries Over XML Data
- Author
-
D. L. Yan and Jian Liu
- Subjects
Information retrieval ,Computer science ,Applied Mathematics ,Efficient XML Interchange ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,Query optimization ,Query language ,Spatial query ,Query expansion ,XML database ,Computational Theory and Mathematics ,Artificial Intelligence ,Control and Systems Engineering ,Web query classification ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,computer - Abstract
With the increasing popularity of XML for data representations, there is a lot of interest in searching XML data. Due to the structural heterogeneity and textual content's diversity of XML, it is daunting for users to formulate exact queries and search accurate answers. Therefore, approximate matching is introduced to deal with the difficulty in answering users’ queries, and this matching could be addressed by first relaxing the structure and content of a given query and, then, looking for answers that match the relaxed queries. Ranking and returning the most relevant results of a query have become the most popular paradigm in XML query processing. However, the existing proposals do not adequately take structures into account, and they, therefore, lack the strength to elegantly combine structures with contents to answer the relaxed queries. To address this problem, we first propose a sophisticated framework of query relaxations for supporting approximate queries over XML data. The answers underlying this framework are not compelled to strictly satisfy the given query formulation; instead, they can be founded on properties inferable from the original query. We, then, develop a novel top- k retrieval approach that can smartly generate the most promising answers in an order correlated with the ranking measure. We complement the work with a comprehensive set of experiments to show the effectiveness of our proposed approach in terms of precision and recall metrics.
- Published
- 2016