4,571 results on '"XML validation"'
Search Results
2. RELATIONAL STORAGE FOR XML RULES
- Author
-
Abd El-Aziz A.A
- Subjects
Document Structure Description ,XML Encryption ,Information retrieval ,XML Security, XML Rules, Relational Database, XPath queries, SQL ,Database ,Computer science ,Efficient XML Interchange ,XML Signature ,InformationSystems_DATABASEMANAGEMENT ,XML validation ,computer.file_format ,computer.software_genre ,XML database ,XML Schema Editor ,Streaming XML ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,computer - Abstract
Very few research works have been done on XML security over relational databases despite that XML became the de facto standard for the data representation and exchange on the internet and a lot of XML documents are stored in RDBMS. In [14], the author proposed an access control model for schema-based storage of XML documents in relational storage and translating XML access control rules to relational access control rules. However, the proposed algorithms had performance drawbacks. In this paper, we will use the same access control model of [14] and try to overcome the drawbacks of [14] by proposing an efficient technique to store the XML access control rules in a relational storage of XML DTD. The mapping of the XML DTD to relational schema is proposed in [7]. We also propose an algorithm to translate XPath queries to SQL queries based on the mapping algorithm in [7].
- Published
- 2021
- Full Text
- View/download PDF
3. Extensible Binary Meta Language
- Author
-
Steve Lhomme, Moritz Bunkus, and Dave Rice
- Subjects
Document Structure Description ,Database ,Programming language ,computer.internet_protocol ,Computer science ,Efficient XML Interchange ,XML validation ,Well-formed document ,Document type definition ,computer.file_format ,computer.software_genre ,XML Schema Editor ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,XML schema ,computer ,XML ,computer.programming_language - Abstract
This document defines the Extensible Binary Meta Language (EBML) format as a generalized file format for any type of data in a hierarchical form. EBML is designed as a binary equivalent to XML and uses a storage-efficient approach to build nested Elements with identifiers, lengths, and values. Similar to how an XML Schema defines the structure and semantics of an XML Document, this document defines how EBML Schemas are created to convey the semantics of an EBML Document.
- Published
- 2020
4. Consistencies of fuzzy spatiotemporal data in XML documents
- Author
-
Zongmin Ma, Li Yan, Yoshiharu Ishikawa, and Luyi Bai
- Subjects
Spatiotemporal database ,Logic ,Computer science ,computer.internet_protocol ,02 engineering and technology ,computer.software_genre ,Fuzzy logic ,Consistency (database systems) ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,Nonlinear Sciences::Pattern Formation and Solitons ,Computer Science::Databases ,computer.programming_language ,Structure (mathematical logic) ,Information retrieval ,05 social sciences ,050301 education ,XML validation ,Data model ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,ComputingMethodologies_GENERAL ,Data mining ,0503 education ,computer ,XML - Abstract
Researches on spatiotemporal data based on XML has received increasing attention due to that XML has a lot of advantages such as extensibility and flexibility. Although XML has been employed to model and handle spatiotemporal data, relatively little work has been carried out to further investigate the consistencies of spatiotemporal data, especially fuzzy spatiotemporal data in XML documents. In this paper, we first propose a fuzzy spatiotemporal data model, and then present the structure of fuzzy spatiotemporal data in XML document. After studying consistency conditions for fuzzy spatiotemporal data in XML documents, we demonstrate how updating operations, inserting operations, and deleting operations effect on consistencies of fuzzy spatiotemporal data in XML documents. Furthermore, we propose algorithms for fixing these inconsistencies. After investigating several characteristics of the three primitive changing operations on the fuzzy spatiotemporal data model, the performances of inconsistency fixing time are evaluated.
- Published
- 2018
5. An approach of top-k keyword querying for fuzzy XML
- Author
-
Ting Li, Li Yan, and Zongmin Ma
- Subjects
Document Structure Description ,computer.internet_protocol ,Computer science ,Efficient XML Interchange ,02 engineering and technology ,computer.software_genre ,Semantics ,Fuzzy logic ,Theoretical Computer Science ,Set (abstract data type) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,computer.programming_language ,Numerical Analysis ,Information retrieval ,XML validation ,computer.file_format ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,Data mining ,computer ,Software ,XML - Abstract
Keyword search on XML document has received wide attention. Many search semantics and algorithms have been proposed for XML keyword queries. But the existing approaches fall short in their abilities to support keyword queries over fuzzy XML documents. To overcome this limitation, in this paper, we discuss how to obtain and evaluate top-k smallest lowest common ancestor (SLCA) results of keyword queries on fuzzy XML documents. We define the fuzzy SLCA semantics on the fuzzy XML document, and then propose a novel encoding scheme to denote different types of nodes in fuzzy XML documents. After these, we propose two efficient algorithms to find k SLCA results with highest possibilities for a given keyword query on the fuzzy XML document. First one is an algorithm which can obtain the top-k SLCA results and their possibilities based on the stack technique. The second algorithm can obtain top-k SLCA results of keyword queries based on a set of SLCA’s properties. Finally, we compare and evaluate the performances of the two algorithms.
- Published
- 2017
6. Structural XML Query Processing
- Author
-
Michal Krátký, Martin Svoboda, Tomáš Skopal, Sherif Sakr, Irena Holubová, Martin Nečaský, and Radim Baca
- Subjects
Document Structure Description ,XML Encryption ,Information retrieval ,General Computer Science ,Database ,Computer science ,Efficient XML Interchange ,XML Signature ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,Theoretical Computer Science ,XML database ,XML Schema Editor ,020204 information systems ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,computer - Abstract
Since the boom in new proposals on techniques for efficient querying of XML data is now over and the research world has shifted its attention toward new types of data formats, we believe that it is crucial to review what has been done in the area to help users choose an appropriate strategy and scientists exploit the contributions in new areas of data processing. The aim of this work is to provide a comprehensive study of the state-of-the-art of approaches for the structural querying of XML data. In particular, we start with a description of labeling schemas to capture the structure of the data and the respective storage strategies. Then we deal with the key part of every XML query processing: a twig query join, XML query algebras, optimizations of query plans, and selectivity estimation of XML queries. To the best of our knowledge, this is the first work that provides such a detailed description of XML query processing techniques that are related to structural aspects and that contains information about their theoretical and practical features as well as about their mutual compatibility and general usability.
- Published
- 2017
7. An approach to build XML-based domain specific languages solutions for client-side web applications
- Author
-
Francisco Jurado, Enrique Chavarriaga, and Fernando Díez
- Subjects
Computer Networks and Communications ,Computer science ,computer.internet_protocol ,Programming language ,Efficient XML Interchange ,XML Signature ,020207 software engineering ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,XML framework ,XML database ,020204 information systems ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,computer ,Software ,XML ,computer.programming_language - Abstract
Summary Domain-Specific Languages (DSLs) allow for the building of applications that ease the labour of both software engineers and domain experts thanks to the level of abstraction they provide. In cases where the domain is restricted to Client-Side Web Applications (CSWA), XML-based languages, frameworks and widgets are commonly combined in order to provide fast, robust and flexible solutions. This article presents an approach designed to create XML-based DSL solutions for CSWA that includes an evaluation engine, a programming model and a lightweight development environment. The approach is able to evaluate multiple XML-based DSL programs simultaneously to provide solutions to those Domain Specific Problems for CSWAs. To better demonstrate the capabilities and potential of this novel approach, we will employ a couple of case studies, namely Anisha and FeedPsi .
- Published
- 2017
8. BonXai
- Author
-
Frank Neven, Thomas Schwentick, Wim Martens, Matthias Niewerth, MARTENS, Wim, NEVEN, Frank, Niewerth, Matthias, and Schwentick, Thomas
- Subjects
Document Structure Description ,XML Encryption ,Schematron ,Computer science ,computer.internet_protocol ,Efficient XML Interchange ,XML ,BonXai ,XML Schema ,schema languages ,XML Signature ,02 engineering and technology ,computer.software_genre ,Logical schema ,XML Schema Editor ,020204 information systems ,Schema (psychology) ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,RELAX NG ,XML schema ,computer.programming_language ,Programming language ,cXML ,XML validation ,computer.file_format ,XML framework ,XML database ,XML Schema (W3C) ,Document Schema Definition Languages ,Star schema ,Document Definition Markup Language ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,Data mining ,computer ,Information Systems - Abstract
While the migration from DTD to XML Schema was driven by a need for increased expressivity and flexibility, the latter was also significantly more complex to use and understand. Whereas DTDs are characterized by their simplicity, XML Schema Documents are notoriously difficult. In this article, we introduce the XML specification language BonXai, which incorporates many features of XML Schema but is arguably almost as easy to use as DTDs. In brief, the latter is achieved by sacrificing the explicit use of types in favor of simple patterns expressing contexts for elements. The goal of BonXai is not to replace XML Schema but rather to provide a simpler alternative for users who want to go beyond the expressiveness and features of DTD but do not need the explicit use of types. Furthermore, XML Schema processing tools can be used as a back-end for BonXai, since BonXai can be automatically converted into XML Schema. A particularly strong point of BonXai is its solid foundation rooted in a decade of theoretical work around pattern-based schemas. We present a formal model for a core fragment of BonXai and the translation algorithms to and from a core fragment of XML Schema. We prove that BonXai and XML Schema can be converted back-and-forth on the level of tree languages and we formally study the size trade-offs between the two languages. We acknowledge the financial support of grant number MA 4938/2-1 from the Deutsche Forschungsgemeinschaft (Emmy Noether Nachwuchsgruppe). We further acknowledge the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under the FET-Open grant agreement FOX, number FP7-ICT-233599.
- Published
- 2017
9. Efficient keyword search in fuzzy XML
- Author
-
Jian Liu and Xuefeng Zhang
- Subjects
Document Structure Description ,Information retrieval ,Logic ,Computer science ,computer.internet_protocol ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,Efficient XML Interchange ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,XML framework ,Keyword density ,XML database ,Artificial Intelligence ,020204 information systems ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,XML schema ,Data mining ,computer ,XML ,computer.programming_language - Abstract
Evaluation of keyword queries over XML documents is one of the most fundamental tasks for XML data management. Previous methods have focused on the processing of deterministic XML data. However, uncertain data are inherent in practical applications, and how to support efficient keyword search over fuzzy XML data remains at large an open problem. In this paper, we tackle the problem of efficiently producing SLCA (smallest lowest common ancestor) results for keyword queries in fuzzy XML documents. We propose an efficient approach that can find all SLCA results for a given keyword query over fuzzy XML data. In particular, we introduce an effective method to transform a simple keyword query into a segmented keyword query that captures the original query requirements and conforms to the underlying fuzzy XML data. The proposed approach could help us eliminate irrelevant SLCA results and speed up the query processing. The final experiments show the effectiveness and efficiency of our proposed approach in generating SLCA results.
- Published
- 2017
10. Performance Evaluation of Native XML Database and XML Enabled Database
- Author
-
S. Balamurugan and A. Ayyasamy
- Subjects
Document Structure Description ,Information retrieval ,Database ,Computer science ,Efficient XML Interchange ,XML Signature ,XML validation ,XML Base ,computer.file_format ,computer.software_genre ,XML database ,XML Schema Editor ,Streaming XML ,computer - Published
- 2017
11. A methodology for measuring structure similarity of fuzzy XML documents
- Author
-
Zhen Zhao and Zongmin Ma
- Subjects
Document Structure Description ,computer.internet_protocol ,Computer science ,Well-formed document ,02 engineering and technology ,Similarity measure ,computer.software_genre ,Theoretical Computer Science ,Simple API for XML ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,computer.programming_language ,Numerical Analysis ,Information retrieval ,XML validation ,Computer Science Applications ,XML framework ,Computational Mathematics ,Computational Theory and Mathematics ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,Data mining ,computer ,Software ,XML - Abstract
Document matching has become a crucial task for data integration. A considerable amount of algorithms for comparing XML documents have been proposed in the literature. Yet, the existing approaches fall short in ability to identify structural similarities of fuzzy XML documents. To fill this gap, in this paper, we provide an integrated comparison approach to cope with structural similarities of the fuzzy XML documents. Firstly, we propose a new fuzzy XML document tree model to represent fuzzy XML document. Secondly, we offer element/attribute features similarity measure approach to identify matching nodes. Thirdly, we present an effective algorithm based on the tree edit distance to detect the structural similarities between fuzzy XML document trees represented with the proposed model. Finally, the experimental results demonstrate that our approach can efficiently perform structural similarity measure of the fuzzy XML documents.
- Published
- 2017
12. Tree pattern matching in heterogeneous fuzzy XML databases
- Author
-
Xiao Zhang, Jian Liu, and Lei Zhang
- Subjects
Document Structure Description ,Information Systems and Management ,computer.internet_protocol ,Computer science ,Efficient XML Interchange ,XML Signature ,Well-formed document ,02 engineering and technology ,Document management system ,computer.software_genre ,Management Information Systems ,Simple API for XML ,Knowledge extraction ,Artificial Intelligence ,XML Schema Editor ,020204 information systems ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,Binary XML ,XML schema ,computer.programming_language ,Information retrieval ,Database ,XML validation ,computer.file_format ,XML framework ,XML database ,XML Schema (W3C) ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,Data mining ,computer ,Software ,XML - Abstract
Dealing with heterogeneous data underlying fuzzy XML databases is challenging for any task of document management and knowledge discovery, since the structural heterogeneity and uncertainty of the large number of XML data sources make it difficult to effectively answer the structured query, especially the tree-pattern query. To address this issue, we propose a novel framework for managing fuzzy XML queries in a heterogeneous environment in this paper. In particular, we devise a holistic algorithm for matching tree-patterns over heterogeneous fuzzy XML data. Our approach adopts a compact stack technique and generates the matches by one scan on the relevant data associated with the tree-pattern, which eliminates re-scanning unnecessary portions of XML documents and redundant intermediate results. Finally, a comprehensive experimental evaluation conducted on real and synthetic data sets is carried out to show the significance of our approach as a solution for querying heterogeneous data in fuzzy XML documents.
- Published
- 2017
13. Object-stack: An object-oriented approach for top-k keyword querying over fuzzy XML
- Author
-
Ting Li and Zongmin Ma
- Subjects
Document Structure Description ,Information retrieval ,Computer Networks and Communications ,Computer science ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,XML validation ,02 engineering and technology ,Query optimization ,Query language ,Theoretical Computer Science ,Query expansion ,Simple API for XML ,Keyword density ,Web query classification ,020204 information systems ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Software ,Information Systems - Abstract
Keyword search is the most popular technique of searching information from XML (eXtensible markup language) document. It enables users to easily access XML data without learning the structure query language or studying the complex data schemas. Existing traditional keyword query methods are mainly based on LCA (lowest common ancestor) semantics, in which the returned results match all keywords at the granularity of elements. In many practical applications, information is often uncertain and vague. As a result, how to identify useful information from fuzzy data is becoming an important research topic. In this paper, we focus on the issue of keyword querying on fuzzy XML data at the granularity of objects. By introducing the concept of "object tree", we propose the query semantics for keyword query at object-level. We find the minimum whole matching result object trees which contain all keywords and the partial matching result object trees which contain partial keywords, and return the root nodes of these result object trees as query results. For effectively and accurately identifying the top-K answers with the highest scores, we propose a score mechanism with the consideration of tf*idf document relevance, users' preference and possibilities of results. We propose a stack-based algorithm named object-stack to obtain the top-K answers with the highest scores. Experimental results show that the object-stack algorithm outperforms the traditional XML keyword query algorithms significantly, and it can get high quality of query results with high search efficiency on the fuzzy XML document.
- Published
- 2017
14. A query refinement framework for xml keyword search
- Author
-
Yi Yu, Jian Shen, Zhangjie Fu, and Zhifeng Bao
- Subjects
Computer Networks and Communications ,computer.internet_protocol ,Computer science ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,Efficient XML Interchange ,XML Signature ,02 engineering and technology ,Query optimization ,computer.software_genre ,Ranking (information retrieval) ,Query expansion ,Search engine ,Web query classification ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Information retrieval ,Web search query ,XML validation ,computer.file_format ,XML framework ,Keyword density ,Ranking ,Hardware and Architecture ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,Data mining ,computer ,Software ,XML - Abstract
Existing work of XML keyword search focus on how to find relevant and meaningful data fragments for a query, assuming each keyword is intended as part of it. However, in XML keyword search, user queries usually contain irrelevant or mismatched terms, typos etc, which may easily lead to empty or meaningless results. In this paper, we introduce the problem of content-aware XML keyword query refinement, where the search engine should judiciously decide whether a user query Q needs to be refined during the processing of Q, and find a list of promising refined query candidates which guarantee to have meaningful matching results over the XML data, without any user interaction or a second try. To achieve this goal, we build a novel content-aware XML keyword query refinement framework consisting of two core parts: (1) we build a query ranking model to evaluate the quality of a refined query RQ, which captures the morphological/semantical similarity between Q and RQ and the dependency of keywords of RQ over the XML data; (2) we integrate the exploration of RQ candidates and the generation of their matching results as a single problem, which is fulfilled within a one-time scan of the related keyword inverted lists optimally. Finally, an extensive empirical study verifies the efficiency and effectiveness of our framework.
- Published
- 2017
15. Fixing inconsistencies of fuzzy spatiotemporal XML data
- Author
-
Shaohui Cheng, Zhuo Lin, Zhulei Shao, and Luyi Bai
- Subjects
0209 industrial biotechnology ,Knowledge representation and reasoning ,computer.internet_protocol ,Computer science ,02 engineering and technology ,computer.software_genre ,Fuzzy logic ,Data modeling ,Consistency (database systems) ,020901 industrial engineering & automation ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,Computer Science::Databases ,computer.programming_language ,Information retrieval ,business.industry ,XML validation ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,The Internet ,ComputingMethodologies_GENERAL ,Data mining ,business ,computer ,XML - Abstract
Fuzzy spatiotemporal data models have been used to support spatial and temporal knowledge representation and reasoning in the presence of fuzziness. In the meantime, XML is expected to become the next generation standard language for exchanging data over the Internet, which will become a trend to represent fuzzy spatiotemporal data based on XML. However, fuzzy spatiotemporal XML documents may contain inconsistencies violating predefined spatial and temporal constraints, which cause the data inconsistency problems. Although those consistency problems in XML documents have been widely studied, their studies only take the general data into account, and the studies on consistencies of fuzzy spatiotemporal data are still open issues. In this paper we put forward solutions to the problems of inconsistencies in fuzzy spatiotemporal XML documents. We also analyze inconsistent states which are named discontinuity overlap or cycle of the temporal labels of some incoming edges. Then, we put forward the corresponding approaches to checking and fixing fuzzy spatiotemporal XML documents according to the inconsistent states. Finally, the experimental results show that our proposed algorithms can fix inconsistencies of fuzzy spatiotemporal XML documents significantly.
- Published
- 2017
16. A new structure and access mechanism for secure and efficient XML data broadcast in mobile wireless networks
- Author
-
Meghdad Mirabi and Babak Safabahar
- Subjects
Document Structure Description ,XML Encryption ,Computer science ,computer.internet_protocol ,SOAP ,Efficient XML Interchange ,XML Signature ,02 engineering and technology ,computer.software_genre ,Simple API for XML ,XML Schema Editor ,020204 information systems ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,XML namespace ,XML schema ,Binary XML ,computer.programming_language ,Database ,business.industry ,Search engine indexing ,XML validation ,computer.file_format ,XML framework ,XML database ,Hardware and Architecture ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,business ,computer ,Software ,XML ,Information Systems ,Computer network - Abstract
A new structure for streaming the XML data is proposed which guarantees confidentiality of the XML data over the wireless stream.An access mechanism is proposed to efficiently process XML queries over the encrypted XML stream. Recently, the use of XML for data broadcasting in mobile wireless networks has gained many attentions. One of the most essential requirements for such networks is data confidentiality. In order to secure XML data broadcast in mobile wireless networks, mobile clients should obey a set of access authorizations specified on the original XML document. In such environments, mobile clients can only access authorized parts of encrypted XML stream based on their access authorizations. Several indexing methods have been proposed in order to have selective access to XML data over the XML stream. However, these indexing methods cannot be used for encrypted XML data. In this paper, we define a new structure for XML stream which supports data confidentiality of XML data over the wireless broadcast channel. We also define an access mechanism for our proposed structure to efficiently process XML queries over the encrypted XML stream. The experimental results demonstrate that the use of our proposed structure and access mechanism for XML data broadcast efficiently disseminates XML data in mobile wireless networks.
- Published
- 2017
17. New Path Based Index Structure for Processing CAS Queries over XML Database
- Author
-
Dhanalekshmi Gopinathan and Krishna Asawa
- Subjects
Document Structure Description ,index ,XML Encryption ,General Computer Science ,Computer science ,computer.internet_protocol ,Efficient XML Interchange ,XML Signature ,Joins ,02 engineering and technology ,Database storage structures ,computer.software_genre ,CAS query ,XML ,query processing ,WWW ,XPath ,database storage ,lcsh:QA75.5-76.95 ,Twig ,Schema (psychology) ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,computer.programming_language ,Information retrieval ,05 social sciences ,Search engine indexing ,050301 education ,XML validation ,computer.file_format ,JSON ,XML database ,020201 artificial intelligence & image processing ,Data mining ,lcsh:Electronic computers. Computer science ,0503 education ,computer - Abstract
Querying nested data has become one of the most challenging issues for retrieving desired information from the Web. Today diverse applications generate a tremendous amount of data in different formats. These data and information exchanged on the Web are commonly expressed as nested representation such as XML, JSON, etc. Unlike the traditional database system, they don't have a rigid schema. In general, the nested data is managed by storing data and its structures separately which significantly reduces the performance of data retrieving. Ensuring efficiency of processing queries which locates the exact positions of the elements has become a big challenging issue. There are different indexing structures which have been proposed in the literature to improve the performance of the query processing on the nested structure. Most of the past researches on nested structure concentrate on the structure alone. This paper proposes new index structure which combines siblings of the terminal nodes as one path which efficiently processes twig queries with less number of lookups and joins. The proposed approach is compared with some of the existing approaches. The results also show that they are processed with better performance compared to the existing ones.
- Published
- 2017
18. Development of custom notation for XML-based language: A model-driven approach
- Author
-
Sergej Chodarev and Jaroslav Porubän
- Subjects
Document Structure Description ,General Computer Science ,Programming language ,Computer science ,computer.internet_protocol ,Efficient XML Interchange ,XML validation ,computer.file_format ,computer.software_genre ,01 natural sciences ,010305 fluids & plasmas ,XML Schema Editor ,Regular Language description for XML ,0103 physical sciences ,Streaming XML ,XML schema ,010306 general physics ,computer ,XML ,computer.programming_language - Abstract
In spite of its popularity, XML provides poor user experience and a lot of domain-specific languages can be improved by introducing custom, more humanfriendly notation. This paper presents an approach for design and development of the custom notation for existing XML-based language together with a translator between the new notation and XML. The approach supports iterative design of the language concrete syntax, allowing its modification based on users feedback. The translator is developed using a model-driven approach. It is based on explicit representation of language abstract syntax (metamodel) that can be augmented with mappings to both XML and the custom notation. We provide recommendations for application of the approach and demonstrate them on a case study of a language for definition of graphs.
- Published
- 2017
19. Indexing techniques for processing generalized XML documents
- Author
-
Ghassan Z. Qadah
- Subjects
Document Structure Description ,XML Encryption ,Information retrieval ,Computer science ,Programming language ,Efficient XML Interchange ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,XQuery ,XML database ,Hardware and Architecture ,XML Schema Editor ,020204 information systems ,Streaming XML ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Law ,computer ,Software ,computer.programming_language - Abstract
The Extensible Markup Language (XML) data model has recently gained huge popularity because of its ability to represent a wide variety of structured (relational) and semi-structured (document) data. Several query languages have been proposed for the XML model, the most-widely used one is the XQuery. An important component of an XQuery is its XPath expression which retrieves a set of XML documents to be manipulated by the associated XQuery. An XPath expression can be of several types, among which are the containment queries. Traditional research of processing containment queries has concentrated on data retrieval from independent XML documents; not much research has been directed towards interlinked XML documents. This paper reviews this area of research and shows the adequacy and correctness of one of the reviewed algorithms when applied to independent XML documents. However, the direct application of this algorithm to process queries against interlinked XML documents is shown to generate incorrect results. To remedy such a situation, two new algorithms and the associated indexing structures are developed and shown to perform correctly in processing both independent and/or inter-linked XML documents. In addition, one of the new algorithms is shown to minimize the storage requirement of the intermediate lists generated throughout its execution and therefore improving further the algorithm's space and time performance.
- Published
- 2017
20. Survey on Keyword Search over XML Documents
- Author
-
Tok Wang Ling and Thuy Ngoc Le
- Subjects
Document Structure Description ,XML Encryption ,computer.internet_protocol ,Computer science ,Efficient XML Interchange ,XML Signature ,Well-formed document ,02 engineering and technology ,Document type definition ,computer.software_genre ,Simple API for XML ,XML Schema Editor ,020204 information systems ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,Binary XML ,Information exchange ,computer.programming_language ,Information retrieval ,XML validation ,computer.file_format ,XML framework ,XML database ,XML Schema (W3C) ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,computer ,Software ,XML ,Information Systems ,XML Catalog - Abstract
Since XML has become a standard for information exchange over the Internet, more and more data are represented as XML. XML keyword search has been attracted a lot of interests because it provides a simple and user-friendly interface to query XML documents. This paper provides a survey on keyword search over XML document. We mainly focus on the topics of defining semantics for XML keyword search and the corresponding algorithms to find answers based on these semantics. We classify existing works for XML keyword search into three main types, which are tree-based approaches, graph-based approaches and semantics-based approaches. For each type of approaches, we further classify works into sub-classes and especially we summarize, make comparison and point out the relationships among sub-classes. In addition, for each type of approach, we point out the common problems they suffer
- Published
- 2016
21. XML Representation of Web Document used by Search Engine
- Author
-
Mukesh Rawat and Payal Kansal
- Subjects
Document Structure Description ,Information retrieval ,Computer science ,Efficient XML Interchange ,General Engineering ,XML validation ,Well-formed document ,XML Base ,computer.file_format ,Document type definition ,World Wide Web ,Simple API for XML ,XML schema ,computer ,computer.programming_language - Published
- 2016
22. Relevant XML Documents - Approach Based on Vectors and Weight Calculation of Terms
- Author
-
Abdeslem Dennai, Mohammed Yacine Dennai, and Sidi Mohammed Benslimane
- Subjects
Document Structure Description ,Information retrieval ,Computer science ,computer.internet_protocol ,XML validation ,computer ,XML - Published
- 2016
23. Query XML Streaming Data with List
- Author
-
Liao Husheng and He Zhixue
- Subjects
XML Encryption ,Information retrieval ,General Computer Science ,Database ,Computer science ,Efficient XML Interchange ,XML Signature ,XML validation ,computer.file_format ,computer.software_genre ,Query optimization ,XML database ,XML Schema Editor ,Streaming XML ,computer - Published
- 2016
24. Two Zero-Watermark methods for XML documents
- Author
-
Quan Wen, Peng Li, and Yufei Wang
- Subjects
021110 strategic, defence & security studies ,XML Encryption ,Theoretical computer science ,Computer science ,Data_MISCELLANEOUS ,Efficient XML Interchange ,0211 other engineering and technologies ,XML Signature ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,XML database ,XML Schema Editor ,020204 information systems ,Streaming XML ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,computer ,Information Systems ,computer.programming_language - Abstract
As XML files are less redundant and readily reorganized, it is really difficult to design a XML watermarking scheme which can get a trade-off between robust and invisible. However, this trade-off can be achieved by the Zero-Watermark method. In this paper, two Zero-Watermark methods are designed and tested for XML documents. One is XSLT-related method which is designed with embedding extra codes in XSLT file to serve as sort of copyright function. Another uses the functional dependency of XML file as feature for Zero-Watermark. Experiment results show that both methods have good real-time performances. Experiment results show that Zero-Watermark algorithm with functional dependency can resist selection attacks, alteration attacks, reorganization attacks and compression attacks.
- Published
- 2016
25. Efficient Identification of Structural Relationships for XML Queries using Secure Labeling Schemes
- Author
-
S. Sankari and S. Bose
- Subjects
Document Structure Description ,XML Encryption ,XML tree ,Information retrieval ,Computer science ,Efficient XML Interchange ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,Simple API for XML ,XML Schema Editor ,020204 information systems ,Streaming XML ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Decision Sciences (miscellaneous) ,Data mining ,computer ,Information Systems - Abstract
XML emerged as a de-facto standard for data representation and information exchange over the World Wide Web. By utilizing document object model (DOM), XML document can be viewed as XML DOM tree. Nodes of an XML tree are labeled to uniquely identify every node by following a labeling scheme. This paper proposes a method to efficiently identify the two structural relationships namely document order (DO) and sibling relationship that exist between the XML nodes using two secure labeling schemes specifically enhanced Dewey coding (EDC) and secure Dewey coding (SDC). These structural relationships influence the performance of XML queries so they need to be identified in efficient time. This paper implements the method to identify DO and sibling relationship using EDC and SDC labels for various real-time XML documents. Experiment results show the identification of DO and sibling relationship using SDC labels performs better than EDC labels for processing XML queries.
- Published
- 2016
26. Development of Human-friendly Notation for XML-based Languages
- Author
-
Sergej Chodarev
- Subjects
Document Structure Description ,XML Encryption ,lcsh:T58.5-58.64 ,Computer science ,Programming language ,lcsh:Information technology ,Efficient XML Interchange ,XML Signature ,020207 software engineering ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,lcsh:QA75.5-76.95 ,XML Schema Editor ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,XML schema ,lcsh:Electronic computers. Computer science ,computer ,computer.programming_language - Abstract
XML is a popular choice for development of domain-specific languages. In spite of its popularity, XML is a poor user interface and a lot of languages can be improved by introducing custom notation. This paper presents an approach for development of custom human-friendly notation for existing XML-based language together with a translator between the new notation and XML. This approach is based on explicit representation of language abstract syntax that can be decorated with mappings to both XML and the custom notation. The approach supports iterative design and development of the language concrete syntax, allowing its modification based on users feedback. Development process is demonstrated on a case study of language for definition of graphical user interface layout.
- Published
- 2016
27. Pengembangan Repository berbasis Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) pada Standar Metadata Encoding and Transmission Standard (METS) dan MPEG-21 Digital Item Declaration Language (DIDL)
- Author
-
Syarifuddin Syarifuddin and Taufiq Iqbal
- Subjects
World Wide Web ,Metadata ,Upload ,Metadata Encoding and Transmission Standard ,computer.internet_protocol ,Computer science ,Validator ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,Declaration ,XML validation ,Protocol for Metadata Harvesting ,computer ,MPEG-21 - Abstract
The purpose of this research is to build a repository model and feature the Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and the Metadata Encoding and Transmission Standard (METS) and MPEG-21 Digital Item Declaration Language (DIDL). The research model used is qualitative research and methods. Application development used is Fourth Generation Techniques (4GT). From the results of the development of the repository by involving the Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) module on the Metadata Encoding and Transmission Standard (METS) and MPEG-21 Digital Item Declaration Language (DIDL), it has been applied to the repository application that was built. The test results using the OAI-PMH URL using the OVAL validator tool found that there were no problems and problems in validating and verifying data in the Identify, ListMetadataFormats, ListSets, ListIdentifiers, ListRecords, and XML Validation commands. While the test results show the success rate in crawling each metadata in the web repository, the average success rate of crawling metadata by Google Scholar is 90%, while the error is known to be 10% because some documents do not have complete metadata such as bibliography and uploaded documents.
- Published
- 2020
28. Static Analysis Method of Android-specific Problems through Java and Xml Mutual Analysis
- Author
-
Jiyong Jung and Jongmoon Baik
- Subjects
Java ,Programming language ,Computer science ,computer.internet_protocol ,strictfp ,XML validation ,computer.software_genre ,XML framework ,Static import ,Streaming XML ,Java annotation ,computer ,XML ,computer.programming_language - Published
- 2016
29. Study of XML Indexing Structure Based on XISS
- Author
-
Hong Jie Tang
- Subjects
Document Structure Description ,XML Encryption ,Information retrieval ,Computer science ,Search engine indexing ,Efficient XML Interchange ,XML validation ,General Medicine ,computer.file_format ,computer.software_genre ,XML framework ,XML database ,Schema (psychology) ,Streaming XML ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Xml indexing ,XML schema ,Data mining ,computer ,computer.programming_language - Abstract
The study is based on XISS(XML Indexing and Storage System) of Dietz’s Numbering Schema to determine the ancestor-descendant relationship. According to the results of research, this paper proposes an improved method of node encoding, realizes its indexing structure, and discusses its query path. Finally, the paper analyzes the property of this improved method.
- Published
- 2016
30. ANALYSIS AND IMPLEMENTATION OF APPLICATION SCHEMAS FOR THE INSPIRE BUILDINGS THEME
- Author
-
Michal Med and Petr Souček
- Subjects
Document Structure Description ,XSD schema ,Computer science ,Efficient XML Interchange ,cXML ,General Engineering ,InformationSystems_DATABASEMANAGEMENT ,XML validation ,computer.file_format ,GML format ,Geography Markup Language ,World Wide Web ,XML Schema Editor ,lcsh:TA1-2040 ,Schema (psychology) ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,XML schema ,web service ,INSPIRE ,Buildings ,lcsh:Engineering (General). Civil engineering (General) ,computer ,computer.programming_language - Abstract
Implementing the INSPIRE directive involves transforming various data themes into the structure and content given by Data Specifications published by the Joint Research Center of the European Commission. The data is to be published in the GML format, which is the standard for the Open Geospatial Consortium. The validity of the data structure is ensured by validation against XML schemas. These schemas are usually also provided by JRC, though not necessarily for all application schemas. Six application schemas are defined for the currently implemented Buildings theme, but XML schemas are available for only three of them. All application schemas have been analyzed, and it has been found that the most suitable data model corresponds most closely to the BuildingsExtended2D application schema. No XML schema has been provided by JRC in the current version. The BuildingsExtendedBase abstract XML schema was also needed when using the previous schemas. There is now a need to create these missing XML schemas.
- Published
- 2016
31. Comprehensive Study on Keyword Search on Semi Structured Data
- Author
-
C N Sowmyarani and P. Dayananda
- Subjects
Document Structure Description ,Information retrieval ,Computer science ,Efficient XML Interchange ,XML validation ,Well-formed document ,02 engineering and technology ,computer.file_format ,computer.software_genre ,XML framework ,XML database ,XML Schema Editor ,020204 information systems ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,computer ,computer.programming_language - Abstract
Keyword search is a user-friendly approach that enables inexperienced users to easily retrieve information from XML data with no specific knowledge of complex structured query language. Since an XML document can have a large size and contain a lot of information, an XML keyword search result should be a fragment of an XML document dynamically constructed at query time, which is achievable due to the structuredness of XML. Processing keyword searches on XML has several challenges, e.g., what are the elements in the XML document that are relevant to the query? How to generate the results efficiently and rank the results meaningfully? How to present the results to the user in a way such that the user can quickly find the desired information? In this survey, the authors review the papers in the literature that attempted to address these problems. The authors divide the existing approaches into several classes based on the problem they tackled, and perform a comprehensive analysis of these works.
- Published
- 2016
32. Organization of energy resource monitoring on the basis of XML protocol
- Author
-
V. I. Ukhov and I. O. Kovtsova
- Subjects
Database ,business.industry ,computer.internet_protocol ,SOAP ,Computer science ,cXML ,Efficient XML Interchange ,XML validation ,computer.file_format ,computer.software_genre ,Control and Systems Engineering ,XML Protocol ,Binary XML ,Electrical and Electronic Engineering ,business ,computer ,Service Interface for Real Time Information ,XML ,Computer network - Abstract
The architecture of an integrated energy monitoring system is discussed. Merits and drawbacks of various data interchange formats are described. Basic types of the information transmitted in the system are pointed out. Key principles of XML messages formation are shown as well as various aspects of subject area influence on the XML protocol structure.
- Published
- 2016
33. An Overview on XML Semantic Disambiguation from Unstructured Text to Semi-Structured Data: Background, Applications, and Ongoing Challenges
- Author
-
Joe Tekli
- Subjects
Document Structure Description ,Computer science ,computer.internet_protocol ,Efficient XML Interchange ,02 engineering and technology ,Document management system ,computer.software_genre ,Semantic network ,Social Semantic Web ,XML Schema Editor ,020204 information systems ,Semantic computing ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,Semantic Web Stack ,Semantic Web ,computer.programming_language ,Information retrieval ,Ontology learning ,business.industry ,Semantic Web Rule Language ,Search engine indexing ,Semantic search ,XML validation ,computer.file_format ,Semantic interoperability ,SemEval ,Computer Science Applications ,XML framework ,Semantic grid ,Computational Theory and Mathematics ,Categorization ,020201 artificial intelligence & image processing ,Semi-structured data ,business ,computer ,XML ,Information Systems ,Data integration - Abstract
Since the last two decades, XML has gained momentum as the standard for web information management and complex data representation. Also, collaboratively built semi-structured information resources, such as Wikipedia, have become prevalent on the Web and can be inherently encoded in XML. Yet most methods for processing XML and semi-structured information handle mainly the syntactic properties of the data, while ignoring the semantics involved. To devise more intelligent applications, one needs to augment syntactic features with machine-readable semantic meaning. This can be achieved through the computational identification of the meaning of data in context, also known as (a.k.a.) automated semantic analysis and disambiguation, which is nowadays one of the main challenges at the core of the Semantic Web. This survey paper provides a concise and comprehensive review of the methods related to XML-based semi-structured semantic analysis and disambiguation. It is made of four logical parts. First, we briefly cover traditional word sense disambiguation methods for processing flat textual data. Second, we describe and categorize disambiguation techniques developed and extended to handle semi-structured and XML data. Third, we describe current and potential application scenarios that can benefit from XML semantic analysis, including: data clustering and semantic-aware indexing, data integration and selective dissemination, semantic-aware and temporal querying, web and mobile services matching and composition, blog and social semantic network analysis, and ontology learning. Fourth, we describe and discuss ongoing challenges and future directions, including: the quantification of semantic ambiguity, expanding XML disambiguation context, combining structure and content, using collaborative/social information sources, integrating explicit and implicit semantic analysis, emphasizing user involvement, and reducing computational complexity.
- Published
- 2016
34. A study on Clustering Algorithms for XML Data Clustering
- Author
-
B. S. E. Zoraida and S. Saranya
- Subjects
Information retrieval ,Fuzzy clustering ,computer.internet_protocol ,Computer science ,Efficient XML Interchange ,Conceptual clustering ,XML validation ,computer.file_format ,computer.software_genre ,ComputingMethodologies_PATTERNRECOGNITION ,Simple API for XML ,XML database ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Data mining ,Cluster analysis ,computer ,XML - Abstract
Nowadays mining meaningful information from large scale web documents is more important to satisfy the user demand. XML and RDF documents are supporting the semantic information retrieval to interpret and extract meaningful information for user query. XML documents have light weight code and logical structure, which facilitate easy exchange of data values and structure information in terms of knowledge. Many mining techniques and algorithms are used to enhance the performance of XML information Retrieval. Classification (Supervised Learning) and Clustering (Unsupervised Learning) are the preprocessing techniques used to grouping up the similar data objects based on similarity criteria. This paper presents the study on three clustering algorithms (k-means, EM, Tree Clustering) and its similarity measures on XML datasets. The three clustering algorithms are compared and tested with the same xml datasets for finding the best one to cluster XML documents.
- Published
- 2016
35. LCA-based algorithms for efficiently processing multiple keyword queries over XML streams
- Author
-
Altigran Soares da Silva, Alberto H. F. Laender, Evandrino G. Barros, and Mirella M. Moro
- Subjects
Information Systems and Management ,Information retrieval ,Computer science ,computer.internet_protocol ,Response time ,XML validation ,02 engineering and technology ,STREAMS ,computer.software_genre ,Schema (genetic algorithms) ,020204 information systems ,Scalability ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Overall performance ,Data mining ,Algorithm ,computer ,Lowest common ancestor ,XML - Abstract
In a stream environment, differently from traditional databases, data arrive continuously, unindexed and potentially unbounded, whereas queries must be evaluated for producing results on the fly. In this article, we propose two new algorithms (called SLCAStream and ELCAStream) for processing multiple keyword queries over XML streams. Both algorithms process keyword-based queries that require minimal or no schema knowledge to be formulated, follow the lowest common ancestor (LCA) semantics, and provide optimized methods to improve the overall performance. Moreover, SLCAStream, which implements the smallest LCA (SLCA) semantics, outperforms the state-of-the-art, with up to 49% reduction in response time and 36% in memory usage. In turn, ELCAStream is the first to explore the exclusive LCA (ELCA) semantics over XML streams. A comprehensive set of experiments evaluates several aspects related to performance and scalability of both algorithms, which shows they are effective alternatives to search services over XML streams.
- Published
- 2016
36. Selectivity estimation of extended XML query tree patterns based on prime number labeling and synopsis modeling
- Author
-
El-Sayed M. El-Alfy, Salahadin Mohammed, and Ahmad F. Barradah
- Subjects
Computer science ,computer.internet_protocol ,Efficient XML Interchange ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,Query optimization ,XML framework ,Simple API for XML ,XML database ,Hardware and Architecture ,020204 information systems ,Modeling and Simulation ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,XML schema ,Data mining ,computer ,Software ,XML ,computer.programming_language - Abstract
With the new era of big data and the proliferation of XML documents for representing and exchanging data over the web, selectivity estimation of XML query patterns has become a crucial component of database optimizers. It helps the optimizer choose the best possible plan for query evaluation. Existing selectivity estimators for XML queries can only support basic Query Tree Patterns (QTPs) with logical AND operator. In this paper, we propose a novel approach, called XQuest, for selectivity estimation that supports extended QTPs that may contain logical operators or wildcards. This approach is based on a modified implementation of prime number labeling to construct a structural summary model of the XML data. Subsequently, a simulator of an XML query evaluator runs on the resulting model from the previous stage and aggregates the estimate for each target QTP. We conducted several experiments to study the performance of the proposed approach on three XML benchmark datasets; in terms of synopsis generation time, storage requirements, and estimation accuracy. The results show that the proposed approach can have more accurate estimates with low memory and time requirements. For example, when compared to a Sampling algorithm with the same allocated memory budget, the error rate of the proposed approach never reached 5% whereas it reached 98.5% for the Sampling algorithm.
- Published
- 2016
37. A Study on Processing XML Documents
- Author
-
Tae Gwon Kim
- Subjects
Document Structure Description ,Information retrieval ,computer.internet_protocol ,Computer science ,Programming language ,InformationSystems_DATABASEMANAGEMENT ,XML validation ,computer.software_genre ,Path expression ,XQuery ,XML database ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Processing Instruction ,computer ,XML ,computer.programming_language ,XPath - Abstract
XML can effectively express structured or semi-structured data as well as relational databases. XQuery is a query language for retrieving information for such an XML document. In this paper, an XQuery composer is designed and implemented, with an API provided for XQuery processors, and a proper processor is registered. This composer shows query results immediately processed by the processor. As this composer contains a parser for XQuery, it can compose XQuery effectively using a diverse dialog box designed for XQuery grammar. A dialog box is affiliated with a clause region, which is a region that algebra operates from the parsing tree. It can compose path expressions for an XML document easily as it shows an element tree from DTD graphically. Path expressions are composed automatically by marking elements in the structural hierarchy and by specifying the predicate of an element partially.
- Published
- 2016
38. Answering Approximate Queries Over XML Data
- Author
-
D. L. Yan and Jian Liu
- Subjects
Information retrieval ,Computer science ,Applied Mathematics ,Efficient XML Interchange ,XML validation ,02 engineering and technology ,computer.file_format ,computer.software_genre ,Query optimization ,Query language ,Spatial query ,Query expansion ,XML database ,Computational Theory and Mathematics ,Artificial Intelligence ,Control and Systems Engineering ,Web query classification ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,computer - Abstract
With the increasing popularity of XML for data representations, there is a lot of interest in searching XML data. Due to the structural heterogeneity and textual content's diversity of XML, it is daunting for users to formulate exact queries and search accurate answers. Therefore, approximate matching is introduced to deal with the difficulty in answering users’ queries, and this matching could be addressed by first relaxing the structure and content of a given query and, then, looking for answers that match the relaxed queries. Ranking and returning the most relevant results of a query have become the most popular paradigm in XML query processing. However, the existing proposals do not adequately take structures into account, and they, therefore, lack the strength to elegantly combine structures with contents to answer the relaxed queries. To address this problem, we first propose a sophisticated framework of query relaxations for supporting approximate queries over XML data. The answers underlying this framework are not compelled to strictly satisfy the given query formulation; instead, they can be founded on properties inferable from the original query. We, then, develop a novel top- k retrieval approach that can smartly generate the most promising answers in an order correlated with the ranking measure. We complement the work with a comprehensive set of experiments to show the effectiveness of our proposed approach in terms of precision and recall metrics.
- Published
- 2016
39. S2CX: From relational data via SQL/XML to (Un-)Compressed XML
- Author
-
Rita Hartel, Stefan Böttcher, and Dennis Wolters
- Subjects
Document Structure Description ,SQL ,XML Encryption ,computer.internet_protocol ,Computer science ,Relational database ,Efficient XML Interchange ,XML Signature ,XML Base ,02 engineering and technology ,computer.software_genre ,SQL/XML ,Oracle ,Simple API for XML ,Relational database management system ,XML Schema Editor ,020204 information systems ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,XML namespace ,RELAX NG ,XML schema ,SGML ,computer.programming_language ,Information retrieval ,cXML ,InformationSystems_DATABASEMANAGEMENT ,XML validation ,computer.file_format ,XML framework ,XML database ,Hardware and Architecture ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,computer ,Software ,XML ,Information Systems ,XML Catalog - Abstract
The gap between storing data in relational databases and transferring data in form of XML has been closed e.g. by SQL/XML queries that generate XML data out of relational data sources. However, only few relational database systems support the evaluation of SQL/XML queries. And even in those systems supporting SQL/XML, the evaluation of such queries is quite slow compared to the evaluation of SQL queries. In this paper, we present S2CX, an approach that allows to efficiently evaluate SQL/XML queries on any relational database system, no matter whether it supports SQL/XML or not. As a result to an SQL/XML query, S2CX supports different output formats ranging from plain XML to different compressed XML representations including a succinct encoding of XML data, schema-aware compressed XML to grammar compressed XML. In many cases, S2CX produces compressed XML as a result to an SQL/XML query even faster than the evaluation of SQL/XML queries into non-compressed XML as provided by Oracle 11 g and by DB2. Furthermore, our approach to query evaluation scales better, i.e., the larger the dataset, the faster is our approach compared to SQL/XML query evaluation in Oracle 11 g and in DB2.
- Published
- 2016
40. Uncertain XML documents classification using Extreme Learning Machine
- Author
-
Xiangguo Zhao, Xin Bi, Guoren Wang, Zhen Zhang, and Hongbo Yang
- Subjects
Information retrieval ,Uncertain data ,Computer science ,computer.internet_protocol ,Cognitive Neuroscience ,Efficient XML Interchange ,XML validation ,02 engineering and technology ,computer.file_format ,Computer Science Applications ,Artificial Intelligence ,XML Schema Editor ,020204 information systems ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Binary XML ,XML schema ,computer ,XML ,Extreme learning machine ,computer.programming_language - Abstract
Driven by the emerging network data exchange and storage, XML documents classification has become increasingly important. Most existing representation model and conventional learning algorithm are defined on certain XML documents. However, in many real-world applications, XML datasets contain inherent uncertainty, which brings greater challenges to classification problem. In this paper, we propose a novel solution to classify uncertain XML documents, including uncertain XML documents representation and two uncertain learning algorithms based on Extreme Learning Machine. Experimental results show that our approaches exhibit prominent performance for uncertain XML documents classification problem.
- Published
- 2016
41. Computer aided anonymization and redaction of judicial documents
- Author
-
Branko Milosavljević, Gordana Milosavljević, Zora Konjović, Goran Sladić, and Stevan Gostojić
- Subjects
050502 law ,General Computer Science ,Database ,business.industry ,Computer science ,computer.internet_protocol ,Common law ,05 social sciences ,ComputingMilieux_LEGALASPECTSOFCOMPUTING ,Access control ,XML validation ,Redaction ,computer.software_genre ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Legal certainty ,Information system ,Role-based access control ,0509 other social sciences ,050904 information & library sciences ,business ,computer ,XML ,0505 law - Abstract
Public access to case law is a required prerequisite for the legal certainty and the rule of law. Nevertheless, according to the law, only authorized persons can access judgments in their non-anonymized and unredacted form. This paper proposes a computer aided method for anonymization and redaction of judgments, with an aim to improve efficiency of overall process. The anonymization and redaction procedure is based on the access control mechanism for XML documents. AKOMA NTOSO is chosen as an XML format in order to facilitate integration with other (legal) information systems, but the proposed method can be easily adapted to different document types and different XML formats. The method is verified by a prototype implementation which is validated by employees in a court of law.
- Published
- 2016
42. The Study of XML Functional Dependency and Multi-Valued Dependency and Inference Rules Set
- Author
-
Cao Lijun, Zhongping Zhang, and Liu Xiyin
- Subjects
Dependency (UML) ,Theoretical computer science ,General Computer Science ,Computer science ,computer.internet_protocol ,Multivalued dependency ,XML validation ,Join dependency ,computer.software_genre ,Dependency theory (database theory) ,Dependency graph ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Data mining ,Functional dependency ,computer ,Computer Science::Databases ,XML - Abstract
This paper discusses and defines XML functional dependency and XML multi-valued dependency, making the formal definition of XML functional dependency and XML multivalued dependency and further defines the XML-trivial functional dependency and multivalued dependency. It defines the logical implication, base closures and minimal dependence and gives the set of inference rules that effectiveness and completeness are proved when functional dependency and multi-valued dependency exist simultaneously.
- Published
- 2015
43. Hardware-Based High Performance XML Parsing Technique Using an FPGA
- Author
-
Kyu-Hee Lee and Byeong-seok Seo
- Subjects
business.industry ,Programming language ,Computer science ,Efficient XML Interchange ,XML Signature ,XML validation ,computer.file_format ,computer.software_genre ,XML framework ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,XML database ,Simple API for XML ,Streaming XML ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,XML schema ,business ,computer ,Computer hardware ,computer.programming_language - Abstract
A structured XML has been widely used to present services on various Web-services. The XML is also used for digital documents and digital signatures and for the representation of multimedia files in email systems. The XML document should be firstly parsed to access elements in the XML. The parsing is the most compute-instensive task in the use of XML documents. Most of the previous work has focused on hardware based XML parsers in order to improve parsing performance, while a little work has studied parsing techniques. We present the high performance parsing technique which can be used all of XML parsers and design hardware based XML parser using an FPGA. The proposed parsing technique uses element analyzers instead of the state machine and performs multibyte-based element matching. As a result, our parsing technique can reduce the number of clock cycles per byte(CPB) and does not need to require any preprocessing, such as loading XML data into memory. Compared to other parsers, our parser acheives 1.33~1.82 times improvement in the system performance. Therefore, the proposed parsing technique can process XML documents in real time and is suitable for applying to all of XML parsers.
- Published
- 2015
44. Finding target and constraint concepts for XML query construction
- Author
-
Keng Hoon Gan and K. K. Phang
- Subjects
Document Structure Description ,Query expansion ,Information retrieval ,Web search query ,Computer Networks and Communications ,Web query classification ,Computer science ,XML validation ,Sargable ,Query language ,Query optimization ,Information Systems - Abstract
Purpose– This paper aims to focus on automatic selection of two important structural concepts required in an XML query, namely, target and constraint concepts, when given a keywords query. Due to the diversities of concepts used in XML resources, it is not easy to select a correct concept when constructing an XML query.Design/methodology/approach– In this paper, a Context-based Term Weighting model that performs term weighting based on part of documents. Each part represents a specific context, thus offering better capturing of concept and term relationship. For query time analysis, a Query Context Graph and two algorithms, namely, Select Target and Constraint (QC) and Select Target and Constraint (QCAS) are proposed to find the concepts for constructing XML query.Findings– Evaluations were performed using structured document for conference domain. For constraint concept selection, the approach CTX+TW achieved better result than its baseline, NCTX, when search term has ambiguous meanings by using context-based scoring for the concepts. CTX+TW also shows its stability on various scoring models like BM25, TFIEF and LM. For target concept selection, CTX+TW outperforms the standard baseline, SLCA, whereas it also records higher coverage than FCA, when structural keywords are used in query.Originality/value– The idea behind this approach is to capture the concepts required for term interpretation based on parts of the collections rather than the entire collection. This allows better selection of concepts, especially when a structured XML document consists many different types of information.
- Published
- 2015
45. XML Data Security
- Author
-
Basavaraj G. Shirur
- Subjects
XML Encryption ,Computer science ,Efficient XML Interchange ,XML Signature ,XML validation ,computer.file_format ,Computer security ,computer.software_genre ,XML database ,XML Protocol ,XML Schema Editor ,Automotive Engineering ,Streaming XML ,computer - Published
- 2016
46. TM-Builder: An Ontology Builder based on XML Topic Maps
- Author
-
Pedro Rangel Henriques, José Carlos Ramalho, and Giovani Rubert Librelotto
- Subjects
Document Structure Description ,Computer science ,SemanticWeb ,Efficient XML Interchange ,XML Base ,02 engineering and technology ,lcsh:QA75.5-76.95 ,XML Schema Editor ,TopicMaps ,0202 electrical engineering, electronic engineering, information engineering ,XML schema ,computer.programming_language ,060201 languages & linguistics ,Information retrieval ,Topic Maps ,Ontology ,cXML ,XML validation ,06 humanities and the arts ,General Medicine ,computer.file_format ,XML ,XSL ,0602 languages and literature ,OntologyExtraction ,020201 artificial intelligence & image processing ,lcsh:Electronic computers. Computer science ,computer - Abstract
Everyday a huge number of new information resources are linked to the web. This way the web is growing very fast, making search tasks more and more difficult with worse results. To solve the problem several initiatives were undertaken and a new area of research and development emerged: the one called Semantic Web.When we refer to the semantic web we are thinking about a network of concepts. Each concept has a group of related resources and can be related to other concepts; we can then use this concept network to navigate among web resources or simply among information resources. From the undertaken initiatives one became an ISO standard: Topic Maps ISO 13250. The aim of this paper is to introduce a Topic Map (TM) Builder, that is a processor that extracts topics and relations from instances of a family of XML documents.A TM-Builder is strongly dependent on the resources structure. So, to extract a topic map for different collections of information resources (sets of documents with different structures) we have to implement several TM-Builders, one for each collection. This is not very easy! To overcome this inconvenient we have created an XML abstraction layer for TM-Builders that enables us to specify the topic map we want to build from a concrete family of resources, in order to generate automatically the intended extractor. To describe that process, i.e. the extraction of knowledge from XML documents to produce a TM, we present a language to specify topic maps for a class of XML documents, that we call XSTM (XML Specification for Topic Maps). We also discuss a XSL processor that automatically generates the Extractor from its formal specification written in XSTM, the XSTM-P.
- Published
- 2018
47. Evaluating Queries and Updates on Big XML Documents
- Author
-
Nicole Bidoit, Dario Colazzo, Noor Malla, Carlo Sartiani, Données et Connaissances Massives et Hétérogènes (LRI) (LaHDAK - LRI), Laboratoire de Recherche en Informatique (LRI), and Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)
- Subjects
XML Encryption ,Computer Networks and Communications ,Computer science ,Efficient XML Interchange ,02 engineering and technology ,computer.software_genre ,Theoretical Computer Science ,Simple API for XML ,XML Schema Editor ,020204 information systems ,Streaming XML ,0202 electrical engineering, electronic engineering, information engineering ,Cloud computing ,XML schema ,ACM: H.: Information Systems/H.2: DATABASE MANAGEMENT ,computer.programming_language ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,Database ,XML validation ,computer.file_format ,XML ,XML database ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Map/Reduce ,020201 artificial intelligence & image processing ,computer ,Software ,Information Systems - Abstract
International audience; In this paper we present Andromeda, a system for processing queries and updates on large XML documents. The system is based on the idea of statically and dynamically partitioning the input document, so as to distribute the computing load among the machines of a MapReduce cluster.
- Published
- 2018
48. Machine learning techniques for XML (co-)clustering by structure-constrained phrases
- Author
-
Gianni Costa and Riccardo Ortale
- Subjects
computer.internet_protocol ,Computer science ,Efficient XML Interchange ,XML validation ,02 engineering and technology ,computer.file_format ,Library and Information Sciences ,computer.software_genre ,Biclustering ,XML ,Semi-structured data analysis ,XML (co-)clustering by structure and nested text ,Structure-constrained phrases ,Contextualized n-grams ,Simple API for XML ,020204 information systems ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,020201 artificial intelligence & image processing ,XML schema ,Data mining ,Cluster analysis ,computer ,Information Systems ,computer.programming_language - Abstract
A new method is proposed for clustering XML documents by structure-constrained phrases. It is implemented by three machine-learning approaches previously unexplored in the XML domain, namely non-negative matrix (tri-)factorization, co-clustering and automatic transactional clustering. A novel class of XML features approximately captures structure-constrained phrases as n-grams contextualized by root-to-leaf paths. Experiments over real-world benchmark XML corpora show that the effectiveness of the three approaches improves with contextualized n-grams of suitable length. This confirms the validity of the devised method from multiple clustering perspectives. Two approaches overcome in effectiveness several state-of-the-art competitors. The scalability of the three approaches is investigated, too.
- Published
- 2018
49. An Approach to the Validation of XML Documents Based on the Model Driven Architecture and the Object Constraint Language
- Author
-
Ruslan L. Sivakov, Denis A. Nikiforov, and Dmitriy V. Korj
- Subjects
SQL ,computer.internet_protocol ,Computer science ,Programming language ,XML validation ,0102 computer and information sciences ,02 engineering and technology ,XSLT ,computer.software_genre ,01 natural sciences ,JSON ,010201 computation theory & mathematics ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,XML schema ,computer ,XML ,Object Constraint Language ,computer.programming_language ,XPath - Abstract
It is possible to develop data processing applications using a variety of different data representation formats (EDI, CSV, XML, JSON), domain-specific languages, and general-purpose programming languages (XSLT, SQL, Java, C#). On the one hand, such a variety allows one to choose the most optimal data format or language based on the specific requirements being applied, while on the other one, contemporary information systems or complexes of integrated information systems have become similar to the Tower of Babel, being so cumbersome to build and maintain. A possible solution to this issue could be found in developing platform-independent specifications to be used for generating the source code for each required platform.
- Published
- 2018
50. Declarative XML Schema Validation with SWI–Prolog
- Author
-
Falco Nogatz and Jona Kalkus
- Subjects
Unification ,Computer science ,Programming language ,Backtracking ,computer.internet_protocol ,XML validation ,computer.software_genre ,Prolog ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,Validator ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,XML schema ,computer ,Protocol (object-oriented programming) ,XML ,computer.programming_language - Abstract
Xml Schema is a well–established mechanism to define the structure and constrain the content of an Xml document. While this approach taken by itself is declarative, currently available tools for Xml validation are not. In this paper we introduce an implementation of an Xsd validator in Swi–Prolog, made publicly available as the package library(xsd). Our approach is based on flattening the Xsd and Xml documents into Prolog facts. The top–down validation makes great use of Prolog’s backtracking and unification capabilities. To ensure the compliance to the Xsd standard and to support the test–driven development, we have created a test framework based on the Test Anything Protocol and Swi–Prolog’s quasi–quotations.
- Published
- 2018
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.