Descriptor: "kennisgrafieken" - Searchworks@Jio Institute Digital Library Search Results

1. Open Information Extraction for Knowledge Representation: Triple Extraction and Information Retrieval From Unstructured Text

Author: Sarhan, Ingy and Sarhan, Ingy
Abstract: The field of Natural Language Processing (NLP) focuses on developing computational techniques to analyze and extract information from human language. With the exponential growth of unstructured textual data, NLP-based techniques have become essential for extracting valuable insights from this data. However, existing information extraction systems have limitations in terms of extracting valuable information without predefined relations or ontology and storing the extracted knowledge effectively. This Ph.D. thesis aims to enhance open information extraction methods to represent unstructured textual data efficiently and effectively. The first part of the research focuses on Open Information Extraction (OIE) systems and their challenges. Existing OIE methods, including pattern-based and machine learning-based approaches, as well as neural techniques, are analyzed to understand their limitations. A Bidirectional Gated Recurrent Unit (Bi-GRU) OIE model is proposed in Chapter 3, which utilizes contextualized word embeddings to extract relevant triples from unstructured text. Experimental results demonstrate the effectiveness of this model in generating high-quality relation triples. Chapter 4 addresses the lack of labeled data, a common problem in NLP tasks. The research extends the OIE model from Chapter 3 by using learned features to generate relation triples and explores the transferability of these features across different OIE domains and the related task of Relation Extraction (RE). The results show comparable performance with traditional training, indicating the potential of OIE in achieving NLP performance without labeled data. In Chapter 5, the focus shifts to enhancing pre-trained language models for taxonomy classification. Pre-trained language models often struggle with unseen patterns during inference, and the limited size of annotated data poses a challenge. A two-stage fine-tuning procedure, incorporating data augmentation techniques, is proposed to improve
Published: 2023

2. Open Information Extraction for Knowledge Representation: Triple Extraction and Information Retrieval From Unstructured Text

Subjects: informatie-extractie, Open informatie-extractie, Open Information Extraction, Transfer Learning, taalmodellen, overdrachtsleren, Machine Learning, machinaal leren, taxonomieën, kennisgrafieken, Knowledge Graphs, Language Models, Information Extraction, Taxonomies
Abstract: The field of Natural Language Processing (NLP) focuses on developing computational techniques to analyze and extract information from human language. With the exponential growth of unstructured textual data, NLP-based techniques have become essential for extracting valuable insights from this data. However, existing information extraction systems have limitations in terms of extracting valuable information without predefined relations or ontology and storing the extracted knowledge effectively. This Ph.D. thesis aims to enhance open information extraction methods to represent unstructured textual data efficiently and effectively. The first part of the research focuses on Open Information Extraction (OIE) systems and their challenges. Existing OIE methods, including pattern-based and machine learning-based approaches, as well as neural techniques, are analyzed to understand their limitations. A Bidirectional Gated Recurrent Unit (Bi-GRU) OIE model is proposed in Chapter 3, which utilizes contextualized word embeddings to extract relevant triples from unstructured text. Experimental results demonstrate the effectiveness of this model in generating high-quality relation triples. Chapter 4 addresses the lack of labeled data, a common problem in NLP tasks. The research extends the OIE model from Chapter 3 by using learned features to generate relation triples and explores the transferability of these features across different OIE domains and the related task of Relation Extraction (RE). The results show comparable performance with traditional training, indicating the potential of OIE in achieving NLP performance without labeled data. In Chapter 5, the focus shifts to enhancing pre-trained language models for taxonomy classification. Pre-trained language models often struggle with unseen patterns during inference, and the limited size of annotated data poses a challenge. A two-stage fine-tuning procedure, incorporating data augmentation techniques, is proposed to improve the generalizability of pre-trained models. Experimental results demonstrate strong generalizability on unseen data, with an F1 score of 91.25%. Chapter 6 explores the use of OIE for constructing a knowledge graph, specifically in the context of cyber threat intelligence. Open-CyKG, an open cyber threat intelligence knowledge graph framework, is designed using an attention-based neural OIE model and a Named Entity Recognition (NER) model. Refinement and canonicalization techniques are employed to overcome ambiguity and data redundancy during knowledge graph construction. The results show that querying the constructed knowledge graph can be done efficiently, highlighting the support of OIE in knowledge graph development. The proposed components achieve beyond-state-of-the-art results in terms of OIE performance, NER performance, and knowledge graph canonicalization. The research presented in the previous chapters demonstrates significant improvements in the efficiency and effectiveness of open information extraction methods for representing unstructured textual data. These advancements leverage techniques such as data augmentation, multi-stage fine-tuning, and pre-trained language models. The construction of knowledge graphs, enabled by OIE, has the potential to mimic human intelligence and benefit various complex applications, including recommender systems, search engines, and dialog systems.
Published: 2023

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

2 results on '"kennisgrafieken"'

1. Open Information Extraction for Knowledge Representation: Triple Extraction and Information Retrieval From Unstructured Text

2. Open Information Extraction for Knowledge Representation: Triple Extraction and Information Retrieval From Unstructured Text

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

2 results on '"kennisgrafieken"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources