808 results on '"dependency parsing"'
Search Results
2. DPMN: Multi-Task Learning Network for Problem of Overlapping Relation Extraction.
- Author
-
LI Yajie, TANG Guogen, and LI Ping
- Subjects
LEARNING strategies ,PIPELINE failures ,NATURAL language processing - Abstract
As one of the basic components of natural language processing, relation extraction aims to extract relation facts from a given unstructured text. In practical applications, there is a lack of entity information at the sentence level, and there are often scenarios where a single sentence contains multiple overlapping relation triplets. Relation triplets can generate multiple cross overlaps, making relation extraction tasks more challenging. Early research uses pipeline method to process, which not only ignores the relevance of entity recognition and relationship prediction, but also is vulnerable to propagation of uncertainty. This paper proposes a multi- task learning network (DPMN) based on dependency parsing, which can identify entity span more accurately by dependency parsing, enrich relation semantics, and have multi- task learning strategies to enhance the interaction between various subtasks. Compared with the baseline model, DPMN has better performance in relation triplet extraction, which alleviates the problem of overlapping relations to some extent. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Multi-source domain adaptation for dependency parsing via domain-aware feature generation.
- Author
-
Li, Ying, Zhang, Zhenguo, Xian, Yantuan, Yu, Zhengtao, Gao, Shengxiang, Mao, Cunli, and Huang, Yuxin
- Abstract
With deep representation learning advances, supervised dependency parsing has achieved a notable enhancement. However, when the training data is drawn from various predefined out-domains, the parsing performance drops sharply due to the domain distribution shift. The key to addressing this problem is to model the associations and differences between multiple source and target domains. In this work, we propose an innovative domain-aware adversarial and parameter generation network for multi-source cross-domain dependency parsing where a domain-aware parameter generation network is used for identifying domain-specific features and an adversarial network is used for learning domain-invariant ones. Experiments on the benchmark datasets reveal that our model outperforms strong BERT-enhanced baselines by 2 points in the average labeled attachment score (LAS). Detailed analysis of various domain representation strategies shows that our proposed distributed domain embedding can accurately capture domain relevance, which motivates the domain-aware parameter generation network to emphasize useful domain-specific representations and disregard unnecessary or even harmful ones. Additionally, extensive comparison experiments show deeper insights on the contributions of the two components. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Hybrid Detection Method for Multi-Intent Recognition in Air–Ground Communication Text.
- Author
-
Pan, Weijun, Wang, Zixuan, Wang, Zhuang, Wang, Yidi, and Huang, Yuanjing
- Subjects
LANGUAGE models ,AIR traffic control ,ARTIFICIAL intelligence ,TEXT recognition ,DEEP learning ,KNOWLEDGE representation (Information theory) - Abstract
In recent years, the civil aviation industry has actively promoted the automation and intelligence of control processes with the increasing use of various artificial intelligence technologies. Air–ground communication, as the primary means of interaction between controllers and pilots, typically involves one or more intents. Recognizing multiple intents within air–ground communication texts is a critical step in automating and advancing the control process intelligently. Therefore, this study proposes a hybrid detection method for multi-intent recognition in air–ground communication text. This method improves recognition accuracy by using different models for single-intent texts and multi-intent texts. First, the air–ground communication text is divided into two categories using multi-intent detection technology: single-intent text and multi-intent text. Next, for single-intent text, the Enhanced Representation through Knowledge Integration (ERNIE) 3.0 model is used for recognition; while the A Lite Bidirectional Encoder Representations from Transformers (ALBERT)_Sequence-to-Sequence_Attention (ASA) model is proposed for identifying multi-intent texts. Finally, combining the recognition results from the two models yields the final result. Experimental results demonstrate that using the ASA model for multi-intent text recognition achieved an accuracy rate of 97.84%, which is 0.34% higher than the baseline ALBERT model and 0.15% to 0.87% higher than other improved models based on ALBERT and ERNIE 3.0. The single-intent recognition model achieved an accuracy of 96.23% when recognizing single-intent texts, which is at least 2.18% higher than the multi-intent recognition model. The results indicate that employing different models for various types of texts can substantially enhance recognition accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Exploring Multi-Level User Perceived Quality through Dependency Syntax Analysis and Hierarchical Clustering.
- Author
-
Gui, Shumeng, Xu, Zhaoguang, and Dang, Yanzhong
- Subjects
ELECTRIC vehicles ,SOCIAL media ,HIERARCHICAL clustering (Cluster analysis) ,ENERGY industries ,NOUN phrases (Grammar) - Abstract
The vast and unstructured data on social media platforms offer insights into user-perceived quality, presenting a novel avenue for new energy vehicle companies to analyze product quality. In response, this study introduces a methodology that integrates dependency syntactic parsing with hierarchical clustering to derive multi-level insights on user-perceived quality. Initially, we employ dependency syntactic parsing and part-of-speech tagging to identify compound noun phrases within comments. These phrases serve as a specialized out-of-vocabulary library pertinent to the new energy vehicle sector, from which a selection of words is chosen as potential evaluation metrics. Subsequently, we utilize Word2Vec to develop word vectors from automotive forum corpora. Leveraging these word vectors and the identified evaluation metrics, a hierarchical clustering algorithm is then applied to establish a comprehensive three-level user-perceived quality indicator system. Experimental validation conducted on forum comment data from BYD's new energy vehicles confirms the reliability and effectiveness of the proposed methodology. The user-perceived quality extraction method delineated in this research aids automotive firms in pinpointing areas of user interest, thereby substantially enhancing user loyalty and satisfaction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. De-Noising Tail Entity Selection in Automatic Question Generation with Fine-Tuned T5 Model
- Author
-
Tharaniya Sairaj, R., Balasundaram, S. R., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Nanda, Satyasai Jagannath, editor, Yadav, Rajendra Prasad, editor, Gandomi, Amir H., editor, and Saraswat, Mukesh, editor
- Published
- 2024
- Full Text
- View/download PDF
7. Automatic dependency parsing of Estonian: what linguistic features to include?
- Author
-
Laur, Sven, Orasmaa, Siim, Eiche, Sandra, and Särg, Dage
- Published
- 2024
- Full Text
- View/download PDF
8. Morphological competence in neural natural language processing
- Author
-
Czarnowska, Paula, Copestake, Ann, and Cotterell, Ryan
- Subjects
case-marking languages ,dependency parsing ,Estonian ,Finnish ,linguistically-oriented analysis of neural networks ,morphological competence ,morphologically rich languages ,natural language processing ,Polish ,Russian - Abstract
In case-marking languages (CMLs), such as Polish or Finnish, a substantial portion of grammatical information is expressed at the word-level. The word-forms provide information about their inherent properties, like tense or mood, but also encode information about relations between the words. This is in contrast to morphologically impoverished languages, like English, which undergo little inflection. The linguistic factors associated with CMLs make them a challenge for data-driven, neural natural language processing (NLP). To successfully process a CML, neural NLP models must be morphologically competent; i.e., they have to capture both the meaning and function of different components of a word form and recognise the importance of morphological signals within a language. Despite the importance of morphological competence for language processing, the neural NLP models have never been directly tested for that linguistic ability. This gap in the literature is the more important given that most neural NLP models are developed with English language in mind and later applied, without any adaptations, to other languages. It remains unclear whether the architectures and optimization techniques developed on English are able to extract all the essential information from the word-forms of CMLs and whether they can interpret this information at the clausal-level to solve NLP tasks. In this thesis I investigate whether state-of-the-art neural models for CMLs utilise morphosyntactic information when solving a task for which this information is key: dependency parsing. To answer this question I propose a new evaluation paradigm which involves evaluating the models on various counterfactual versions of dependency corpora. Through evaluation of Polish, Russian, Finnish and Estonian dependency parsers, I reveal that the models often fail to recognise morphology as the primary indicator of syntax; instead of generalising based on the case and agreement markings, they learn to over-rely on word order and lexical semantics. Following this finding, I experiment with two methods of increasing the models' reliance on morphology: one based on the alteration of the training data and another involving an enhanced training objective. Finally, through creating synthetic CMLs by manipulating selected typological properties of Polish, I investigate whether the models have a 'preference' for the means of encoding case information and reveal that syncretism and high fusion are amongst the properties that drive the models away from relying on morphology as a signal to subject/objecthood.
- Published
- 2023
- Full Text
- View/download PDF
9. A multi-feature fusion approach based on domain adaptive pretraining for aspect-based sentiment analysis.
- Author
-
Ma, Yinglong, He, Ming, Pang, Yunhe, Wang, Libiao, and Liu, Huili
- Subjects
- *
SENTIMENT analysis , *LANGUAGE models , *PARSING (Computer grammar) , *DEEP learning , *MACHINE learning , *SEMANTICS - Abstract
Aspect-based sentiment analysis aims to recognize the sentiment polarities for opinion words with the aid of some machine learning or deep learning-based sentiment classification models. Dependency parsing has been considered as an efficient tool for identifying the opinion words in the sentiment text. However, many dependency-based methods might be susceptible to the dependency tree, which inevitably introduces noisy information due to that the rich relation information between words is neglected. In this paper, we propose a multi-feature fusion approach based on domain adaptive pretraining to reduce dependency-based noisy information for aspect-based sentiment classification (ASC). First, we utilize multi-task learning (MTL) for domain adaptive pretraining, which combines biaffine attention model (BAM) and mask language model (MLM) by jointly considering the structure, the relation semantics of edges, and the linguistic feature in the sentiment text. Second, to fully consider these different features affected with each other, a double graph fusion model is proposed, which takes as input the pretrained dependency graph into a message passing neural network (MPNN) initialized with the optimal parameters of the pretrained BAM for training. Lastly, extensive experiments were made against the state-of-the-art competitors over four benchmark datasets, and the results illustrate that our approach outperforms these competitors over most of the four datasets and achieves an accuracy of up to 92.69% and a macro-averaged F1 value of up to 85.79%. The MTL-based domain adaptive pretraining is efficient to achieve high-quality dependency parsing contributing to improving the performance of ASC, while maintaining lower computational cost. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Enhancing text classification with attention matrices based on BERT.
- Author
-
Yu, Zhiyi, Li, Hong, and Feng, Jialin
- Subjects
- *
LANGUAGE models , *CLASSIFICATION , *LEARNING strategies , *NATURAL language processing - Abstract
Summary: Text classification is a critical task in the field of natural language processing. While pre‐trained language models like BERT have made significant strides in improving performance in this area, the distinctive dependency information that is present in text has not been fully exploited. Besides, BERT mostly captures phrase‐level information in lower layers, which becomes progressively weaker with the increasing depth of layers. To address these limitations, our work focuses on enhancing text classification through the incorporation of Attention Matrices, particularly in the fine‐tuning process of pre‐trained models like BERT. Our approach, named AM‐BERT, leverages learned dependency relationships as external knowledge to enhance the pre‐trained model by generating attention matrices. In addition, we introduce a new learning strategy that enables the model to retain learned phrase‐level structure information. Extensive experiments and detailed analysis on multiple benchmark datasets demonstrate the effectiveness of our approach in text classification tasks. Furthermore, we show that AM‐BERT achieves stable performance improvements also in named entity recognition tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. CR-M-SpanBERT: Multiple embedding-based DNN coreference resolution using self-attention SpanBERT
- Author
-
Joon-young Jung
- Subjects
coreference resolution ,dependency parsing ,multiple embedding ,Telecommunication ,TK5101-6720 ,Electronics ,TK7800-8360 - Abstract
This study introduces CR-M-SpanBERT, a coreference resolution (CR) model that utilizes multiple embedding-based span bidirectional encoder representations from transformers, for antecedent recognition in natural language (NL) text. Information extraction studies aimed to extract knowledge from NL text autonomously and cost-effectively. However, the extracted information may not represent knowledge accurately owing to the presence of ambiguous entities. Therefore, we propose a CR model that identifies mentions referring to the same entity in NL text. In the case of CR, it is necessary to understand both the syntax and semantics of the NL text simultaneously. Therefore, multiple embeddings are generated for CR, which can include syntactic and semantic information for each word. We evaluate the effectiveness of CRM-SpanBERT by comparing it to a model that uses SpanBERT as the language model in CR studies. The results demonstrate that our proposed deep neural network model achieves high-recognition accuracy for extracting antecedents from NL text. Additionally, it requires fewer epochs to achieve an average F1 accuracy greater than 75% compared with the conventional SpanBERT approach.
- Published
- 2024
- Full Text
- View/download PDF
12. Integrating graph embedding and neural models for improving transition-based dependency parsing.
- Author
-
Le-Hong, Phuong and Cambria, Erik
- Subjects
- *
NATURAL language processing , *PARSING (Computer grammar) , *RECURRENT neural networks , *ARTIFICIAL neural networks , *LANGUAGE research , *NATURAL languages - Abstract
This paper introduces an effective method for improving dependency parsing which is based on a graph embedding model. The model helps extract local and global connectivity patterns between tokens. This method allows neural network models to perform better on dependency parsing benchmarks. We propose to incorporate node embeddings trained by a graph embedding algorithm into a bidirectional recurrent neural network scheme. The new model outperforms a baseline reference using a state-of-the-art method on three dependency treebanks for both low-resource and high-resource natural languages, namely Indonesian, Vietnamese and English. We also show that the popular pretraining technique of BERT would not pick up on the same kind of signal as graph embeddings. The new parser together with all trained models is made available under an open-source license, facilitating community engagement and advancement of natural language processing research for two low-resource languages with around 300 million users worldwide in total. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Using pre-trained models and graph convolution networks to find the causal relations among events in the Chinese financial text data.
- Author
-
Hu, Kai, Li, Qing, Xie, Jie, Pu, Yingyan, and Guo, Ya
- Subjects
CHINESE language ,PARSING (Computer grammar) ,SOFTWARE as a service ,SPACE exploration ,RESEARCH personnel ,CONTRACTING out ,RANDOM graphs - Abstract
Nowadays, information explosion happens in every field. In the stock market of China, automatically understanding the market dynamics is extremely important. However, the information and datasets in Chinese are overwhelming for researchers in the field. How to extract useful information and understand the underlying logic in the Chinese corpus are the research hotspot. Causal relation identification is one of the most central tasks. Many works have made important progress in finding the causal relations in open-domain text, however, there is still space for further explorations in the specific domain of the financial field. In this paper, we propose to use the graph convolution network (GCN) to help represent the dependency relations among the entities in the logical networks provided by the Chinese dependency parsing tool, language technology platform(LTP). The motivation for using the GCN method to help represent the dependency relations is that the causal relations are highly correlated with language structures. Besides, we also choose to use the domain-specific pre-trained model FinBERT because this pre-trained model is specific to the financial field. Results show that the GCN-based method and pertained models of FinBERT in our proposed model play a key role in outperforming the baseline model of the traditional sequential labeling method and the start of art method from F1 of 0.4573 and 0.5506 to 0.6254. Our approach also wins third place in the Eastern District of software service outsourcing competition in China in the year 2021. We believe the proposed methods can contribute as at least an alternative option in future relation extraction tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. CR‐M‐SpanBERT: Multiple embedding‐based DNN coreference resolution using self‐attention SpanBERT.
- Author
-
Jung, Joon‐young
- Subjects
LANGUAGE models ,ARTIFICIAL neural networks ,DATA mining - Abstract
This study introduces CR‐M‐SpanBERT, a coreference resolution (CR) model that utilizes multiple embedding‐based span bidirectional encoder representations from transformers, for antecedent recognition in natural language (NL) text. Information extraction studies aimed to extract knowledge from NL text autonomously and cost‐effectively. However, the extracted information may not represent knowledge accurately owing to the presence of ambiguous entities. Therefore, we propose a CR model that identifies mentions referring to the same entity in NL text. In the case of CR, it is necessary to understand both the syntax and semantics of the NL text simultaneously. Therefore, multiple embeddings are generated for CR, which can include syntactic and semantic information for each word. We evaluate the effectiveness of CR‐M‐SpanBERT by comparing it to a model that uses SpanBERT as the language model in CR studies. The results demonstrate that our proposed deep neural network model achieves high‐recognition accuracy for extracting antecedents from NL text. Additionally, it requires fewer epochs to achieve an average F1 accuracy greater than 75% compared with the conventional SpanBERT approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Automatic question generation using extended dependency parsing.
- Author
-
Sewunetie, Walelign Tewabe and Kovacs, Laszlo
- Subjects
SPEECH perception ,MACHINE learning - Abstract
The importance of automatic question generation (AQG) systems in education is recognized for automating tasks and providing adaptive assessments. Recent research focuses on improving quality with advanced neural networks and machine learning techniques. However, selecting the appropriate target sentences and concepts remains challenging in AQG systems. To address this problem, the authors created a novel system that combined sentence structure analysis, dependency parsing approach, and named entity recognition techniques to select the relevant target words from the given sentence. The main goal of this paper is to develop an AQG system using syntactic and semantic sentence structure analysis. Evaluation using manual and automatic metrics shows good performance on simple and short sentences, with an overall score of 3.67 out of 5.0. As the field of AQG continues to evolve rapidly, future research should focus on developing more advanced models that can generate a wider range of questions, especially for complex sentence structures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Comparative relation mining of customer reviews based on a hybrid CSR method.
- Author
-
Gao, Song, Wang, Hongwei, Zhu, Yuanjun, Liu, Jiaqi, and Tang, Ou
- Abstract
Online reviews contain comparative opinions that reveal the competitive relationships of related products, help identify the competitiveness of products in the marketplace, and influence consumers' purchasing choices. The Class Sequence Rule (CSR) method, which is previously commonly used to identify the comparative relations of reviews, suffers from low recognition efficiency and inaccurate generation of rules. In this paper, we improve on the CSR method by proposing a hybrid CSR method, which utilises dependency relations and the part-of-speech to identify frequent sequence patterns in customer reviews, which can reduce manual intervention and reinforce sequence rules in the relation mining process. Such a method outperforms CSR and other CSR-based models with an F-value of 84.67%. In different experiments, we find that the method is characterised by less time-consuming and efficient in generating sequence patterns, as the dependency direction helps to reduce the sequence length. In addition, this method also performs well in implicit relation mining for extracting comparative information that lacks obvious rules. In this study, the optimal CSR method is applied to automatically capture the deeper features of comparative relations, thus improving the process of recognising explicit and implicit comparative relations. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. A dependency-based hybrid deep learning framework for target-dependent sentiment classification.
- Author
-
Liu, Jingyi and Li, Sheng
- Subjects
- *
DEEP learning , *LINGUISTIC models , *SENTIMENT analysis , *PARSING (Computer grammar) , *CLASSIFICATION , *ELECTRONIC data processing - Abstract
One of the main challenges in target-dependent sentiment classification (TDSC) is dealing with sentences that contain multiple targets with varying polarities. Traditional sentiment analysis has shown the effectiveness of language characteristics. Therefore, we propose a method to extract target semantic-related tokens from sentences in order to simplify the sentiment classification task. To achieve this, we establish six grammatical principles that utilize grammatical knowledge to filter the relevant descriptions of targets. Since a target is typically a noun and acts as a subject, we summarize the six rules to extract the contexts contained in the objects and subordinate clauses. We use dependency parsing to analyze the grammatical relations between the target and its context. We design a data pre-processing method called Text Filtering (TF) to automate this procedure. After executing the TF algorithm, we pass the target-related words to a simple classifier to predict their sentiment polarities. Rather than feeding these features directly to a network and letting it learn features on its own, our approach employs dependency relations to extract context linked to the target. This provides the network with meaningful and representative features, resulting in superior results. We conduct ablation studies to investigate the effectiveness of the proposed TF algorithm. In the restaurant hard dataset, our approach improves accuracy by 13.76% and macro-F1 by 14.65% compared to a CNN-based method where TF is not implemented. • Incorporate linguistic knowledge to the model by a two-step framework for effective handling of multiple targets. • Target-related contexts were extracted by the summarized rules which distinguish content and function words. • A hierarchical and recursive data process method was designed specifically for TDSC named Text filtering (TF). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. A hitchhiker's guide to efficient non-projective dependency parsing
- Author
-
Zmigrod, Ran, Griffin, Timothy, and Cotterell, Ryan
- Subjects
Algorithms ,Dependency Parsing ,PhD Thesis - Abstract
Dependency parsing has played a pivotal role in NLP over the last few decades, and non- projective, graph-based dependency parsers have become a dominant approach for this task in the 21st century. While much research has focused on advancing log-linear parameterisations and neural network development, there has been a paucity of research addressing the fundamental algorithms that allow us to use graph-based dependency parsers. This thesis examines algorithms used in four stages of non-projective, graph-based dependency parsers: inference, sampling, decoding, and significance testing. The thesis will guide the reader through a series of novel extensions to existing algorithms, as well as the original develop- ment of new algorithms concerning these four processes in the dependency parsing galaxy. We provide a framework for efficiently evaluating a family of expectations that can be used for inference, such as risk, the Shannon entropy, the KL divergence, inter alia, by capitalising on the connection between gradients and expectations. We further introduce two fast methods for sampling trees from graph theory, and extend them to satisfy a common constraint in dependency parsing schemes that only allows one edge to emanate from the root. Additionally, we contribute the first sampling-without-replacement algorithm for graph-based dependency parsers. We also analyse, simplify, and extend algorithms for decoding the one-best and K-best trees and discuss existing algorithms and new modifications to enable root constrained decoding. Finally, we propose a novel algorithm to perform an exact paired permutation significance tests when comparing the performance of two parsers. Not only is this the first exact algorithm to perform this test, but it is also faster than current approximation strategies. This thesis offers insights into efficient algorithms in different aspects of dependency parsing. We shed light on core algorithms from the graph theory literature and develop these algorithms so that they execute more efficiently and satisfy necessary constraints in dependency parsing. This hitchhiker's guide to dependency parsing is accompanied by publicly available code that can be easily integrated into modern parsers, enabling more efficient parsers in future research.
- Published
- 2022
- Full Text
- View/download PDF
19. CoreNLP dependency parsing and pattern identification for enhanced opinion mining in aspect-based sentiment analysis
- Author
-
Makera Moayad Aziz, Azuraliza Abu Bakar, and Mohd Ridzwan Yaakub
- Subjects
Aspect-based sentiment analysis ,Sentiment analysis ,Dependency parsing ,Data processing ,Deep learning ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Aspect-Based Sentiment Analysis (ABSA) aims to identify the sentiment expressed towards a specific feature or aspect of a given text. Although certain ABSA techniques employ syntactic information to capture the connection between the opinion target and the sentiment word, they often do not incorporate data processing techniques such as dependency parsing, which can be beneficial in accurately capturing the sentiment expressed towards the opinion target. In this paper a method for ABSA that employs both syntax and semantic information and incorporates dependency parsing(Semantic-Syntactic Dependency Parsing (SSDP) Method) with Core Natural Language Processing (CoreNLP) which is a natural language processing library for processing the input text and identifying patterns effectively (according to the CoreNLP relations and part of speech tagging(POS)) to extract the critical relations that accurately reflect the sentiment conveyed regarding the opinion target is proposed. The results show that the proposed pattern captured approximately 75% of the data, and the rest were classified via Long Short-Term Memory (LSTM) based on semantic information. We illustrated the efficacy of SSDP, through experiments on the SemEval14,Semval15 and Semval16 datasets, which include two datasets (laptops and restaurants) carefully categorized by a human annotator into categories of positive, negative, or neutral attitudes. The experimental results reveal that SSDP is superior to the other state-of-the-art ABSA approaches, that use syntax information but do not utilize data processing techniques. Additionally, we highlight the limitations of ABSA methods that do not incorporate syntax information and the potential improvements that can be made through the use of data processing.
- Published
- 2024
- Full Text
- View/download PDF
20. Examining research topics with a dependency-based noun phrase extraction method: a case in accounting
- Author
-
Lei, Lei, Deng, Yaochen, and Liu, Dilin
- Published
- 2023
- Full Text
- View/download PDF
21. Hybrid Detection Method for Multi-Intent Recognition in Air–Ground Communication Text
- Author
-
Weijun Pan, Zixuan Wang, Zhuang Wang, Yidi Wang, and Yuanjing Huang
- Subjects
deep learning ,multi-intent recognition ,multi-intent detection ,dependency parsing ,air traffic control ,Motor vehicles. Aeronautics. Astronautics ,TL1-4050 - Abstract
In recent years, the civil aviation industry has actively promoted the automation and intelligence of control processes with the increasing use of various artificial intelligence technologies. Air–ground communication, as the primary means of interaction between controllers and pilots, typically involves one or more intents. Recognizing multiple intents within air–ground communication texts is a critical step in automating and advancing the control process intelligently. Therefore, this study proposes a hybrid detection method for multi-intent recognition in air–ground communication text. This method improves recognition accuracy by using different models for single-intent texts and multi-intent texts. First, the air–ground communication text is divided into two categories using multi-intent detection technology: single-intent text and multi-intent text. Next, for single-intent text, the Enhanced Representation through Knowledge Integration (ERNIE) 3.0 model is used for recognition; while the A Lite Bidirectional Encoder Representations from Transformers (ALBERT)_Sequence-to-Sequence_Attention (ASA) model is proposed for identifying multi-intent texts. Finally, combining the recognition results from the two models yields the final result. Experimental results demonstrate that using the ASA model for multi-intent text recognition achieved an accuracy rate of 97.84%, which is 0.34% higher than the baseline ALBERT model and 0.15% to 0.87% higher than other improved models based on ALBERT and ERNIE 3.0. The single-intent recognition model achieved an accuracy of 96.23% when recognizing single-intent texts, which is at least 2.18% higher than the multi-intent recognition model. The results indicate that employing different models for various types of texts can substantially enhance recognition accuracy.
- Published
- 2024
- Full Text
- View/download PDF
22. Advancing Hungarian Text Processing with HuSpaCy: Efficient and Accurate NLP Pipelines
- Author
-
Orosz, György, Szabó, Gergő, Berkecz, Péter, Szántó, Zsolt, Farkas, Richárd, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ekštein, Kamil, editor, Pártl, František, editor, and Konopík, Miloslav, editor
- Published
- 2023
- Full Text
- View/download PDF
23. Sentiment Component Extraction from Dependency Parse for Hindi
- Author
-
Panicker, Remya, Bhavsar, Ramchandra, Pawar, B. V., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Suma, V., editor, Lorenz, Pascal, editor, and Baig, Zubair, editor
- Published
- 2023
- Full Text
- View/download PDF
24. Enhancing Medication Event Classification with Syntax Parsing and Adversarial Learning
- Author
-
Szántó, Zsolt, Bánáti, Balázs, Zombori, Tamás, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Goedicke, Michael, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Stettner, Lukasz, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Rettberg, Achim, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Maglogiannis, Ilias, editor, Iliadis, Lazaros, editor, MacIntyre, John, editor, and Dominguez, Manuel, editor
- Published
- 2023
- Full Text
- View/download PDF
25. Combining Graph-Based Dependency Features with Convolutional Neural Network for Answer Triggering
- Author
-
Gupta, Deepak, Kohail, Sarah, Bhattacharyya, Pushpak, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, and Gelbukh, Alexander, editor
- Published
- 2023
- Full Text
- View/download PDF
26. DAT-MT Accelerated Graph Fusion Dependency Parsing Model for Small Samples in Professional Fields.
- Author
-
Li, Rui, Shu, Shili, Wang, Shunli, Liu, Yang, Li, Yanhao, and Peng, Mingjun
- Subjects
- *
DEEP learning , *PARSING (Computer grammar) , *INFORMATION technology , *TIME complexity , *KNOWLEDGE graphs , *INFORMATION overload , *FEATURE extraction - Abstract
The rapid development of information technology has made the amount of information in massive texts far exceed human intuitive cognition, and dependency parsing can effectively deal with information overload. In the background of domain specialization, the migration and application of syntactic treebanks and the speed improvement in syntactic analysis models become the key to the efficiency of syntactic analysis. To realize domain migration of syntactic tree library and improve the speed of text parsing, this paper proposes a novel approach—the Double-Array Trie and Multi-threading (DAT-MT) accelerated graph fusion dependency parsing model. It effectively combines the specialized syntactic features from small-scale professional field corpus with the generalized syntactic features from large-scale news corpus, which improves the accuracy of syntactic relation recognition. Aiming at the problem of high space and time complexity brought by the graph fusion model, the DAT-MT method is proposed. It realizes the rapid mapping of massive Chinese character features to the model's prior parameters and the parallel processing of calculation, thereby improving the parsing speed. The experimental results show that the unlabeled attachment score (UAS) and the labeled attachment score (LAS) of the model are improved by 13.34% and 14.82% compared with the model with only the professional field corpus and improved by 3.14% and 3.40% compared with the model only with news corpus; both indicators are better than DDParser and LTP 4 methods based on deep learning. Additionally, the method in this paper achieves a speedup of about 3.7 times compared to the method with a red-black tree index and a single thread. Efficient and accurate syntactic analysis methods will benefit the real-time processing of massive texts in professional fields, such as multi-dimensional semantic correlation, professional feature extraction, and domain knowledge graph construction. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. Using Syntax and Shallow Semantic Analysis for Vietnamese Question Generation.
- Author
-
Tran, Phuoc, Nguyen, Duy Khanh, Tran, Tram, and Vo, Bay
- Subjects
VIETNAMESE language ,SYNTAX (Grammar) ,NATURAL language processing ,SEMANTICS ,PARTS of speech - Abstract
This paper presents a method of using syntax and shallow semantic analysis for Vietnamese question generation (QG). Specifically, our proposed technique concentrates on investigating both the syntactic and shallow semantic structure of each sentence. The main goal of our method is to generate questions from a single sentence. These generated questions are known as factoid questions which require short, fact-based answers. In general, syntax-based analysis is one of the most popular approaches within the QG field, but it requires linguistic expert knowledge as well as a deep understanding of syntax rules in the Vietnamese language. It is thus considered a high-cost and inefficient solution due to the requirement of significant human effort to achieve qualified syntax rules. To deal with this problem, we collected the syntax rules in Vietnamese from a Vietnamese language textbook. Moreover, we also used different natural language processing (NLP) techniques to analyze Vietnamese shallow syntax and semantics for the QG task. These techniques include: sentence segmentation, word segmentation, part of speech, chunking, dependency parsing, and named entity recognition. We used human evaluation to assess the credibility of our model, which means we manually generated questions from the corpus, and then compared them with the generated questions. The empirical evidence demonstrates that our proposed technique has significant performance, in which the generated questions are very similar to those which are created by humans. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. An intelligent extension of the training set for the Persian n-gram language model: an enrichment algorithm.
- Author
-
Motavallian, Rezvan and Komeily, Masoud
- Subjects
- *
LANGUAGE models , *PERSIAN language , *TRAINING , *ALGORITHMS , *GRAMMAR , *CORPORA , *SENTENCE fragments , *HEURISTIC , *HEURISTIC algorithms - Abstract
In this article, we are going to introduce an automatic mechanism to intelligently extend the training set to improve the n-gram language model of Persian. Given the free word-order property in Persian, our enrichment algorithm diversifies n-gram combinations in baseline training data through dependency reordering, adding permissible sentences and filtering ungrammatical sentences using a hybrid empirical (heuristic) and linguistic approach. Experiments performed on baseline training set (taken from a standard Persian corpus) and the resulting enriched training set indicate a declining trend in average relative perplexity (between 34% to 73%) for informal/spoken vs. formal/written Persian test data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Data-driven dependency parsing of Vedic Sanskrit.
- Author
-
Hellwig, Oliver, Nehrdich, Sebastian, and Sellmer, Sven
- Subjects
- *
CORPORA , *TREES , *LANGUAGE & languages - Abstract
This paper describes the first data-driven parser for Vedic Sanskrit, an ancient Indo-Aryan language in which a corpus of important religious and philosophical texts has been composed. We report and critically discuss experiments with the input feature representations, paying special attention to the performance of contextualized word embeddings and to the influence of morpho-syntactic representations on the parsing quality. In addition, we provide an in-depth discussion of the parsing errors that covers structural traits of the predicted trees as well as linguistic and extra-textual influence factors. In its optimal configuration, the proposed model achieves 87.61 unlabeled and 81.84 labeled attachment score on a held-out set of test sentences, demonstrating good performance for an under-resourced language. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. On understanding character-level models for representing morphology
- Author
-
Vania, Clara, Lopez, Adam, and Goldwater, Sharon
- Subjects
006.3 ,natural language processing ,morphology ,morphemes ,dependency parsing ,character-level models ,NLP - Abstract
Morphology is the study of how words are composed of smaller units of meaning (morphemes). It allows humans to create, memorize, and understand words in their language. To process and understand human languages, we expect our computational models to also learn morphology. Recent advances in neural network models provide us with models that compose word representations from smaller units like word segments, character n-grams, or characters. These so-called subword unit models do not explicitly model morphology yet they achieve impressive performance across many multilingual NLP tasks, especially on languages with complex morphological processes. This thesis aims to shed light on the following questions: (1) What do subword unit models learn about morphology? (2) Do we still need prior knowledge about morphology? (3) How do subword unit models interact with morphological typology? First, we systematically compare various subword unit models and study their performance across language typologies. We show that models based on characters are particularly effective because they learn orthographic regularities which are consistent with morphology. To understand which aspects of morphology are not captured by these models, we compare them with an oracle with access to explicit morphological analysis. We show that in the case of dependency parsing, character-level models are still poor in representing words with ambiguous analyses. We then demonstrate how explicit modeling of morphology is helpful in such cases. Finally, we study how character-level models perform in low resource, cross-lingual NLP scenarios, whether they can facilitate cross-linguistic transfer of morphology across related languages. While we show that cross-lingual character-level models can improve low-resource NLP performance, our analysis suggests that it is mostly because of the structural similarities between languages and we do not yet find any strong evidence of crosslinguistic transfer of morphology. This thesis presents a careful, in-depth study and analyses of character-level models and their relation to morphology, providing insights and future research directions on building morphologically-aware computational NLP models.
- Published
- 2020
- Full Text
- View/download PDF
31. Synwmd: Syntax-aware word Mover's distance for sentence similarity evaluation.
- Author
-
Wei, Chengwei, Wang, Bin, and Jay Kuo, C.-C.
- Subjects
- *
GRAPH connectivity , *WEIGHTED graphs , *STORAGE & moving industry , *TASK performance , *VOCABULARY , *MAXIMA & minima - Abstract
• Syntax-aware Word Mover's Distance (SynWMD), a method for sentence similarity evaluation, is proposed. • Word importance is inferred from the graph built by the syntactic parse trees. • The local syntactic parsing structure of words is considered in computing the distance between words. • Experiments on semantic textual similarity tasks and sentence classification tasks have shown the effectiveness of Syn-WMD. Word Mover's Distance (WMD) computes the distance between words and models text similarity with the moving cost between words in two text sequences. Yet, it does not offer good performance in sentence similarity evaluation since it does not incorporate word importance and fails to take inherent contextual and structural information in a sentence into account. An improved WMD method using the syntactic parse tree, called Syntax-aware Word Mover's Distance (SynWMD), is proposed to address these two shortcomings in this work. First, a weighted graph is built upon the word co-occurrence statistics extracted from the syntactic parse trees of sentences. The importance of each word is inferred from graph connectivities. Second, the local syntactic parsing structure of words is considered in computing the distance between words. To demonstrate the effectiveness of the proposed SynWMD, we conduct experiments on 6 textual semantic similarity (STS) datasets and 4 sentence classification datasets. Experimental results show that SynWMD achieves state-of-the-art performance on STS tasks. It also outperforms other WMD-based methods on sentence classification tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Emotion Detection from the Text of the Qur'an Using Advance Roberta Deep Learning Net.
- Author
-
Karami, Mostafa, Talebpour, Alireza, Tajabadi, Farzaneh, and Hajimohammadi, Zeinab
- Subjects
NATURAL language processing ,DEEP learning ,LANGUAGE models ,MIRACLES ,JOY ,PARTS of speech ,EMOTION recognition ,EMOTIONS - Abstract
As data and context continue to expand, a vast amount of textual content, including books, blogs, and papers, is produced and distributed electronically. Analyzing such large amounts of content manually is a timeconsuming task. Automatic detection of feelings and emotions in these texts is crucial, as it helps to identify the emotions conveyed by the author, understand the author's writing style, and determine the target audience for these texts. The Qur'an, regarded as the word of God and a divine miracle, serves as a comprehensive guide and a reflection of human life. Detecting emotions and feelings within the content of the Qur'an contributes to a deeper understanding of God's commandments. Recent advancements, particularly the application of transformer-based language models in natural language processing, have yielded state-of-the-art results that are challenging to surpass easily. In this paper, we propose a method to enhance the accuracy and generality of these models by incorporating syntactic features such as Parts Of Speech (POS) and Dependency Parsing tags. Our approach aims to elevate the performance of emotion detection models, making them more robust and applicable across diverse contexts. For model training and evaluation, we utilized the Isear dataset, a wellestablished and extensive dataset in this field. The results indicate that our proposed model achieves superior performance compared to existing models, achieving an accuracy of 77% on this dataset. Finally, we applied the newly proposed model to recognize the feelings and emotions conveyed in the Itani English translation of the Qur'an. The results revealed that joy has the most significant contribution to the emotional content of the Holy Qur'an. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Text Classification Based on Graph Neural Networks and Dependency Parsing
- Author
-
YANG Xu-hua, JIN Xin, TAO Jin, MAO Jian-fei
- Subjects
text classification ,graph neural network ,dependency parsing ,graph classification ,Computer software ,QA76.75-76.765 ,Technology (General) ,T1-995 - Abstract
Text classification is a basic and important task in natural language processing.It is widely used in language processing scenarios such as news classification,topic tagging and sentiment analysis.The current text classification models generally do not consider the co-occurrence relationship of text words and the syntactic characteristics of the text itself,thus limiting the effect of text classification.Therefore,a text classification model based on graph convolutional neural network(Mix-GCN) is proposed.Firstly,based on the co-occurrence relationship and syntactic dependency between text words,the text data is constructed into a text co-occurrence graph and a syntactic dependency graph.Then the GCN model is used to perform representation learning on the text graph and syntactic dependency graph,and the embedding vector of the word is obtained.Then the embedding vector of the text is obtained by graph pooling method and adaptive fusion method,and the text classification is completed by the graph classification method.Mix-GCN model simultaneously considers the relationship between adjacent words in the text and the syntactic dependencies existing between text words,which improves the performance of text classification.On 6 benchmark datasets,compared to 8 well-known text classification methods,experimental results show that Mix-GCN has a good text classification effect.
- Published
- 2022
- Full Text
- View/download PDF
34. Sculpting Enhanced Dependencies for Belarusian
- Author
-
Shishkina, Yana, Lyashevskaya, Olga, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Burnaev, Evgeny, editor, Ignatov, Dmitry I., editor, Ivanov, Sergei, editor, Khachay, Michael, editor, Koltsova, Olessia, editor, Kutuzov, Andrei, editor, Kuznetsov, Sergei O., editor, Loukachevitch, Natalia, editor, Napoli, Amedeo, editor, Panchenko, Alexander, editor, Pardalos, Panos M., editor, Saramäki, Jari, editor, Savchenko, Andrey V., editor, Tsymbalov, Evgenii, editor, and Tutubalina, Elena, editor
- Published
- 2022
- Full Text
- View/download PDF
35. Reinforcement of BERT with Dependency-Parsing Based Attention Mask
- Author
-
Mechouma, Toufik, Biskri, Ismail, Meunier, Jean Guy, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Bădică, Costin, editor, Treur, Jan, editor, Benslimane, Djamal, editor, Hnatkowska, Bogumiła, editor, and Krótkiewicz, Marek, editor
- Published
- 2022
- Full Text
- View/download PDF
36. Construction Research and Applications of Industry Chain Knowledge Graphs
- Author
-
Zhang, Boyao, Wang, Zijian, Zhang, Haikuo, Zhao, Yonghua, Sun, Jingqi, Wang, Jing, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Memmi, Gerard, editor, Yang, Baijian, editor, Kong, Linghe, editor, Zhang, Tianwei, editor, and Qiu, Meikang, editor
- Published
- 2022
- Full Text
- View/download PDF
37. Adapting Cross-Lingual Model to Improve Vietnamese Dependency Parsing
- Author
-
Do, Duc, Dinh, Dien, Luong, An-Vinh, Do, Thao, Xhafa, Fatos, Series Editor, Dang, Ngoc Hoang Thanh, editor, Zhang, Yu-Dong, editor, Tavares, João Manuel R. S., editor, and Chen, Bo-Hao, editor
- Published
- 2022
- Full Text
- View/download PDF
38. A Comprehensive Analysis of Subword Contextual Embeddings for Languages with Rich Morphology
- Author
-
Akdemir, Arda, Shibuya, Tetsuo, Güngör, Tunga, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Wani, M. Arif, editor, Raj, Bhiksha, editor, Luo, Feng, editor, and Dou, Dejing, editor
- Published
- 2022
- Full Text
- View/download PDF
39. Word Sense Disambiguation Based on Graph and Knowledge Base
- Author
-
Meng, Fanqing, Chlamtac, Imrich, Series Editor, Mu, Shenglin, editor, Yujie, Li, editor, and Lu, Huimin, editor
- Published
- 2022
- Full Text
- View/download PDF
40. Automatic syntactic analysis of learner English
- Author
-
Huang, Yan and Korhonen, Anna
- Subjects
428.0071 ,subcategorization identification ,learner English ,dependency parsing ,subcategorization frame ,SCF ,syntactic analysis ,computational linguistics ,natural language processing ,second language acquisition ,corpus linguistics - Abstract
Automatic syntactic analysis is essential for extracting useful information from large-scale learner data for linguistic research and natural language processing (NLP). Currently, researchers use standard POS taggers and parsers developed on native language to analyze learner language. Investigation of how such systems perform on learner data is needed to develop strategies for minimizing the cross-domain effects. Furthermore, POS taggers and parsers are developed for generic NLP purposes and may not be useful for identifying specific syntactic constructs such as subcategorization frames (SCFs). SCFs have attracted much research attention as they provide unique insight into the interplay between lexical and structural information. An automatic SCF identification system adapted for learner language is needed to facilitate research on L2 SCFs. In this thesis, we first provide a comprehensive evaluation of standard POS taggers and parsers on learner and native English. We show that the common practice of constructing a gold standard by manually correcting the output of a system can introduce bias to the evaluation, and we suggest a method to control for the bias. We also quantitatively evaluate the impact of fine-grained learner errors on POS tagging and parsing, identifying the most influential learner errors. Furthermore, we show that the performance of probabilistic POS taggers and parsers on native English can predict their performance on learner English. Secondly, we develop an SCF identification system for learner English. We train a machine learning model on both native and learner English data. The system can label individual verb occurrences in learner data for a set of 49 distinct SCFs. Our evaluation shows that the system reaches an accuracy of 84\% F1 score. We then demonstrate that the level of accuracy is adequate for linguistic research. We design the first multidimensional SCF diversity metrics and investigate how SCF diversity changes with L2 proficiency on a large learner corpus. Our results show that as L2 proficiency develops, learners tend to use more diverse SCF types with greater taxonomic distance; more advanced learners also use different SCF types more evenly and locate the verb tokens of the same SCF type further away from each other. Furthermore, we demonstrate that the proposed SCF diversity metrics contribute a unique perspective to the prediction of L2 proficiency beyond existing syntactic complexity metrics.
- Published
- 2019
- Full Text
- View/download PDF
41. Fine-Tuning BERT-Based Pre-Trained Models for Arabic Dependency Parsing.
- Author
-
Al-Ghamdi, Sharefah, Al-Khalifa, Hend, and Al-Salman, Abdulmalik
- Subjects
NATURAL language processing ,LANGUAGE models - Abstract
With the advent of pre-trained language models, many natural language processing tasks in various languages have achieved great success. Although some research has been conducted on fine-tuning BERT-based models for syntactic parsing, and several Arabic pre-trained models have been developed, no attention has been paid to Arabic dependency parsing. In this study, we attempt to fill this gap and compare nine Arabic models, fine-tuning strategies, and encoding methods for dependency parsing. We evaluated three treebanks to highlight the best options and methods for fine-tuning Arabic BERT-based models to capture syntactic dependencies in the data. Our exploratory results show that the AraBERTv2 model provides the best scores for all treebanks and confirm that fine-tuning to the higher layers of pre-trained models is required. However, adding additional neural network layers to those models drops the accuracy. Additionally, we found that the treebanks have differences in the encoding techniques that give the highest scores. The analysis of the errors obtained by the test examples highlights four issues that have an important effect on the results: parse tree post-processing, contextualized embeddings, erroneous tokenization, and erroneous annotation. This study reveals a direction for future research to achieve enhanced Arabic BERT-based syntactic parsing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. 融入依存句法信息的事件时序关系识别.
- Author
-
李良毅, 张亚飞, 郭军军, 高盛祥, and 余正涛
- Subjects
LONG-term memory ,NATURAL languages ,VOCABULARY ,IDENTIFICATION - Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
43. Event Detection Using a Self-Constructed Dependency and Graph Convolution Network.
- Author
-
He, Li, Meng, Qingxin, Zhang, Qing, Duan, Jianyong, and Wang, Hao
- Subjects
NATURAL language processing ,REPRESENTATIONS of graphs ,UNDIRECTED graphs ,DIRECTED graphs ,SPANNING trees ,PROBLEM solving - Abstract
The extant event detection models, which rely on dependency parsing, have exhibited commendable efficacy. However, for some long sentences with more words, the results of dependency parsing are more complex, because each word corresponds to a directed edge with a dependency parsing label. These edges do not all provide guidance for the event detection model, and the accuracy of dependency parsing tools decreases with the increase in sentence length, resulting in error propagation. To solve these problems, we developed an event detection model that uses a self-constructed dependency and graph convolution network. First, we statistically analyzed the ACE2005 corpus to prune the dependency parsing tree, and combined the named entity features in the sentence to generate an undirected graph. Second, we implemented an enhanced graph convolution network using the multi-head attention mechanism to understand the representation of nodes in the graph. Finally, a gating mechanism combined the semantic and structural dependency information of the sentence, enabling us to accomplish the event detection task. A series of experiments conducted on the ACE2005 corpus demonstrates that the proposed method enhances the performance of the event detection model. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Shuffling Softly, Sighing Deeply: A Digital Inquiry into Representations of Older Men and Women in Literature for Different Ages.
- Author
-
Geybels, Lindsey
- Subjects
- *
OLDER men , *OLDER women , *OLDER people , *AGEISM , *YOUNG adult fiction , *CHILDREN'S literature , *YOUNG adults , *STEREOTYPES - Abstract
When gender is brought into concerns about older people, the emphasis often lies on stereotypes connected to older women, and few comparative studies have been conducted pertaining to the representation of the intersection between older age and gender in fiction. This article argues that not only children's literature, traditionally considered to be a carrier of ideology, plays a large part in the target readership's age socialization, but so do young adult and adult fiction. In a large corpus of 41 Dutch books written for different ages, the representation of older men and women is studied through the verbs, grammatical possessions and adjectives associated with the relevant fictional characters, which were extracted from the texts through the computational method of dependency parsing. Older adult characters featured most frequently in fiction for adults, where, more so than in the books for younger readers, they are depicted as being prone to illness, experiencing the effects of a deteriorating body and having a limited social network. In the books for children, little to no association between older adulthood and mortality was found in the data. Ageist stereotypes pertaining to both genders were found throughout the corpus. In terms of characterization, male older adults are associated more with physicality, including matters of illness and mobility, while character traits and emotions show up in a more varied manner in connection to female older characters. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Dependency parsing with bottom-up Hierarchical Pointer Networks.
- Author
-
Fernández-González, Daniel and Gómez-Rodríguez, Carlos
- Subjects
- *
PARSING (Computer grammar) , *NATURAL language processing , *CHINESE language , *ENGLISH language , *COMPUTATIONAL linguistics - Abstract
Dependency parsing is a crucial step towards deep language understanding and, therefore, widely demanded by numerous Natural Language Processing applications. In particular, left-to-right and top-down transition-based algorithms that rely on Pointer Networks are among the most accurate approaches for performing dependency parsing. Additionally, it has been observed for the top-down algorithm that Pointer Networks' sequential decoding can be improved by implementing a hierarchical variant, more adequate to model dependency structures. Considering all this, we develop a bottom-up oriented Hierarchical Pointer Network for the left-to-right parser and propose two novel transition-based alternatives: an approach that parses a sentence in right-to-left order and a variant that does so from the outside in. We empirically test the proposed neural architecture with the different algorithms on a wide variety of languages, outperforming the original approach in practically all of them and setting new state-of-the-art results on the English and Chinese Penn Treebanks for non-contextualized and BERT-based embeddings. • The left-to-right transition-based algorithm is a state-of-the-art dependency parser. • This is based on Pointer Networks that implement a sequential decoding. • We develop a hierarchical decoding for the left-to-right transition-based parser. • This approach reduces error-propagation by providing long-range dependency information. • The resulting neural architecture achieves a state-of-the-art performance on Penn Treebanks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Improving the performance of graph based dependency parsing by guiding bi-affine layer with augmented global and local features
- Author
-
Mücahit Altıntaş and A. Cüneyd Tantuğ
- Subjects
Artificial intelligence ,Natural language processing ,Dependency parsing ,Human sentence interpretation ,Sentence representation ,Super token features ,Cybernetics ,Q300-390 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The growing interaction between humans and machines raises the necessity to more sophisticated tools for natural language understanding. Dependency parsing is crucial for capturing the semantics of a sentence. Although graph-based dependency parsing approaches outperform transition-based methods because they are not exposed to error propagation as their compeer, their feature space is comparatively limited. Thus, the main issue with graph-based parsing is how to expand the set of features to improve performance. In this research, we propose to expand the feature space of graph-based parsers. To benefit from the global meaning of the entire sentence content, we employee the sentence representation as an additional token feature. Also, to highlight local word collaborations that build sub-tree structures, we use convolutional neural network layers over token embeddings. We achieve the state-of-art results for Turkish, English, Hungarian, and Korean by getting the unlabeled and labeled attachment scores respectively on the test sets; 82.64% and 76.35% on Turkish IMST, 93.36% and 91.34% on English EWT, 90.85% and 87.39% on Hungarian Szeged, 92.44% and 89.58% on Korean GSD treebanks. Our experimental findings show that augmented global and local features empower the performance of graph-based dependency parsers.
- Published
- 2023
- Full Text
- View/download PDF
47. Telugu Dependency Treebank
- Author
-
B. V. Seshu Kumari, M. Susmitha, S. Sudeshna, and P. Bala Kesava Reddy
- Subjects
Natural language processing ,Telugu language ,Dependency parsing ,Telugu tree bank ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
We discuss Telugu Language and Treebanks briefly in this work. Initially, we'll go over the Telugu language briefly. The paninian grammatical model utilized for Telugu dependency representation is then described. Following that, we explain Telugu treebanks and the various formats used to express these treebanks. We also discuss the Telugu language and its representation in the Telugu Dependency Treebank, and we give information on the Telugu language and the Telugu Dependency Treebank. Natural languages are often morphologically rich, and they create sentences in a variety of ways. Researchers have been investigating approaches to annotate text with linguistic knowledge since the advent of machine translation in the 1960s. Previous studies on Indian languages were done at the chunk level. The present shallow parser morphologically parses the input text to the chunk label. Researchers are considering working at the phrase level in the future. They broke the phrases down into smaller parts. The relationship between chunk heads is essential to proceed to sentence-level parsing. This results in reliance parsing.
- Published
- 2023
48. Im2Graph: A Weakly Supervised Approach for Generating Holistic Scene Graphs from Regional Dependencies.
- Author
-
Ghosh, Swarnendu, Gonçalves, Teresa, and Das, Nibaran
- Subjects
ARTIFICIAL intelligence ,IMAGE representation ,ISOMORPHISM (Mathematics) ,KNOWLEDGE graphs ,PARSING (Computer grammar) ,DEEP learning ,ALGORITHMS - Abstract
Conceptual representations of images involving descriptions of entities and their relations are often represented using scene graphs. Such scene graphs can express relational concepts by using sets of triplets 〈 s u b j e c t — p r e d i c a t e — o b j e c t 〉 . Instead of building dedicated models for scene graph generation, our model tends to extract the latent relational information implicitly encoded in image captioning models. We explored dependency parsing to build grammatically sound parse trees from captions. We used detection algorithms for the region propositions to generate dense region-based concept graphs. These were optimally combined using the approximate sub-graph isomorphism to create holistic concept graphs for images. The major advantages of this approach are threefold. Firstly, the proposed graph generation module is completely rule-based and, hence, adheres to the principles of explainable artificial intelligence. Secondly, graph generation can be used as plug-and-play along with any region proposition and caption generation framework. Finally, our results showed that we could generate rich concept graphs without explicit graph-based supervision. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Contextualized word senses: from attention to compositionality.
- Author
-
Gamallo, Pablo
- Abstract
The neural architectures of language models are becoming increasingly complex, especially that of Transformers, based on the attention mechanism. Although their application to numerous natural language processing tasks has proven to be very fruitful, they continue to be models with little or no interpretability and explainability. One of the tasks for which they are best suited is the encoding of the contextual sense of words using contextualized embeddings. In this paper we propose a transparent, interpretable, and linguistically motivated strategy for encoding the contextual sense of words by modeling semantic compositionality. Particular attention is given to dependency relations and semantic notions such as selection preferences and paradigmatic classes. A partial implementation of the proposed model is carried out and compared with Transformer-based architectures for a given semantic task, namely the similarity calculation of word senses in context. The results obtained show that it is possible to be competitive with linguistically motivated models instead of using the black boxes underlying complex neural architectures. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. 基于依存句法分析的分层语义通信系统.
- Author
-
姜胜腾, 罗 鹏, 刘月玲, 张亦弛, 曹 阔, 熊 俊, 赵海涛, and 魏急波
- Abstract
Copyright of Journal of Signal Processing is the property of Journal of Signal Processing and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.