Back to Search Start Over

基于类型注意力和GCN 的远程监督关系抽取.

Authors :
张欢
李卫疆
Source :
Computer Engineering & Science / Jisuanji Gongcheng yu Kexue. Feb2024, Vol. 46 Issue 2, p316-324. 9p.
Publication Year :
2024

Abstract

Distant supervision relation extraction uses the automatic alignment of natural language texts and knowledge bases to generate labeled training datasets, solving the problem of manual sample labeling. In the current research, most distant supervision does not pay attention to the long-tail data, so most of the sentence bags obtained by distant supervision contain too few sentences. These sentence bags cannot truly and comprehensively express the data itself. This paper proposes a distant supervised relation extraction model (PG+PTATT) based on position-type attention mechanism and graph convolutional network. According to the similarity between sentence bags, Graph Convolutional Networks (GCN) aggregate the implicit high-level features of similar sentence bags to optimize the sentence bags and obtain more prosperous and more comprehensive feature information of the sentence bags. At the same time, an attention mechanism, Position-Type Attention (PTATT) is constructed, which can solve the problem of wrong labels in distant supervision relation extraction: using the position relationships between entity words and non-entity words and type relationships are modeled to reduce the impact of noisy words. The proposed model is experimentally verified on the dataset New York Times (NYT), and the experimental results show that the proposed model can effectively solve the problems existing in distant supervision relation extraction; and it can effectively improve the accuracy of relation extraction.. [ABSTRACT FROM AUTHOR]

Details

Language :
Chinese
ISSN :
1007130X
Volume :
46
Issue :
2
Database :
Academic Search Index
Journal :
Computer Engineering & Science / Jisuanji Gongcheng yu Kexue
Publication Type :
Academic Journal
Accession number :
175786587
Full Text :
https://doi.org/10.3969/j.issn.1007-130X.2024.02.014