Back to Search Start Over

Bridge inspection named entity recognition via BERT and lexicon augmented machine reading comprehension neural model

Authors :
Di Wang
Shixin Jiang
Ren Li
Jianxi Yang
Tianjin Mo
Dong Li
Source :
Advanced Engineering Informatics. 50:101416
Publication Year :
2021
Publisher :
Elsevier BV, 2021.

Abstract

As an important data source in the field of bridge management, bridge inspection reports contain large-scale fine-grained data, including information on bridge members and structural defects. However, due to insufficient research on automatic information extraction in this field, valuable bridge inspection information has not been fully utilized. Particularly, for Chinese bridge inspection entities, which involve domain-specific vocabularies and have obvious nesting characteristics, most of the existing named entity recognition (NER) solutions are not suitable. To address this problem, this paper proposes a novel lexicon augmented machine reading comprehension-based NER neural model for identifying flat and nested entities from Chinese bridge inspection text. The proposed model uses the bridge inspection text and predefined question queries as input to enhance the ability of contextual feature representation and to integrate prior knowledge. Based on the character-level features encoded by the pre-trained BERT model, bigram embeddings and weighted lexicon features are further combined into a context representation. Then, the bidirectional long short-term memory neural network is used to extract sequence features before predicting the spans of named entities. The proposed model is verified by the Chinese bridge inspection named entity corpus. The experimental results show that the proposed model outperforms other mainstream NER models on the bridge inspection corpus. The proposed model not only provides a basis for automatic bridge inspection information extraction but also supports the downstream tasks such as knowledge graph construction and question answering systems.

Details

ISSN :
14740346
Volume :
50
Database :
OpenAIRE
Journal :
Advanced Engineering Informatics
Accession number :
edsair.doi...........1f0ae12637a896f65bffeeb7ed728371