Back to Search Start Over

Neural Architecture Comparison for Bibliographic Reference Segmentation: An Empirical Study.

Authors :
Cuéllar Hidalgo, Rodrigo
Pinto Elías, Raúl
Torres-Moreno, Juan-Manuel
Vergara Villegas, Osslan Osiris
Reyes Salgado, Gerardo
Magadán Salazar, Andrea
Source :
Data (2306-5729); May2024, Vol. 9 Issue 5, p71, 24p
Publication Year :
2024

Abstract

In the realm of digital libraries, efficiently managing and accessing scientific publications necessitates automated bibliographic reference segmentation. This study addresses the challenge of accurately segmenting bibliographic references, a task complicated by the varied formats and styles of references. Focusing on the empirical evaluation of Conditional Random Fields (CRF), Bidirectional Long Short-Term Memory with CRF (BiLSTM + CRF), and Transformer Encoder with CRF (Transformer + CRF) architectures, this research employs Byte Pair Encoding and Character Embeddings for vector representation. The models underwent training on the extensive Giant corpus and subsequent evaluation on the Cora Corpus to ensure a balanced and rigorous comparison, maintaining uniformity across embedding layers, normalization techniques, and Dropout strategies. Results indicate that the BiLSTM + CRF architecture outperforms its counterparts by adeptly handling the syntactic structures prevalent in bibliographic data, achieving an F1-Score of 0.96. This outcome highlights the necessity of aligning model architecture with the specific syntactic demands of bibliographic reference segmentation tasks. Consequently, the study establishes the BiLSTM + CRF model as a superior approach within the current state-of-the-art, offering a robust solution for the challenges faced in digital library management and scholarly communication. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
23065729
Volume :
9
Issue :
5
Database :
Complementary Index
Journal :
Data (2306-5729)
Publication Type :
Academic Journal
Accession number :
177498917
Full Text :
https://doi.org/10.3390/data9050071