1. Predicting semantic category of answers for question answering systems using transformers: a transfer learning approach.
- Author
-
C M, Suneera, Prakash, Jay, and Alaparthi, Varun Sai
- Subjects
NATURAL language processing ,TRANSFORMER models ,LANGUAGE models ,DEEP learning ,KNOWLEDGE base ,QUESTION answering systems - Abstract
A question-answering (QA) system is a key application in the field of natural language processing (NLP) that provides relevant answers to user queries written in natural language. In factoid QA using knowledge bases, predicting the semantic category of answers, such as location, person, or numerical values, helps to reduce the search spaces and is an essential step in formal query construction for answer retrieval. However, finding the semantics in sequence data like questions is challenging. In this regard, Recursive neural networks based deep learning methods have been applied. But, they are inefficient in handling long-term dependencies. Recently, pre-trained language models employing transformers are proven effective and can generate context-dependent embedding for words and sentences from their encoders with attention mechanisms. However, to train an efficient transformer model for semantic category prediction requires a large dataset and high computational resources. Therefore, in this work, we employ a transfer learning approach using pre-trained transformer models by efficiently adapting them to predict the semantic category of answers from input questions. Here, embeddings from the encoders of the text-to-text transfer transformer (T5) model have been leveraged to obtain an efficient question representation and to train the classification model which is named as QcT5. Along with QcT5, an extensive experimental study on other recent transformer models - BERT, RoBERTa, DeBERTa, and XLNet, is conducted, and their performance is analyzed in various fine-tuning settings. Experimental results indicate that the QcT5 model significantly improves the performance compared to the selected state-of-the-art methods by achieving an f1-score of 98.7%, 89.9% on TREC-6, and TREC-50 datasets respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF