Back to Search Start Over

Hierarchical Text Classification As Sub-Hierarchy Sequence Generation

Authors :
Im, SangHun
Kim, Gibaeg
Oh, Heung-Seon
Jo, Seongung
Kim, Donghwan
Source :
Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 12933-12941 (2023)
Publication Year :
2021

Abstract

Hierarchical text classification (HTC) is essential for various real applications. However, HTC models are challenging to develop because they often require processing a large volume of documents and labels with hierarchical taxonomy. Recent HTC models based on deep learning have attempted to incorporate hierarchy information into a model structure. Consequently, these models are challenging to implement when the model parameters increase for a large-scale hierarchy because the model structure depends on the hierarchy size. To solve this problem, we formulate HTC as a sub-hierarchy sequence generation to incorporate hierarchy information into a target label sequence instead of the model structure. Subsequently, we propose the Hierarchy DECoder (HiDEC), which decodes a text sequence into a sub-hierarchy sequence using recursive hierarchy decoding, classifying all parents at the same level into children at once. In addition, HiDEC is trained to use hierarchical path information from a root to each leaf in a sub-hierarchy composed of the labels of a target document via an attention mechanism and hierarchy-aware masking. HiDEC achieved state-of-the-art performance with significantly fewer model parameters than existing models on benchmark datasets, such as RCV1-v2, NYT, and EURLEX57K.<br />Comment: 9 pages, 5 figures, Published at AAAI23

Details

Database :
arXiv
Journal :
Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 12933-12941 (2023)
Publication Type :
Report
Accession number :
edsarx.2111.11104
Document Type :
Working Paper
Full Text :
https://doi.org/10.1609/aaai.v37i11.26520