Back to Search
Start Over
Towards Robust Prediction on Tail Labels
- Source :
- KDD
- Publication Year :
- 2021
- Publisher :
- ACM, 2021.
-
Abstract
- Extreme multi-label learning (XML) works to annotate objects with relevant labels from an extremely large label set. Many previous methods treat labels uniformly such that the learned model tends to perform better on head labels, while the performance is severely deteriorated for tail labels. However, it is often desirable to predict more tail labels in many real-world applications. To alleviate this problem, in this work, we show theoretical and experimental evidence for the inferior performance of representative XML methods on tail labels. Our finding is that the norm of label classifier weights typically follows a long-tailed distribution similar to the label frequency, which results in the over-suppression of tail labels. Base on this new finding, we present two new modules: (1)ReRank works to re-rank the predicted score, which significantly improves the performance on tail labels by eliminating the effect of label-priors; (2)Taug augments tail labels via a decoupled learning scheme, which can yield more balanced classification boundary. We conduct experiments on commonly used XML benchmarks with hundreds of thousands of labels, showing that the proposed methods improve the performance of many state-of-the-art XML models by a considerable margin (6% performance gain with respect to PSP@1 on average). Anonymous source code is available at https://github.com/ReRANK-XML/rerank-XML.
- Subjects :
- Scheme (programming language)
Source code
computer.internet_protocol
Computer science
business.industry
media_common.quotation_subject
Boundary (topology)
Pattern recognition
Base (topology)
Set (abstract data type)
Margin (machine learning)
Classifier (linguistics)
Artificial intelligence
business
computer
XML
media_common
computer.programming_language
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
- Accession number :
- edsair.doi...........c08128f2c2ded6d530d0d54d47530556
- Full Text :
- https://doi.org/10.1145/3447548.3467223