1. Unsupervised Continual Learning Using Cross-Level Discrimination and Evidential Pseudo Out-of-Distribution Detection Along With Gradient Projected Hard Attention
- Author
-
Ankit Malviya and Chandresh Kumar Maurya
- Subjects
Continual learning ,unsupervised learning ,catastrophic forgetting ,parameter isolation ,rehearsal free ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Catastrophic forgetting is a prominent challenge in machine learning, where models forget previously learned knowledge when exposed to new information. Supervised Continual Learning (SCL) addresses this by adapting to changing data distributions using labeled data. However, practical limitations arise for SCL in real-world settings, where labeled data is scarce. Conversely, Unsupervised Continual Learning (UCL) aims to mitigate forgetting without the need for manual annotations. Still, many previous UCL methods rely on replay-based strategies, which may not be viable in contexts where storing training data is impractical. Additionally, replay methods may face challenges related to overfitting and representation drifts when the buffer size is limited. To overcome the limitations of replay strategies, we propose an approach based on parameter isolation, combined with gradient projection. Specifically, we use task-specific hard attention and gradient projection to constrain updates to parameters crucial for previous tasks. Furthermore, our approach offers advantages over architecture-based methodologies by avoiding the need for network expansion and allowing for sequential learning within a predefined network architecture. Contrastive learning-based unsupervised methods are effective in preserving representation continuity. However, the loss in contrastive learning may suffer from a decrease in the diversity of negative samples. To address this, we incorporate both direct instance grouping and discrimination at the cross-level for actual and pseudo groups to contrastively learn unsupervised representations. We employ evidential deep learning (EDL) on rotation augmentation-based pseudo groups to effectively identify OOD instances and learn distinct class representations. Through comprehensive experiments, we demonstrate that the proposed model achieves an overall-average accuracy of 77.82% for task incremental learning (TIL), and 67.39% for class incremental learning (CIL) across various benchmark datasets, encompassing both short and long sequences of tasks. Notably, our model achieves almost zero forgetting, outperforming state-of-the-art (SOTA) baseline accuracies of 73.33% and 60.8% for TIL and CIL, respectively. Additionally, our model demonstrates a significant improvement over the SOTA SCL baseline, achieving a 2.82% and 3.68% increase in average TIL and CIL accuracy, respectively, while substantially reducing forgetting from approximately 12.66% and 19.56% to nearly zero.
- Published
- 2024
- Full Text
- View/download PDF