Back to Search
Start Over
Rethinking Resource Competition in Multi-Task Learning: From Shared Parameters to Shared Representation
- Source :
- IEEE Access, Vol 12, Pp 128717-128728 (2024)
- Publication Year :
- 2024
- Publisher :
- IEEE, 2024.
-
Abstract
- The core idea of Multi-Task Learning (MTL) is to develop neural networks with a shared feature extraction backbone and multiple prediction heads, each capable of inferring a different task simultaneously. Parameters in the backbone contribute to all tasks while those in the prediction heads contribute to only one or fewer tasks. Challenges arise when multiple tasks compete for resource. Existing methods focus on resource competition in shared parameters and proposed explanatory factors of task conflicts, task dominance, and gradient stability. However the fundamental nature of MTL is still understudied. In this paper, instead of following the existing methodology research directions, we carry out large-scale empirical study and provide deeper insight on understanding MTL. In particular, instead of focusing on resource competition in the shared parameters in the backbone, we shift our attention to resource competition in the backbone output, which is the embedded representation that is shared by all prediction heads. We show that the existing explanatory factors display weak causal relationship with model performance. We propose a novel measurement, which we term Feature Disentanglement, and show that understanding MTL problems from the perspective of how the shared representation is leveraged by different prediction heads, is a more faithful and reliable way than that from the perspective of how supervision signals from different tasks are interfering in the shared parameters. Additionally, it has been a commonly employed technique to replace gradients w.r.t. shared parameters with gradients w.r.t. shared representation for reduced computation. We conduct a comprehensive study and show that unless a theoretical analysis could be developed, there is not general guarantee that this fast approximation technique would work in practice.
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 12
- Database :
- Directory of Open Access Journals
- Journal :
- IEEE Access
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.188ec2ab0f5e4e9da784367f66a1edc4
- Document Type :
- article
- Full Text :
- https://doi.org/10.1109/ACCESS.2024.3429281