Back to Search Start Over

Rethinking Resource Competition in Multi-Task Learning: From Shared Parameters to Shared Representation

Authors :
Dayou Mao
Yuhao Chen
Yifan Wu
Maximilian Gilles
Alexander Wong
Source :
IEEE Access, Vol 12, Pp 128717-128728 (2024)
Publication Year :
2024
Publisher :
IEEE, 2024.

Abstract

The core idea of Multi-Task Learning (MTL) is to develop neural networks with a shared feature extraction backbone and multiple prediction heads, each capable of inferring a different task simultaneously. Parameters in the backbone contribute to all tasks while those in the prediction heads contribute to only one or fewer tasks. Challenges arise when multiple tasks compete for resource. Existing methods focus on resource competition in shared parameters and proposed explanatory factors of task conflicts, task dominance, and gradient stability. However the fundamental nature of MTL is still understudied. In this paper, instead of following the existing methodology research directions, we carry out large-scale empirical study and provide deeper insight on understanding MTL. In particular, instead of focusing on resource competition in the shared parameters in the backbone, we shift our attention to resource competition in the backbone output, which is the embedded representation that is shared by all prediction heads. We show that the existing explanatory factors display weak causal relationship with model performance. We propose a novel measurement, which we term Feature Disentanglement, and show that understanding MTL problems from the perspective of how the shared representation is leveraged by different prediction heads, is a more faithful and reliable way than that from the perspective of how supervision signals from different tasks are interfering in the shared parameters. Additionally, it has been a commonly employed technique to replace gradients w.r.t. shared parameters with gradients w.r.t. shared representation for reduced computation. We conduct a comprehensive study and show that unless a theoretical analysis could be developed, there is not general guarantee that this fast approximation technique would work in practice.

Details

Language :
English
ISSN :
21693536
Volume :
12
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.188ec2ab0f5e4e9da784367f66a1edc4
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2024.3429281