Start Over

Multitask learning, biased competition, and inter-task interference

Authors :: Pollard, Amelia
Shapiro, Jonathan
Publication Year :: 2021
Publisher :: University of Manchester, 2021.
Abstract: With recent advances in machine learning research, it is now possible for many simple tasks to be accurately modelled at human level performance. Modern machine learning is therefore increasingly focused on more complex tasks. Complex datasets in machine learning are of- ten presented as a monolithic whole, and are in turn treated as such. However, it is important to remember that as tasks become more complex, we begin to see interference between the subtasks which comprise them. This phenomenon is known as inter-task interference. If this inter-task interference is not addressed, the machine learning model will attempt to find a sin- gle approximation which performs equally well on many tasks, thereby lowering performance in comparison to the union of the tasks solved individually. In order to address this issue, this thesis argues that practitioners of machine learning should account for the tasks and subtasks within complex datasets by approaching them in a multi-task learning manner. Visual Question Answering exemplifies this complexity and, we hypothesise, constitutes a collection of shared tasks in a domain, rather than a single task as it has previously been treated. We argue for a new view of neural network training as a collection of subnetworks that can be generated and exploited as needed for a given task. This thesis presents three papers analysing the multi-task nature of the Visual Question An- swering task, proposing a new method inspired by Biased Competition by which this attribute can be taken advantage of, and finally demonstrating that this new methodology can apply to other less complex datasets with similar benefits. By empirical evidence, we demonstrate that Visual Question Answering can be viewed as a multi-task learning problem resulting in both improved performance and better generalisation. We then present the second of the three pa- pers, in which we make the primary contributions of this thesis. These contributions include an argument for learning complex tasks as a collection of component subtasks and the proposal of the Biased Competition model which takes advantage of this collection of subtasks. In addition, we present a new toy dataset for developing and analysing Visual Question Answering systems quickly while preserving the complexity of the original task in the reasoning component. We demonstrate that the Biased Competition model is capable of learning to separate tasks in an end-to-end manner without the requirement of additional task type labels, and in doing so, transfers knowledge about component subtasks between each task. This both significantly re- duces catastrophic interference between tasks and in some cases augments performance, as more tasks are added to the domain. Finally, we present a third paper in which we demonstrate that the Biased Competition model is applicable to simpler datasets, and showing that it acts to counteract overfitting and improve performance of both a Deep Convolutional Neural Network model and Deep Long-Short Term Memory model by learning to separate tasks.