Abstract

With the success of deep learning in a wide variety of areas, many deep multi-task learning (MTL) models have been proposed claiming improvements in performance obtained by sharing the learned structure across several related tasks. However, the dynamics of multi-task learning in deep neural networks is still not well understood at either the theoretical or experimental level. In particular, the usefulness of different task pairs is not known a priori. Practically, this means that properly combining the losses of different tasks becomes a critical issue in multi-task learning, as different methods may yield different results. In this paper, we benchmarked different multi-task learning approaches using shared trunk with task specific branches architecture across three different MTL datasets. For the first dataset, i.e. Multi-MNIST (Modified National Institute of Standards and Technology database), we thoroughly tested several weighting strategies, including simply adding task-specific cost functions together, dynamic weight average (DWA) and uncertainty weighting methods each with various amounts of training data per-task. We find that multi-task learning typically does not improve performance for a user-defined combination of tasks. Further experiments evaluated on diverse tasks and network architectures on various datasets suggested that multi-task learning requires careful selection of both task pairs and weighting strategies to equal or exceed the performance of single task learning.

Highlights

  • The goal of multi-task learning (MTL) is to learn multiple different yet related tasks simultaneously [1]

  • Our contributions are as follows: a) To the best of our knowledge, this is the first meta-analysis that extensively compares the different weighting approaches combining multiple loss functions in the context of both heterogeneous MTL and homogeneous MTL. b) The key observation is that MTL approaches which enforce shared data representations could show more efficacy when the training samples are sparse. c) We find that many of the results obtained with any chosen set of tasks, which we refer to user-defined tasks, may not achieve performance gains over single task learning (STL), which calls the attention of the deep learning community for more rigorous theoretical analysis

  • EXPERIMENTS We developed a basic experiment to run on the aforementioned Multi-MNIST dataset to demonstrate the effect of MTL

Read more

Summary

Introduction

The goal of multi-task learning (MTL) is to learn multiple different yet related tasks simultaneously [1]. The weighting approaches we evaluated include uniform combination of losses from different tasks, dynamic weight average (DWA) [10] and uncertainty weighting methods [11] [12] with various amounts of training data per-task. Experiments of STL and MTL with two heads of classification tasks on Multi-MNIST dataset.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call