Abstract
In this paper we propose a novel Multi-Task Learning (MTL) framework, Split ‘n’ Merge Net. We draw the inspiration from the multi-head attention formulation of Transformers and propose a novel, simple and interpretable pathway to process information captured and exploited by multiple tasks. In particular, we propose a novel splitting network design, which is empowered with multi-head attention, and generates dynamic masks to filter task specific information and task agnostic shared factors from the input. To drive this generation, and to avoid the oversharing of information between the tasks, we propose a novel formulation of the mutual information loss which encourages the generated split embeddings to be distinct as possible. A unique merging network is also introduced to fuse the task specific, and shared information and generate an augmented embedding for the individual downstream tasks in the MTL pipeline. We evaluate the proposed Split ‘n’ Merge Network on two distinct MTL tasks where we achieve state-of-the-art results for both. Our primary, ablation and interpretation evaluations indicate the robustness and flexibility of the propose approach and demonstrates its applicability to numerous, diverse real-world MTL applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.