Abstract
Multi-task learning (MTL) is a learning paradigm that aims at joint optimization of multiple tasks using a single neural network for better performance and generalization. In practice, MTL rests on the inherent assumption of availability of common datasets with ground truth labels for each of the downstream tasks. However, collecting such a common annotated dataset is laborious for complex computer vision tasks such as the saliency estimation which would require the eye fixation points as the ground truth data. To this end, we propose a novel MTL framework in the absence of common annotated dataset for joint estimation of important downstream tasks in computer vision - object detection and saliency estimation. Unlike many state-of-the-art methods, that rely on common annotated datasets for training, we consider the annotations from different datasets for jointly training different tasks, calling this setting as cross-domain MTL. We adapt MUTAN [3] framework to fuse features from different datasets to learn domain invariant features capturing the relatedness of different tasks. We demonstrate the improvement in the performance and generalizability of our MTL architecture. We also show that the proposed MTL network offers a 13% reduction in memory footprint due to parameter sharing between the related tasks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.