Abstract
Multi-media data including image, video, text, audio, and 3D model, has been fast emerging on the Internet. Jointly correlating the data of various media types is a challenging task. With the considerable learning ability of deep network, existing works mainly construct multi-pathway network to learn cross-media correlation, where each pathway is for one media type. However, with number of media types increasing, existing methods face the problems of high repetition and complexity, leading to overfitting and poor generalization ability, which makes adverse effect on correlation learning. For addressing the above issues, we propose cross-media deep compression and regularization (CDCR) approach for quintuple-media joint correlation learning: 1) cross-media partial weight-sharing networks is proposed, where a part of parameters are commonly shared among multiple pathways, to exploit common characteristics across different media types for capturing intrinsic cross-media correlation; 2) we propose media-adaptive network pruning to drop connections between weakly-correlated neurons, which can emphasize media-specific characteristics adaptively; and 3) cross-media network regularization is proposed to utilize relationships among quintuple-media data, which can guarantee generalization ability and enhance intra-media and inter-media correlation. The experiments verify the effectiveness of our approach, which outperforms the state-of-the-art methods on two very challenging datasets, including a large-scale dataset PKU XMediaNet with more than 100 000 quintuple-media instances.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.