Domain Gap Research Articles

As an increasingly popular task in multimedia information retrieval, video moment retrieval (VMR) aims to localize the target moment from an untrimmed video according to a given language query. Most previous methods depend heavily on numerous manual annotations ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i> , moment boundaries), which are extremely expensive to acquire in practice. In addition, due to the domain gap between different datasets, directly applying these pre-trained models to an unseen domain leads to a significant performance drop. In this paper, we focus on a novel task: cross-domain VMR, where fully-annotated datasets are available in one domain (“source domain”), but the domain of interest (“target domain”) only contains unannotated datasets. As far as we know, we present the first study on cross-domain VMR. To address this new task, we propose a novel Multi-Modal Cross-Domain Alignment (MMCDA) network to transfer the annotation knowledge from the source domain to the target domain. However, due to the domain discrepancy between the source and target domains and the semantic gap between videos and queries, directly applying trained models to the target domain generally leads to a performance drop. To solve this problem, we develop three novel modules: (i) a domain alignment module is designed to align the feature distributions between different domains of each modality; (ii) a cross-modal alignment module aims to map both video and query features into a joint embedding space and to align the feature distributions between different modalities in the target domain; (iii) a specific alignment module tries to obtain the fine-grained similarity between a specific frame and the given query for optimal localization. By jointly training these three modules, our MMCDA can learn domain-invariant and semantic-aligned cross-modal representations. Extensive experiments on three challenging benchmarks (ActivityNet Captions, Charades-STA and TACoS) illustrate that our cross-domain method MMCDA outperforms all state-of-the-art single-domain methods. Impressively, MMCDA raises the performance by more than 7% in representative cases, which demonstrates its effectiveness.

The success of immuno-oncology treatments promises long-term cancer remission for an increasing number of patients. The response to checkpoint inhibitor drugs has shown a correlation with the presence of immune cells in the tumor and tumor microenvironment. An in-depth understanding of the spatial localization of immune cells is therefore critical for understanding the tumor’s immune landscape and predicting drug response. Computer-aided systems are well suited for efficiently quantifying immune cells in their spatial context. Conventional image analysis approaches are often based on color features and therefore require a high level of manual interaction. More robust image analysis methods based on deep learning are expected to decrease this reliance on human interaction and improve the reproducibility of immune cell scoring. However, these methods require sufficient training data and previous work has reported low robustness of these algorithms when they are tested on out-of-distribution data from different pathology labs or samples from different organs. In this work, we used a new image analysis pipeline to explicitly evaluate the robustness of marker-labeled lymphocyte quantification algorithms depending on the number of training samples before and after being transferred to a new tumor indication. For these experiments, we adapted the RetinaNet architecture for the task of T-lymphocyte detection and employed transfer learning to bridge the domain gap between tumor indications and reduce the annotation costs for unseen domains. On our test set, we achieved human-level performance for almost all tumor indications with an average precision of 0.74 in-domain and 0.72–0.74 cross-domain. From our results, we derive recommendations for model development regarding annotation extent, training sample selection, and label extraction for the development of robust algorithms for immune cell scoring. By extending the task of marker-labeled lymphocyte quantification to a multi-class detection task, the pre-requisite for subsequent analyses, e.g., distinguishing lymphocytes in the tumor stroma from tumor-infiltrating lymphocytes, is met.

Domain Gap Research Articles

Related Topics

Articles published on Domain Gap

Cross-View Image Synthesis From a Single Image With Progressive Parallel GAN

Domain Adaptation for Underwater Image Enhancement.

Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval

Crowd Counting via Unsupervised Cross-Domain Feature Adaptation

Source-Free Active Domain Adaptation via Augmentation-Based Sample Query and Progressive Model Adaptation.

Transformer-Based Synthetic-to-Measured SAR Image Translation via Learning of Representational Features

Memory-Contrastive Unsupervised Domain Adaptation for Building Extraction of High-Resolution Remote Sensing Imagery

Learning Adaptive Patch Generators for Mask-Robust Image Inpainting

Learning Shape-Invariant Representation for Generalizable Semantic Segmentation.

Face De-Occlusion With Deep Cascade Guidance Learning

SCAN++: Enhanced Semantic Conditioned Adaptation for Domain Adaptive Object Detection

DAKRS: Domain Adaptive Knowledge-Based Retrieval System for Natural Language-Based Vehicle Retrieval

An Empirical Study of Remote Sensing Pretraining

Pan-tumor T-lymphocyte detection using deep neural networks: Recommendations for transfer learning in immunohistochemistry

Depth-Assisted ResiDualGAN for Cross-Domain Aerial Images Semantic Segmentation

Cross-Modality Pyramid Alignment for Visual Intention Understanding.

Semi-Self-Supervised Learning for Semantic Segmentation in Images with Dense Patterns.

Cross-Domain Facial Expression Recognition via Contrastive Warm up and Complexity-Aware Self-Training.

Domain Adaptive Remote Sensing Scene Recognition via Semantic Relationship Knowledge Transfer

Learning Domain and Pose Invariance for Thermal-to-Visible Face Recognition

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Domain Gap Research Articles

Related Topics

Articles published on Domain Gap

Cross-View Image Synthesis From a Single Image With Progressive Parallel GAN

Domain Adaptation for Underwater Image Enhancement.

Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval

Crowd Counting via Unsupervised Cross-Domain Feature Adaptation

Source-Free Active Domain Adaptation via Augmentation-Based Sample Query and Progressive Model Adaptation.

Transformer-Based Synthetic-to-Measured SAR Image Translation via Learning of Representational Features

Memory-Contrastive Unsupervised Domain Adaptation for Building Extraction of High-Resolution Remote Sensing Imagery

Learning Adaptive Patch Generators for Mask-Robust Image Inpainting

Learning Shape-Invariant Representation for Generalizable Semantic Segmentation.

Face De-Occlusion With Deep Cascade Guidance Learning

SCAN++: Enhanced Semantic Conditioned Adaptation for Domain Adaptive Object Detection

DAKRS: Domain Adaptive Knowledge-Based Retrieval System for Natural Language-Based Vehicle Retrieval

An Empirical Study of Remote Sensing Pretraining

Pan-tumor T-lymphocyte detection using deep neural networks: Recommendations for transfer learning in immunohistochemistry

Depth-Assisted ResiDualGAN for Cross-Domain Aerial Images Semantic Segmentation

Cross-Modality Pyramid Alignment for Visual Intention Understanding.

Semi-Self-Supervised Learning for Semantic Segmentation in Images with Dense Patterns.

Cross-Domain Facial Expression Recognition via Contrastive Warm up and Complexity-Aware Self-Training.

Domain Adaptive Remote Sensing Scene Recognition via Semantic Relationship Knowledge Transfer

Learning Domain and Pose Invariance for Thermal-to-Visible Face Recognition