Abstract

For many computer vision applications, the learned models usually have high performance on the training datasets but suffer from significant performance degradation when deployed in new environments, where there are usually style differences between the training images and the testing images. For high-level vision tasks, an effective domain generalizable model is expected to be able to learn feature representations that are both generalizable and discriminative. In this paper, we design a novel Style Normalization and Restitution module (SNR) to simultaneously ensure high generalization and discrimination capability of the networks. In SNR, particularly, we filter out the style variations (<i>e.g</i>., illumination, color contrast) by performing Instance Normalization (IN) to obtain style normalized features, where the discrepancy among different samples/domains is reduced. However, such a process is task-ignorant and inevitably removes some task-relevant discriminative information, which may hurt the performance. To remedy this, we propose to distill task-relevant discriminative features from the residual (<i>i</i>.<i>e</i>., the difference between the original feature and the style normalized feature) and add them back to the network to ensure high discrimination. Moreover, for better disentanglement, we enforce a dual restitution loss constraint to encourage the better separation of task-relevant and task-irrelevant features. We validate the effectiveness of our SNR on different vision tasks, including classification, semantic segmentation, and object detection. Experiments demonstrate that our SNR is capable of improving the performance of networks for domain generalization (DG) and unsupervised domain adaptation (UDA).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call