Deep representation learning for domain generalization with information bottleneck principle

Jiao Zhang,Xu-Yao Zhang,Chuang Wang,Cheng-Lin Liu

doi:10.1016/j.patcog.2023.109737

Abstract

Although deep neural networks have achieved superior performance on many classical tasks, they deteriorate in real applications due to the unpredictable distribution shift. Domain generalization (DG) focuses on improving the generalization ability of the predictive model in unseen domains by training on multiple available source domains. All these domains share the same categories but commonly obey different distributions. In this paper, we establish a new theoretical framework for domain generalization from the perspective of the information bottleneck (IB) principle, which links representation learning in DG with domain-invariant representation learning and maximizing feature entropy (MFE). Based on the theoretical framework, we provide a feasible solution by class-wise instance discrimination combined with inter-dimension decorrelation and intra-dimension uniformity to learn the desired representation for domain generalization, which achieves excellent performance on multiple datasets without knowing domain labels. Extensive experiments show that the proposed regularization rule (MFE) can improve invariance-based DG methods consistently. Moreover, as an extreme case of domain generalization, we also show that MFE is promising to improve adversarial robustness.

Full Text