Unpaired image-to-image translation finds a mapping between two domains that do not have paired data. One approach is patchwise contrastive learning, a one-sided translation that maximizes mutual information between corresponding input and output patches. Noncorresponding patches are treated as negatives. Previous approaches randomly select noncorresponding patches, resulting in semantically similar patches incorrectly labeled as negatives. Inspired by negative learning, we propose the novel patchwise negative learning loss to address this issue. We do not naively minimize mutual information between all noncorresponding ones, unlike prior methods. Instead, we choose one noncorresponding patch and maximize dissimilarity with the query patch. The selected noncorresponding patch reduces the chance of choosing false negatives that contain high mutual information. By further maximizing dissimilarity with that single negative, we discourage our model from fitting on noisy negative patches. We demonstrate the capabilities of our model against other prominent image translation methods on the selfie2anime, horse2zebra, and cat2dog datasets.
Read full abstract