Abstract

Recently, out-of-distribution (OOD) detection has received considerable attention, because confident labels assigned to OOD examples represent a vulnerability similar to adversarial input perturbation. We are interested in models that combine the benefits of being robust to adversarial input and being able to detect OOD examples. Furthermore, we require that both in-distribution classification and OOD detection be robust to adversarial input perturbation. Several related studies apply an ad-hoc combination of several design choices to achieve similar goals. One can use several functions over the logit or soft-max layer for defining training objectives, OOD detection methods and adversarial attacks. Here, we present a design-space that covers such design choices, as well as a principled way of evaluating the networks. This includes a strong attack scenario where both in-distribution and OOD examples are adversarially perturbed to mislead OOD detection. We draw several interesting conclusions based on our empirical analysis of this design space. Most importantly, we argue that the key factor is not the OOD training or detection method in itself, but rather the application of matching detection and training methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call