Freeze-and-mutate: abnormal sample identification for DL applications through model core analysis

Huiyan Wang,Chang Xu,Ziqi Chen

doi:10.1007/s10515-022-00373-7

Huiyan Wang, Chang Xu + Show 1 more

Open Access

https://doi.org/10.1007/s10515-022-00373-7

Copy DOI

Abstract

Deep learning (DL) applications, representing an emerging form of new software, are gaining increasing popularity by their intelligent and adaptive services. However, their service reliability depends highly on the prediction accuracy of their internally-integrated DL models. In practice, DL models are often observed to suffer from ill predictions upon abnormal inputs (e.g., adversarial attacking samples, out-of-distribution (OOD) samples, and etc.), and this could easily lead to unexpected behaviors or even catastrophic consequences (e.g., system crash). One promising way to guard the application reliability is to reveal such abnormal inputs in time before they are fed to the DL models integrated in the concerned applications. Then remedy actions (e.g., discarding or fixing these inputs) can be done to protect applications from acting abnormally. Existing work addressed this revealing problem by either making sample distance-comparison based analysis or generating sufficient model mutants for comparative analysis. However, such treatments caused a restricted focus on samples only, while overlooking the DL models themselves, or had to analyze massive mutants, incurring non-negligible overheads to applications. In this article, we propose a novel approach, NetChopper, to conducting a core analysis on the target DL model, and then partitioning it into two parts, one associating closely with the training knowledge being the model core (expected to be important and thus stable), and the other being the remaining part (expected to be immaterial and thus changeable). Based on such partitioning, NetChopper proceeds to preserve (or freeze) the model core, but mutate the remaining part to produce only a small number of model mutants. Later, NetChopper becomes able to reveal abnormal inputs from normal ones by exploiting these model-relevant and light-weight mutants only. We experimentally evaluated NetChopper by widely-used DL subjects (e.g., MNIST+LeNet4, and CIFAR10+VGG16) and typical abnormal inputs (e.g., adversarial and OOD samples). The results reported NetChopper ’s promising AUROC scores in revealing the abnormal degrees of inputs, generally and stably outperforming, or comparably effective as, state-of-the-art techniques (e.g., mMutant, Surprise, and Mahalanobis), and also confirmed its high effectiveness and efficiency (with only marginal online overhead).

Full Text