Abstract
Anomaly detection is an important technique to make the data-exploding society safer and more harmonious. The existing weakly-supervised anomaly detection works ignore the problem of subpopulation shift, where novel abnormal subpopulation is not observed during the training time but emerges in the test set. The setting is challenging and valuable, but has not been studied in the previous literature. To solve the problem, we construct an unbiased risk estimator to analyze this new problem. Then, according to the unbiased risk estimator, we raise a novel and effective method called SPSAD (SubPopulations Shift Anomaly Detection). The method combines four components used in tandem. SPSAD firstly suggests a preliminary screening module, which takes advantage of intersection statistic to excavate the primary instances of the novel abnormal subpopulation from unlabeled data. Then, through updating similarity statistic based on primary instances, SPSAD can extract reliable anomalies and normal examples in unlabeled data, respectively. SPSAD then filtrates the instances acquired in the preliminary screening phase to obtain clean instances of novel abnormal subpopulations. Finally, SPSAD constructs a robust risk estimator on the basis of the excavated examples, performing weakly-supervised anomaly detection under subpopulation shift successfully. In addition, we make rigorous theoretical analyses and prove the feasibility of the new problem, which provides a certain theoretical guarantee for the algorithm we design. A range of empirical studies show that our algorithm is significantly better than the state of the art.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.