Abstract
It is common that most data available for building an anomaly detector are usually collected under normal operations. When anomalous data are available (even though they may be limited), they can be used to improve the anomaly detection performance. However, how to best utilize the anomalous data is a significant challenge. In this paper, we report a new anomaly detection method, called Skewness & Constant False Alarm Rate Anomaly Detection (SCFAR-AD). This method can utilize limited anomalous information to find a refined decision boundary between normal and anomalous class data by following three steps. First, a kewness-based process generates skewed data from the normal dataset. Second, an improved one-class Support Vector Machines (1-SVMs) method is utilized to find an initial decision boundary surrounding the normal class data, and the built 1-SVMs classifier is applied to all the generated skewed data to find true outliers that are classified as anomalies. These true outliers, together with limited anomalous data, are used to form an expanded anomalous dataset. Third, we refine the initial decision boundary by a 2-class SVMs (2-SVMs) method to separate the expanded anomalous data from the normal data. In addition, with a constant false alarm rate concept, i.e., the false alarm rate (ratio of normal data that are classified as anomalous) being controlled, the decision boundary can be pushed away from the anomalies, resulting in an increased detection rate of anomalies. The proposed method has been successfully verified and validated with two benchmark datasets (German credit data and E. coli protein data) from the UC Irvine data repository.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.