Outlier detection (OD) has been popularly developed in many fields such as medical diagnosis, network intrusion detection, fraud detection and military surveillance. This paper presents an accumulated relative density (ARD) OD method to identify outliers which possess relatively low or high local density. Previously, many density-based OD methods, such as local outlier factor (LOF) and Local Correlation Integral (LOCI), are applied to detect outliers which have low relative density in the data set. Relative local density (RLD) is measured and then compared with each other by statistics to label abnormities. In the proposed ARD method, a big circle centered at every data point is formed first. This big circle covers some data points with its radius. Then, for each encapsulated point inside this big circle, a small circle centered at itself is defined. Afterward, the ratio of number of covered data points inside the small circle of that particular point to the average number of data points in all small circles is defined as the RLD. After RLDs of all data points are calculated, a point whose RLD deviates greatly from the mean of all RLDs will be labeled as an outlier, otherwise as inliers. This ARD method was evaluated by a real world traffic data set which was originally represented as spatial-temporal (ST) traffic flow signals. The ST signals were processed by a principal component analysis (PCA) to reduce its dimension into two-dimensional 2D data points. An average 95% detection success rate (DSR) of OD can be achieved by this method.
Read full abstract