Abstract

The local outlier factor (LOF) algorithm is one of the representative algorithms based on the density outlier detection algorithm. But the algorithm has the problem of high time complexity, not suitable for large data sets and high dimensional data set. Therefore, this paper proposes a new outlier detection algorithm, clustering the data sets determines the data center of data space through the K-means clustering algorithm, building data set primary model by setting the distance threshold of the data set object to the data center, and optimizing the screening process combined the neighbor distribution of data objects. Although the use of clustering algorithm for abnormal data set screening will increase the computational complexity of the algorithm, but the data center space once identified will no longer need to repeat the calculation, so with the increase of data, the advantages of the algorithm will become more and more obvious. After testing, the algorithm can effectively improve the detection accuracy of anomaly factors, and reduce the computational complexity of the algorithm, and can complete the local outlier detection.<br />

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call