Abstract

Outlier detection is an important research topic in data mining. The existing outlier detection method does not consider ordinal data, such as heterogeneous data. Fuzzy β covering is an information granularity representation tool, which can better fit the distribution of objects in real data and characterize the differences of objects. This paper studies outlier detection for heterogeneous data via fuzzy β covering. Firstly, some new properties of fuzzy β covering are proposed. Then, the fuzzy similarity relation between any two objects in heterogeneous data is defined, and fuzzy information granules are constructed by fuzzy β covering to obtain the new fuzzy approximation accuracy. Next, the outlier factor based on fuzzy β covering is established by integrating the outlier degree and weight function of fuzzy information granules. Moreover, the corresponding algorithm (called FBCOD algorithm) is designed. Finally, the FBCOD algorithm is compared with eight existing outlier detection algorithms on 12 UCI data sets. The experimental results of AUC value, statistical test and F1 measurement show that the FBCOD algorithm has better effectiveness and flexibility than some existing algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call