Abstract

ABSTRACTAnomaly detection has been extensively studied over the past decades; however, there are still various challenges due to the complex structures of the real-world datasets. First, only a few methods in the literature provide insight into the datasets that have both categorical and continuous attributes, and even fewer of them are sensitive to the dependencies between the two types of attributes. Second, a real-world dataset tends to be more complex in its structure, and the categorical attributes are usually hierarchically correlated, which has been largely ignored by the existing outlier detection approaches. Following this line of reasoning, we propose a distributed outlier detection method for mixed attribute datasets, especially with hierarchical categorical attributes. The proposed method accounts for the dependencies between categorical and continuous attributes rather than treating them as two separate parts. In addition, the proposed method is able to capture the hierarchical structure among categorical attributes. The experimental results on a real-world dataset and a simulation study show its superior performance in terms of both the detection accuracy and time efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call