Abstract

An outlier in a dataset is a point or a class of points that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of outliers is important for many applications and has always attracted attention among data mining research community. In this paper, a new method in detecting outlier based on Rough Sets Theory is proposed. The main concept of using the Rough Sets for outlier detection is to discover Non-Reduct from the information system (IS). Non-Reduct is a set of attributes from IS that may contain outliers. It is discovered through the computation of Non-Reduct by defining Indiscernibility matrix modulo (iDMM D) and Indiscernibility function modulo (iDFM D). A measurement called RSetOF (Rough Set Outlier Factor Value) is hereby defined to identify and detect outlier objects. Extensive experiments were conducted where ten benchmark datasets were tested with the proposed method. To evaluate the effectiveness of performance of the proposed method, RSetAlg is compared to the Frequent Pattern (FindFPOF) method. The experimental result reveals that the approach utilised is a good outlier detection method compared to FindFPOF method. Thus, this proposed method has formed a novel and competitive method in outlier detection.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.