Abstract

Outlier detection is an important task in data mining which aims at detecting patterns that are unusual in a dataset. Though several techniques are proved to be useful in solving some outlier detection problems, there are certain issues yet to be resolved. Most of the existing methods compute distance of points in full dimensional space to detect outliers. But in high dimensional space, the concept of proximity may not be qualitatively meaningful due to the curse of dimensionality and incurs high computational cost. Moreover, the existing methods focus on discovering outliers but do not provide the interpretability of different subspaces that cause the abnormality. Frequent pattern mining based approaches resolve the aforementioned issues. Recently, infrequent pattern mining has attracted the attention of data mining research community which aims at discovering rare associations and researches in this area motivated to propose a new method to detect outliers in data streams. Infrequent patterns are more interesting than frequent patterns in some domains such as fraudulent credit transactions, anomaly detection, etc. In such applications, mining infrequent patterns facilitates detecting outliers. Minimal infrequent patterns are generators of family of infrequent patterns. In this paper, a novel method is presented to detect outliers by mining minimal infrequent patterns from data streams. Three measures namely Transaction Weighting Factor (TWF), Minimal Infrequent Deviation Factor (MIPDF) and Minimal Infrequent Pattern based Outlier Factor (MIFPOF) are defined. An algorithm called Minimal Infrequent Pattern based Outlier Detection (MIFPOD) method is proposed for detecting outliers in data streams based on mined minimal infrequent patterns. The effectiveness of the proposed method is demonstrated on synthetic dataset obtained from vital dataset collected from body sensors and a publicly available real dataset. The experimental results have shown that the proposed method outperforms the existing methods in detecting outliers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.