Abstract

Outlier detection is essential in data-based science, it aims at detecting the itemsets that with a significant difference to other data. With the limiting of equipment precision and network transmission, the uncertain data is more common in daily life. However, the traditional outlier detection methods are not applicable for uncertain data stream and the large volume of data makes the outlier detecting occupy large memory usage and time cost, moreover, the multiple scanning times on data stream for Apriori-like methods are unrealistic. In this paper, the matrix structure is constructed to store the information of uncertain data stream and the following mining process is conducted with matrix structure, therefore, the whole data stream only need to be scanned for only one times. Then, the “upper cap” concept is used in FIM-UDS method to mine the frequent itemsets more effectively to support outlier detecting. Moreover, two outlier factors and outlier detection method that called FIM-UDSOD are designed to detect the potential outliers. Finally, two public datasets are used to verify the efficiency of FIM-UDS method and one synthetic dataset is used for evaluating FIM-UDSOD method, the experimental results show that our proposed FIM-UDSOD method is more effective in outlier detecting.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.