Abstract

Outlier detection is critical in many business applications, as it recognizes unusual behaviours to prevent losses and optimize revenue. For example, illegitimate online transactions can be detected based on its pattern with outlier detection. The performance of existing outlier detection methods is limited by the pattern/behaviour of the dataset; these methods may not perform well without prior knowledge of the dataset. This paper proposes a multi-level outlier detection algorithm (MCOD) that uses multi-level unsupervised learning to cluster the data and discover outliers. The proposed detection method is tested on datasets in different fields with different sizes and dimensions. Experimental analysis has shown that the proposed MCOD algorithm has the ability to improving the outlier detection rate, as compared to the traditional anomaly detection methods. Enterprises and organizations can adopt the proposed MCOD algorithm to ensure a sustainable and efficient detection of frauds/outliers to increase profitability (and/or) to enhance business outcomes.

Highlights

  • Illegal actions in business usually lead to a significant amount of financial loss, especially with those organizations that handle a large amount of data or metrics

  • The machine learning methods they used were k-nearest neighbour (KNN), random forest, and support vector machines (SVM), while the deep learning methods were convolutional neural networks (CNN), restricted Boltzmann machine (RBM), and deep belief networks (DBN)

  • The MCBOD algorithm relies on the fact that the clustering method in the first layer can provide the knowledge about the dataset as the initial seeds to the second layer to achieve better detection of outliers in the dataset

Read more

Summary

Introduction

Illegal actions in business usually lead to a significant amount of financial loss, especially with those organizations that handle a large amount of data or metrics. The distance-based outlier detection method assumes inliers are closed to each other, and the density-based detection method believes inliers have more neighbour data points than outliners Those assumptions may not be suitable for general datasets with various types, sparsity, configurations, or prior labelling. The proposed MCOD algorithm uses self-organizing maps (SOM) [9] as the base level of clustering, due to its efficiency in handling several types of classification problems while providing a useful, interactive, and intelligible summary of the data. The rest of this paper is organized as follows: Section 2 introduces the background of current outlier detection methods; Section 3 presents the proposed model of multi-level clustering-based outlier detection; Section 4 describes the experimental analysis; Section 5 outlines the conclusions and the future works

Related Work and Background
Distance-Based Outlier Detection
Distribution-Based Outlier Detection
Density-Based Outlier Detection
Deviation-Based Outlier Detection
Angle-Based Outlier Detection
Deep Learning-Based Outlier Detection
Clustering-Based Outlier Detection
Multi-Level Clustering-Based Outlier Detection
SOM Clustering
DaIntapseutts
BSitoemp 6ed
Credit Card Datasets
Adopted Outliers Detection Algorithms
Evaluation Criteria
Individual Clustering Results
Experiment 1
Experiment 2
Experiment 3
Conclusions and Future Directions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.