Abstract

The Outlier detection is one of the major issues that has been worked out deeply within the Data Mining domain. It has been used to detect dissimilar observations within the data taken into the account. Detection of outliers helps to recognize the system faults and thereby helping the administrators to take preventive measures before it rises. In this paper, we recommends a comprehensive survey of an outlier detection. We anticipate this survey will support a better understanding of various directions in which experimental approach can be done on this topic.

Highlights

  • E The major problem in data mining is outlier detection which can be used to finding patterns in data that are varying from rest of the data[1]

  • T follows:“An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism.”It has been vastly used in major areas of applications such as military surveillance for enemy activities, intrusion detection in cyber security, fraud detection for credit cards, insurance or health care and fault detection in safety

  • Another form of data handled by outlier detection techniques in this domain is time series data, such as Electro Cardio Grams (ECG) and Electro Encephala Grams (EEG)

Read more

Summary

INTRODUCTION

E The major problem in data mining is outlier detection which can be used to finding patterns in data that are varying from rest of the data[1]. In several cases, unauthorized use may show different patterns, such as a buying differently from geographically strange locations Such patterns can be used to detect outliers in credit card transaction data. Medical Diagnosis: In many medical applications the data is collected from a variety of devices such as MRI scans, PET scans or ECG time-series Unusual patterns in such data typically reflect disease conditions. An important problem in the property-casualty, the insurance industry is claims fraud, e.g. automobile insurance fraud The data in this domain for fraud detection comes from the documents submitted by the claimants. Another form of data handled by outlier detection techniques in this domain is time series data, such as Electro Cardio Grams (ECG) and Electro Encephala Grams (EEG). Each data occurrences is defined using two sets of attributes are Contextual attributes and Behavioral attributes

E Collective Outliers
Statistical Detection Methods
Parametric Methods
Density Based Outlier Detection
Distance Based Outlier Detection
A Advantages and Disadvantages of Distance-based Methods
Methods
D Outlier Detection Enhanced DBSCAN
E References:
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.