Abstract
Outlier detection is an important data mining task for consistency checks, fraud detection, etc. Binary decision making on whether or not an object is an outlier is not appropriate in many applications and moreover hard to parametrize. Thus, recently, methods for outlier ranking have been proposed. Determining the degree of deviation, they do not require setting a decision boundary between outliers and the remaining data. High dimensional and heterogeneous (continuous and categorical attributes) data, however, pose a problem for most outlier ranking algorithms. In this work, we propose our OutRank approach for ranking outliers in heterogeneous high dimensional data. We introduce a consistent model for different attribute types. Our novel scoring functions transform the analyzed structure of the data to a meaningful ranking. Promising results in preliminary experiments show the potential for successful outlier ranking in high dimensional data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.