Abstract

Previous chapter Next chapter Full AccessProceedings Proceedings of the 2006 SIAM International Conference on Data Mining (SDM)Fast Mining of Distance-Based Outliers in High-Dimensional DatasetsAmol Ghoting, Srinivasan Parthasarathy, and Matthew Eric OteyAmol Ghoting, Srinivasan Parthasarathy, and Matthew Eric Oteypp.609 - 613Chapter DOI:https://doi.org/10.1137/1.9781611972764.70PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract Defining outliers by their distance to neighboring data points has been shown to be an effective non-parametric approach to outlier detection. Existing algorithms for mining distance-based outliers do not scale to large, highdimensional data sets. In this paper, we present RBRP, a fast algorithm for mining distance-based outliers, particularly targeted at high-dimensional data sets. RBRP scales log-linearly as a function of the number of data points and linearly as a function of the number of dimensions. Our empirical evaluation demonstrates that we outperform the state-of-the-art, often by an order of magnitude. Previous chapter Next chapter RelatedDetails Published:2006ISBN:978-0-89871-611-5eISBN:978-1-61197-276-4 https://doi.org/10.1137/1.9781611972764Book Series Name:ProceedingsBook Code:PR124Book Pages:xii + 646Key words:Outlier detection, high-dimensional data sets, approximate k-nearest neighbors, clustering

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call