Abstract
The combination of different sources of information is a problem that arises in several situations, for instance, when data are analysed using different similarity measures. Often, each source of information is given as a similarity, distance, or a kernel matrix. In this paper, we propose a new class of methods which consists of producing, for anomaly detection purposes, a single Mercer kernel (that acts as a similarity measure) from a set of local entropy kernels and, at the same time, avoids the task of model selection. This kernel is used to build an embedding of data in a variety that will allow the use of a (modified) one-class Support Vector Machine to detect outliers. We study several information combination schemes and their limiting behaviour when the data sample size increases within an Information Geometry context. In particular, we study the variety of the given positive definite kernel matrices to obtain the desired kernel combination as belonging to that variety. The proposed methodology has been evaluated on several real and artificial problems.
Highlights
Usual Data Mining tasks, such as classification, regression and anomaly detection, are heavily dependent on the geometry of the underlying data space
We explore linear combinations and Karcher means, to validate the intuition that the use of a more natural mean than the arithmetic mean will produce better practical results, as far as positive definite matrices are involved
We have explored how to combine different sources of information for anomaly detection within the framework of Entropy measures
Summary
Usual Data Mining tasks, such as classification, regression and anomaly detection, are heavily dependent on the geometry of the underlying data space. Machines (SVM), provide the control on the data space geometry through the use of a Mercer kernel function [1,2]. The choice of the appropriate kernel, including its parameters, is a particular case of model selection problems. A typical way to proceed is by means of cross-validation procedures [5] These parameter calibration strategies, intuitive and simple from an applied point of view, have some important drawbacks. An appealing alternative to model selection when working with SVM is to combine or merge different kernel functions into a single kernel [6,7]. The paper is organized as follows: Section 2 describes the functional data analysis methods used to produce the data representations from kernels, as well as the minimum entropy method used in this paper for anomaly detection.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.