Linear combination of densities and its direct estimation framework with applications

Min Xu,Fu-Lai Chung,Guanjin Wang,Shitong Wang

doi:10.1007/s00521-015-1947-3

Abstract

In this paper, typical learning task including data condensation, binary classification, identification of the independence between random variables and conditional density estimation is described from a unified perspective of a linear combination of densities, and accordingly a direct estimation framework based on a linear combination of Gaussian components (i.e., Gaussian basis functions) under integrated square error criterion is proposed to solve these learning tasks. The proposed direct estimation framework has three advantages. Firstly, different from most of the existing state-of-the-art methods in which estimating each component's density in this linear combination of densities and then combining them linearly are required, it can directly estimate the linear combination of densities as a whole, and it has at least comparable to or even better approximation accuracy than the existing density estimation methods. Secondly, the time complexity of the proposed direct estimation framework is O(l3) in which l is the number of Gaussian components in this framework which are generally viewed as the Gaussian distributions of the clusters in a dataset, and hence l is generally much less than the size of the dataset, so it is very suitable for large datasets. Thirdly, this proposed framework can be typically used to develop alternative approaches to classification, data condensation, identification of the independence between random variables, conditional density estimation and the similarity identification between multiple source domains and a target domain. Our preliminary results about experiments on several typical applications indicate the power of the proposed direct estimation framework.

Full Text