Ensemble Kernel Mean Matching

Yun-Qian Miao,Mohamed S Kamel,Ahmed K Farahat

doi:10.1109/icdm.2015.127

Abstract

The Kernel Mean Matching (KMM) is an elegant algorithm that produces density ratios between training and test data by minimizing their maximum mean discrepancy in a kernel space. The applicability of KMM to large-scale problems is however hindered by the quadratic complexity of calculating and storing the kernel matrices over training and test data. To address this problem, this paper proposes a novel ensemble algorithm for KMM, which divides test samples into smaller partitions, estimates a density ratio for each partition and then fuses these local estimates with a weighted sum. Our theoretical analysis shows that the ensemble KMM has a lower error bound than the centralized KMM, which uses all the test data at once to estimate the density ratio. Considering its suitability for distributed implementation, the proposed algorithm is also favorable in terms of time and space complexities. Experiments on benchmark datasets confirm the superiority of the proposed algorithm in terms of estimation accuracy and running time.

Full Text