Abstract

Let Z be a union of a training set X and a testing set Y. Assume that a kernel method produces a dimensionality reduction (DR) mapping P that maps the high-dimensional data X to its row-dimensional representation P(X). The out-of-sample extension of dimensionality reduction problem is to find the dimensionality reduction of Y using the extension of P instead of re-training the whole set Z. In this paper, utilizing the framework of reproducing kernel Hilbert space theory, we introduce a least-square approach to extensions of the popular DR mappings called Diffusion maps (Dmaps). We establish a theoretic analysis for the out-of-sample DR Dmaps, which also provides a uniform treatment of many popular out-of-sample algorithms based on kernel methods. We also illustrate the validity of the developed out-of-sample DR algorithms in several examples.

Highlights

  • In many scientific and technological areas, we need to analyze and process highdimensional data, such as speech signals, images and videos, text documents, stock trade records, and others

  • Out-of-Sample Extensions of Diffusion Maps is often unpractical if the cardinality of X becomes very large, or the new data set Z comes as a time-stream

  • The main purpose of this paper is to give a mathematical analysis on the out-of-sample dimensionality reduction (DR) extension of Diffusion maps (Dmaps)

Read more

Summary

INTRODUCTION

In many scientific and technological areas, we need to analyze and process highdimensional data, such as speech signals, images and videos, text documents, stock trade records, and others. To reduce the dimensions of such data sets, people employ non-linear DR methods [6,7,8,9,10,11,12], among which, the method of Diffusion Maps (Dmaps) introduced by Coifman and his research group [13, 14] have been proved attractive and effective. Dmaps employs the diffusion kernel to define the similarity on a given data set X ⊂ RD. Out-of-Sample Extensions of Diffusion Maps is often unpractical if the cardinality of X becomes very large, or the new data set Z comes as a time-stream. The main purpose of this paper is to give a mathematical analysis on the out-of-sample DR extension of Dmaps. We give several examples for the extension

PRELIMINARY
LEAST-SQUARE OUT-OF-SAMPLE DR EXTENSIONS FOR DMAPS
Dmaps With the Graph-Laplacian Kernel
Dmaps With the Laplace-Beltrami Kernel
Algorithms for Out-of-Sample DR
ILLUSTRATIVE EXAMPLES
Out-of-Sample Extension by Graph-Laplacian Mapping
Findings
Out-of-Sample Extension by Laplace-Beltrami Mapping
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.