Abstract

In order to avoid the curse of dimensionality, frequently encountered in big data analysis, there has been vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower-dimensional manifold; thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real-life applications, data is often very noisy. In this work, we propose a method to approximate $$\mathcal {M}$$ a d-dimensional $$C^{m+1}$$ smooth submanifold of $$\mathbb {R}^n$$ ( $$d \ll n$$ ) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located “near” the lower-dimensional manifold and suggest a nonlinear moving least-squares projection on an approximating d-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $$\mathcal {O}(h^{m+1})$$ , where h is the fill distance and m is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension n. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high-dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call