Abstract

Data mining to discover patterns and aid decisions is the key to utilizing massive data for process automation and optimization. An especially challenging data mining problem is kriging, i.e., prediction of multiple, related variables from latent patterns in the data. We present a manifold based machine learning approach to discover patterns in massive, correlated, high-dimensional data. Dimensionality reduction using a manifold is a type of non-linear principal component analysis (PCA). The manifold captures the underlying data structure of the inputs and corresponding outputs by way of projecting the data onto a set of basis functions defined by the manifold. These bases ensure that any future adjustments affect the model with respect to the natural geometry of the data. We chose the manifold learning technique for its robustness against unbalanced data. Our contribution, described in this paper, enables interactive learning and incremental learning, i.e., incremental adjustment of the manifold (and its predictions) based on new observations and also user corrections to the predicted values, rerun the analysis on the full data set. Our experiments demonstrate that prediction performance remains equivalent to Multi-kernel Gaussian Processes on standard data sets despite these practically useful enhancements.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call