Abstract

We discuss an approach called spontaneous data learning (SDL) to open novel explanatory paradigm connecting parametrics with nonparametrics. The statistical performance for SDL is explored from information geometric viewpoint, so that SDL gives a new perspective beyond the discussion for robustness or misspecification of parametric model. If the true distribution is exactly in the parametric model, the theory of statistical estimation has been well established, in which any minimum divergence estimator satisfies parametric consistency. We focus on a collapse of the parametric theory perturbing toward a nonparametric setting, where the true distribution may range from unimodality to multimodality; various estimators are targeted and investigated in a class of minimum divergence. In this context a selection of estimators is explored rather than model selection. Specifically we choose the power divergence class under a normal mean model, where the true distribution is, for example, a mixture of K distributions. Then we observe that the local minima of the empirical loss function for the power divergence properly suggest the K means if they are mutually separated in the mixture distribution, and the order of power is appropriated selected. The resulting method for clustering analysis is shown to spontaneously detects the number K of clusters. Further, we observe that the normalized empirical loss function converges to the true density function if the power parameter goes to infinity. As a result the power parameter combines between the parametric and nonparametric consistency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call