Abstract
A soft-constrained neural network for density estimation (SC-NN-4pdf) has recently been introduced to tackle the issues arising from the application of neural networks to density estimation problems (in particular, the satisfaction of the second Kolmogorov axiom). Although the SC-NN-4pdf has been shown to outperform parametric and non-parametric approaches (from both the machine learning and the statistics areas) over a variety of univariate and multivariate density estimation tasks, no clear rationale behind its performance has been put forward so far. Neither has there been any analysis of the fundamental theoretical properties of the SC-NN-4pdf. This paper narrows the gaps, delivering a formal statement of the class of density functions that can be modeled to any degree of precision by SC-NN-4pdfs, as well as a proof of asymptotic convergence in probability of the SC-NN-4pdf training algorithm under mild conditions for a popular class of neural architectures. These properties of the SC-NN-4pdf lay the groundwork for understanding the strong estimation capabilities that SC-NN-4pdfs have only exhibited empirically so far.
Highlights
Density estimation has long been a fundamental open issue in statistics and pattern classification.Implicitly or explicitly, it is at the core of statistical pattern recognition and unsupervised learning [1].Applications embrace data compression and model selection [2], coding [3], and bioinformatics [4].Density estimation was applied to the modeling of sequences [5,6] and structured data [7,8], as well.the task of estimating conditional probability distributions is fundamental to the broad area of probabilistic graphical models [9,10]
Density estimation is at the core of many practical applications rooted in pattern recognition, unsupervised learning, statistical analysis. and coding
artificial neural networks (ANN)-based pdf estimation algorithms struggle to break-through due to the difficulties posed by the very nature of the unsupervised estimation task and the requirement of satisfying Kolmogorov’s axioms of probability
Summary
Density estimation has long been a fundamental open issue in statistics and pattern classification.Implicitly or explicitly, it is at the core of statistical pattern recognition and unsupervised learning [1].Applications embrace data compression and model selection [2], coding [3], and bioinformatics [4].Density estimation was applied to the modeling of sequences [5,6] and structured data [7,8], as well.the task of estimating conditional probability distributions is fundamental to the broad area of probabilistic graphical models [9,10]. Density estimation has long been a fundamental open issue in statistics and pattern classification. Or explicitly, it is at the core of statistical pattern recognition and unsupervised learning [1]. Parametric techniques require a strong assumption on the form of the probability density function (pdf) at hand, while non-parametric approaches are memory-based (i.e., prone to overfitting), overly complex in time and space, and unreliable over small data samples. Despite the ease of training ANNs for Bayes posterior probability estimation aimed at pattern classification [12,13], learning density functions raises problems entailed by the intrinsically unsupervised task and, above all, to the requirement of satisfying (at least numerically) the axioms of probability [14]. The integral of the function realized by the ANN shall be equal to one
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.