Abstract
Nonparametric estimation for a probability density function that describes multivariate data has typically been addressed by kernel density estimation (KDE). A novel density estimator recently developed by Farmer and Jacobs offers an alternative high-throughput automated approach to univariate nonparametric density estimation based on maximum entropy and order statistics, improving accuracy over univariate KDE. This article presents an extension of the single variable case to multiple variables. The univariate estimator is used to recursively calculate a product array of one-dimensional conditional probabilities. In combination with interpolation methods, a complete joint probability density estimate is generated for multiple variables. Good accuracy and speed performance in synthetic data are demonstrated by a numerical study using known distributions over a range of sample sizes from 100 to 106 for two to six variables. Performance in terms of speed and accuracy is compared to KDE. The multivariate density estimate developed here tends to perform better as the number of samples and/or variables increases. As an example application, measurements are analyzed over five filters of photometric data from the Sloan Digital Sky Survey Data Release 17. The multivariate estimation is used to form the basis for a binary classifier that distinguishes quasars from galaxies and stars with up to 94% accuracy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.