A Balanced Approach to Adaptive Probability Density Estimation.

Julio A Kovacs,Cailee Helmick,Willy Wriggers

doi:10.3389/fmolb.2017.00025

Julio A Kovacs, Cailee Helmick + Show 1 more

Open Access

https://doi.org/10.3389/fmolb.2017.00025

Copy DOI

Journal: Frontiers in Molecular Biosciences	Publication Date: Apr 25, 2017
Citations: 3	License type: cc-by

Affiliation: Old Dominion University

Abstract

Our development of a Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data led us to the general problem of how to accurately estimate the probability density function of a random variable, especially in cases of very uneven samples. Here, we propose a novel Balanced Adaptive Density Estimation (BADE) method that effectively optimizes the amount of smoothing at each point. To do this, BADE relies on an efficient nearest-neighbor search which results in good scaling for large data sizes. Our tests on simulated data show that BADE exhibits equal or better accuracy than existing methods, and visual tests on univariate and bivariate experimental data show that the results are also aesthetically pleasing. This is due in part to the use of a visual criterion for setting the smoothing level of the density estimate. Our results suggest that BADE offers an attractive new take on the fundamental density estimation problem in statistics. We have applied it on molecular dynamics simulations of membrane pore formation. We also expect BADE to be generally useful for low-dimensional applications in other statistical application domains such as bioinformatics, signal processing and econometrics.

Highlights

One of the most popular non-parametric density estimation methods is kernel density estimation (KDE), whereby the density is estimated by means of a sum of kernel functions centered at the sample points (Silverman, 1986; Wand and Jones, 1995): f(x) = 1 M M KH(x − xj), j=1 (1)where KH(x) = det(H)−1/2K(H−1/2 · x), K : Rd → R being the d-variate kernel and M the data size
Motivated by the various limitations of previous methods, here we propose a novel approach, which we call “Balanced Adaptive Density Estimation (BADE)” that offers several desirable features: good scaling for large data sizes; not restricted to diagonal bandwidth matrices; free of data-dependent parameters
In order to evaluate the accuracy of BADE, we performed statistics of the integrated square error (ISE) (Integrated Squared Error) for simulated samples taken from known distributions (Figures 4, 5 for the univariate case; Figures 9, 10 for the bivariate case)

Summary

Introduction

We adopted a fixed-bandwidth KDE approach in our recent application to Fast (Mutual) Information Matching (FIM) of molecular dynamics time series data (Kovacs and Wriggers, 2016). In protein simulations we have recommended to use a rms-fluctuation-based activity that gives a more even histogram (Kovacs and Wriggers, 2016). This is not an option for the membrane simulations in the accompanying paper (Wriggers et al, 2017), so we require a variable-bandwidth approach that can handle graph-based activity functions in that application

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Balanced Approach to Adaptive Probability Density Estimation.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Molecular Biosciences

Lead the way for us

Similar Papers

Efficient Non-parametric Neural Density Estimation and Its Application to Outlier and Anomaly Detection
Joseph A Gallego-Mejia
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37
Joseph A Gallego-MejiaJoseph A Gallego-Mejia
26 Jun 2023
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37

Analysis of optimization methods for nonparametric estimation of probability density in large volume samples
A. V. Lapko ... V. A. Lapko
Izmeritel`naya Tekhnika | VOL. -
A. V. Lapko, et. al.A. V. Lapko ... V. A. Lapko
08 Dec 2023
Izmeritel`naya Tekhnika | VOL. -

Methods of density estimation on the Grassmann manifold
Yasuko Chikuse
Linear Algebra and its Applications | VOL. 354
Yasuko ChikuseYasuko Chikuse
04 Sep 2002
Linear Algebra and its Applications | VOL. 354

On the Estimation of Gaussian Convolution Probability Density
S C Schwartz
SIAM Journal on Applied Mathematics | VOL. 17
S C SchwartzS C Schwartz
01 Mar 1969
SIAM Journal on Applied Mathematics | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Balanced Approach to Adaptive Probability Density Estimation.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Molecular Biosciences