Optimizing the Estimation of a Histogram-Bin Width—Application to the Multivariate Mixture-Model Estimation

Branislav Panić,Jernej Klemenc,Marko Nagode

doi:10.3390/math8071090

Abstract

A maximum-likelihood estimation of a multivariate mixture model’s parameters is a difficult problem. One approach is to combine the REBMIX and EM algorithms. However, the REBMIX algorithm requires the use of histogram estimation, which is the most rudimentary approach to an empirical density estimation and has many drawbacks. Nevertheless, because of its simplicity, it is still one of the most commonly used techniques. The main problem is to estimate the optimum histogram-bin width, which is usually set by the number of non-overlapping, regularly spaced bins. For univariate problems it is usually denoted by an integer value; i.e., the number of bins. However, for multivariate problems, in order to obtain a histogram estimation, a regular grid must be formed. Thus, to obtain the optimum histogram estimation, an integer-optimization problem must be solved. The aim is therefore the estimation of optimum histogram binning, alone and in application to the mixture model parameter estimation with the REBMIX&EM strategy. As an estimator, the Knuth rule was used. For the optimization algorithm, a derivative based on the coordinate-descent optimization was composed. These proposals yielded promising results. The optimization algorithm was efficient and the results were accurate. When applied to the multivariate, Gaussian-mixture-model parameter estimation, the results were competitive. All the improvements were implemented in the rebmix R package.

Highlights

Let the random variable y ∈ Rd follow the multivariate, finite-mixture model with the probability density function c f (y|c, w, Θ) =∑ wl f l (y|Θl ). (1) l =1The main task is to estimate the c, w, Θ parameters
As the number of components c can be seen as a complexity parameter, the value of the likelihood function increases as this value increases, Mathematics 2020, 8, 1090; doi:10.3390/math8071090
For the first useful insight, we evaluated on how many datasets for each dimension d, each algorithm used was capable of finding the true binning v from which the dataset was simulated

Summary

Introduction

The main task is to estimate the c, w, Θ parameters. The parameters c, w, Θ are, respectively, the number of components in the mixture model, the component weights and the component parameters. The maximum-likelihood estimation of such parameters is a difficult task and mostly requires a combination of multiple procedures for its estimation [1]. Obtaining a direct analytical derivation of the likelihood function for mixture models is difficult and inconvenient; instead, the complete-data-likelihood function can be used and the maximum-likelihood parameter estimates can be obtained using the expectation-maximization (EM) algorithm [2]. As the number of components c can be seen as a complexity parameter, the value of the likelihood function increases as this value increases, Mathematics 2020, 8, 1090; doi:10.3390/math8071090 www.mdpi.com/journal/mathematics

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematics	Publication Date: Jul 3, 2020
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Optimizing the Estimation of a Histogram-Bin Width—Application to the Multivariate Mixture-Model Estimation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Similar Papers

Practical Issues in Estimation Over Multiaccess Fading Channels With TBMA Wireless Sensor Networks
Ping Gao ... Cihan Tepedelenlioglu
IEEE Transactions on Signal Processing | VOL. 56
Ping Gao, et. al.Ping Gao ... Cihan Tepedelenlioglu
01 Mar 2008
IEEE Transactions on Signal Processing | VOL. 56

Parameter estimation of Gaussian mixture models (GMM) with expectation maximization (EM) algorithm
Wardatul Jannah ... Dewi R S Saputro
-
Wardatul Jannah, et. al.Wardatul Jannah ... Dewi R S Saputro
01 Jan 2021
01 Jan 2021

Model Based Clustering using Finite Mixture Models of Lognormal Distribution
...
-
, et. al. ...
13 Feb 2019
13 Feb 2019

Theory and Practice of Expectation Maximization (EM) Algorithm
Chandan K Reddy ... Bala Rajaratnam
-
Chandan K Reddy, et. al.Chandan K Reddy ... Bala Rajaratnam
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimizing the Estimation of a Histogram-Bin Width—Application to the Multivariate Mixture-Model Estimation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics