Abstract

A commonly used tool for estimating the parameters of a mixture model is the Expectation–Maximization (EM) algorithm, which is an iterative procedure that can serve as a maximum-likelihood estimator. The EM algorithm has well-documented drawbacks, such as the need for good initial values and the possibility of being trapped in local optima. Nevertheless, because of its appealing properties, EM plays an important role in estimating the parameters of mixture models. To overcome these initialization problems with EM, in this paper, we propose the Rough-Enhanced-Bayes mixture estimation (REBMIX) algorithm as a more effective initialization algorithm. Three different strategies are derived for dealing with the unknown number of components in the mixture model. These strategies are thoroughly tested on artificial datasets, density–estimation datasets and image–segmentation problems and compared with state-of-the-art initialization methods for the EM. Our proposal shows promising results in terms of clustering and density-estimation performance as well as in terms of computational efficiency. All the improvements are implemented in the rebmix R package.

Highlights

  • The Expectation–Maximization (EM) algorithm was introduced in [1]

  • We have proposed the use of the Rough-Enhanced-Bayes mixture estimation (REBMIX) algorithm as an initialization technique for the EM

  • The proposed strategies were thoroughly tested on artificially created datasets (Section 5.1), density-estimation tasks (Section 5.2), and image-segmentation tasks (Section 5.3)

Read more

Summary

Introduction

The Expectation–Maximization (EM) algorithm was introduced in [1]. Today, after a little more than 40 years, it is still one of the most popular algorithms for statistical pattern recognition. DA-EM in [2], to modifications of the EM algorithm for different purposes: image matching [3]; parameter estimation [4,5]; malaria diagnoses [6]; mixture simplification [7]; and audio-visual scene analysis [8]. EM’s popularity has risen due to its use in estimating mixture-model (MM) parameters [9,10]. The parameters of MM can be used multi-purposely. Their use is interesting for density-estimation tasks [7,11] and clustering tasks [6,12,13]. In the context of MM parameter estimation, the EM algorithm can be seen as a clustering algorithm that maximizes the missing data log-likelihood function as the objective function. EM has many appealing properties, such as Mathematics 2020, 8, 373; doi:10.3390/math8030373 www.mdpi.com/journal/mathematics

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.