Abstract

In this paper, we propose a Dirichlet process (DP) mixture model of Gamma distributions, which is an extension of the finite Gamma mixture model to the infinite case. In particular, we propose a novel online nonparametric Bayesian analysis method based on the infinite Gamma mixture model where the determination of the number of clusters is bypassed via an infinite number of mixture components. The proposed model is learned via an online extended variational Bayesian inference approach in a flexible way where the priors of model’s parameters are selected appropriately and the posteriors are approximated effectively in a closed form. The online setting has the advantage to allow data instances to be treated in a sequential manner, which is more attractive than batch learning especially when dealing with massive and streaming data. We demonstrated the performance and merits of the proposed statistical framework with a challenging real-world application namely oil spill detection in synthetic aperture radar (SAR) images.

Highlights

  • The use of statistical machine learning has proliferated in many fields, especially to solve a broad range of problems ranging from signal processing, speech recognition, to geosciences and remote sensing where strong models are needed to apply statistical methodology

  • It should be noted that one of the challenges is the lack of already common data sets for oil spill detection and this problem has been addressed by many relevant research communities such as [57,58]

  • The first data set is the synthetic aperture radar (SAR) images containing oil spills collected via the European Space Agency (ESA) database [40] which is composed of 1112 images with 5 different classes: Land, Look-alike, oil-spill, ships, and sea surface

Read more

Summary

Introduction

The use of statistical machine learning has proliferated in many fields, especially to solve a broad range of problems ranging from signal processing, speech recognition, to geosciences and remote sensing where strong models are needed to apply statistical methodology. In the case of geosciences and remote sensing, for instance, statistical machine learning techniques have been deployed successfully in a variety of problems and applications in many parts of the earth system and beyond [1]. In general, a formal approach to unsupervised learning and allow, in particular, to select the optimal number of clusters for a given dataset. This fact has been largely detailed in the literature (see, for example, [4,5]).

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call