Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models

Hien Duy Nguyen,Trungtin Nguyen,Faicel Chamroukhi,Geoffrey John Mclachlan

doi:10.1186/s40488-021-00125-0

Hien Duy Nguyen, Trungtin Nguyen + Show 2 more

Open Access

PDF Available

https://doi.org/10.1186/s40488-021-00125-0

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Mixture of experts (MoE) models are widely applied for conditional probability density estimation problems. We demonstrate the richness of the class of MoE models by proving denseness results in Lebesgue spaces, when inputs and outputs variables are both compactly supported. We further prove an almost uniform convergence result when the input is univariate. Auxiliary lemmas are proved regarding the richness of the soft-max gating function class, and their relationships to the class of Gaussian gating functions.

Highlights

1 Introduction Mixture of experts (MoE) models are a widely applicable class of conditional probability density approximations that have been considered as solution methods across the spectrum of statistical and machine learning Yuksel et al (2012); Masoudnia and Ebrahimpour (2014); Nguyen and Chamroukhi (2018)
We say that m is a K-component MoE model with gates arising from the class GK and experts arising from E, where E is a class of Probability density function (PDF) with support Y
We address the problem of approximating f, with respect to the Lp norm, using MoE models in the soft-max and Gaussian gated classes, (2021) 8:13

Summary

Introduction

Mixture of experts (MoE) models are a widely applicable class of conditional probability density approximations that have been considered as solution methods across the spectrum of statistical and machine learning Yuksel et al (2012); Masoudnia and Ebrahimpour (2014); Nguyen and Chamroukhi (2018). Suppose that the target conditional PDF f is in the class Fp = F ∩ Lp. We address the problem of approximating f, with respect to the Lp norm, using MoE models in the soft-max and Gaussian gated classes, MψS = mψK : Z → [0, ∞) |mψK y|x = Gatek (x) gψ y; μk, σk , k=1 gψ ∈ Eψ ∩ L∞, Gate ∈ GSK , μk ∈ Y, σk ∈ (0, ∞) , k ∈[ K ] , K ∈ N. Related to our results are contributions regarding the approximation capabilities of the conditional expectation function of the classes MψS and MψG (Wang and Mendel 1992; Zeevi et al 1998; Jiang and Tanner 1999a; Krzyzak and Schafer 2005; Mendes and Jiang 2012; Nguyen et al 2016; Nguyen et al 2019) and the approximation capabilities of subclasses of MψS and MψG , with respect to the Kullback– Leibler divergence (Jiang and Tanner 1999b; Norets 2010; Norets and Pelenis 2014).

Main results Denote the class of bounded functions on Z by

Technical lemmas

Proofs of main results

Proof of Theorem 2

Proof of Lemma 1

Proof of Lemma 3

Summary and conclusions

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Statistical Distributions and Applications	Publication Date: Aug 6, 2021
Citations: 8	License type: open-access

R Discovery Prime

Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Journal of Statistical Distributions and Applications

Lead the way for us

Similar Papers

Run-Time Performance Analysis of the Mixture of Experts Model
Giuliano Armano ... Nima Hatami
-
Giuliano Armano, et. al.Giuliano Armano ... Nima Hatami
01 Jan 2010
01 Jan 2010

Variational Bayesian mixture of experts models and sensitivity analysis for nonlinear dynamical systems
Tara Baldacchino ... Jennifer Rowson
Mechanical Systems and Signal Processing | VOL. 66-67
Tara Baldacchino, et. al.Tara Baldacchino ... Jennifer Rowson
27 May 2015
Mechanical Systems and Signal Processing | VOL. 66-67

Hybridizing mixtures of experts with support vector machines: Investigation into nonlinear dynamic systems identification
Clodoaldo A.M Lima ... Fernando J Von Zuben
Information Sciences | VOL. 177
Clodoaldo A.M Lima, et. al.Clodoaldo A.M Lima ... Fernando J Von Zuben
19 Jan 2007
Information Sciences | VOL. 177

A regularized minimum cross-entropy algorithm on mixtures of experts for time series prediction and curve detection
Zhiwu Lu
Pattern Recognition Letters | VOL. 27
Zhiwu LuZhiwu Lu
30 Jan 2006
Pattern Recognition Letters | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Journal of Statistical Distributions and Applications