Abstract

Mixture of experts (MoE) models are widely applied for conditional probability density estimation problems. We demonstrate the richness of the class of MoE models by proving denseness results in Lebesgue spaces, when inputs and outputs variables are both compactly supported. We further prove an almost uniform convergence result when the input is univariate. Auxiliary lemmas are proved regarding the richness of the soft-max gating function class, and their relationships to the class of Gaussian gating functions.

Highlights

  • 1 Introduction Mixture of experts (MoE) models are a widely applicable class of conditional probability density approximations that have been considered as solution methods across the spectrum of statistical and machine learning Yuksel et al (2012); Masoudnia and Ebrahimpour (2014); Nguyen and Chamroukhi (2018)

  • We say that m is a K-component MoE model with gates arising from the class GK and experts arising from E, where E is a class of Probability density function (PDF) with support Y

  • We address the problem of approximating f, with respect to the Lp norm, using MoE models in the soft-max and Gaussian gated classes, (2021) 8:13

Read more

Summary

Introduction

Mixture of experts (MoE) models are a widely applicable class of conditional probability density approximations that have been considered as solution methods across the spectrum of statistical and machine learning Yuksel et al (2012); Masoudnia and Ebrahimpour (2014); Nguyen and Chamroukhi (2018). Suppose that the target conditional PDF f is in the class Fp = F ∩ Lp. We address the problem of approximating f, with respect to the Lp norm, using MoE models in the soft-max and Gaussian gated classes, MψS = mψK : Z → [0, ∞) |mψK y|x = Gatek (x) gψ y; μk, σk , k=1 gψ ∈ Eψ ∩ L∞, Gate ∈ GSK , μk ∈ Y, σk ∈ (0, ∞) , k ∈[ K ] , K ∈ N. Related to our results are contributions regarding the approximation capabilities of the conditional expectation function of the classes MψS and MψG (Wang and Mendel 1992; Zeevi et al 1998; Jiang and Tanner 1999a; Krzyzak and Schafer 2005; Mendes and Jiang 2012; Nguyen et al 2016; Nguyen et al 2019) and the approximation capabilities of subclasses of MψS and MψG , with respect to the Kullback– Leibler divergence (Jiang and Tanner 1999b; Norets 2010; Norets and Pelenis 2014).

Main results Denote the class of bounded functions on Z by
Technical lemmas
Proofs of main results
Proof of Theorem 2
Proof of Lemma 1
Proof of Lemma 3
Summary and conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call