Abstract

The maximum mean discrepancy (MMD) has found numerous applications in statistics and machine learning, among which is its use as a penalty in the Wasserstein auto‐encoder (WAE). In this paper, we compute closed‐form expressions for estimating the Gaussian kernel‐based MMD between a given distribution and the standard multivariate normal distribution. This formula reveals a connection to the Baringhaus–Henze–Epps–Pulley (BHEP) statistic of the Henze–Zirkler test and provides further insights about the MMD. We introduce the standardized version of MMD as a penalty for the WAE training objective, allowing for a better interpretability of MMD values and more compatibility across different hyperparameter settings. Next, we propose code normalization–using batch normalization at the code layer–which has the benefits of making the kernel width selection easier, reducing the training effort, and preventing outliers in the aggregate code distribution. Our experiments on synthetic and real data show that the analytic formulation improves over the commonly used stochastic approximation of the MMD and demonstrate that code normalization provides significant benefits when training WAEs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.