Abstract

In this paper, we address the issue of computing the maximal correlation functions for jointly distributed high-dimensional random variables. In such cases, the operations in the alternative conditional expectation (ACE) algorithm can only be implemented by an approximated manner, which is modeled as a variational ACE algorithm with noise. We study the computational behaviors of this algorithm, where the optimal tradeoff between the learning rate, computation accuracy, and convergence rate is investigated. In addition, we establish a connection between the variational ACE algorithm and the residual learning architecture. Our results illustrate interesting interpretations of how multi-layer residual structure benefits function learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call