Bayesian Ying-Yang system, best harmony learning, and five action circling

Lei Xu

doi:10.1007/s11460-010-0108-9

Abstract

Firstly proposed in 1995 and systematically developed in the past decade, Bayesian Ying-Yang learning1) is a statistical approach for a two pathway featured intelligent system via two complementary Bayesian representations of a joint distribution on the external observation X and its inner representation R, which can be understood from a perspective of the ancient Ying-Yang philosophy. We have q(X,R) = q(X|R)q(R) as Ying that is primary, with its structure designed according to tasks of the system, and p(X,R) = p(R|X)p(X) as Yang that is secondary, with p(X) given by samples of X while the structure of p(R|X) designed from Ying according to a Ying-Yang variety preservation principle, i.e., p(R|X) is designed as a functional with q(X|R), q(R) as its arguments. We call this pair Bayesian Ying-Yang (BYY) system. A Ying-Yang best harmony principle is proposed for learning all the unknowns in the system, in help of an implementation featured by a five action circling under the name of A5 paradigm. Interestingly, it coincides with the famous ancient WuXing theory that provides a general guide to keep the A5 circling well balanced towards a Ying-Yang best harmony. This BYY learning provides not only a general framework that accommodates typical learning approaches from a unified perspective but also a new road that leads to improved model selection criteria, Ying-Yang alternative learning with automatic model selection, as well as coordinated implementation of Ying based model selection and Yang based learning regularization. This paper aims at an introduction of BYY learning in a twofold purpose. On one hand, we introduce fundamentals of BYY learning, including system design principles of least redundancy versus variety preservation, global learning principles of Ying-Yang harmony versus Ying-Yang matching, and local updating mechanisms of rival penalized competitive learning (RPCL) versus maximum a posteriori (MAP) competitive learning, as well as learning regularization by data smoothing and induced bias cancelation (IBC) priori. Also, we introduce basic implementing techniques, including apex approximation, primal gradient flow, Ying-Yang alternation, and Sheng-Ke-Cheng-Hui law. On the other hand, we provide a tutorial on learning algorithms for a number of typical learning tasks, including Gaussian mixture, factor analysis (FA) with independent Gaussian, binary, and non-Gaussian factors, local FA, temporal FA (TFA), hidden Markov model (HMM), hierarchical BYY, three layer networks, mixture of experts, radial basis functions (RBFs), subspace based functions (SBFs). This tutorial aims at introducing BYY learning algorithms in a comparison with typical algorithms, particularly with a benchmark of the expectation maximization (EM) algorithm for the maximum likelihood. These algorithms are summarized in a unified Ying-Yang alternation procedure with major parts in a same expression while differences simply characterized by few options in some subroutines. Additionally, a new insight is provided on the ancient Chinese philosophy of Yin-Yang and WuXing from a perspective of information science and intelligent system.

Full Text