Semantic Information G Theory and Logical Bayesian Inference for Machine Learning

Chenguang Lu

doi:10.3390/info10080261

Abstract

An important problem in machine learning is that, when using more than two labels, it is very difficult to construct and optimize a group of learning functions that are still useful when the prior distribution of instances is changed. To resolve this problem, semantic information G theory, Logical Bayesian Inference (LBI), and a group of Channel Matching (CM) algorithms are combined to form a systematic solution. A semantic channel in G theory consists of a group of truth functions or membership functions. In comparison with the likelihood functions, Bayesian posteriors, and Logistic functions that are typically used in popular methods, membership functions are more convenient to use, providing learning functions that do not suffer the above problem. In Logical Bayesian Inference (LBI), every label is independently learned. For multilabel learning, we can directly obtain a group of optimized membership functions from a large enough sample with labels, without preparing different samples for different labels. Furthermore, a group of Channel Matching (CM) algorithms are developed for machine learning. For the Maximum Mutual Information (MMI) classification of three classes with Gaussian distributions in a two-dimensional feature space,only 2–3 iterations are required for the mutual information between three classes and three labels to surpass 99% of the MMI for most initial partitions For mixture models, the Expectation-Maximization (EM) algorithm is improved to form the CM-EM algorithm, which can outperform the EM algorithm when the mixture ratios are imbalanced, or when local convergence exists. The CM iteration algorithm needs to combine with neural networks for MMI classification in high-dimensional feature spaces. LBI needs further investigation for the unification of statistics and logic.

Highlights

Machine learning is based on learning functions and classifiers
Only 2–3 iterations were required for the mutual information to surpass 99% of the Mutual Information (MMI)
The following three examples show that the Channel Matching (CM)-EM algorithm can outperform both the EM

Summary

Introduction

Machine learning is based on learning functions and classifiers. In 1922, Fisher [1] proposed the Likelihood Inference (LI), which uses likelihood functions as learning functions and it uses the MaximumLikelihood (ML) criterion to optimize the learning functions and classifiers (see abbreviations in this paper). Machine learning is based on learning functions and classifiers. In 1922, Fisher [1] proposed the Likelihood Inference (LI), which uses likelihood functions as learning functions and it uses the Maximum. Likelihood (ML) criterion to optimize the learning functions and classifiers (see abbreviations in this paper). When the prior distribution, P(x) (where x is an instance), is changed, the optimized likelihood function will be invalid. As LI cannot make use of prior knowledge, Bayesians proposed. Bayesian Inference (BI) during the 1950s [2,3], which uses Bayesian posteriors as learning functions. In many cases, we only have prior knowledge of instances, instead of labels or model parameters and, BI is still not good in such cases.

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information	Publication Date: Aug 16, 2019
Citations: 11	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Semantic Information G Theory and Logical Bayesian Inference for Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Similar Papers

Making up the shortages of the Bayes classifier by the maximum mutual information classifier
Chenguang Lu ... Xiaohui Zou
The Journal of Engineering | VOL. 2020
Chenguang Lu, et. al.Chenguang Lu ... Xiaohui Zou
01 Jul 2020
The Journal of Engineering | VOL. 2020

The Semantic Information Method Compatible with Shannon, Popper, Fisher, and Zadeh’s Thoughts
Chenguang Lu
-
Chenguang LuChenguang Lu
01 Jan 2019
01 Jan 2019

P-T Probability Framework and Semantic Information G Theory Tested by Seven Difficult Tasks
Chenguang Lu
-
Chenguang LuChenguang Lu
01 Jan 2020
01 Jan 2020

Reviewing Evolution of Learning Functions and Semantic Information Measures for Understanding Deep Learning.
Chenguang Lu
Entropy | VOL. 25
Chenguang LuChenguang Lu
15 May 2023
Entropy | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantic Information G Theory and Logical Bayesian Inference for Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information