Abstract

Parameter learning is an important aspect of learning in Bayesian networks. Although the maximum likelihood algorithm is often effective, it suffers from overfitting when there is insufficient data. To address this, prior distributions of model parameters are often imposed. When training a Bayesian network, the parameters of the network are optimized to fit the data. However, imposing prior distributions can reduce the fitness between parameters and data. Therefore, a trade-off is needed between fitting and overfitting. In this study, a new algorithm, named MiniMax Fitness (MMF) is developed to address this problem. The method includes three main steps. First, the maximum a posterior estimation that combines data and prior distribution is derived. Then, the hyper-parameters of the prior distribution are optimized to minimize the fitness between posterior estimation and data. Finally, the order of posterior estimation is checked and adjusted to match the order of the statistical counts from the data. In addition, we introduce an improved constrained maximum entropy method, named Prior Free Constrained Maximum Entropy (PF-CME), to facilitate parameter learning when domain knowledge is provided. Experiments show that the proposed methods outperforms most of existing parameter learning methods.

Highlights

  • A Bayesian network (BN) [1] is a joint probability distribution model representing a set of stochastic variables

  • To make full use of sample data and domain knowledge, in this part, we present an improved constrained maximum entropy method, called Prior Free Constrained Maximum Entropy (PF-CME) method

  • We described the parameter learning problem in Bayesian networks and proposed a method to address the overfitting problem when the available data is insufficient

Read more

Summary

Introduction

A Bayesian network (BN) [1] is a joint probability distribution model representing a set of stochastic variables. A BN consists of a directed acyclic graph that represents the dependent relationship between variables and a numerical section that specifies the conditional probability distribution for each variable. Imposing certain prior distributions will decrease the likelihood of the parameters and reduce the fitness between parameters and data. Apart from imposing quantitative prior, qualitative domain knowledge improves parameter estimation To utilize both data and domain knowledge, we further introduce an improved constrained maximum entropy method. Compared with the traditional constrained maximum entropy method, by our method, domain experts do not need to specify prior strength, which is hard to provide and has considerable effect on the parameter estimation. The remainder of the paper is organized as follows: In Section 2, the works related to parameter learning and Minimax algorithm application are introduced.

Related Works
Preliminaries
Parameter Learning
Inequality Relationship
MiniMax Fitness Method
Prior Free Constrained Maximum Entropy Method
Experiments
Experiment Setting
Parameter Learning under Different Sample Sizes
Parameter Learning under Different Constraint Sizes
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call