Learning Bayesian Networks from Correlated Data.

Harold Bae,Paola Sebastiani,Martin H Steinberg,Monty Montano,Stefano Monti,Thomas T Perls

doi:10.1038/srep25156

Harold Bae, Paola Sebastiani + Show 4 more

Open Access

https://doi.org/10.1038/srep25156

Copy DOI

Abstract

Bayesian networks are probabilistic models that represent complex distributions in a modular way and have become very popular in many fields. There are many methods to build Bayesian networks from a random sample of independent and identically distributed observations. However, many observational studies are designed using some form of clustered sampling that introduces correlations between observations within the same cluster and ignoring this correlation typically inflates the rate of false positive associations. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. We compare different learning metrics using simulations and illustrate the method in two real examples: an analysis of genetic and non-genetic factors associated with human longevity from a family-based study, and an example of risk factors for complications of sickle cell anemia from a longitudinal study with repeated measures.

Highlights

Learning Bayesian Networks from Independent and Identically Distributed Observations
BN is a vector of random variables Y = (Y1, ... , Yv) with a joint probability distribution that factorizes according to the local and global Markov properties represented by the associated directed acyclic graph (DAG)[13,14,15]
There are well established approaches to structure learning of BNs6,7,13 that use either exact Bayesian criteria based on the marginal likelihood p(D|M) = ∫ p(D θ, M)p(θ M)dθ, or asymptotic criteria such as AIC = − 2 log(p(D| θ)) + 2p, or BIC = − 2 log(p(D| θ)) + log (n)p where D denotes the sample of size n, M denotes the BN structure, θ is a vector of p model parameters, p(D|θ, M) and p(θ|M) denote the likelihood function and the prior distribution of the parameters, and θ is the maximum likelihood estimate of θ

Summary

OPEN Learning Bayesian Networks from

Correlated Data received: 02 October 2015 accepted: 08 April 2016 Published: 05 May 2016. We describe a novel parameterization of Bayesian networks that uses random effects to model the correlation within sample units and can be used for structure and parameter learning from correlated data without inflating the Type I error rate. It is well known that ignoring the correlation between observations can impact the false positive rates of regression methods[10], and the same problem is likely to persist with using BNs. As an example, Fig. 1 illustrates the effect of ignoring the correlation between observations when learning the network structure using three common model selection metrics. We extend mixed-effects regression models to BNs and present the results of simulation studies that describe the inflation to the Type I error due to ignoring correlated data and compare different model selection metrics that can be used for learning mixed-effects BNs. We illustrate our proposed approach in two real data examples.

Background

It assumes that the data are from

Simulation Studies

Power Moderate

LRT BICM BICM

Discussion and Conclusions

Findings

Additional Information

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific reports	Publication Date: May 5, 2016
Citations: 19	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Learning Bayesian Networks from Correlated Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific reports

Lead the way for us

Similar Papers

Learning bayesian network parameters from limited data by integrating entropy and monotonicity
Zhiping Fan ... Xue Feng
Knowledge-Based Systems | VOL. 291
Zhiping Fan, et. al.Zhiping Fan ... Xue Feng
24 Feb 2024
Knowledge-Based Systems | VOL. 291

Enhancing multi-label classification by modeling dependencies among labels
Shangfei Wang ... Qiang Ji
Pattern Recognition | VOL. 47
Shangfei Wang, et. al.Shangfei Wang ... Qiang Ji
16 Apr 2014
Pattern Recognition | VOL. 47

Bayesian Network Learning for Classification via Transfer Method
April Hua Liu ... Zihao Cheng
-
April Hua Liu, et. al.April Hua Liu ... Zihao Cheng
01 Nov 2019
01 Nov 2019

Hard and Soft EM in Bayesian Network Learning from Incomplete Data
Andrea Ruggieri ... Francesco Stranieri
Algorithms | VOL. 13
Andrea Ruggieri, et. al.Andrea Ruggieri ... Francesco Stranieri
09 Dec 2020
Algorithms | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Bayesian Networks from Correlated Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific reports