Learning genetic epistasis using Bayesian network scoring criteria.

Xia Jiang,M Michael Barmada,Richard E Neapolitan,Shyam Visweswaran

doi:10.1186/1471-2105-12-89

Xia Jiang, M Michael Barmada + Show 2 more

Open Access

PDF Available

https://doi.org/10.1186/1471-2105-12-89

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundGene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machine-learning and data mining methods have been developed for learning epistatic relationships from data. A well-known combinatorial method that has been successfully applied for detecting epistasis is Multifactor Dimensionality Reduction (MDR). Jiang et al. created a combinatorial epistasis learning method called BNMBL to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2-SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL.Although BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model.ResultsWe evaluated the performance of 22 BN scoring criteria using 28,000 simulated data sets and a real Alzheimer's GWAS data set. Our results were surprising in that the Bayesian scoring criterion with large values of a hyperparameter called α performed best. This score performed better than other BN scoring criteria and MDR at recall using simulated data sets, at detecting the hardest-to-detect models using simulated data sets, and at substantiating previous results using the real Alzheimer's data set.ConclusionsWe conclude that representing epistatic interactions using BN models and scoring them using a BN scoring criterion holds promise for identifying epistatic genetic variants in data. In particular, the Bayesian scoring criterion with large values of a hyperparameter α appears more promising than a number of alternatives.

Highlights

Gene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases
Bayesian network (BN) are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of single nucleotide polymorphism (SNP) in that model
We provide the results of experiments that evaluate the various BN scores and Multifactor Dimensionality Reduction (MDR) using simulated data and a real genome-wide association study (GWAS) data set

Summary

Introduction

Gene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Jiang et al created a combinatorial epistasis learning method called BNMBL to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. A number of other BN scoring criteria have been developed They may detect epistatic interactions even better than BNMBL. BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model. The individual loci exhibit weak marginal effects; sometimes they may exhibit none

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Mar 31, 2011
Citations: 91	License type: CC BY 2.0

R Discovery Prime

Learning genetic epistasis using Bayesian network scoring criteria.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Evaluating the Sampling Performance of Exploratory and Cross-Validated DETECT Procedure with Imperfect Models
Cengiz Zopluoglu
Multivariate Behavioral Research | VOL. 50
Cengiz ZopluogluCengiz Zopluoglu
02 Nov 2015
Multivariate Behavioral Research | VOL. 50

Author response: Limitations of principal components in quantitative genetic association models for human studies
Yiqi Yao ... Alejandro Ochoa
-
Yiqi Yao, et. al.Yiqi Yao ... Alejandro Ochoa
25 Apr 2023
25 Apr 2023

Decision letter: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg ... Detlef Weigel
-
Magnus Nordborg, et. al.Magnus Nordborg ... Detlef Weigel
04 Jul 2022
04 Jul 2022

Editor's evaluation: Limitations of principal components in quantitative genetic association models for human studies
Magnus Nordborg
-
Magnus NordborgMagnus Nordborg
04 Jul 2022
04 Jul 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Learning genetic epistasis using Bayesian network scoring criteria.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Bioinformatics