Abstract

BackgroundSingle nucleotide polymorphisms (SNPs) in genes derived from distinct pathways are associated with a breast cancer risk. Identifying possible SNP-SNP interactions in genome-wide case–control studies is an important task when investigating genetic factors that influence common complex traits; the effects of SNP-SNP interaction need to be characterized. Furthermore, observations of the complex interplay (interactions) between SNPs for high-dimensional combinations are still computationally and methodologically challenging. An improved branch and bound algorithm with feature selection (IBBFS) is introduced to identify SNP combinations with a maximal difference of allele frequencies between the case and control groups in breast cancer, i.e., the high/low risk combinations of SNPs.ResultsA total of 220 real case and 334 real control breast cancer data are used to test IBBFS and identify significant SNP combinations. We used the odds ratio (OR) as a quantitative measure to estimate the associated cancer risk of multiple SNP combinations to identify the complex biological relationships underlying the progression of breast cancer, i.e., the most likely SNP combinations. Experimental results show the estimated odds ratio of the best SNP combination with genotypes is significantly smaller than 1 (between 0.165 and 0.657) for specific SNP combinations of the tested SNPs in the low risk groups. In the high risk groups, predicted SNP combinations with genotypes are significantly greater than 1 (between 2.384 and 6.167) for specific SNP combinations of the tested SNPs.ConclusionsThis study proposes an effective high-speed method to analyze SNP-SNP interactions in breast cancer association studies. A number of important SNPs are found to be significant for the high/low risk group. They can thus be considered a potential predictor for breast cancer association.

Highlights

  • Single nucleotide polymorphisms (SNPs) in genes derived from distinct pathways are associated with a breast cancer risk

  • SNPruler is a statistical method for identifying single nucleotide polymorphism (SNP) combinations; it uses the Chi-square test to design the bound in the original Branch and Bound algorithm

  • This study proposes a method based on statistical epistasis and an improved branch and bound algorithm combined with feature selection (IBBFS) to explore combinations of SNP-SNP interactions in a breast cancer association study

Read more

Summary

Results

For all other combinations of SNPs the p-value was greater than 0.05 These experimental results prove that the proposed IBBFS method can handle combinations of multiple SNPs and determine the best combination of two to seven SNPs, both the in low and high risk categories. The experiments show that IBBFS has great potential for the identification of complex biological relationships among cancer processes during the development of breast cancer

Conclusions
Background
Conclusion
Methods
Cordell HJ
Fridley BL
30. Chen X
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call