Abstract
BackgroundWe propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis.ResultsBoth simulation and real data analysis were conducted to assess its performances by comparing with other methods including χ2 test with Bonferroni and B-H adjustment, least absolute shrinkage and selection operator (LASSO) and DASSO-MB. A serious of simulation studies showed that the true discovery rate (TDR) of proposed MBRFS was always close to zero under null hypothesis (odds ratio = 1 for each SNPs) with excellent stability in all three scenarios of independent phenotype-related SNPs without linkage disequilibrium (LD) around them, correlated phenotype-related SNPs without LD around them, and phenotype-related SNPs with strong LD around them. As expected, under different odds ratio and minor allel frequency (MAFs), MBRFS always had the best performances in capturing the true phenotype-related biomarkers with higher matthews correlation coefficience (MCC) for all three scenarios above. More importantly, since proposed MBRFS using the repeated fishing strategy, it still captures more phenotype-related SNPs with minor effects when non-significant phenotype-related SNPs emerged under χ2 test after Bonferroni multiple correction. The various real omics data analysis, including GWAS data, DNA methylation data, gene expression data and metabolites data, indicated that the proposed MBRFS always detected relatively reasonable biomarkers.ConclusionsOur proposed MBRFS can exactly capture the true phenotype-related biomarkers with the reduction of false negative rate when the phenotype-related biomarkers are independent or correlated, as well as the circumstance that phenotype-related biomarkers are associated with non-phenotype-related ones.Electronic supplementary materialThe online version of this article (doi:10.1186/s12863-016-0358-5) contains supplementary material, which is available to authorized users.
Highlights
We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis
It indicated that MBRFS had highest overall true discovery rate (TDR) with acceptable false discovery rate (FDR) in scheme 2 and in scheme 3
In scheme 4, it showed that MBRFS and least absolute shrinkage and selection operator (LASSO) had relative higher overall TDR, while LASSO emerged highest FDR though its overall TDR seemed a little higher than the proposed
Summary
We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis. High-throughput-omic platforms, such as SNPS arrays, expression arrays and mass spectrometry, etc., have been commonly used in large scale population level systems biology or systems epidemiology study. These omic techniques have provided us the feasibility to accumulate a wealth of genetic, transcriptomic, proteomic and metabolomics data to study health and disease in breadth and depth at the human population level. Li et al BMC Genetics (2016) 17:51 customarily regarded as the universal criterion to claim the significance of each marker [2] These arbitrary correction methods inevitably increase the false negatives, for instance, phenotype-related biomarkers with p value larger than the Bonferroni cutoff of χ2 test will never be identified as positive ones. The most commonly used Bonferroni adjustment will be less powerful if the high correlation existed between markers (e.g., linkage disequilibrium, LD, between SNPs), which can be ubiquitously encountered in big omics data analysis
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.