Abstract

One of the biggest challenges in analyzing high throughput omics data in biological studies is extracting information that is relevant to specific biological mechanisms of interest while simultaneously restricting the number of false positive findings. Due to random chances with numerous candidate targets and mechanisms, computational approaches often yield a large number of false positives that cannot easily be discerned from relevant biological findings without costly, and often infeasible, biological experiments. We here introduce and apply an integrative bioinformatics approach, Biologically Anchored Knowledge Expansion (BAKE), which uses sequential statistical analysis and literature mining to identify highly relevant network genes and effectively removes false positive findings. Applying BAKE to genomic expression data collected from mouse (Mus musculus) adipocytes during insulin resistance progression, we uncovered the transcription factor Krueppel-like Factor 4 (KLF4) as a regulator of early insulin signaling. We experimentally confirmed that KLF4 controls the expression of two key insulin signaling molecules, the Insulin Receptor Substrate 2 (IRS2) and Tuberous Sclerosis Complex 2 (TSC2).

Highlights

  • High-throughput profiling techniques are widely used to decipher biological and human disease mechanisms with genomics, transcriptomics, proteomics, epigenomics, metabolomics, and other omics approaches [1,2,3,4]

  • Different computational network mining and modeling approaches have been developed in recent years to reconstruct complex biological networks from high throughput molecular data in genomics, proteomics, metabolomics and other omics-based studies

  • In silico network mining techniques, often identify numerous false network interactions that can only be conclusively uncovered by performing extensive biological experiments

Read more

Summary

Introduction

High-throughput profiling techniques are widely used to decipher biological and human disease mechanisms with genomics, transcriptomics, proteomics, epigenomics, metabolomics, and other omics approaches [1,2,3,4]. Some other widely used network inference approaches are regression based algorithms for known sets of transcription factors and target genes [3, 19], shrinkage techniques [20], and Network Component Analysis [21]. Some widely used gene network inference methods are currently available as user friendly modules in the GP-DREAM software [24], such as the Correlation approach that deduces high confidence transcription factor-target gene pairs [22]. While these computational network reconstruction techniques have proven useful in many studies, their limitations have been well recognized. Subjective expert knowledge determines a small number of candidate networks for further investigation

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call