A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis.

Ying Ni,Haitham Elmarakeby,Song Li,Eva Collakova,Ruth Grene,Lenwood S Heath,Delasa Aghamirzaie

doi:10.3389/fpls.2016.01936

Ying Ni, Haitham Elmarakeby + Show 5 more

Open Access

PDF Available

https://doi.org/10.3389/fpls.2016.01936

Copy DOI

Export

Save

Cite

Journal: Frontiers in Plant Science	Publication Date: Dec 23, 2016
Citations: 41	License type: cc-by

Affiliation: Virginia Tech

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Gene regulatory networks (GRNs) provide a representation of relationships between regulators and their target genes. Several methods for GRN inference, both unsupervised and supervised, have been developed to date. Because regulatory relationships consistently reprogram in diverse tissues or under different conditions, GRNs inferred without specific biological contexts are of limited applicability. In this report, a machine learning approach is presented to predict GRNs specific to developing Arabidopsis thaliana embryos. We developed the Beacon GRN inference tool to predict GRNs occurring during seed development in Arabidopsis based on a support vector machine (SVM) model. We developed both global and local inference models and compared their performance, demonstrating that local models are generally superior for our application. Using both the expression levels of the genes expressed in developing embryos and prior known regulatory relationships, GRNs were predicted for specific embryonic developmental stages. The targets that are strongly positively correlated with their regulators are mostly expressed at the beginning of seed development. Potential direct targets were identified based on a match between the promoter regions of these inferred targets and the cis elements recognized by specific regulators. Our analysis also provides evidence for previously unknown inhibitory effects of three positive regulators of gene expression. The Beacon GRN inference tool provides a valuable model system for context-specific GRN inference and is freely available at https://github.com/BeaconProjectAtVirginiaTech/beacon_network_inference.git.

Highlights

Elucidating the topology of gene regulatory networks (GRNs) is fundamental to understanding how transcription factors (TFs) regulate gene expression and the complexity of interdependencies among genes
We have developed the Beacon Gene regulatory networks (GRNs) inference tool, a supervised machine learning method based on a local support vector machine (SVM) approach, to infer complex GRNs representing gene-regulator interactions occurring in developing Arabidopsis embryos from gene expression data and known regulatory relationships used as a prior knowledge
The local SVM approach with RBF kernel was chosen based on a performance comparison with the global SVM approach and the unsupervised method context likelihood of relatedness algorithm (CLR)

Summary

Introduction

Elucidating the topology of gene regulatory networks (GRNs) is fundamental to understanding how transcription factors (TFs) regulate gene expression and the complexity of interdependencies among genes. Potential TF target relationships can be identified by using chromatin immunoprecipitation with DNA microarray (ChIP-chip; Junker et al, 2010), ChIP-sequencing (Park, 2009), or protein-binding microarrays (Berger and Bulyk, 2009). Many computational approaches have been proposed to infer GRNs using gene expression levels. With the advent of highthroughput transcriptome methods such as RNA sequencing (RNA-seq), computational inference of a regulatory network on a genome scale has been made more feasible. Inference through computational methods is convenient, and there are various ways to validate the results (Schrynemackers et al, 2014; Patel and Wang, 2015)

Objectives

Methods

Results

Conclusion