Measurement of Conditional Relatedness Between Genes Using Fully Convolutional Neural Network.

Yan Wang,Shuangquan Zhang,Sen Yang,Lili Yang,Qin Ma,Yuan Tian

doi:10.3389/fgene.2019.01009

Abstract

Measuring conditional relatedness, the degree of relation between a pair of genes in a certain condition, is a basic but difficult task in bioinformatics, as traditional co-expression analysis methods rely on co-expression similarities, well known with high false positive rate. Complement with prior-knowledge similarities is a feasible way to tackle the problem. However, classical combination machine learning algorithms fail in detection and application of the complex mapping relations between similarities and conditional relatedness, so a powerful predictive model will have enormous benefit for measuring this kind of complex mapping relations. To this need, we propose a novel deep learning model of convolutional neural network with a fully connected first layer, named fully convolutional neural network (FCNN), to measure conditional relatedness between genes using both co-expression and prior-knowledge similarities. The results on validation and test datasets show FCNN model yields an average 3.0% and 2.7% higher accuracy values for identifying gene–gene interactions collected from the COXPRESdb, KEGG, and TRRUST databases, and a benchmark dataset of Xiao-Yong et al. research, by grid-search 10-fold cross validation, respectively. In order to estimate the FCNN model, we conduct a further verification on the GeneFriends and DIP datasets, and the FCNN model obtains an average of 1.8% and 7.6% higher accuracy, respectively. Then the FCNN model is applied to construct cancer gene networks, and also calls more practical results than other compared models and methods. A website of the FCNN model and relevant datasets can be accessed from https://bmbl.bmi.osumc.edu/FCNN.

Highlights

Conditional relatedness between a pair of genes is a degree of the relation between two genes in a certain condition, e.g. in cancer tissues or inflammation, implying the probability of these genes jointly involved in a biological process under such cell environment (Wang et al, 2019)
The fully convolutional neural network (FCNN) model is trained by minimizing the Binary Cross Entropy loss (BCEloss) with RMSprop optimizer (Zhao et al, 2019) in the light of the area under the curve (AUC) of validation and test datasets
The FCNN model obtains the best performance among machine learning models, which proves deep-learning-based models can more effectively detect the complex map relations between similarities and conditional relatedness than traditional algorithms, such as fully connected neural network (FNN), Multi-Features Relatedness (MFR), logistic regression (LR), linear discriminant analysis (LDA), support vector machine (SVM), and so on

Summary

Introduction

Conditional relatedness between a pair of genes is a degree of the relation between two genes in a certain condition, e.g. in cancer tissues or inflammation, implying the probability of these genes jointly involved in a biological process under such cell environment (Wang et al, 2019). It is different from gene–gene interaction meaning a 0/1 (non-interacting/interacting) binary relation between a pair of genes. Expression similarities have been successfully applied in measuring conditional relatedness for constructing gene networks, on which Poliakov et al identify disease-related metabolic pathways (Poliakov et al, 2014). When acquiring gene expression data, it often contains some inevitable noise, which causes errors in the calculation of conditional relatedness, well known as high false positive rate

Methods

Results

Discussion

Conclusion