Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference.

Furqan Aziz,Laura Bravo-Merodio,John A Williams,Dominic Russ,Georgios V Gkoutos,Animesh Acharjee

doi:10.3390/ijms21217886

Abstract

Inferring the topology of a gene regulatory network (GRN) from gene expression data is a challenging but important undertaking for gaining a better understanding of gene regulation. Key challenges include working with noisy data and dealing with a higher number of genes than samples. Although a number of different methods have been proposed to infer the structure of a GRN, there are large discrepancies among the different inference algorithms they adopt, rendering their meaningful comparison challenging. In this study, we used two methods, namely the MIDER (Mutual Information Distance and Entropy Reduction) and the PLSNET (Partial least square based feature selection) methods, to infer the structure of a GRN directly from data and computationally validated our results. Both methods were applied to different gene expression datasets resulting from inflammatory bowel disease (IBD), pancreatic ductal adenocarcinoma (PDAC), and acute myeloid leukaemia (AML) studies. For each case, gene regulators were successfully identified. For example, for the case of the IBD dataset, the UGT1A family genes were identified as key regulators while upon analysing the PDAC dataset, the SULF1 and THBS2 genes were depicted. We further demonstrate that an ensemble-based approach, that combines the output of the MIDER and PLSNET algorithms, can infer the structure of a GRN from data with higher accuracy. We have also estimated the number of the samples required for potential future validation studies. Here, we presented our proposed analysis framework that caters not only to candidate regulator genes prediction for potential validation experiments but also an estimation of the number of samples required for these experiments.

Highlights

Network reverse engineering is the process of inferring the structure of a network from gene expression data through computational techniques
In order to determine whether the output of PLSNET and MIDER can be combined to produce a more accurate gene regulatory network (GRN), we identified and removed those network edges where the target gene is a gene that has been identified as a potential regulatory gene by PLSNET
We have explored the structure of a GRN inferred from the gene expression profiles of three different datasetsthe (IBD, pancreatic ductal adenocarcinoma (PDAC), acute myeloid leukaemia (AML))

Summary

Introduction

Network reverse engineering is the process of inferring the structure of a network from gene expression data through computational techniques. The problem of inferring the structure of a network is challenging for a number of reasons. The main challenge arises from the fact that while the number of genes in a given data set is high, typically the number of available samples is low. It is important to note that a gene regulatory network (GRN) is usually inferred directly from expression data that is, more often than not, noisy. For these reasons, it is highly unlikely that a single best method exists for every case [1,2]. Different methods highlight different interactions, and even the state-of-the-art methods generally achieve very low prediction accuracy [3]

Objectives

Methods

Results

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Molecular Sciences	Publication Date: Oct 23, 2020
Citations: 4	License type: CC BY 4.0

R Discovery Prime

Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal of Molecular Sciences

Lead the way for us

Similar Papers

Inference of Gene Regulatory Network (GRN) from Gene Expression Data Using K-Means Clustering and Entropy Based Selection of Interactions
Asadullah Al Galib ... Mohammad Mohaimanur Rahman
-
Asadullah Al Galib, et. al.Asadullah Al Galib ... Mohammad Mohaimanur Rahman
01 Jan 2021
01 Jan 2021

Genome-Wide Association Study Data Reveal Genetic Susceptibility to Chronic Inflammatory Intestinal Diseases and Pancreatic Ductal Adenocarcinoma Risk.
...
Cancer research | VOL. 80
, et. al. ...
15 Sep 2020
Cancer research | VOL. 80

Data from Genome-Wide Association Study Data Reveal Genetic Susceptibility to Chronic Inflammatory Intestinal Diseases and Pancreatic Ductal Adenocarcinoma Risk
I-Min Lee ...
-
I-Min Lee, et. al.I-Min Lee ...
31 Mar 2023
31 Mar 2023

Data from Genome-Wide Association Study Data Reveal Genetic Susceptibility to Chronic Inflammatory Intestinal Diseases and Pancreatic Ductal Adenocarcinoma Risk
Howard D Sesso ...
-
Howard D Sesso, et. al.Howard D Sesso ...
31 Mar 2023
31 Mar 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Biomarker Prioritisation and Power Estimation Using Ensemble Gene Regulatory Network Inference.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: International Journal of Molecular Sciences