An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks.

Rongquan Wang,Caixia Wang,Huimin Ma

doi:10.3389/fgene.2022.839949

Rongquan Wang, Caixia Wang + Show 1 more

Open Access

PDF Available

https://doi.org/10.3389/fgene.2022.839949

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Detecting protein complexes is one of the keys to understanding cellular organization and processes principles. With high-throughput experiments and computing science development, it has become possible to detect protein complexes by computational methods. However, most computational methods are based on either unsupervised learning or supervised learning. Unsupervised learning-based methods do not need training datasets, but they can only detect one or several topological protein complexes. Supervised learning-based methods can detect protein complexes with different topological structures. However, they are usually based on a type of training model, and the generalization of a single model is poor. Therefore, we propose an Ensemble Learning Framework for Detecting Protein Complexes (ELF-DPC) within protein-protein interaction (PPI) networks to address these challenges. The ELF-DPC first constructs the weighted PPI network by combining topological and biological information. Second, it mines protein complex cores using the protein complex core mining strategy we designed. Third, it obtains an ensemble learning model by integrating structural modularity and a trained voting regressor model. Finally, it extends the protein complex cores and forms protein complexes by a graph heuristic search strategy. The experimental results demonstrate that ELF-DPC performs better than the twelve state-of-the-art approaches. Moreover, functional enrichment analysis illustrated that ELF-DPC could detect biologically meaningful protein complexes. The code/dataset is available for free download from https://github.com/RongquanWang/ELF-DPC.

Highlights

Most complex systems, such as biological systems and human society, can be presented as complex networks in the real world
A protein-protein interaction (PPI) network is generally described as a weighted graph G = (V, E, W), where V is a set of proteins, E is a set of interactions, and W is a n × n(n = |V|) matrix that represents the reliability of protein pairs in PPI networks
2.3.4.2 The Structural Modularity of Protein Complexes Based on the within-module and between module edges of subgraphs and the size of the subgraph, we present a new formal definition of protein complexes in PPI networks (Wu et al, 2009; Yu et al, 2011; Nepusz et al, 2012; Wang et al, 2019)

Summary

INTRODUCTION

Most complex systems, such as biological systems and human society, can be presented as complex networks in the real world. Community detection in complex networks is essential in many fields, aiming to identify clusters with high internal connectivity. These clusters are well separated from the rest of the network. A protein complex is a group of proteins formed by interacting simultaneously and in place. With the development of high-throughput experimental methods, many PPI networks have been produced, which usually have small world, scale-free, and modularity characteristics. They could be formulated as graphs where the nodes represent the proteins, and the edges represent the interactions. More details on the related work are introduced in the related work section

Related Work

Observations and Contributions

Datasets

Terminologies

Methods

Neighborhood Affinity

F-Measure

Evaluation Metrics

Jaccard

20: Jaccard

Parameter Selection

Comparison With State-of-the-art Algorithms

Comparison With Functional Enrichment Analysis

Case Study

CONCLUSION

DATA AVAILABILITY STATEMENT

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in genetics	Publication Date: Feb 24, 2022
Citations: 7	License type: CC BY 4.0

R Discovery Prime

An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Frontiers in genetics

Lead the way for us

Similar Papers

Protein complex detection via weighted ensemble clustering based on Bayesian nonnegative matrix factorization.
Le Ou-Yang ... Vladimir N Uversky
PloS one | VOL. 8
Le Ou-Yang, et. al.Le Ou-Yang ... Vladimir N Uversky
02 May 2013
PloS one | VOL. 8

Protein complex detection in PPI networks based on data integration and supervised learning method.
Feng Ying Yu ... Zhi Hao Yang
BMC Bioinformatics | VOL. Suppl 16 12
Feng Ying Yu, et. al.Feng Ying Yu ... Zhi Hao Yang
25 Aug 2015
BMC Bioinformatics | VOL. Suppl 16 12

A Novel Core-Attachment-Based Method to Identify Dynamic Protein Complexes Based on Gene Expression Profiles and PPI Networks.
Qianghua Xiao ... Fang‐Xiang Wu
PROTEOMICS | VOL. 19
Qianghua Xiao, et. al.Qianghua Xiao ... Fang‐Xiang Wu
20 Feb 2019
PROTEOMICS | VOL. 19

DPCMNE: Detecting Protein Complexes From Protein-Protein Interaction Networks Via Multi-Level Network Embedding.
Xiangmao Meng ... Ju Xiang
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 19
Xiangmao Meng, et. al.Xiangmao Meng ... Ju Xiang
08 Jan 2021
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Frontiers in genetics