Markov Clustering Research Articles

Many pragmatic clustering methods have been developed to group data vectors or objects into clusters so that the objects in one cluster are very similar and objects in different clusters are distinct based on some similarity measure. The availability of time course data has motivated researchers to develop methods, such as mixture and mixed-effects modelling approaches, that incorporate the temporal information contained in the shape of the trajectory of the data. However, there is still a need for the development of time-course clustering methods that can adequately deal with inhomogeneous clusters (some clusters are quite large and others are quite small). Here we propose two such methods, hierarchical clustering (IHC) and iterative pairwise-correlation clustering (IPC). We evaluate and compare the proposed methods to the Markov Cluster Algorithm (MCL) and the generalised mixed-effects model (GMM) using simulation studies and an application to a time course gene expression data set from a study containing human subjects who were challenged by a live influenza virus. We identify four types of temporal gene response modules to influenza infection in humans, i.e., single-gene modules (SGM), small-size modules (SSM), medium-size modules (MSM) and large-size modules (LSM). The LSM contain genes that perform various fundamental biological functions that are consistent across subjects. The SSM and SGM contain genes that perform either different or similar biological functions that have complex temporal responses to the virus and are unique to each subject. We show that the temporal response of the genes in the LSM have either simple patterns with a single peak or trough a consequence of the transient stimuli sustained or state-transitioning patterns pertaining to developmental cues and that these modules can differentiate the severity of disease outcomes. Additionally, the size of gene response modules follows a power-law distribution with a consistent exponent across all subjects, which reveals the presence of universality in the underlying biological principles that generated these modules.

Read full abstract

Proteins are vital biological molecules driving many fundamental cellular processes. They rarely act alone, but form interacting groups called protein complexes. The study of protein complexes is a key goal in systems biology. Recently, large protein-protein interaction (PPI) datasets have been published and a plethora of computational methods that provide new ideas for the prediction of protein complexes have been implemented. However, most of the methods suffer from two major limitations: First, they do not account for proteins participating in multiple functions and second, they are unable to handle weighted PPI graphs. Moreover, the problem remains open as existing algorithms and tools are insufficient in terms of predictive metrics. In the present paper, we propose gradually expanding neighborhoods with adjustment (GENA), a new algorithm that gradually expands neighborhoods in a graph starting from highly informative "seed" nodes. GENA considers proteins as multifunctional molecules allowing them to participate in more than one protein complex. In addition, GENA accepts weighted PPI graphs by using a weighted evaluation function for each cluster. In experiments with datasets from Saccharomyces cerevisiae and human, GENA outperformed Markov clustering, restricted neighborhood search and clustering with overlapping neighborhood expansion, three state-of-the-art methods for computationally predicting protein complexes. Seven PPI networks and seven evaluation datasets were used in total. GENA outperformed existing methods in 16 out of 18 experiments achieving an average improvement of 5.5% when the maximum matching ratio metric was used. Our method was able to discover functionally homogeneous protein clusters and uncover important network modules in a Parkinson expression dataset. When used on the human networks, around 47% of the detected clusters were enriched in gene ontology (GO) terms with depth higher than five in the GO hierarchy. In the present manuscript, we introduce a new method for the computational prediction of protein complexes by making the realistic assumption that proteins participate in multiple protein complexes and cellular functions. Our method can detect accurate and functionally homogeneous clusters.

Read full abstract

Markov Clustering Research Articles

Related Topics

Articles published on Markov Clustering

HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks.

Hybrid Approach To Abstractive Summarization

A proximity-based graph clustering method for the identification and application of transcription factor clusters

MOCASSIN-prot: a multi-objective clustering approach for protein similarity networks.

Multilevel Flow-Based Markov Clustering for Design Structure Matrices

An effective approach to detecting both small and large complexes from protein-protein interaction networks

Cyber pirating and Detection of malicious activities p2p botnets using Markov cluster algorithm

An optimal parallel implementation of Markov Clustering based on the coordination of CPU and GPU

Investigating eLearning Research Trends in Iran via Automatic Semantic Network Generation

Detection of protein complex from protein-protein interaction network using Markov clustering

DECIPHERING THE ACTION MECHANISM OF INDONESIA HERBAL DECOCTION IN THE TREATMENT OF TYPE II DIABETES USING A NETWORK PHARMACOLOGY APPROACH

Mining Major Transitions of Chronic Conditions in Patients with Multiple Chronic Conditions.

Sentence Clustering: A Comparative study

Network Community Detection Based on the Physarum-Inspired Computational Framework.

A network-pathway based module identification for predicting the prognosis of ovarian cancer patients.

Evidence of extensive positive selection acting on cherry (Prunus avium L.) resistance gene analogs (RGAs)

Correlation-based iterative clustering methods for time course data: The identification of temporal gene response modules for influenza infection in humans

Eco-modular product architecture identification and assessment for product recovery

A modified two-stage Markov clustering algorithm for large and sparse networks

Predicting overlapping protein complexes from weighted protein interaction graphs by gradually expanding dense neighborhoods.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Markov Clustering Research Articles

Related Topics

Articles published on Markov Clustering

HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks.

Hybrid Approach To Abstractive Summarization

A proximity-based graph clustering method for the identification and application of transcription factor clusters

MOCASSIN-prot: a multi-objective clustering approach for protein similarity networks.

Multilevel Flow-Based Markov Clustering for Design Structure Matrices

An effective approach to detecting both small and large complexes from protein-protein interaction networks

Cyber pirating and Detection of malicious activities p2p botnets using Markov cluster algorithm

An optimal parallel implementation of Markov Clustering based on the coordination of CPU and GPU

Investigating eLearning Research Trends in Iran via Automatic Semantic Network Generation

Detection of protein complex from protein-protein interaction network using Markov clustering

DECIPHERING THE ACTION MECHANISM OF INDONESIA HERBAL DECOCTION IN THE TREATMENT OF TYPE II DIABETES USING A NETWORK PHARMACOLOGY APPROACH

Mining Major Transitions of Chronic Conditions in Patients with Multiple Chronic Conditions.

Sentence Clustering: A Comparative study

Network Community Detection Based on the Physarum-Inspired Computational Framework.

A network-pathway based module identification for predicting the prognosis of ovarian cancer patients.

Evidence of extensive positive selection acting on cherry (Prunus avium L.) resistance gene analogs (RGAs)

Correlation-based iterative clustering methods for time course data: The identification of temporal gene response modules for influenza infection in humans

Eco-modular product architecture identification and assessment for product recovery

A modified two-stage Markov clustering algorithm for large and sparse networks

Predicting overlapping protein complexes from weighted protein interaction graphs by gradually expanding dense neighborhoods.