Abstract

The analysis of biological networks is an important task in life sciences. Most of biological interactions can be modeled using graphical networks where arcs represent probabilistic relationships between nodes or variables. Such models help scientists to analyze their complex data sets, test candidate interaction networks and understand the studied relationships. These studies face two major problems: the selection of most probable interaction topologies and the clustering of the associated peculiar data. In this paper, we model biological interactions with a mixture of multivariate Gaussian distributions. We, then, introduce a new algorithm for the parameters estimation and data clustering. This algorithm, called Graphical Expectation Maximization (GEM), extends the EM algorithm by taking into account several decomposable graph structures and using an original initialization technique. Applying this algorithm, we propose a model selection procedure based on the Bayesian Information Criterion. The accuracy of the proposed method is demonstrated on the grounds of a simulation study of a signal transduction network of the epidermal growth factor (EGFR) protein. Moreover, we apply the proposed model selection procedure to choose the most appropriate interaction graphs for microbial community in infant gut using a real data set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call