Abstract

Many complex processes, from protein folding to neuronal network dynamics, can be described as stochastic exploration of a high-dimensional energy landscape. Although efficient algorithms for cluster detection in high-dimensional spaces have been developed over the last two decades, considerably less is known about the reliable inference of state transition dynamics in such settings. Here we introduce a flexible and robust numerical framework to infer Markovian transition networks directly from time-independent data sampled from stationary equilibrium distributions. We demonstrate the practical potential of the inference scheme by reconstructing the network dynamics for several protein-folding transitions, gene-regulatory network motifs, and HIV evolution pathways. The predicted network topologies and relative transition time scales agree well with direct estimates from time-dependent molecular dynamics data, stochastic simulations, and phylogenetic trees, respectively. Owing to its generic structure, the framework introduced here will be applicable to high-throughput RNA and protein-sequencing datasets, and future cryo-electron microscopy (cryo-EM) data.

Highlights

  • Many complex processes, from protein folding to neuronal network dynamics, can be described as stochastic exploration of a high-dimensional energy landscape

  • We approximate the empirical probability density function (PDF) by using the expectation maximization algorithm to fit a Gaussian mixture model (GMM) in a space of sufficiently large dimension d (Methods and Fig. 1a)

  • Some proteins can be described through effective one-dimensional reaction coordinates[5,7,52,53], the accurate description of their diffusive dynamics over the full microscopic energy landscape requires many degrees of freedom[54,55]

Read more

Summary

Introduction

From protein folding to neuronal network dynamics, can be described as stochastic exploration of a high-dimensional energy landscape. Much progress has been made in dimensionality reduction[25,26,27] and the reconstruction of effective energy landscapes in these settings[3,13,16,17,28], the problem of inferring dynamical information such as protein-folding or mutation pathways and rates from instantaneous ensemble data remains a major challenge. The agreement of our inferred results with two separate sets of time-dependent measurements suggests that the inference of complex transition networks via reconstructed energy landscapes can provide a viable and often more efficient alternative to traditional timeseries estimates, as new experimental techniques will offer unprecedented access to high-dimensional ensemble data

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call