Noisy time-series data—from various experiments, including Förster resonance energy transfer, patch clamp, and force spectroscopy, among others—are commonly analyzed with either hidden Markov models or step-finding algorithms, both of which detect discrete transitions. Hidden Markov models, including their extensions to infinite state spaces, inherently assume exponential—or technically geometric—holding time distributions, biasing step locations toward steps with geometric holding times, especially in sparse and/or noisy data. In contrast, existing step-finding algorithms, while free of this restraint, often rely on ad hoc metrics to penalize steps recovered in time traces (by using various information criteria) and otherwise rely on approximate greedy algorithms to identify putative global optima. Here, instead, we devise a robust and general probabilistic (Bayesian) step-finding tool that neither relies on ad hoc metrics to penalize step numbers nor assumes geometric holding times in each state. As the number of steps themselves in a time-series are a priori unknown, we treat these within a Bayesian nonparametric (BNP) paradigm. We find that the method developed, BNP Step (BNP-Step), accurately determines the number and location of transitions between discrete states without any assumed kinetic model and learns the emission distribution characteristic of each state. In doing so, we verify that BNP-Step can analyze sparser data sets containing higher noise and more closely spaced states than otherwise resolved by current state-of-the-art methods. What is more, BNP-Step rigorously propagates measurement uncertainty into uncertainty over state transition locations, numbers, and emission levels as characterized by the posterior. We demonstrate the performance of BNP-Step on both synthetic data as well as data drawn from force spectroscopy experiments.
Read full abstract