Modeling complex systems with large numbers of degrees of freedom has become a grand challenge over the past decades. In many situations, only a few variables are actually observed in terms of measured time series, while the majority of variables—which potentially interact with the observed ones—remain hidden. A typical approach is then to focus on the comparably few observed, macroscopic variables, assuming that they determine the key dynamics of the system, while the remaining ones are represented by noise. This naturally leads to an approximate, inverse modeling of such systems in terms of stochastic differential equations (SDEs), with great potential for applications from biology to finance and Earth system dynamics. A well-known approach to retrieve such SDEs from small sets of observed time series is to reconstruct the drift and diffusion terms of a Langevin equation from the data-derived Kramers–Moyal (KM) coefficients. For systems where interactions between the observed and the unobserved variables are crucial, the Mori–Zwanzig formalism (MZ) allows to derive generalized Langevin equations that contain non-Markovian terms representing these interactions. In a similar spirit, the empirical model reduction (EMR) approach has more recently been introduced. In this work we attempt to reconstruct the dynamical equations of motion of both synthetical and real-world processes, by comparing these three approaches in terms of their capability to reconstruct the dynamics and statistics of the underlying systems. Through rigorous investigation of several synthetical and real-world systems, we confirm that the performance of the three methods strongly depends on the intrinsic dynamics of the system at hand. For instance, statistical properties of systems exhibiting weak history-dependence but strong state-dependence of the noise forcing, can be approximated better by the KM method than by the MZ and EMR approaches. In such situations, the KM method is of a considerable advantage since it can directly approximate the state-dependent noise. However, limitations of the KM approximation arise in cases where non-Markovian effects are crucial in the dynamics of the system. In these situations, our numerical results indicate that methods that take into account interactions between observed and unobserved variables in terms of non-Markovian closure terms (i.e., the MZ and EMR approaches), perform comparatively better.
Read full abstract