A Toolkit for Data-Driven Discovery of Governing Equations in High-Noise Regimes

Charles B Delahunt,J Nathan Kutz

doi:10.1109/access.2022.3159335

Abstract

We consider the data-driven discovery of governing equations from time-series data in the limit of high noise. The algorithms developed describe an extensive toolkit of methods for circumventing the deleterious effects of noise in the context of the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">sparse identification of nonlinear dynamics</i> (SINDy) framework. We offer two primary contributions, both focused on noisy data acquired from a system <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\dot { \boldsymbol x} = { \boldsymbol f} ({ \boldsymbol x})$ </tex-math></inline-formula> . First, we propose, for use in high-noise settings, an extensive toolkit of critically enabling extensions for the SINDy regression method, to progressively cull functionals from an over-complete library and yield a set of sparse equations that regress to the derivate <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\dot { \boldsymbol {x}}$ </tex-math></inline-formula> . This toolkit includes: (regression step) weight timepoints based on estimated noise, use ensembles to estimate coefficients, and regress using FFTs; (culling step) leverage linear dependence of functionals, and restore and protect culled functionals based on Figures of Merit (FoMs). In a novel Assessment step, we define FoMs that compare model predictions to the original time-series (i.e., <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${ \boldsymbol x}(t)$ </tex-math></inline-formula> rather than <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\dot { \boldsymbol {x}}(t)$ </tex-math></inline-formula> ). These innovations can extract sparse governing equations and coefficients from high-noise time-series data (e.g., 300% added noise). For example, it discovers the correct sparse libraries in the Lorenz system, with median coefficient estimate errors equal to 1%−3% (for 50% noise), 6%−8% (for 100% noise), and 23%−25% (for 300% noise). The enabling modules in the toolkit are combined into a single method, but the individual modules can be tactically applied in other equation discovery methods (SINDy or not) to improve results on high-noise data. Second, we propose a technique, applicable to any model discovery method based on <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\dot { \boldsymbol x} = { \boldsymbol f} ({ \boldsymbol x})$ </tex-math></inline-formula> , to assess the accuracy of a discovered model in the context of non-unique solutions due to noisy data. Currently, this non-uniqueness can obscure a discovered model’s accuracy and thus a discovery method’s effectiveness. We describe a technique that uses linear dependencies among functionals to transform a discovered model into an equivalent form that is closest to the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">true</i> model, enabling more accurate assessment of a discovered model’s correctness.

Highlights

We apply an engineering lens to sparse identification of nonlinear dynamics (SINDy) to address the exigencies of noisy data, and describe a toolkit of novel, practically-based techniques, including: In the Regression step we weight timepoints based on estimated noise, use ensembles to estimate coefficients, and regress using the Fast Fourier Transforms (FFTs) of the derivatives and library functionals
We offer two main contributions, both applicable to discovery methods generally, not just to SINDy
We have presented a toolkit of methods to address noisy data, for data-driven discovery of governing equations

Summary

Introduction

The derivation of governing equations for physical systems has dominated the physical and engineering sciences for centuries. It is the dominant paradigm for the modeling and characterization of physical processes, engendering rapid and diverse technological developments in every application area of the sciences. Since the mid 20th century, governing equations have become even more influential due to the rise of computers and scientific computing. The rapid evolution of sensor technologies and data-acquisition software/hardware, broadly defined, has opened new fields of exploration where governing equations are difficult to generate and/or produce. For instance, come to mind as application areas where first-principle derivations are difficult to achieve, yet data is becoming abundant and of

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 17	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Toolkit for Data-Driven Discovery of Governing Equations in High-Noise Regimes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Bayesian autoencoders for data-driven discovery of coordinates, governing equations and fundamental constants
L Mars Gao ... J Nathan Kutz
Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences | VOL. 480
L Mars Gao, et. al.L Mars Gao ... J Nathan Kutz
01 Mar 2024
Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences | VOL. 480

Towards data-driven discovery of governing equations in geosciences
Wenxiang Song ... Liangsheng Shi
Communications Earth & Environment | VOL. 5
Wenxiang Song, et. al.Wenxiang Song ... Liangsheng Shi
14 Oct 2024
Communications Earth & Environment | VOL. 5

Data-Driven Discovery of Governing Equations for Coarse-Grained Heterogeneous Network Dynamics
Katherine Owens ... J Nathan Kutz
SIAM Journal on Applied Dynamical Systems | VOL. 22
Katherine Owens, et. al.Katherine Owens ... J Nathan Kutz
06 Sep 2023
SIAM Journal on Applied Dynamical Systems | VOL. 22

AI-Timoshenko: Automatedly Discovering Simplified Governing Equations for Applied Mechanics Problems From Simulated Data
Zhanchao Huang ... Chunjiang Li
Journal of Applied Mechanics | VOL. 88
Zhanchao Huang, et. al.Zhanchao Huang ... Chunjiang Li
14 Jun 2021
Journal of Applied Mechanics | VOL. 88

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Toolkit for Data-Driven Discovery of Governing Equations in High-Noise Regimes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access