Inspired by the attractive features of least-squares theory in many practical applications, this contribution introduces least-squares-based deep learning (LSBDL). Least-squares theory connects explanatory variables to predicted variables, called observations, through a linear(ized) model in which the unknown parameters of this relation are estimated using the principle of least-squares. Conversely, deep learning (DL) methods establish nonlinear relationships for applications where predicted variables are unknown (nonlinear) functions of explanatory variables. This contribution presents the DL formulation based on least-squares theory in linear models. As a data-driven method, a network is trained to construct an appropriate design matrix of which its entries are estimated using two descent optimization methods: steepest descent and Gauss–Newton. In conjunction with interpretable and explainable artificial intelligence, LSBDL leverages the well-established least-squares theory for DL applications through the following three-fold objectives: (i) Quality control measures such as covariance matrix of predicted outcome can directly be determined. (ii) Available least-squares reliability theory and hypothesis testing can be established to identify mis-specification and outlying observations. (iii) Observations’ covariance matrix can be exploited to train a network with inconsistent, heterogeneous and statistically correlated data. Three examples are presented to demonstrate the theory. The first example uses LSBDL to train coordinate basis functions for a surface fitting problem. The second example applies LSBDL to time series forecasting. The third example showcases a real-world application of LSBDL to downscale groundwater storage anomaly data. LSBDL offers opportunities in many fields of geoscience, aviation, time series analysis, data assimilation and data fusion of multiple sensors.
Read full abstract