Abstract
Regression of experimental or simulated data has important implications in sensitivity studies, uncertainty analysis, and prediction accuracy. The fitness of a model is highly dependent on the number of data points and the locations of the chosen points on the curve. The objective of the research is to find the best scheme for a nonlinear regression model using a fraction of total data points without losing any features or trends in the data. Six different schemes are developed by setting criteria such as equal spacing along axes, equal distance between two consecutive points, constraint in the angle of curvature, etc. A workflow is provided to summarize the entire protocol of data preprocessing, training and testing nonlinear regression models with various schemes using a simulated temperature profile from an enhanced geothermal system. It is shown that only 5% of data points are sufficient to represent the entire curve using a regression model with a proper scheme.
Highlights
Data extraction, processing and interpretation have become pivotal tool in making informed and risk evaluated decision in every industry
Time series data is widely used to forecast for weather, disease outbreak, stock, production and many more[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
We have considered data obtained from a commercial simulator for a temperature decline curve in an enhanced geothermal system [19]
Summary
Data extraction, processing and interpretation have become pivotal tool in making informed and risk evaluated decision in every industry. One of the major problems faced is that the data points are not evenly distributed over a time period This is caused because of the different convergence techniques used by most numerical simulators. Whereas, using adaptive time-step, the simulator initially generates data points at very small-time interval and as the equations begin to converge, it gradually increases the time-step and increasing the interval for data point generation. This leads to comparatively small number of data points as compared to the fixed time-step but even the result might contain unnecessary amount of data points
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.