Correlation analysis is a crucial step prior to any regression modeling for data prediction, as it can unveil the relationship between predictors and responses, particularly in terms of linearity and nonlinearity. This step is often mandatory for selecting the most suitable regression model. The major challenge is that linear correlation measures are only appropriate for linear cases and there are limited nonlinear measures for nonlinearity assessment. Apart from these challenges, the demanding issue relates to the influence of unknown and unmeasured predictor data in regression model selection thereby yielding unrealistic and inaccurate outputs from both linear and nonlinear correlation measures. To tackle these challenges, this paper proposes a hierarchical correlation analysis method that initially indicates the influence of unknown predictors and then assists in selecting the best regressor for modeling and forecasting. The proposed method utilizes a linear measure known as canonical correlation analysis and a nonlinear measure known as the maximal information criterion. Based on the correlation values derived from these measures, it is possible to classify them into three levels of correlation: low, moderate, and high. In the first and second scenarios, it is inferred that the response data is linearly or nonlinearly influenced by the measured predictors, respectively. In the third scenario, three conditions are derived from the correlation levels, suggesting that the response data is not influenced by the measured predictors. To validate the proposed hierarchical correlation analysis method's applicability and effectiveness, it is tested with predictor and response data related to real-world long-span bridges. The only measured predictor is air temperature, while the response data comprises limited bridge displacements obtained from synthetic aperture radar images, in line with remote sensing technology. The results indicate that the proposed method is highly effective and applicable for selecting the best regression model for prediction.
Read full abstract