Considering the case that the prediction variable is a time series and the response variable is a continuous scalar, we propose a time series regression model based on improved PCA and Bagging Algorithms. Compared with PCA dimension reduction, the proposed method uses distance correlation coefficient matrix instead of Person correlation coefficient matrix, which makes the distribution assumption of original variables more free. Considering that PCA is an unsupervised dimension reduction technique and the connection functions between principal components and response variables are unknown, we propose to use Bagging Algorithmss to capture information of principal components related to response variables. In the actual data analysis, the comparative methods are LASSO and PCA-based linear models, and the empirical results show that the proposed method has certain competitiveness compared with the comparison method.Finally, because the base-model of Bagging Algorithms is model-free, some machine learning methods with higher precision and flexibility can be used as the base-model for data tasks with different complexity.
Read full abstract