Time series data have characteristics such as high dimensionality, excessive noise, data imbalance, etc. In the data preprocessing process, feature selection plays an important role in the quantitative analysis of multidimensional time series data. Aiming at the problem of feature selection of multidimensional time series data, a feature selection method for time series based on mutual information (MI) is proposed. One of the difficulties of traditional MI methods is in searching for a suitable target variable. To address this issue, the main innovation of this paper is the hybridization of principal component analysis (PCA) and kernel regression (KR) methods based on MI. Firstly, based on historical operational data, quantifiable system operability is constructed using PCA and KR. The next step is to use the constructed system operability as the target variable for MI analysis to extract the most useful features for the system data analysis. In order to verify the effectiveness of the method, an experiment is conducted on the CMAPSS engine dataset, and the effectiveness of condition recognition is tested based on the extracted features. The results indicate that the proposed method can effectively achieve feature extraction of high-dimensional monitoring data.
Read full abstract