Abstract

In many applications, there are a number of data sources that can be collected and numerous features that can be calculated from these data sources. The error of big data has lead many to believe that the larger the data, the better the results. However, as the dimensionality of the data increases, the effects of the curse of dimensionality become more prevalent. Further, a large feature set also increases the computational costof data collection and feature calculation. In this study, we evaluated four dimensionality reduction techniques as part of a system for condition monitoring of a hydraulic actuator. Two feature selection techniques, ReliefF and variable importance, and two feature extraction techniques, principal component analysis and autoencoders, are used to reduce the input into three classification algorithms. We conclude that variable importance in conjunction with the random forest algorithm outperforms the other dimensionality reduction techniques. Feature selection has the added advantage of being able to remove data sources and features from the data collection and feature calculation process that are not present inthe relevant feature subset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call