Abstract

ABSTRACT Supervised feature selection aims to find the signals that best predict a target variable. Typical approaches use measures of correlation or similarity, as seen in filter methods, or predictive power in learned models, as seen in wrapper methods. In both approaches, the selected features often have high entropies and are not suitable for compression. This is a particular drawback in the automotive domain where fast communication and archival of vehicle telemetry data is increasingly important, especially with technologies such as V2V and V2X (vehicle-to-vehicle and vehicle-to-everything communication). This paper aims to select features with good predictive performances and good compression by introducing a compressibility factor into several existing feature selection approaches. Where appropriate, performance guarantees are provided for greedy searches based on monotonicity and submodularity. Using the language of entropy, the relationship between relevance, redundancy, and compressibility is discussed from the perspective of signal selection. The approaches are then demonstrated in selecting features from a vehicle Controller Area Network for use in SVMs in a regression task, namely predicting fuel consumption, and a classification task, namely identifying Points of Interest. We show that while predictive performance is slightly lower when compression is considered, the compressibility of the selected features is significantly improved.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call