Abstract

Feature selection is a technique used to reduce complexity and improve performance in regards to generalization, fit, and accuracy of prediction for machine learning models. A central challenge in this selection process is that the search over the space of features to find the subset of k optimal features is a known NP-Hard problem. In this work, we study metrics for encoding the combinatorial search as a binary quadratic model, such as Generalized Mean Information Coefficient and Pearson Correlation Coefficient in application to the underlying regression problem of price prediction. We compare predictive model performance in regards to leveraging quantum-assisted vs. classical subroutines for the search, using minimum redundancy maximal relevancy as the heuristic for our approach. We cross validate predictive models on a real world problem of price prediction, and show a performance improvement of mean absolute error scores for our quantum-assisted method (1471.02 pm {135.6}) vs. similar methodologies such as greedy selection (1707.2 pm {168}), recursive feature elimination (1678.3 pm {143.7}), and all features (1546.2 pm {154}). Our findings show that by leveraging quantum assisted routines we find solutions which increase the quality of predictive model output while reducing the input dimensionality to the learning algorithm on synthetic and real-world data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call