Abstract

Feature engineering is one of the major challenges of machine learning. While multiple automation solutions have been proposed in recent years, the vast majority focuses on extracting features from the analyzed dataset itself and not from other (external) sources. In this study we present FGSES, a general framework for automatic feature engineering and its application to DBpedia. Our framework automatically matches the entities in the analyzed dataset to those of the external data source, and then proceeds to generate a large and diverse set of candidate features, both from structured and unstructured content. To efficiently process the large number of generated features, FGSES uses a meta learning-based ranking approach. Our evaluation, conducted on 18 tabular datasets with diverse characteristics, shows that FGSES achieves an average error reduction of 16.5%, significantly outperforming the evaluated baselines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call