Special issue on feature engineering editorial

Tim Verdonck,Bart Baesens,María Óskarsdóttir,Seppe Vanden Broucke

doi:10.1007/s10994-021-06042-2

Tim Verdonck, Bart Baesens + Show 2 more

Open Access

https://doi.org/10.1007/s10994-021-06042-2

Copy DOI

Abstract

In order to improve the performance of any machine learning model, it is important to focus more on the data itself instead of continuously developing new algorithms. This is exactly the aim of feature engineering. It can be defined as the clever engineering of data hereby exploiting the intrinsic bias of the machine learning technique to our benefit, ideally both in terms of accuracy and interpretability at the same time. Often times it will be applied in combination with simple machine learning techniques such as regression models or decision trees to boost their performance (whilst maintaining the interpretability property which is so often needed in analytical modeling) but it may also improve complex techniques such as XGBoost and neural networks. Feature engineering aims at designing smart features in one of two possible ways: either by adjusting existing features using various transformations or by extracting or creating new meaningful features (a process often called “featurization”) from different sources (e.g., transactional data, network data, time series data, text data, etc.).

Full Text