An adaptive Physics-based feature engineering approach for Machine Learning-assisted alloy discovery

Yasaman J Soofi,Yijia Gu,Jinling Liu

doi:10.1016/j.commatsci.2023.112248

Abstract

This study investigated the importance of integrating a physics-based perspective in feature engineering for machine learning applications in material science problems. Specifically, we studied the encoding of the variable of temper designation, which contains critical alloy manufacturing information and is commonly included as an important feature for predicting alloy properties in machine learning models. Popular encoding methods such as one-hot encoding or ordinal encoding neglect the physics-based mechanism of temper designations by considering them either totally independent or sequentially ordinal. Following the underlying physical mechanism of the temper designation variable, we propose an adaptive encoding method for temper designations by first decomposing them into categorical and numerical subunits that can be more properly encoded by one-hot encoding and ordinal encoding respectively. The proposed adaptive encoding method is investigated on two independent aluminum alloy datasets consisting of various mechanical and technological properties. Our investigations showed that the proposed adaptive encoding method outperforms traditional encoding methods in the prediction of both mechanical and technological properties. As a general encoding method, this adaptive encoding method can be applied to a variety of decomposable variables to help advance machine-learning-assisted alloy design.

Full Text