Prevention of Leakage in Machine Learning Prediction for Polymer Composite Properties.

Hajime Shimakawa,Masahiro Sato,Akiko Kumada

doi:10.1021/acs.jcim.3c01894

Abstract

Machine learning (ML) has facilitated property prediction for intricate materials by integrating materials and experimental features such as processing and measurement conditions. However, ML models designed for material properties have often disregarded a common issue of "leakage," resulting in an overestimation of model performance and a decrease in model transferability. This issue can arise from biases inherent in multiple data points obtained from the same experimental group. We provide a critical examination and prevention method of leakage in property prediction for polymer composites. Our proposed method utilizes data partitioning based on the experimental group to ensure that data from the same group are not mixed in both the training and test sets. Evaluation results highlight that the conventional random partitioning unintentionally inflates ML performance through the misuse of experimental features for leaking data bias within the same experimental group rather than explaining the physical causality. In contrast, the proposed method enables the leakage-free utilization of experimental features to improve prediction accuracy while ensuring model transferability. Specifically, when integrating experimental features with polymer and filler features, the conventional method overestimates the prediction performance of electrical conductivity in reducing RMSE by 26% depending on leakage, whereas the proposed method achieves a reduction in RMSE by 5% without leakage. These findings offer valuable guidance for the effective utilization of experimental features in data-driven materials science.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Prevention of Leakage in Machine Learning Prediction for Polymer Composite Properties.

Abstract

Talk to us

Similar Papers

More From: Journal of chemical information and modeling

Lead the way for us

Similar Papers

The prediction of distant metastasis risk for male breast cancer patients based on an interpretable machine learning model
Xuhai Zhao ... Cong Jiang
BMC Medical Informatics and Decision Making | VOL. 23
Xuhai Zhao, et. al.Xuhai Zhao ... Cong Jiang
21 Apr 2023
BMC Medical Informatics and Decision Making | VOL. 23

A Machine Learning–Based Prognostic Model for the Prediction of Early Death After Traumatic Brain Injury: Comparison with the Corticosteroid Randomization After Significant Head Injury (CRASH) Model
Sang Hyub Lee ... Dong Ho Kang
World Neurosurgery | VOL. 166
Sang Hyub Lee, et. al.Sang Hyub Lee ... Dong Ho Kang
03 Jul 2022
World Neurosurgery | VOL. 166

Perception without preconception: comparison between the human and machine learner in recognition of tissues from histological sections
Sanghita Barui ... K S Rajmohan
Scientific Reports | VOL. 12
Sanghita Barui, et. al.Sanghita Barui ... K S Rajmohan
30 Sep 2022
Scientific Reports | VOL. 12

Texture feature analysis of MRI-ADC images to differentiate glioma grades using machine learning techniques
Sahan M Vijithananda ... P B Hewavithana
Scientific Reports | VOL. 13
Sahan M Vijithananda, et. al.Sahan M Vijithananda ... P B Hewavithana
22 Sep 2023
Scientific Reports | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prevention of Leakage in Machine Learning Prediction for Polymer Composite Properties.

Abstract

Talk to us

Similar Papers

More From: Journal of chemical information and modeling