Abstract

AbstractSmall data sets are an extremely challenging problem in the machine learning (ML) realm, and in specific, in regression scenarios, as the lack of relevant data can lead to ML models that have large bias. However, there are many applications for which a purely data‐driven procedure would be advantageous, but a large amount of data are not available. This article proposes a novel regression‐based transfer learning (TL) model to address this challenge, where TL is defined as knowledge transfer from a large, relevant data set (source domain data) to a small data set (target domain data). The proposed TL model is termed double‐weighted support vector transfer regression (DW‐SVTR), which couples least squares support vector machines for regression (LS‐SVMR) with two weight functions. The first weight function uses kernel mean matching (KMM) to reweight the source domain data such that the mean values of the source and target domain data in a reproduced kernel Hilbert space (RKHS) are close. In this way, the source domain data points relevant to the target domain points have a larger weight than irrelevant source domain points. The second weight is a function of estimated residuals, which aims to further reduce the negative interference of irrelevant source domain points. The proposed approach is assessed and validated via simulated data and by enhanced shear strength prediction of nonductile columns based on limited availability of nonductile column data. Specifically, the results for the latter show that the proposed DW‐SVTR can reduce the root mean square error (RMSE) by 34% and enhance the coefficient of determination (R2) by 229%. These numerical results demonstrate that the DW‐SVTR significantly reduces the effect of small sample bias and improves prediction performance compared to standard ML methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call