Abstract

The number of reliable samples obtained in early decision-making activity is usually relatively small. Due to variable distribution and incomplete structure of tiny datasets, it is challenging to create reliable and robust predictive modeling using classic statistical and machine learning models in small sample settings. The virtual sample generation (VSG) technique improves model learning accuracies for minimal datasets across diverse applications. Virtual samples on independent variables were generated using established VSG methods predicated on the assumption of a probability distribution or a membership function to fill data gaps. However, in the actual world, non-linear function interactions between variables are common. To address this issue, this paper developed a novel VSG method called Dual-VSG, which generates non-linear interpolation virtual samples using a self-supervised learning (SSL) framework to improve learning performance on small datasets. We generated non-linear interpolation virtual samples without labels by estimating non-linear functions and transforming them into a high-dimensional space using the proposed dual-net model. The weights of the dual-net model are transferred to a downstream task to generate virtual sample labels. To demonstrate the effectiveness of the suggested strategy, this research employed five datasets. On the Backpropagation Neural Networks (BPNN) predictive model, we compared the suggested method's prediction performance to two state-of-the-art VSG approaches. To assess prediction performance on a regression dataset, the Mean Absolute Percentage Error (MAPE) and the Root Mean Square Error (RMSE) are used. Furthermore, the classification accuracy (ACC) and the Fl measure are used to assess classification capability on classification datasets. In addition, the paired t-test was utilized to see if the suggested Dual-VSG approach differed significantly from the other VSG methods in terms of RMSE, MAPE, accuracy (ACC), or F1 score. For short datasets, the suggested Dual-VSG method outperforms those VSG methods, according to our experimental results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call