Abstract

Small data volume and data imbalance often lead to statistical failure and seriously restrict the accuracy of data-driven models, which has become a bottleneck problem, needing to be solved, in small sample modeling. The data expansion method has become the main way to solve small sample modeling. However, the randomness in the process of virtual sample generation and combination leads to many invalid data, resulting in poor consistency between the expanded data and the original data. For this reason, this paper proposes a virtual sample generation method based on acceptable area and joint probability distribution sampling (APS-VSG) to limit the randomness in the data expansion method, reduce the proportion of invalid data, improve data consistency after expansion, and improve the accuracy of the data-driven model under the condition of small samples. Firstly, the concept of “compact range of interaction (CRI)” was proposed, which further limits the domain estimation range of data to approximate the valid area of the data. Secondly, the prior knowledge was used to improve mega-trend-diffusion (MTD), and the CRI is delineated according to the trend dispersion to obtain the acceptable area of the virtual data. Finally, a joint probability distribution was constructed based on the true values of small samples in the acceptable area, and data sampling was conducted based on the probability distribution to generate virtual data. The experimental results of standard function datasets show that the virtual samples generated by the proposed method can ensure validity of more than 85%. The experimental results of the NASA li-ion battery dataset show that, compared with Interpolation, Noise, MD-MTD, GAN, and GMM-VSG methods, the error of the data-driven model trained with virtual data generated by the proposed method is significantly reduced. Compared with GAN and GMM-VSG, MSE, RMSE, MAE, and MAPE are reduced by at least 19.3%, 10.6%, 15.4%, and 16.7%, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.