Abstract

Synthetic data augmentation holds substantial research and application value in scenarios characterized by limited samples and high dimensions. It enhances the analytical ability and efficiency of spectral analysis models. This paper proposes Autoencoder-Combined Boundary Equilibrium Generative Adversarial Networks (AE-BEGAN) as a new method for augmenting synthetic data in scenarios with limited samples and high dimensions, with a specific emphasis on near-infrared (NIR) spectral data. The spectral data first undergoes preprocessing procedures that encompass advanced noise reduction algorithms and techniques for removing abnormal samples, guaranteeing elimination of unwanted disturbances and outliers. Then, the pre-processed data is utilized to train the AE-BEGAN model, which generates augmented synthetic samples. Finally, real NIR spectral data obtained from lubricant samples exhibiting different water contents were employed to validate and test the performance of the model. The experimental results demonstrate that the AE-BEGAN model outperforms other GANs in generating synthetic data of high quality and diversity, as quantified by two evaluation metrics, α-Precision and β-Recall with scores of approximately 0.86 and 0.28, respectively. The application case study confirms that the AE-BEGAN model exhibits the capability to generate derived NIR spectra and expand the number of spectra in scenarios with limited samples and high dimensions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call