Grouped-sampling technique to deal with unbalance in Raman spectral data modeling

Haitao Song,Hongyong Leng,Zhuoya Hou,Rui Gao,Cheng Chen,Chunzhi Meng,Jinshan Sun,Chenxi Li,Binlin Ma

doi:10.1016/j.pdpdt.2022.103059

Abstract

Due to limitations in disease prevalence and hospital specificity, spectral data are often collected with unbalanced sample size. To solve this problem, a new sampling method – grouped-sampling was proposed in this research, which is shown to be effective for unbalanced data. It avoids over-fitting of over-sampling and overcomes under-sampling utilization of under-sampling. In this study, we applied grouped-sampling to two unbalanced datasets where the sample proportions are 199:40 and 75:225. And then verified from two classic models: PCA-SVM (Principal Component Analysis-Support Vector Machine) and the deep learning algorithm GoogLeNet. The accuracy of these two datasets were 85.11% and 96.15% in PCA-SVM and 85.10% and 84.61% on GoogLeNet. Also, the F1-score were evaluated to measure the classification balance of sampling method, and result shows that F1-score of grouped-sampling is always the highest compared to over-sampling and under-sampling. In summary, compared to traditional sampling methods, grouped-sampling performs better on prediction for classes with smaller sample size, which means grouped-sampling can improve the balance of classification results and the potential of practical application. Therefore, we develop a group sampling method that distinguishes between under- and over-sampling, which greatly improves the accuracy and balance of predictions for unbalanced samples.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Grouped-sampling technique to deal with unbalance in Raman spectral data modeling

Abstract

Talk to us

Similar Papers

More From: Photodiagnosis and Photodynamic Therapy

Lead the way for us

Journal: Photodiagnosis and Photodynamic Therapy	Publication Date: Aug 6, 2022
Citations: 3

Similar Papers

Hybrid Machine Learning Model for Face Recognition Using SVM
Anil Kumar Yadav ... Nirmal Kumar Gupta
Computers, Materials & Continua | VOL. 72
Anil Kumar Yadav, et. al.Anil Kumar Yadav ... Nirmal Kumar Gupta
01 Jan 2021
Computers, Materials & Continua | VOL. 72

Research on College Students’ Academic Early Warning System Based on PCA-SVM
Xiang Chen ... Geng Lin
-
Xiang Chen, et. al.Xiang Chen ... Geng Lin
07 Nov 2019
07 Nov 2019

Transmission Condition Monitoring of 3D Printers Based on the Echo State Network
Shaohui Zhang ... Kun He
Applied Sciences | VOL. 9
Shaohui Zhang, et. al.Shaohui Zhang ... Kun He
29 Jul 2019
Applied Sciences | VOL. 9

Microbial and epidemiological factors in early detection of esophageal squamous cell carcinoma and precancerous lesions.
Minjuan Li ... Jianhua Gu
Chinese Medical Journal | VOL. 136
Minjuan Li, et. al.Minjuan Li ... Jianhua Gu
06 Apr 2023
Chinese Medical Journal | VOL. 136

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Grouped-sampling technique to deal with unbalance in Raman spectral data modeling

Abstract

Talk to us

Similar Papers

More From: Photodiagnosis and Photodynamic Therapy