Hepatitis C is a chronic infectious disease, and early detection and diagnosis are key to curing it. In this study, human serum Raman spectroscopy combined with a support vector machine (SVM) classification algorithm was used to identify multiple types of hepatitis C. The HCV genome is highly mutated and its nucleic acid sequence diversity is up to 30%, according to the homology of nucleotide sequences, the virus strains were divided into seven genotypes and more than 90 subtypes, there were geographical differences in the distribution of HCV of different genotypes, and hcv-1, 2 and 3 were widely prevalent in the world, the main prevalent HCV genotypes in China include 1b,2a,3a,3b and 6a. Combined with the characteristics of Urumqi, xinjiang, China as a multi-ethnic gathering area and the distribution characteristics of HCV genotypes in Urumqi, xinjiang reported in literature, HCV1, HCV2, HCV3a and HCV3b were selected as groups in this paper (Messina et al., 2015; Chen et al., 2017; Ohno et al., 1997). The serum Raman spectra of 55 healthy people, 55 hepatitis C virus cluster 1 (HCV1) patients, and 55 hepatitis C virus cluster 2 (HCV2) patients were collected. The normalized average Raman spectra of the three groups of serum, the differences in the average spectra between groups were plotted and analyzed. The attributions, similarities and differences in the main characteristic peaks in the three types of serum Raman spectra were described. The SVM (support vector machine) algorithm was combined with the normalized Raman spectral data to identify the three groups of serum with 91.1 % accuracy. Furthermore, serum Raman spectroscopy data from 17 hepatitis C virus genotype 3a (HCV3a) patients, 7 hepatitis C virus genotype 3b (HCV3b) patients, and 6 hepatitis C virus cluster 4 (HCV4) patients were also collected. Because of the small number of serum samples, the HCV3b and HCV4 patient sera were classified into one group to discriminate them from HCV3a patients. A model of HCV3a hepatitis was detected. As with the abovementioned groups of patients, the normalized mean Raman spectra of the HCV3a patients and HCV3b patients + HCV4 patients, the difference between the average spectra of the two groups were plotted and analyzed; the attributions, similarities and differences of the main characteristic peaks from these two groups of serum Raman spectra were described. The SVM algorithm was combined with the normalized Raman spectroscopy data to identify the two groups of patient sera with 90 % identification accuracy. This study shows that serum Raman spectroscopy combined with an SVM algorithm can be used for multiclass identification of hepatitis C.
Read full abstract