Abstract

In this paper, we present a different way to the standard methods to classify Raman spectra whose grouping process is based on a phenomenon of clustering observed in nature at the atomic level and correctly described by the statistical physics model known as the Potts model, which represents the interacting spins on a crystalline lattice. This clustering method is known as the super paramagnetic clustering (SPC), which allows identifying hierarchical structures in data banks. In this novel method, we assigned a Potts spin to each data point (Raman spectrum) and introduced an interaction between neighboring points whose coupling strength is a decreasing function of the distance between the nearest neighboring sites. We found a hierarchical tree structure in our data bank of Raman spectra allowing us to discriminate between the spectra from control and diabetes patients. The sensitivity and specificity of the diabetes detection technique by Raman spectroscopy were calculated directly because the SPC method achieves an accurate determination of the members of each cluster. As a cross-check, SPC results were compared with published results of multivariate analysis, observing excellent agreements; however, the SPC method allows determining the members of all identified clusters explicitly.

Highlights

  • In recent years, spectroscopic techniques such as Raman spectroscopy, Fourier-transform infrared spectroscopy, X-ray spectroscopy, and mass spectroscopy have become fundamental tools in the fields of chemistry, drugs, the agrofood sector, life sciences, and environmental analysis to study different biological systems based on the chemical and structural composition of biological samples [1,2,3].In these techniques, once spectra are captured, mathematical tools to classify them are required; spectra corresponding to biological samples usually show a high complexity because they contain a large number of peaks of different intensities and forms, unlike spectra corresponding to nonbiological samples where discrimination between a pair of samples turns out to be relatively simple

  • In order to compare the control and diabetes Raman spectra, the spectra were processed as it is described in the previous section; 2330 × 182 data matrix was constructed where the first 102 columns correspond to the spectra from control patients and the last 80 columns correspond to the spectra from diabetes patients. e 182 × 182 distance matrix was constructed using the data matrix

  • We proposed the superparamagnetic clustering method as a different way to the standard methods for identifying patterns in large banks of spectra based on the spectra bands similarity. is method that uses the Potts spin model from statistical physics allowed to successfully discriminate diabetes spectra from control spectra with high sensitivity and specificity through a hierarchical structure of clusters

Read more

Summary

Introduction

Spectroscopic techniques such as Raman spectroscopy, Fourier-transform infrared spectroscopy, X-ray spectroscopy, and mass spectroscopy have become fundamental tools in the fields of chemistry, drugs, the agrofood sector, life sciences, and environmental analysis to study different biological systems based on the chemical and structural composition of biological samples [1,2,3].In these techniques, once spectra are captured, mathematical tools to classify them are required; spectra corresponding to biological samples usually show a high complexity because they contain a large number of peaks of different intensities and forms, unlike spectra corresponding to nonbiological samples where discrimination between a pair of samples turns out to be relatively simple. Among the main techniques applied in the analysis of spectra, we have multivariate analysis (principal component analysis and linear discriminant analysis) [4, 5] and clustering analysis (K-means and spectral norm methods) [6] Among these clustering methods, the ones that acquire particular interest are those methods that allow exploration of hierarchical structures in data banks, facilitating the study of diseases characterized by being classified into either different types or showing various stages of progress [4].

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.