Abstract

Surface-enhanced Raman scattering (SERS) is a valuable analytical technique for the analysis of biological samples. However, due to the nature of SERS it is often challenging to exploit the generated data to obtain the desired information when no reporter or label molecules are used. Here, the suitability of random forest based approaches is evaluated using SERS data generated by a simulation framework that is also presented. More specifically, it is demonstrated that important SERS signals can be identified, the relevance of predefined spectral groups can be evaluated, and the relations of different SERS signals can be analyzed. It is shown that for the selection of important SERS signals Boruta and surrogate minimal depth (SMD) and for the analysis of spectral groups the competing method Learner of Functional Enrichment (LeFE) should be applied. In general, this investigation demonstrates that the combination of random forest approaches and SERS data is very promising for sophisticated analysis of complex biological samples.

Highlights

  • Surface-enhanced Raman scattering (SERS) is an analytical approach that is capable to study small structures in biological materials[1] and that is even able to detect single molecules[2,3]

  • It can be concluded that SERS spectra were obtained that show realistic characteristics, e.g. diverse signals that are partially characteristic for the respective group

  • This study shows that the combination of random forest (RF) based approaches and surface-enhanced Raman scattering (SERS) is very promising to analyze complex biological samples

Read more

Summary

Introduction

Surface-enhanced Raman scattering (SERS) is an analytical approach that is capable to study small structures in biological materials[1] and that is even able to detect single molecules[2,3]. One possible solution is the application of SERS labels, nanoparticles that are combined with functionalized reporter molecules for specific binding and, to obtain more reproducible and specific SERS spectra[7] In this case, usually only the signals of the reporter molecules are detected. In label-free SERS experiments it has been shown that variation due to the nature of SERS can hamper analysis with PCA and HCA when biological samples containing multiple molecules are analyzed[15]. This can be circumvented when supervised methods like artificial neural networks (ANN) are applied. To comprehensively evaluate the validity and power www.nature.com/scientificreports of the different methods, the true properties of the data needs to be known which is why a framework for the simulation of SERS data was established for this study

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.