Abstract

Applications of machine learning in chemistry are many and varied, from prediction of structure–property relationships, to modeling of potential energy surfaces for large scale atomistic simulations. We describe a generalized approach for the application of machine learning to the classification of spectra which can be used as the basis for a wide variety of undergraduate projects. While our examples use FTIR and mass spectra, the approach could equally well be used with UV–visible, Raman, NMR, or indeed any other type of spectra. We summarize a number of different unsupervised and supervised machine learning algorithms that can be used to classify spectra into groups, and illustrate their application using data from three different projects carried out by fourth year chemistry undergraduates. The three projects investigated the ability of the various machine learning approaches to correctly classify spectra of a variety of fruits, whiskies, and teas, respectively. In all cases the algorithms were able to differentiate between the various samples used in each study, and the trained machine learning models could then be used to classify unknown samples with a high degree of accuracy (>98% in many cases). Depending on the extent to which students are expected to write their own code to perform the data analysis, the general model adopted in this work can be adapted for a variety of purposes, from short (one to two day) practical exercises and workshops, to much longer independent student projects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call