Infrared (IR) spectroscopy is a powerful and versatile tool for analyzing functional groups in organic compounds. A complex and time-consuming interpretation of massive unknown spectra usually requires knowledge of chemistry and spectroscopy. This paper presents a new deep learning method for transforming IR spectral features into intuitive imagelike feature maps and prediction of major functional groups. We obtained 8272 gas-phase IR spectra from the NIST Chemistry WebBook. Feature maps are constructed using the intrinsic correlation of spectral data, and prediction models are developed based on convolutional neural networks. Twenty-one major functional groups for each molecule are successfully identified using binary and multilabel models without expert guidance and feature selection. The multilabel classification model can produce all prediction results simultaneously for rapid characterization. Further analysis of the detailed substructures indicates that our model is capable of obtaining abundant structural information from IR spectra for a comprehensive investigation. The interpretation of our model reveals that the peaks of most interest are similar to those often considered by spectroscopists. In addition to demonstrating great potential for spectral identification, our method may contribute to the development of automated analyses in many fields.
Read full abstract