Abstract

Matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) analysis is a rapid and reliable method for bacterial identification. Classification algorithms, as a critical part of the MALDI-TOF MS analysis approach, have been developed using both traditional algorithms and machine learning algorithms. In this study, a method that combined helix matrix transformation with a convolutional neural network (CNN) algorithm was presented for bacterial identification. A total of 14 bacterial species including 58 strains were selected to create an in-house MALDI-TOF MS spectrum dataset. The 1D array-type MALDI-TOF MS spectrum data were transformed through a helix matrix transformation into matrix-type data, which was fitted during the CNN training. Through the parameter optimization, the threshold for binarization was set as 16 and the final size of a matrix-type data was set as 25 × 25 to obtain a clean dataset with a small size. A CNN model with three convolutional layers was well trained using the dataset to predict bacterial species. The filter sizes for the three convolutional layers were 4, 8, and 16. The kernel size was three and the activation function was the rectified linear unit (ReLU). A back propagation neural network (BPNN) model was created without helix matrix transformation and a convolution layer to demonstrate whether the helix matrix transformation combined with CNN algorithm works better. The areas under the receiver operating characteristic (ROC) curve of the CNN and BPNN models were 0.98 and 0.87, respectively. The accuracies of the CNN and BPNN models were 97.78 ± 0.08 and 86.50 ± 0.01, respectively, with a significant statistical difference (p < 0.001). The results suggested that helix matrix transformation combined with the CNN algorithm enabled the feature extraction of the bacterial MALDI-TOF MS spectrum, which might be a proposed solution to identify bacterial species.

Highlights

  • Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is a fast, inexpensive and reliable tool for the identification of bacteria, and it has become a gold standard for microbial identification in clinical microbiology laboratories within the last decades (Lasch et al, 2009; Bryson et al, 2019; Hou et al, 2019; Welker et al, 2019)

  • The accuracies of the convolutional neural networks (CNNs) and back propagation neural network (BPNN) models were 97.78 ± 0.08 and 86.50 ± 0.01, respectively, with a difference (p < 0.001) that supports a difference between the two accuracy results. These results suggested that the helix matrix transformation combined with the CNN model algorithm achieves better classification performance in bacterial identification based on MALDITOF MS

  • Matrix-assisted laser desorption ionization-time of flight mass spectrometry is a rapid, high-throughput identification method for bacterial identification, which has been successfully applied in clinical microbiology laboratories (Schubert and Kostrzewa, 2017; Cordovana et al, 2018)

Read more

Summary

Introduction

Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is a fast, inexpensive and reliable tool for the identification of bacteria, and it has become a gold standard for microbial identification in clinical microbiology laboratories within the last decades (Lasch et al, 2009; Bryson et al, 2019; Hou et al, 2019; Welker et al, 2019). The similarity evaluation system for the MALDI-TOF MS spectra of bacteria is commonly used in routine analysis. Sample spectra are compared with the standard spectrum library by calculating the similarity among multiple parameters, such as peak positions, intensities and frequencies, ensuring the highest possible levels of accuracy and reproducibility across a complete range of microorganisms (Wang et al, 2018; Rotcheewaphan et al, 2019). The standard spectrum library can be extended by users to identify more species of bacteria. Only a small number of attributes in MALDI-TOF MS spectra such as the peak height and peak area are analyzed and empirically linked to microbial species in a similarity evaluation system (Weis et al, 2020). Some challenging species with similar MS peaks, such as Shigella and E. coli species are difficult to be identified by traditional algorithm (Ling et al, 2019)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call