Abstract

High-throughput analysis of biomass is necessary to ensure consistent and uniform feedstocks for agricultural and bioenergy applications and is needed to inform genomics and systems biology models. Pyrolysis followed by mass spectrometry such as molecular beam mass spectrometry (py-MBMS) analyses are becoming increasingly popular for the rapid analysis of biomass cell wall composition and typically require the use of different data analysis tools depending on the need and application. Here, the authors report the py-MBMS analysis of several types of lignocellulosic biomass to gain an understanding of spectral patterns and variation with associated biomass composition and use machine learning approaches to classify, differentiate, and predict biomass types on the basis of py-MBMS spectra. Py-MBMS spectra were also corrected for instrumental variance using generalized linear modeling (GLM) based on the use of select ions relative abundances as spike-in controls. Machine learning classification algorithms e.g., random forest, k-nearest neighbor, decision tree, Gaussian Naïve Bayes, gradient boosting, and multilayer perceptron classifiers were used. The k-nearest neighbors (k-NN) classifier generally performed the best for classifications using raw spectral data, and the decision tree classifier performed the worst. After normalization of spectra to account for instrumental variance, all the classifiers had comparable and generally acceptable performance for predicting the biomass types, although the k-NN and decision tree classifiers were not as accurate for prediction of specific sample types. Gaussian Naïve Bayes (GNB) and extreme gradient boosting (XGB) classifiers performed better than the k-NN and the decision tree classifiers for the prediction of biomass mixtures. The data analysis workflow reported here could be applied and extended for comparison of biomass samples of varying types, species, phenotypes, and/or genotypes or subjected to different treatments, environments, etc. to further elucidate the sources of spectral variance, patterns, and to infer compositional information based on spectral analysis, particularly for analysis of data without a priori knowledge of the feedstock composition or identity.

Highlights

  • Pyrolysis coupled with molecular beam mass spectrometry can be used to rapidly analyze pyrolysates and corresponding ions generated from the decomposition of different biopolymers present in biomass based on the abundance of various ions in the resulting spectra

  • We examined the use of six different machine learning classifiers for these three classification problems, namely the random forest classifier, the decision tree classifier, the k-nearest neighbors (k-NN) classifier, the Gaussian Naïve Bayes (GNB) classifier, the Multi-layer perceptron classifier (MLP), and the extreme gradient boosting (XGB) classifier for these three different classification problems

  • We have reported the analysis of different types of biomass and mixtures of biomass by py-MBMS and demonstrated the application of machine learning classification algorithms for the prediction of the biomass types

Read more

Summary

Introduction

The thermal decomposition of lignocellulosic biomass in the absence of oxygen, known as pyrolysis, has been used to study the composition and structure of the cell walls at an analytical scale and to convert the biomass to liquids and gases for larger scale chemical production. Pyrolysis coupled with molecular beam mass spectrometry (pyMBMS) can be used to rapidly analyze pyrolysates and corresponding ions generated from the decomposition of different biopolymers present in biomass based on the abundance of various ions in the resulting spectra. The source of many pyrolysates and their corresponding ions and fragmentation patterns have been thoroughly investigated using various pyrolysis-mass spectrometry systems [1,2,8,10,11,12,13]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call