Abstract

Mass spectral classifiers for 15 substructures have been computed that give discrete present/absent answers. For the development of classifiers, linear discriminant analysis (LDA) and partial least squares discriminant PLS (DPLS) have been used. The low resolution mass spectra were transformed into a set of 400 spectral features. Because each spectrum is described with so many features, some features may not be necessary, and others may contribute only noise. Therefore, the effect of feature selection has been investigated. The methods used were selection by Fisher ratios and selection by a genetic algorithm (GA). The first method is univariate, the second is multivariate; advantages and disadvantages of both are discussed. On the average, feature selection did not significantly change the classification performance compared with results that have been obtained with all features. However, it was possible to reduce the number of features considerably without a loss of classification performance. For a few substructures GA together with LDA resulted in much better classifiers than DPLS with all features. The features selected for classifications of a benzyl substructure and for the presence of chlorine have been interpreted in terms of mass spectrometric fragmentation rules.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call