Abstract

Using linear support vector machines, we investigated the feature selection problem for the application of all-against-all classification of a set of 20 chemicals using two types of sensors, classical doped tin oxide and zeolite-coated chromium titanium oxide sensors. We defined a simple set of possible features, namely the identity of the sensors and the sampling times and tested all possible combinations of such features in a wrapper approach. We confirmed that performance is improved, relative to previous results using this data set, by exhaustive comparison of these feature sets. Using the maximal number of different sensors and all available data points for each sensor does not necessarily yield the best results, even for the large number of classes in this problem. We contrast this analysis, using exhaustive screening of simple feature sets, with a number of more complex feature choices and find that subsampled sets of simple features can perform better. Analysis of potential predictors of classification performance revealed some relevance of clustering properties of the data and of correlations among sensor responses but failed to identify a single measure to predict classification success, reinforcing the relevance of the wrapper approach used. Comparison of the two sensor technologies showed that, in isolation, the doped tin oxide sensors performed better than the zeolite-coated chromium titanium oxide sensors but that mixed arrays, combining both technologies, performed best.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call