Abstract

Tile-based variance rank initiated-unsupervised sample indexing (VRI-USI) analysis is introduced for comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry (GC×GC-TOFMS). VRI-USI analysis addresses the challenge that irrelevant variables can often obscure true chemical variation when using other unsupervised chemometric tools. Implementation of VRI-USI analysis with GC×GC-TOFMS data incorporates the tile-based Fisher ratio (F-ratio) analysis software platform that mitigates the effects of retention shifting in both separation dimensions with an unsupervised variance metric (instead of the F-ratio metric) as the initial step of ranking the hitlist. Next, implementation of k-means clustering, k, per hit using the silhouette metric, Smax, is used to reveal to what extent recurring indexed sample clusters are uncovered. Finally, based upon a probability-based evaluation of how the individual samples cluster throughout the hitlist an unsupervised class membership is revealed. For a JP8 jet fuel dataset spiked with a sulfur-containing analyte mix at 30-ppm, 15-ppm, and neat, clustering by spike level at k = 3 was the most commonly re-occurring set of index assignments, occurring for 11 out of 14 spiked analytes. Upon application of these k-means index assignments to the entire hitlist, all 14 spiked hits had one way ANOVA p-values < 0.05, validating the presumption of classes. Next, application of VRI-USI to a 3-ppm spiked and neat JP8 jet fuel comparison exhibited similar performance to F-ratio analysis for analyte discovery. In the last study, for a dataset of J1800A, JP4, and JP8 jet fuel, each spiked with the sulfur-containing analyte mix at 30-ppm and neat, 453 out of 520 hits in the hitlist exhibited index assignments indicative of fuel type clustering, with the remaining 67 hits having contradictory assignments. Scrutinization of these 67 hits revealed nine hits with “split combinations” in index assignments, whereby the spiked and neat samples for a given fuel were in separate clusters. Eight of these hits were identified as spiked sulfur analytes. Interestingly, these hits also had large Smax indicative of a true sub-cluster. Thus, tile-based VRI-USI analysis appears to be a promising tool for unsupervised multi-class classification studies using GC×GC-TOFMS data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call