Abstract
Chromatographic fingerprinting of complex biological and environmental samples is a active research area with a large and growing literature. Multivariate statistical and pattern recognition techniques can be effective methods for the analysis of such complex data. However, the classification of complex samples on the basis of their chromatographic profiles is complicated by two factors: (1) confounding of the desired group information by experimental variables or other systematic variations, and (2) random or chance classification effects with linear discriminants. Several interesting projects involving these effects and methods for dealing with the effects are discussed. Complex chromatographic data sets often contain information dependent on experimental variables as well as information which differentiates classes. The existence of these types of complicating relationships is an innate part of fingerprint-type data. ADAPT, an interactive computer software system, has the clustering, mapping, and statistical tools necessary to identify and study these effects in realistically large data sets. In one study, pattern recognition analysis of 144 pyrochromatograms from cultured skin fibroblasts was used to differentiate cystic fibrosis carriers from presumed normal donors. Several experimental variables (door gender, chromatographic column, etc.) were observed to contribute to the overall classification process. Notwithstanding these effects, discriminants were developed from the chromatographic peaks that assigned a given pyrochromatogram to its respective class (cystic fibrosis carrier versus normal) largely on the basis of the desired pathological difference. In another study gas chromatographic profiles of cuticular hydrocarbon extracts obtained from 170 red fire at samples were analyzed using pattern recognition methods. Clustering according to the biological variables of social caste and colony was observed. Previously, Monte-Carlo simulation studies have been carried out to assess the probability of chance classification for nonparametric linear discriminants. The level of expected chance classification as a function of the number of observations, the dimensionality, class membership distribution, and covariance structure of the data were examined. These simulation studies established limits on the approaches that can be taken with real data sets so that chance classifications are improbable.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.