Abstract
A mixed type data matrix consisting of 11 quantitative carbohydrate variables and 23 binary secondary metabolites data measured in 5–8 isolates of 7 species of Penicillium was analyzed using different multivariate statistical methods. This kind of data matrix is common in numerical taxonomy and has formerly been analyzed by consensus methods based on the separate analysis of the quantitative and qualitative data matrix, by using Gower's general similarity coefficient for mixed data or by location models. For the initial data treatment the χ 2, Bray—Curtis and Canberra distance coefficients were useful for cluster analysis and minimum spanning trees (MSTs) combined with principal coordinate analysis (PCO). The multivariate ordination methods hitherto recommended for chemotaxonomic data, principal component analysis (PCA) and its constrained ordination equivalent partial least squares (PLS) analysis (using dummy variables for each species) gave seven quite diffuse clusters with some overlap in two-dimensional ordination plots, while correspondence analysis (CA) gave seven very clear clusters. The results indicate that qualitative data strongly dominate quantitative data and that these qualitative data are best represented in plots by correspondence analysis. However, in physiological studies the quantitative data may be considered the most important, PCA and CA are preferred for the analysis of mixed data. Dummy constrained PLS may be used to select quantitative variables that are species specific rather than related to climatic conditions. In classification studies at the species level it is recommended to use correspondence analysis on mixed chemotaxonomical data. In the latter studies variables based on differentiation, such as the biosynthetic families of secondary metabolites used here, give clear species separations, and can be used for further cladistic analyses.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have