Abstract
Unlike mixtures consisting solely of non-Gaussian sources, mixtures including two or more Gaussian components cannot be separated using standard independent components analysis methods that are based on higher order statistics and independent observations. The mixed Independent Components Analysis/Principal Components Analysis (mixed ICA/PCA) model described here accommodates one or more Gaussian components in the independent components analysis model and uses principal components analysis to characterize contributions from this inseparable Gaussian subspace. Information theory can then be used to select from among potential model categories with differing numbers of Gaussian components. Based on simulation studies, the assumptions and approximations underlying the Akaike Information Criterion do not hold in this setting, even with a very large number of observations. Cross-validation is a suitable, though computationally intensive alternative for model selection. Application of the algorithm is illustrated using Fisher's iris data set and Howells' craniometric data set. Mixed ICA/PCA is of potential interest in any field of scientific investigation where the authenticity of blindly separated non-Gaussian sources might otherwise be questionable. Failure of the Akaike Information Criterion in model selection also has relevance in traditional independent components analysis where all sources are assumed non-Gaussian.
Highlights
Independent components analysis (ICA) has recently emerged as a valuable tool for the analysis of multivariate data sets and is increasingly used in a broad array of scientific contexts [1], [2], [3]
Using a linear discriminant function derived from sepal and petal widths and lengths derived from I. setosa and I. versicolor samples, he demonstrated that I. versicolor specimens had mean linear discriminant function scores that were intermediate between those of the other two species
The mixed ICA/PCA model described here can be viewed as a special instance of a class of ICA techniques that explicitly model more than one source distribution as part of the ICA optimization [10], [11], [35], [36]
Summary
Independent components analysis (ICA) has recently emerged as a valuable tool for the analysis of multivariate data sets and is increasingly used in a broad array of scientific contexts [1], [2], [3]. ICA techniques utilizing higher order statistics can separate mixtures of sub-Gaussian and/or super-Gaussian signals into their source components, thereby achieving blind source separation. When each individual multivariate observation represents an independent sample, PLOS ONE | DOI:10.1371/journal.pone.0118877. Stroke Foundation of Ontario through the Centre for Stroke Recovery, CIHR grant MOP84483, and a Bridging Brain, Mind and Behaviour grant from the James S. No individuals employed or contracted by the funders played any role in study design, data collection and analysis, decision to publish, or preparation of the manuscript
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have