Principal subspace theorems deal with the problem of finding subspaces supporting optimal approximations of multivariate distributions. The optimality criterion considered in this paper is the minimization of the mean squared distance between the given distribution and an approximating distribution, subject to some constraints. Statistical applications include, but are not limited to, cluster analysis, principal components analysis and projection pursuit. Most principal subspace theorems deal with elliptical distributions or with mixtures of spherical distributions. We generalize these results using the notion of self-consistency. We also show their connections with the skew-normal distribution and projection pursuit techniques. We also discuss their implications, with special focus on principal points and self-consistent points. Finally, we access the practical relevance of the theoretical results by means of several simulation studies.