We propose a novel multivariate method to analyse biodiversity data based on the Latent Dirichlet Allocation (LDA) model. LDA, a probabilistic model, reduces assemblages to sets of distinct component communities. It produces easily interpretable results, can represent abrupt and gradual changes in composition, accommodates missing data and allows for coherent estimates of uncertainty. We illustrate our method using tree data for the eastern United States and from a tropical successional chronosequence. The model is able to detect pervasive declines in the oak community in Minnesota and Indiana, potentially due to fire suppression, increased growing season precipitation and herbivory. The chronosequence analysis is able to delineate clear successional trends in species composition, while also revealing that site-specific factors significantly impact these successional trajectories. The proposed method provides a means to decompose and track the dynamics of species assemblages along temporal and spatial gradients, including effects of global change and forest disturbances.