Multimorbidity is common in older adults and complicates diagnosing and care for this population. We investigated co-occurrence patterns (clustering) of medical conditions in persons with Alzheimer's disease (AD) and their matched controls. The register-based Medication use and Alzheimer's disease study (MEDALZ) includes 70,718 community-dwelling persons with incident AD diagnosed during 2005-2011 in Finland and a matched comparison cohort. Latent Dirichlet Allocation was used to cluster the comorbidities (ICD-10 diagnosis codes). Modeling was performed separately for AD and control cohorts. We experimented with different numbers of clusters (also known as topics in the field of Natural Language Processing) ranging from five to 20. In both cohorts, 17 of the 20 most frequent diagnoses were the same. Based on a qualitative assessment by medical experts, the cluster patterns were not affected by the number of clusters, but the best interpretability was observed in the 10-cluster model. Quantitative assessment of the optimal number of clusters by log-likelihood estimate did not imply a specific optimal number of clusters. Multidimensional scaling visualized the variability in cluster size and (dis)similarity between the clusters with more overlapping of clusters and variation in group size seen in the AD cohort. Early signs and symptoms of AD were more commonly clustered together in the AD cohort than in the comparison cohort. This study experimented with using natural language processing techniques for clustering patterns from an epidemiological study. From the computed clusters, it was possible to qualitatively identify multimorbidity that differentiates AD cases and controls.