A major goal of psycholinguistic theory is to account for the cognitive constraints limiting the speed and ease of language comprehension and production. Wide-ranging evidence demonstrates a key role for linguistic expectations: A word's predictability, as measured by the information-theoretic quantity of surprisal, is a major determinant of processing difficulty. But surprisal, under standard theories, fails to predict the difficulty profile of an important class of linguistic patterns: the nested hierarchical structures made possible by recursion in human language. These nested structures are better accounted for by psycholinguistic theories of constrained working memory capacity. However, progress on theory unifying expectation-based and memory-based accounts has been limited. Here we present a unified theory of a rational trade-off between precision of memory representations with ease of prediction, a scaled-up computational implementation using contemporary machine learning methods, and experimental evidence in support of the theory's distinctive predictions. We show that the theory makes nuanced and distinctive predictions for difficulty patterns in nested recursive structures predicted by neither expectation-based nor memory-based theories alone. These predictions are confirmed 1) in two language comprehension experiments in English, and 2) in sentence completions in English, Spanish, and German. More generally, our framework offers computationally explicit theory and methods for understanding how memory constraints and prediction interact in human language comprehension and production.