Abstract: In this study, we compared three different models of reading comprehension on a large dataset of more than 6,500 students. We compared two models representing four comprehension processes as well as the influence of text difficulty to a one-factor model by applying multi-dimensional item response models to the newly developed Bavarian reading test (BYLET). Cross-validation indicated the best generalizability for the one-factor model, but factor loadings and global model fit showed some evidence for the influence of text difficulty measured by word and sentence length and the process structure. All psychometric models tested had a good fit, as indicated by global fit indices and loading patterns. The general factor scores point to reliability and validity. We conclude that theories of reading comprehension processes also apply to some extent to the measurement of reading comprehension as a trainable skill and that the general factor score of the BYLET is suitable for a reading comprehension screening between grades two and six. The study is preregistered. The analysis code is available at https://osf.io/xw9bv/?view_only=21549993ef79426eb4006ef82415f25c . Test materials can be sent to interested researchers on request.