Four pairs of competing hypotheses for phosphorus (P) retention in lakes are assessed and 39 static models in three types, namely mechanistic, semi-mechanistic, and strictly-empirical models, are evaluated. Mechanistic models are only based on the physical representation of lakes and basic hypotheses; hence, we used their pairwise comparison to assess the superiority of the hypotheses. The results showed that (i) simulating lakes as mixed-flow reactor is superior to plug-flow reactor hypothesis; (ii) modeling P loss as a second-order reaction outperforms the first-order reaction; (iii) P loss is better explained as a removal process throughout the lake volume than as a settling process across the sediments; and (iv) considering a fraction of P loading is associated with fast settling particles enhances lake total phosphorus (TP) predictions. The preeminent mechanistic model combines, for the first time, the second-order reaction hypothesis with the hypothesis that a specific proportion of P loading settles rapidly at the lake entrance. The comparison of the three model types showed that semi-mechanistic models outperform both mechanistic and strictly-empirical models since they take the form of a mechanistic model based on the physical representation of the lakes and utilize statistically acquired equations for unknown parameters. The best-fit model is a semi-mechanistic model that adopts the mixed-flow reactor hypothesis with a second-order volumetric reaction rate that is calculated as a non-linear function of inflow TP concentration, lake average depth, and water retention time. This model predicts 77.8% of the variability of log10-transformed lake TP concentration, which is 4.2% higher than the best mechanistic model and 0.8% higher than the best strictly-empirical model. The findings of this study not only shed light on the understanding of P retention in lakes but also can be useful for assessment of data-limited lakes and large-scale hydrological models to simulate the P cycle.