Abstract Accurate representation of the planetary boundary layer (PBL) in numerical weather prediction is crucial to air quality, weather forecasting, and climate change research and operations. Here, we evaluate the estimates of PBL height (PBLH) from a set of convective-permitting simulations conducted using the Weather Research and Forecasting (WRF) Model during the Plains Elevated Convection at Night (PECAN) campaign (from June to mid-July 2015). Our analysis aims to quantify the variability in PBLH, along with its associated surface variables and atmospheric profiles, to gauge the influence of model physics on the accuracy of simulated PBL properties. Specifically, we conducted twenty-four 45-day retrospective simulations encompassing different combinations of three PBL and two microphysics schemes, two initial and boundary condition (ICBC) datasets, and two model vertical grid spacings. We compare these simulations with measurements from surface sites and radiosonde launches across four sites in the southern Great Plains during PECAN. Our findings indicate that deriving PBLH from modeled atmospheric profiles yields a narrower spread (200–300 m) compared to PBLH calculated from the model’s PBL scheme (400–500 m) relative to radiosonde-derived PBLH under fair weather conditions. The choice of PBL scheme and ICBCs exerts the most significant impact (by up to 400 m) on the spread of modeled PBLH and associated atmospheric profiles, primarily influencing the representation of surface heating, transport, and mixing processes in the model. The model has difficulty reproducing the correct thermodynamic profile under cloud-intense conditions. These results underscore the benefit of using the physically consistent approach in retrieving, evaluating, and assimilating PBLH estimates, as well as the utility of multiphysics to better address model uncertainties. Significance Statement Studies on weather forecast models typically involve running the model multiple times and adjusting model inputs slightly each time to produce a range of predicted variables. This range is crucial for evaluating forecast probabilities and integrating observations into the model. We found that using different model physics schemes can result in an equal or greater spread for predicted variables (here, we focus on boundary layer height and related variables), compared to input tweaks alone. Additionally, a particular scheme can introduce biases. These findings underscore the importance of employing multiple physics schemes in the model, as they widen the coverage (potentially doubling) of outputs, and thus better address forecast uncertainties.