Physician performance is critical to caring for patients admitted to the intensive care unit (ICU), who are in life-threatening situations and require high level medical care and interventions. Evaluating physicians is crucial for ensuring a high standard of medical care and fostering continuous performance improvement. The non-randomized nature of ICU data often results in imbalance in patient covariates across physician groups, making direct comparisons of the patients’ survival probabilities for each physician misleading. In this article, we utilize the propensity weighting method to address confounding, achieve covariates balance, and assess physician effects. Due to possible model misspecification, we compare the performance of the propensity weighting methods using both parametric models and super learning methods. When the generalized propensity or the quality function is not correctly specified within the parametric propensity weighting framework, super learning-based propensity weighting methods yield more efficient estimators. We demonstrate that utilizing propensity weighting offers an effective way to assess physician performance, a topic of considerable interest to hospital administrators.