Medical schools are required to assess and evaluate their curricula and to develop exam questions with strong reliability and validity evidence, often based on data derived from statistically small samples of medical students. Achieving a large enough sample to reliably and validly evaluate courses, assessments, and exam questions would require extensive data collection over many years, which is inefficient, especially in the fast-changing educational environment of medical schools. This article demonstrates how advanced quantitative methods, such as bootstrapping, can provide reliable data by resampling a single dataset to create many simulated samples. This economic approach, among others, allows for the creation of confidence intervals and, consequently, the accurate evaluation of exam questions as well as broader course and curriculum assessments. Bootstrapping offers a robust alternative to traditional methods, improving the psychometric quality of exam questions, and contributing to fair and valid assessments in medical education.
Read full abstract