As residency programs transition from time- to performance-based competency standards, validated tools are needed to measure performance-based learning outcomes and studies are required to characterize the learning experience for residents. Since pediatric musculoskeletal (MSK) radiograph interpretation can be challenging for emergency medicine trainees, we introduced Web-based pediatric MSK radiograph learning system with performance endpoints into pediatric emergency medicine (PEM) fellowships and determined the feasibility and effectiveness of implementing this intervention. This was a multicenter prospective cohort study conducted over 12 months. The course offered 2,100 pediatric MSK radiographs organized into seven body regions. PEM fellows diagnosed each case and received feedback after each interpretation. Participants completed cases until they achieved a performance benchmark of at least 80% accuracy, sensitivity, and specificity. The main outcome measure was the median number of cases completed by participants to achieve the performance benchmark. Fifty PEM fellows from nine programs in the US and Canada participated. There were 301 of 350 (86%) modules started and 250 of 350 (71%) completed to the predefined performance benchmark during the study period. The median (interquartile range [IQR]) number of cases to performance benchmark per participant was 78 (60-104; min= 56, max= 1,333). Between modules, the median number of cases to achieve the performance benchmark was different for the ankle versus other modules (ankle 366 vs. other 76; difference= 290, 95% confidence interval [CI]= 245 to 335). The performance benchmark was achieved for 90.7% of participants in all modules except the ankle/foot, where 34.9% achieved this goal (difference= 55.8%, 95% CI= 45.3 to 66.3). The mean (95% CI) change in accuracy, sensitivity, and specificity from baseline to performance benchmark was +14.6% (13.4 to 15.8), +16.5% (14.8 to 18.1), and +12.6% (10.7 to 14.5), respectively. Median (IQR) time on each case was 31.0 (21.0-45.3) seconds. Most participants completed the modules to the performance benchmark within 1hour and demonstrated significant skill improvement. Further, there was a large variation in the number of cases completed to achieve the performance endpoint in any given module, and this impacted the feasibility of completing specific modules.