Surgical training has undergone some remarkable changes over the past 2 decades. No longer is the operating room the sole training ground for medical students and residents learning surgery. Basic skills, such as knot tying and laparoscopic suturing, can be practiced in the confines of a surgical skills laboratory.1 Studies have shown that ex-vivo surgical skills training can lead to significant improvement in actual intra-operative performance.2 Models used to teach learners have varied in fidelity, in terms of how realistic the model looks, from simple bench models to complex virtual reality (VR) model. However, a high fidelity or virtual reality model is not synonymous with an excellent teaching tool.3 The authors of this study have developed a VR model that captures the critical constructs of performing a transrectal ultrasound and biopsy (TRUS-BX).4 Targeting is the foundation of TRUS-BX and teaching a learner how to target is the focus of this simulator. The VR simulator developed by the University of Western Ontario group incorporates real patient 3D TRUS data to train and test a learner’s ability to target 12 virtual targets. The simulator calculates the accuracy of each of the biopsy taken and also records time. Face and content validity, through self-made questionnaire, showed this model to have a very realistic feel and simulation of a TRUS-BX. The VR TRUS-BX simulator was also able to discriminate performance between experts and novices. Furthermore, improvements were demonstrated while using this model. As the authors surmized, this VR simulator shows promise as a tool that can be used to train residents and serve a role in continuing professional development. We have seen great progress being made in surgical education research and advance in the science of technical skills assessment. However, there is also much more to be desired. As technology advances and increasingly more high fidelity VR simulators become available, there is a need for standardized methods in evaluating these new simulators. Measuring a simulators face and content validity needs to be more than experts’ opinion that “yes-this looks and feels like the real thing.” A stringent, reliable and accurate tool to measure face and content validity would allow meaningful comparison between simulator offerings from different companies. Measuring construct validity using subjects of varying experience has become a standard and acceptable method of simulator validation in the surgical education field. What would be highly desirable is a simulator with high predictive validity. Can the performance in the simulator predict the performance in the real-world? This would have significant implications on residency training and would introduce a high-stakes technical skills examination using simulator to attest to one’s competence. To achieve this goal, one needs to be able to measure technical performance in the operating room, which still remains the “holy grail” of surgical education. Ethical issues of live patients and the lack of unobtrusive and practical methods of intra-operative assessment remain impeding factors. Surgical education research remains a field that is evolving and it is encouraging to see well-thought simulators being designed and evaluated in a thorough manner.