Surgical approaches that access the posterior temporal bone require careful drilling motions to achieve adequate exposure while avoiding injury to critical structures. We assessed a deep learning hand motion detector to potentially refine hand motion and precision during power drill use in a cadaveric mastoidectomy procedure. A deep-learning hand motion detector tracked the movement of a surgeon's hands during three cadaveric mastoidectomy procedures. The model provided horizontal and vertical coordinates of 21 landmarks on both hands, which were used to create vertical and horizontal plane tracking plots. Preliminary surgical performance metrics were calculated from the motion detections. 1,948,837 landmark detections were collected, with an overall 85.9% performance. There was similar detection of the dominant hand (48.2%) compared to the non-dominant hand (51.7%). A loss of tracking occurred due to the increased brightness caused by the microscope light at the center of the field and by movements of the hand outside the field of view of the camera. The mean (SD) time spent (seconds) during instrument changes was 21.5 (12.4) and 4.4 (5.7) during adjustments of the microscope. A deep-learning hand motion detector can measure surgical motion without physical sensors attached to the hands during mastoidectomy simulations on cadavers. While preliminary metrics were developed to assess hand motion during mastoidectomy, further studies are needed to expand and validate these metrics for potential use in guiding and evaluating surgical training.