Automatic transcription of the Methods-Time Measurement MTM-1 motions in VR

Valentina Gorobets,Andreas Kunz,Rolf Adelsberger,Roman Billeter

doi:10.54941/ahfe1004575

Abstract

We concentrate our research on using VR technology for the automatic transcription of basic human motions in the Methods-Time Measurement (MTM) system. It is a predetermined motion time system that consists of a list of predefined basic motions and the mean time values corresponding to those motions. This system is used to analyze manual workplaces. Currently, the MTM analysis is conducted manually. The working process that needs to be analyzed is video captured and further analyzed by dividing it into a sequence of basic MTM motions. There are various MTM systems that differ by their granularity level, such as MTM-1, MTM-2, MTM-UAS, etc.We propose and evaluate an approach of the automatic transcription of the MTM-1 basic motions. Here are the MTM-1 basic motions our algorithm transcribes that are grouped by the involved body parts. Hand motions: Grasp, Release, Position, Disengage, Apply pressure. Arm motions: Reach, Move, Crank, Turn. Body motions: Sit, Bend, Kneel one knee, Kneel both knees. Leg motions: Step, Leg gestures.Methodology: For our research, we use Unity software to create the virtual environment (VE) and interactions within it. Additionally, we use the HTC Vive tracking system and Sensoryx VRfree data glove that enable body- and hand-tracking. Additionally, the MTM-1 system distinguishes between different grasping types, therefore, using hand-tracking technology instead of controllers becomes essential. To visualize the VE, an HTC Vive Pro headset is used. Our automatic transcription algorithm employs four decision trees that run simultaneously, each dedicated to transcribing hand, arm, body, and leg motions in real time.Hand motions occur when objects are grasped or released. Therefore, the trigger for the hand motion decision tree is the beginning or end of touching an object. Hand motions were observed to commonly occur in the following order: Grasp -> Disengage -> Position -> Apply pressure -> Release. The grasp and release motions are always present in a cycle. The motions disengage, position, and apply pressure are optional in a cycle and are performed while holding an object, so between grasping and releasing.Arm motions decision tree is triggered by obtaining the output of the hand motion decision tree. It is transcribed backward, using the recorded information. The body motions category consists of bending, sitting, and kneeling motions. The assumed initial state of the user is the standing position. These motions occur sequentially, so to transcribe a kneeling motion, the threshold values for the sit (T_sit) and bend (T_Bend) motion must also be achieved.The leg motions consist of step motions and leg gestures. A leg gesture is a motion where only the leg and/or foot move, for example, pressing a pedal. The metric chosen to detect these motions is the velocity of the foot. Results: We conducted the user study with 33 participants, which performed a total of 2738 motions. 2670 of them were true positives (TP), and 176 were false positives (FP). 68 of the motions performed by the users were not captured by the algorithm and represent the false negatives (FN). Additionally, we introduced precision and recall metrics.Precision = sum(TP)/(sum(FP)+sum(TP))=0.938; Recall = sum(TP)/(sum(FN)+sum(TP))=0.975Besides overall statistics, we also calculated statistics separately for every basic motion type.

Full Text