Estimation of the motion of an agent and its environment concurrently is done by simultaneous localization and mapping (SLAM). In the recent past, SLAM has made rapid and exciting progress and is used in different fields such as unmanned aerial vehicle (UAV), medical surgeries, and endoscopic procedures. The aim of this article is to devise a more accurate physiotherapy exercise monitoring device on the basis of analysis from eight different SLAM algorithms with criteria including power and memory consumption, CPU heat, and CPU utilization. This article provides a comprehensive evaluation on an embedded platform and is first of its kind, especially providing that SLAM systems ego-motion estimation has never been done so explicitly before. Based on the results of the prior analysis, we proposed a stereo visual-inertial tracking (S-VIT) for lower limb tracking in physiotherapy applications. Our proposed algorithm has significantly improved results compared with the state-of-the-art algorithms. Data sets of various physiotherapy rehabilitation exercises for leg are also collected for detailed validations where the ground truth is acquired with a state-of-the-art motion tracking system, Vicon. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —Accurate real-time tracking in an unknown environment is a challenging task, especially if high accuracy is needed. In physiotherapy, analyzing the daily recorded data (data acquired from a patient’s body movement) will be beneficial in the process of rehabilitation. However, keeping the daily record of patient body motion during exercise is a difficult task in most of the circumstances, which is due to the nonavailability of precise portable devices for accurate motion tracking. In this article, a solution for tracking the patient’s motion (during physiotherapy) using a small hand-held device is presented. For this purpose, simultaneous localization and mapping (SLAM) is used, which, according to the best of our knowledge, is not used in the field of physiotherapy before. In the first part of this article, we present an extensive analysis of a few SLAM algorithms based on power, memory consumption, CPU heat, and CPU utilization. Based on these results, we select the best algorithm EMoVI-SLAM, and then, we extend the work of this article by modifying EMoVI-SLAM. We propose a new SLAM algorithm called stereo visual-inertial tracking (S-VIT). The proposed algorithm is compared with the EMoVI-SLAM on our data set. We collect the data set of various movements of the lower limb. The result shows that S-VIT outperforms EMoVI-SLAM.