Purpose: Exercise prescription plays a fundamental role in the treatment of knee pathologies such as knee osteoarthritis, where over 90,000 total knee replacement patients receive regular post-surgery physiotherapy each year in the UK. Physiotherapists rely on home-based exercise prescription yet have limited knowledge of patient engagement at home and find it difficult to objectively monitor patient progress, attribute functional improvement (or lack of) to adherence/non-adherence and prescribe personalised interventions. The research vision is to facilitate unobtrusive sensor driven home monitoring/feedback of knee rehabilitation exercises. As a first step, this study sought to fine tune a machine learning algorithm to classify between different knee exercises and understand the impact of different feature selection parameters on classification performance. Methods: 8 volunteers (4 healthy, 4 with self-reported history of knee pain/pathology but not receiving treatment) performed 15 repetitions of 4 knee rehabilitation exercises (sit to stand (STS), knee flexion (KFL), knee extension (KEX) and weight shifting (WSH)) whilst wearing lower limb Xsens inertial sensors sampling at 60 Hz. A total of 85 features were extracted per exercise repetition from tri-axial accelerometer data provided by sensors placed on the foot, shank, thigh and pelvis. These were defined as the 90th percentile spectral edge frequency ((SEF) in X,Y,Z axis) and signal mean, max min, variance, skewness, and kurtosis (all in the X,Y,Z axis) per sensor in addition to repetition length. Participants were split into training and testing datasets using a Leave-One-Group-Out cross validation (cv) where a participant with multiple repetitions represented a group. Within each of the 8 cv-folds, features were scaled, a univariate feature selection method was implemented to reduce the feature set to top ranking features and a linear support vector machine (SVM) classifier was performed to determine how well 4 different knee rehabilitation exercises could be distinguished from each other. This process was repeated using 3 different score function parameters (Mutual Information Classification (MIC), F-value Classification (F-Class) and Chi2) with the number of features being selected ranging from 1 - 10 to determine the optimal score function and number of features whilst optimising classification performance. The F1 score (weighted harmonic mean of precision and recall with a best score of 1) was computed within each cv-fold as an average F1 score across each of the exercises classes and then combined across cv-folds as a median and 1st - 3rd interquartile range. Results: Feature selection methods using the MIC and the F-Class score functions consistently outperformed that using the Chi2 (Figure 1). The MIC approach was chosen as the optimal feature selection method as the F-Class algorithm was marginally outperformed using fewer features with the optimal number of features (ie. fewest whilst retaining classifier performance) being 3. All 8 cv-folds using this method selected Z-axis acceleration 90th percentile SEF of the foot and shank sensors whilst the third feature varied slightly between cv-folds (mean Z-axis acceleration of the foot sensor (n=3/8 cv-folds); Y-axis mean acceleration of the thigh sensor (n=2/8 cv-folds), Z-axis acceleration variance of the foot (n=1/8 cv-folds) or shank (n=1/8 cv-folds) sensors and Z-axis mean acceleration of the shank sensor (n=1/8 cv-folds). Inspection of the individual F1-scores for each exercise using the optimal feature selection method revealed a reduced ability to classify KEX and KFL in comparison to STS and WSH exercises (Figure 2). Conclusions: Changing parameter settings within a univariate feature selection algorithm that identifies top ranking features was found to alter the performance of a linear SVM seeking to classify between 4 knee rehabilitation exercises. This finding confirms the impact that incorrect feature selection parameter settings may have on classification performance. From the parameters and feature selection methods considered, the combination of an MIC score function and the selection of the top 3 features appear optimal for this dataset. Whilst the 3 features selected appeared to consistently classify STS and WSH exercises well, this was not the case for KEX and KFL exercises with both varying considerably between the cv-folds and warrants further investigation. Future work will consider additional features that may improve the ability to discriminate between KEX and KFL exercises. Additional feature selection methods such as recursive feature elimination and removing features with low variance will be investigated to see whether SVM classification performance can be further enhanced.
Read full abstract