Abstract

Purpose: Knee osteoarthritis (KOA) is a heterogeneous condition characterized by changes in a variety of joint tissues and driven by a number of different potential mechanisms. We explored machine learning approaches to phenotyping in KOA in order to better define the progression phenotype(s) that may be more responsive to interventions. This analysis focuses on clusters identified on the basis of genetic data and their association with progression (as defined below). Methods: We used data from the Johnston County OA Project at T1 (baseline predictors) and a mean 6 year follow up (defining outcomes), including 7 genotyped snps associated with OA in a prior review: rs10947262, rs10948172, rs11177, rs11842874, rs4730250, rs6976, and rs7775228 (Table), to compare individuals who progressed radiographically (any increase in Kellgren-Lawrence grade [KLG], “rKOA”) to those who progressed by both radiographs and pain (increase of at least 9 points on WOMAC pain, “rKOA+pain”). From the full dataset of 741 participants, 291 had rKOA progression and 86 had rKOA+pain progression. K-means and hierarchical clustering methods were applied, and SigClust was used to determine statistical significance. DiProPerm, a projection and permutation-based hypothesis test, was used to compare groups (i.e., rKOA vs. rKOA+pain) by testing equality of distributions jointly over all variables (including demographic, clinical, and functional domains). A DiProPerm z-score of at least 2 (p value<0.05) was considered statistically significant. Results: When considering all observations and all variables simultaneously, a significant difference was noted for the comparison of rKOA to rKOA+pain progressors with a z-score of 2.7. However, when the observations were divided into 2 clusters based on genetic data and different by SigClust (p=0.004 by k-means, p=0.007 by hierarchical clustering, in favor of 2 clusters), the DiProPerm z-score of 1.8 for Cluster 1 was non-significant, while the z-score of 3.1 for Cluster 2 showed greater statistical significance, implying a difference in risk for this type of progression based on genetic background. As shown in the Table, in Cluster 1, rs11177 and rs10948172 were contributors to rKOA+pain progression, while rs10947262 was associated with rKOA only progression. In contrast, in Cluster 2, rs6976, rs10947262, rs7775228, and rs11842874 contributed to rKOA+pain progression and only rs4730250 was associated with rKOA progression. Rs10947262 contributed in opposite directions in the two clusters, while rs11177 was only associated in Cluster 1, and rs6976 (notably in perfect linkage disequilibrium (r2=1) with rs11177) was only associated in Cluster 2. Additionally, the pattern of loadings (i.e., the relative contribution to either rKOA or rKOA+pain progression) of other, non-genetic variables differed for the 2 clusters. In Cluster 2, similar to the overall dataset, the main variables driving the difference were: height, race, and marital status (contributing to rKOA), while function (HAQ, semi-tandem stand), baseline KLG grade and sex contributed more to rKOA+pain progression. In contrast, for Cluster 1, the variables contributing most to rKOA were: chair stand time, age, WOMAC pain, education, and high cholesterol; those contributing most to rKOA+pain progression were: hip circumference, CESD score, marital status, gender, and presence of baseline heart disease. Conclusions: Application of novel machine learning-based methodologies resulted in identification of a statistically significant subgroup of progressors characterized by differing genetic background and risk factor profiles. Improved characterization of individuals into such phenotypes may provide new insights into mechanism of disease and potential interventions in OA.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.