This study aims to develop a non-invasive diagnosis model using machine learning (ML) for identifying high-risk IgG4 Hashimoto's thyroiditis (HT) patients. A retrospective cohort of 93 HT patients and a prospective cohort of 179 HT patients were collected. According to the immunohistochemical and pathological results, the patients were divided into IgG4 HT group and non-IgG4 HT group. Serum TgAb IgG4 and TPOAb IgG4 were detected by ELISAs. A logistic regression model, support vector machine (SVM) and random forest (RF) were used to establish a clinical diagnosis model for IgG4 HT. Among these 272 patients, 40 (14.7%) were diagnosed with IgG4 HT. Patients with IgG4 HT were younger than those with non-IgG4 HT (P < 0.05). Serum levels of TgAb IgG4 and TPOAb IgG4 in IgG4 HT group were significantly higher than those in non-IgG4 HT group (P < 0.05). There were no significant differences in gender, disease duration, goiter, preoperative thyroid function status, preoperative TgAb or TPOAb levels, and thyroid ultrasound characteristics between the two groups (all P > 0.05). The accuracy, sensitivity, and specificity were 57%, 78%, and 79% for logistic regression model of IgG4 HT, 80 ± 7%, 84.7% ± 2.6%, and 75.4% ± 9.6% for the RF model and 78 ± 5%, 89.8% ± 5.7%, and 64.7% ± 5.7% for the SVM model. The RF model works better than SVM. The area under the ROC curve of RF ranged 0.87 to 0.92. A clinical diagnosis model for IgG4 HT established by RF model might help the early recognition of the high-risk patients of IgG4 HT.
Read full abstract