Childhood Sjögren's disease is a rare, underdiagnosed, and poorly-understood condition. By integrating machine learning models on a paediatric cohort in the USA, we aimed to develop a novel system (the Florida Scoring System) for stratifying symptomatic paediatric patients with suspected Sjögren's disease. This cross-sectional study was done in symptomatic patients who visited the Department of Pediatric Rheumatology at the University of Florida, FL, USA. Eligible patients were younger than 18 years or had symptom onset before 18 years of age. Patients with confirmed diagnosis of another autoimmune condition or infection with a clear aetiological microorganism were excluded. Eligible patients underwent comprehensive examinations to rule out or diagnose childhood Sjögren's disease. We used latent class analysis with clinical and laboratory variables to detect heterogeneous patient classes. Machine learning models, including random forest, gradient-boosted decision tree, partial least square discriminatory analysis, least absolute shrinkage and selection operator-penalised ordinal regression, artificial neural network, and super learner were used to predict patient classes and rank the importance of variables. Causal graph learning selected key features to build the final Florida Scoring System. The predictors for all models were the clinical and laboratory variables and the outcome was the definition of patient classes. Between Jan 16, 2018, and April 28, 2022, we screened 448 patients for inclusion. After excluding 205 patients due to symptom onset later than 18 years of age, we recruited 243 patients into our cohort. 26 patients were excluded because of confirmed diagnosis of a disorder other than Sjögren's disease, and 217 patients were included in the final analysis. Median age at diagnosis was 15 years (IQR 11-17). 155 (72%) of 216 patients were female and 61 (28%) were male, 167 (79%) of 212 were White, and 20 (9%) of 213 were Hispanic, Latino, or Spanish. The latent class analysis identified three distinct patient classes: class I (dryness dominant with positive tests, n=27), class II (high symptoms with negative tests, n=98), and class III (low symptoms with negative tests, n=92). Machine learning models accurately predicted patient class and ranked variable importance consistently. The causal graphical model discovered key features for constructing the Florida Scoring System. The Florida Scoring System is a paediatrician-friendly tool that can be used to assist classification and long-term monitoring of suspected childhood Sjögren's disease. The resulting stratification has important implications for clinical management, trial design, and pathobiological research. We found a highly symptomatic patient group with negative serology and diagnostic profiles, which warrants clinical attention. We further revealed that salivary gland ultrasonography can be a non-invasive alternative to minor salivary gland biopsy in children. The Florida Scoring System requires validation in larger prospective paediatric cohorts. National Institute of Dental and Craniofacial Research, National Institute of Arthritis, Musculoskeletal and Skin Diseases, National Heart, Lung, and Blood Institute, and Sjögren's Foundation.
Read full abstract