SummaryBackgroundJuvenile-onset systemic lupus erythematosus (SLE) is a rare autoimmune rheumatic disease characterised by more severe disease manifestations, earlier damage accrual, and higher mortality than in adult-onset SLE. We aimed to use machine-learning approaches to characterise the immune cell profile of patients with juvenile-onset SLE and investigate links with the disease trajectory over time.MethodsThis study included patients who attended the University College London Hospital (London, UK) adolescent rheumatology service, had juvenile-onset SLE according to the 1997 American College of Rheumatology revised classification criteria for lupus or the 2012 Systemic Lupus International Collaborating Clinics criteria, and were diagnosed before 18 years of age. Blood donated by healthy age-matched and sex-matched volunteers who were taking part in educational events in the Centre for Adolescent Rheumatology Versus Arthritis at University College London (London, UK) was used as a control. Immunophenotyping profiles (28 immune cell subsets) of peripheral blood mononuclear cells from patients with juvenile-onset SLE and healthy controls were determined by flow cytometry. We used balanced random forest (BRF) and sparse partial least squares-discriminant analysis (sPLS-DA) to assess classification and parameter selection, and validation was by ten-fold cross-validation. We used logistic regression to test the association between immune phenotypes and k-means clustering to determine patient stratification. Retrospective longitudinal clinical data, including disease activity and medication, were related to the immunological features identified.FindingsBetween Sept 5, 2012, and March 7, 2018, peripheral blood was collected from 67 patients with juvenile-onset SLE and 39 healthy controls. The median age was 19 years (IQR 13–25) for patients with juvenile-onset SLE and 18 years (16–25) for healthy controls. The BRF model discriminated patients with juvenile-onset SLE from healthy controls with 90·9% prediction accuracy. The top-ranked immunological features from the BRF model were confirmed using sPLS-DA and logistic regression, and included total CD4, total CD8, CD8 effector memory, and CD8 naive T cells, Bm1, and unswitched memory B cells, total CD14 monocytes, and invariant natural killer T cells. Using these markers patients were clustered into four distinct groups. Notably, CD8 T-cell subsets were important in driving patient stratification, whereas B-cell markers were similarly expressed across the cohort of patients with juvenile-onset SLE. Patients with juvenile-onset SLE and elevated CD8 effector memory T-cell frequencies had more persistently active disease over time, as assessed by the SLE disease activity index 2000, and this was associated with increased treatment with mycophenolate mofetil and an increased prevalence of lupus nephritis. Finally, network analysis confirmed the strong association between immune phenotype and differential clinical features.InterpretationMachine-learning models can define potential disease-associated and patient-specific immune characteristics in rare disease patient populations. Immunological association studies are warranted to develop data-driven personalised medicine approaches for treatment of patients with juvenile-onset SLE.FundingLupus UK, The Rosetrees Trust, Versus Arthritis, and UK National Institute for Health Research University College London Hospital Biomedical Research Centre.