Model-based Clustering Research Articles

AimsApplication of the latent class analysis to acute heart failure with preserved ejection fraction (HFpEF) showed that the heterogeneous acute HFpEF patients can be classified into four distinct phenotypes with different clinical outcomes. This model‐based clustering required a total of 32 variables to be included. However, this large number of variables will impair the clinical application of this classification algorithm. This study aimed to identify the minimal number of variables for the development of optimal subphenotyping model.Methods and resultsThis study is a post hoc analysis of the PURSUIT‐HFpEF study (N = 1095), a prospective, multi‐referral centre, observational study of acute HFpEF [UMIN000021831]. We previously applied the latent class analysis to the PURSUIT‐HFpEF dataset and established the full 32‐variable model for subphenotyping. In this study, we used the Cohen's kappa statistic to investigate the minimal number of discriminatory variables needed to accurately classify the phenogroups in comparison with the full 32‐variable model. Cohen's kappa statistic of the top‐X number of discriminatory variables compared with the full 32‐variable derivation model showed that the models with ≥16 discriminatory variables showed kappa value of >0.8, suggesting that the minimal number of discriminatory variables for the optimal phenotyping model was 16. The 16‐variable model consists of C‐reactive protein, creatinine, gamma‐glutamyl transferase, brain natriuretic peptide, white blood cells, systolic blood pressure, fasting blood sugar, triglyceride, clinical scenario classification, infection‐triggered acute decompensated HF, estimated glomerular filtration rate, platelets, neutrophils, GWTG‐HF (Get With The Guidelines‐Heart Failure) risk score, chronic kidney disease, and CONUT (Controlling Nutritional Status) score. Characteristics and clinical outcomes of the four phenotypes subclassified by the minimal 16‐variable model were consistent with those by the full 32‐variable model. The four phenotypes were labelled based on their characteristics as ‘rhythm trouble’, ‘ventricular‐arterial uncoupling’, ‘low output and systemic congestion’, and ‘systemic failure’, respectively.ConclusionsThe phenotyping model with top 16 variables showed almost perfect agreement with the full 32‐variable model. The minimal model may enhance the future clinical application of this clustering algorithm.

Read full abstract

BackgroundMost phenotyping paradigms in sarcoidosis are based on expert opinion; however, no paradigm has been widely adopted because of the subjectivity in classification. We hypothesized that cluster analysis could be performed on common clinical variables to define more objective sarcoidosis phenotypes.MethodsWe performed a retrospective cohort study of 554 sarcoidosis cases to identify distinct phenotypes of sarcoidosis based on 29 clinical features. Model-based clustering was performed using the VarSelLCM R package and the Integrated Completed Likelihood (ICL) criteria were used to estimate number of clusters. To identify features associated with cluster membership, features were ranked based on variable importance scores from the VarSelLCM model, and additional univariate tests (Fisher’s exact test and one-way ANOVA) were performed using q-values correcting for multiple testing. The Wasfi severity score was also compared between clusters.ResultsCluster analysis resulted in 6 sarcoidosis phenotypes. Salient characteristics for each cluster are as follows: Phenotype (1) supranormal lung function and majority Scadding stage 2/3; phenotype (2) supranormal lung function and majority Scadding stage 0/1; phenotype (3) normal lung function and split Scadding stages between 0/1 and 2/3; phenotype (4) obstructive lung function and majority Scadding stage 2/3; phenotype (5) restrictive lung function and majority Scadding stage 2/3; phenotype (6) mixed obstructive and restrictive lung function and mostly Scadding stage 4. Although there were differences in the percentages, all Scadding stages were encompassed by all of the phenotypes, except for phenotype 1, in which none were Scadding stage 4. Clusters 4, 5, 6 were significantly more likely to have ever been on immunosuppressive treatment and had higher Wasfi disease severity scores.ConclusionsCluster analysis produced 6 sarcoidosis phenotypes that demonstrated less severe and severe phenotypes. Phenotypes 1, 2, 3 have less lung function abnormalities, a lower percentage on immunosuppressive treatment and lower Wasfi severity scores. Phenotypes 4, 5, 6 were characterized by lung function abnormalities, more parenchymal abnormalities, an increased percentage on immunosuppressive treatment and higher Wasfi severity scores. These data support using cluster analysis as an objective and clinically useful way to phenotype sarcoidosis subjects and to empower clinicians to identify those with more severe disease versus those who have less severe disease, independent of Scadding stage.

Read full abstract

Model-based Clustering Research Articles

Related Topics

Articles published on Model-based Clustering

Bridging the Gap Between Qualitative and Quantitative Assessment in Science Education Research with Machine Learning — A Case for Pretrained Language Models-Based Clustering

Clustering trends of melanoma incidence and mortality: Aworldwide assessment from 1995 to 2019.

Temperament and character differences in psychopathic and non-psychopathic antisocial adolescents

Rapid Detection and Quantification of Adulterants in Fruit Juices Using Machine Learning Tools and Spectroscopy Data.

Clustering compositional data using Dirichlet mixture model.

A model-based clustering of expectation–maximization and K -means algorithms in crime hotspot analysis

KLASTERISASI PROVINSI DI INDONESIA BERDASARKAN FAKTOR PENYEBARAN COVID-19 MENGGUNAKAN MODEL-BASED CLUSTERING t-MULTIVARIAT

Discovering Thematically Coherent Biomedical Documents Using Contextualized Bidirectional Encoder Representations from Transformers-Based Clustering.

Spinal Cord Stimulation–Naïve Patients vs Patients With Failed Previous Experiences With Standard Spinal Cord Stimulation: Two Distinct Entities or One Population?

Identification and characterization of chickpea genotypes for early flowering and higher seed germination through molecular markers.

Multiple change point clustering of count processes with application to California COVID data

Correction of multiple-blinking artifacts in photoactivated localization microscopy.

Anderson relaxation test for intrinsic dimension selection in model-based clustering

Comprehensive analysis of the associations between clinical factors and outcomes by machine learning, using post marketing surveillance data of cabazitaxel in patients with castration-resistant prostate cancer

Minimal subphenotyping model for acute heart failure with preserved ejection fraction.

Genome-wide survey on three local horse populations with a focus on runs of homozygosity pattern.

Understanding tourist behaviour towards destination selection based on social media information: an evaluation using unsupervised clustering algorithms

Model-based clustering of high-dimensional longitudinal data via regularization.

Global Patterns of Contemporary Welfare States

Clinical phenotyping in sarcoidosis using cluster analysis

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Model-based Clustering Research Articles

Related Topics

Articles published on Model-based Clustering

Bridging the Gap Between Qualitative and Quantitative Assessment in Science Education Research with Machine Learning — A Case for Pretrained Language Models-Based Clustering

Clustering trends of melanoma incidence and mortality: Aworldwide assessment from 1995 to 2019.

Temperament and character differences in psychopathic and non-psychopathic antisocial adolescents

Rapid Detection and Quantification of Adulterants in Fruit Juices Using Machine Learning Tools and Spectroscopy Data.

Clustering compositional data using Dirichlet mixture model.

A model-based clustering of expectation–maximization and K -means algorithms in crime hotspot analysis

KLASTERISASI PROVINSI DI INDONESIA BERDASARKAN FAKTOR PENYEBARAN COVID-19 MENGGUNAKAN MODEL-BASED CLUSTERING t-MULTIVARIAT

Discovering Thematically Coherent Biomedical Documents Using Contextualized Bidirectional Encoder Representations from Transformers-Based Clustering.

Spinal Cord Stimulation–Naïve Patients vs Patients With Failed Previous Experiences With Standard Spinal Cord Stimulation: Two Distinct Entities or One Population?

Identification and characterization of chickpea genotypes for early flowering and higher seed germination through molecular markers.

Multiple change point clustering of count processes with application to California COVID data

Correction of multiple-blinking artifacts in photoactivated localization microscopy.

Anderson relaxation test for intrinsic dimension selection in model-based clustering

Comprehensive analysis of the associations between clinical factors and outcomes by machine learning, using post marketing surveillance data of cabazitaxel in patients with castration-resistant prostate cancer

Minimal subphenotyping model for acute heart failure with preserved ejection fraction.

Genome-wide survey on three local horse populations with a focus on runs of homozygosity pattern.

Understanding tourist behaviour towards destination selection based on social media information: an evaluation using unsupervised clustering algorithms

Model-based clustering of high-dimensional longitudinal data via regularization.

Global Patterns of Contemporary Welfare States

Clinical phenotyping in sarcoidosis using cluster analysis