Abstract Background Both depolarization and repolarization abnormalities contribute to ventricular arrhythmogenesis in Brugada syndrome (BrS). In this study, we tested the hypothesis that incorporating latent features extracted by various nonnegative matrix factorization (NML) techniques into an interpretable machine learning (IML) prediction model can outperform IML models without latent variables and logistic regression model. Methods This study was based on a published anonymised dataset of BrS patients from the Hong Kong, China. XGBoost was selected as the IML model, with and without incorporating latent features using 11 different NMF techniques: Bayesian nonnegative matrix factorization (BNMF), Iterated Conditional Modes nonnegative matrix factorization (ICM), Fisher Nonnegative Matrix Factorization for learning Local features (LFNMF), Alternating Nonnegative Least Squares Matrix Factorization Using Projected Gradient (bound constrained optimization) method for each subproblem (LSNMF), Non-smooth Nonnegative Matrix Factorization (NSNMF), Probabilistic Nonnegative Matrix Factorization (PMF), Probabilistic Sparse Matrix Factorization (PSMF), Sparse Nonnegative Matrix Factorization (SNMF) based on alternating nonnegativity constrained least squares, Sparse Network-Regularized Multiple Nonnegative Matrix Factorization (SNMNMF), Penalized Matrix Factorization for Constrained Clustering (PMFCC) and Separable Nonnegative Matrix Factorization (SepNMF). Results A total of 548 patients were included (7.3% females, age at diagnosis: 51.0 [38.0-61.0] years old. Of these, 66 suffered from spontaneous ventricular tachyarrhythmias over 84±55 months follow-up. The baseline model using multivariable logistic regression achieved an area under the curve (AUC) of 0.78 [0.72-0.85], which was improved to 0.88 [0.83-0.93] for the IML model without NMF. The AUC was further increased by incorporating additional latent variables extracted using NSNMF (0.95 [0.92-0.98]) and PMFCC (0.94 [0.88-1.00]). Conclusion Incorporation of latent variables by different NMF techniques into an IML prediction model significantly improved the accuracy of risk prediction in BrS.
Read full abstract