Abstract
Diffuse large B-cell lymphoma (DLBCL) is a clinically and molecularly heterogeneous disease. The increasing recognition and targeting of genetically defined DLBCLs highlights the need for robust classification algorithms. We previously characterized recurrent genetic alterations in DLBCL and identified five discrete subtypes, Clusters 1-5 (C1-C5), with unique mechanisms of transformation, immune evasion, candidate treatment targets and different outcomes following standard first-line therapy. Herein, we validate the C1-C5 DLBCL taxonomy in an independent dataset and use the expanded series of 699 primary DLBCLs to develop a probabilistic molecular classifier and confirm its performance in an independent test set. Using our previously assigned cluster labels as a reference, we systematically compared multiple machine learning models and strategies for input feature dimensionality reduction with a newly developed performance metric that captured the relationship between accuracy and confidence of class assignments. The winning neural network model, DLBclass, assigned all cases in the training/validation and independent test sets with 91% and 89% accuracies, respectively. In the 75% of cases with confidence >0.7, DLBclass assignments were accurate in 97% of the training/validation set and 98% of the test set. DLBclass enables robust prospective classification of single cases for inclusion in genetically guided clinical trials or practice and represents a framework for the development of genomic-based classification methods in other cancers.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.