Abstract

Although the identification of inherent structure in chronic lymphocytic leukemia (CLL) gene expression data using class discovery approaches has not been extensively explored, the natural clustering of patient samples can reveal molecular subdivisions that have biological and clinical implications. To explore this, we preprocessed raw gene expression data from two published studies, combined the data to increase the statistical power, and performed unsupervised clustering analysis. The clustering analysis was replicated in 4 independent cohorts. To assess the biological significance of the resultant clusters, we evaluated their prognostic value and identified cluster-specific markers. The clustering analysis revealed two robust and stable subgroups of CLL patients in the pooled dataset. The subgroups were confirmed by different methodological approaches (non-negative matrix factorization NMF clustering and hierarchical clustering) and validated in different cohorts. The subdivisions were related with differential clinical outcomes and markers associated with the microenvironment and the MAPK and BCR signaling pathways. It was also found that the cluster markers were independent of the immunoglobulin heavy chain variable (IGVH) genes mutational status. These findings suggest that the microenvironment can influence the clinical behavior of CLL, contributing to prognostic differences. The workflow followed here provides a new perspective on differences in prognosis and highlights new markers that should be explored in this context.

Highlights

  • Chronic lymphocytic leukemia (CLL) is one the most frequently occurring leukemias in adults in Western countries and is characterized by mature B cell accumulation in the blood, bone marrow and secondary lymphoid organs

  • We conclude that the similarity in different cohorts with regard to differential expression patterns reflects the robustness in the group structure, and we suggest that important genes such as TCL1A and ZNF331 are accountable for the biological subdivision

  • In this paper, using a robust methodology and several cohorts of CLL patients reflecting a broad spectrum of molecular events in the disease, it was possible to distinguish two different patient subgroups and identify subgroup-specific genes

Read more

Summary

Introduction

Chronic lymphocytic leukemia (CLL) is one the most frequently occurring leukemias in adults in Western countries and is characterized by mature B cell accumulation in the blood, bone marrow and secondary lymphoid organs. CLL patients can be divided into two major groups based on whether their immunoglobulin heavy chain variable region (IGVH) genes are mutated or unmutated. Patients with an unmutated IGVH gene have a less favorable prognosis than patients with a mutated IGVH gene [1, 2]. Different chromosomal aberrations, such as PLOS ONE | DOI:10.1371/journal.pone.0137132. Clustering of Expression Data in Chronic Lymphocytic Leukemia deletions in 11q, 13q, or 17p and trisomy 12, have been found in CLL patients, with varied prognostic implications [3]. Common genetic causes have not yet been identified [4], but recurrent mutations in TP53 and ATM and new mutations in NOTCH1, SF3B1, MYD88, BIRC3 and FBXW7 have been identified in recent years by next-generation sequencing [5]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call