Acute myeloid leukemia (AML) is a heterogeneous disease with variable responses to treatment, characterized by specific chromosomal breakpoint products such as AML-ETO, PML-RAR or mutations in e.g., c/EBPα or FLT3 genes. These abnormalities are not sufficient to cause AML and approximately 40% of AML cases present without these defects or have more complex abnormalities. To find genes involved in AML, we recently isolated common integration sites (CIS) from Graffi-1.4 MuLV (Gr1.4)-induced mouse myeloid leukemia and identified the genes affected by these integrations (Erkeland S.J., et al, J. Virol., 2004). To assess the significance of these genes for human AML, we determined their contribution to 16 distinct classes of patients recently defined based on gene expression profiling of 285 AML samples using Affymetrix U133A microarrays (Valk P.J., et al, NEJM, 2004) plus 11 additional groups based on known chromosomal aberrations or gene mutations. Using the significance analysis of microarrays (SAM) methods we could link specific gene sets to these 27 patient classes. Genes were considered to contribute to the signature of the class when the following criteria were met: an RNA expression Fold Change over 1.5 or under 0.67, a score over 4 or less than −4 and a q-value less then 5%. Three gene lists derived from the Gr1.4-induced leukemia samples were generated: (I) genes (n=51, represented by 116 probe sets), within or most proximal to CIS; (II) genes (n=53, 81 probe sets) located next to the CIS flanking genes; (III) genes (n=279, 468 probe sets) located more distantly from the CIS (up to 1 megabases), with a maximum of 5 genes in each direction. Each of these gene lists was compared to the 27 outcomes of the SAM analyses to count the number of cases where a probe set of one of the three lists was significantly deregulated. To establish whether the fraction of deregulated genes in one of the three lists was significantly higher than the expected fraction of the total list, a chi-square test was performed. The list of CIS shows a significant overrepresentation of deregulated genes (Ψ2 =13.4 and p=0.0002). In contrast, the list of two closest flanking genes does not show overrepresentation (Ψ2= 0.4 and p=0.54), and neither does the list of 10 neighboring genes (Ψ2= 1.7 and p= 0.19). We repeated the analysis on a profiling dataset from 130 childhood AML samples (Ross ME, et al, Blood 2004). Using this dataset, we performed 5 SAM analyses based on known chromosomal aberrations or specific FAB-type. Again, this showed an overrepresentation of deregulated genes in case of the CIS (Ψ2= 6.3, p=0.012) but not for the more distant genes (2 closest genes: Ψ2= 0.006, p=0.936, 10 neighboring genes: Ψ2= 1.67, p=0.196). We conclude from these analyses that the closest CIS flanking genes have the highest probability to be involved in human AML. We next performed a similar analysis using the list of virus integrations from the BXH2 myeloid leukemia model (http://genome2.ncifcrf.gov/RTCGD/). This list, consisting of 53 genes represented by 111 probe sets also showed a higher representation of deregulated genes of adult AML (Ψ2= 112.2, p=0.000) as well as of pediatric AML samples (Ψ2= 14.0 and p=0.0001). Examples of CIS genes that are found associated with specific subgroups of clinical AML are CTNNA1, IRS-2, VDUP1, DUSP10 and PRDXII. These results demonstrate the power of combining retroviral tagging of cancer genes with expression array analysis of clinical leukemia samples for the identification of pathogenetic mechanisms in human AML.
Read full abstract