The therapeutic efficacy of tamoxifen is predominantly mediated by its active metabolites 4-hydroxy-tamoxifen and endoxifen, whose formation is catalyzed by the polymorphic cytochrome P450 2D6 (CYP2D6). Yet, known CYP2D6 polymorphisms only partially determine metabolite concentrations in vivo. We performed the first cross-ancestry genome-wide association study with well-characterized patients of European, Middle-Eastern, and Asian descent (n=497) to identify genetic factors impacting active and parent metabolite formation. Genome-wide significant variants were functionally evaluated in an independent liver cohort (n=149) and in silico. Metabolite prediction models were validated in two independent European breast cancer cohorts (n=287, n=189). Within a single 1-megabase (Mb) region of chromosome 22q13 encompassing the CYP2D6 gene, 589 variants were significantly associated with tamoxifen metabolite concentrations, particularly endoxifen and metabolic ratio (MR) endoxifen/N-desmethyltamoxifen (minimal P=5.4E-35 and 2.5E-65, respectively). Previously suggested other loci were not confirmed. Functional analyses revealed 66% of associated, mostly intergenic variants to be significantly correlated with hepatic CYP2D6 activity or expression (ρ=0.35 to -0.52), and six hotspot regions in the extended 22q13 locus impacting gene regulatory function. Machine learning models based on hotspot variants (n=12) plus CYP2D6 activity score (AS) increased the explained variability (~ 9%) compared with AS alone, explaining up to 49% (median R2 ) and 72% of the variability in endoxifen and MR endoxifen/N-desmethyltamoxifen, respectively. Our findings suggest that the extended CYP2D6 locus at 22q13 is the principal genetic determinant of endoxifen plasma concentration. Long-distance haplotypes connecting CYP2D6 with adjacent regulatory sites and nongenetic factors may account for the unexplained portion of variability.
Read full abstract