Tamoxifen is widely used in patients with hormone receptor-positive breast cancer. The polymorphic enzyme CYP2D6 is primarily responsible for metabolic activation of tamoxifen, resulting in substantial interindividual variability of plasma concentrations of its most important metabolite, Z-endoxifen. The Z-endoxifen concentration thresholds below which tamoxifen treatment is less efficacious have been proposed but not validated, and prospective trials of individualized tamoxifen treatment to achieve Z-endoxifen concentration thresholds are considered infeasible. Therefore, we aim to validate the association between Z-endoxifen concentration and tamoxifen treatment outcomes, and identify a Z-endoxifen concentration threshold of tamoxifen efficacy, using pharmacometric modeling and simulation. As a first step, the CYP2D6 Endoxifen Percentage Activity Model (CEPAM) cohort was created by pooling data from 28 clinical studies (> 7,000 patients) with measured endoxifen plasma concentrations. After cleaning, data from 6,083 patients were used to develop a nonlinear mixed-effect (NLME) model for tamoxifen and Z-endoxifen pharmacokinetics that includes a conversion factor to allow inclusion of studies that measured total endoxifen but not Z-endoxifen. The final parent-metabolite NLME model confirmed the primary role of CYP2D6, and contributions from body weight, CYP2C9 phenotype, and co-medication with CYP2D6 inhibitors, on Z-endoxifen pharmacokinetics. Future work will use the model to simulate Z-endoxifen concentrations in patients receiving single agent tamoxifen treatment within large prospective clinical trials with long-term survival to identify the Z-endoxifen concentration threshold below which tamoxifen is less efficacious. Identification of this concentration threshold would allow personalized tamoxifen treatment to improve outcomes in patients with hormone receptor-positive breast cancer.