Abstract Mediation effect refers to the effect conveyed by a third variable, the mediator, to an observed relationship between an exposure and a response variable of interest. Mediation analysis is used widely in fields such as social science, prevention study, behavior research, and epidemiology. For example, it has been well established that low socioeconomic status is associated with poor health status. To reduce this health disparity, investigators would need to quantify the effects, called mediation effects, from different risk factors that intervene in the relationship between the socioeconomic status and health outcomes, so that efficient interventions can be carried out. The general multiple mediation analysis method, proposed by Yu et al., improved traditional methods (e.g., estimation of natural and controlled direct effects) to enable consideration of multiple mediators simultaneously and the use of linear and nonlinear predictive models for estimating mediation effects. A big data set includes a large number of variables. Within big data mediation analysis, the number of potential mediators is big. For example, in explaining the racial disparity in breast cancer survival, there are a lot of risk factors that need to be considered such as individual behaviors, environmental factors, social-economic status, treatment effects, diagnosis characters, and even gene expression variables. In this study, we propose to use a regularized mediation analysis method that identifies significant mediators and estimates their indirect effects simultaneously. Breast cancer is the most common cancer and the second leading cause of cancer death among women of all races. Despite the improvement of survival rates of breast cancer in the U.S., a significant difference between white and black women remains. Due to limitations of current analytic methods and the lack of comprehensive data sets, researchers have not been able to differentiate the relative effect that each factor contributes to the overall racial disparity. We use the CDC-funded Patterns of Care study and Enhancing Cancer Registry Data for Comparative Effectiveness Research study to examine the determinants of racial disparities in breast cancer survival using the propose regularized multiple mediation analysis. Using the proposed method, we identified important factors that explain the racial disparity in breast cancer survival, which includes the cancer stage-related variables (stage, tumor size, lymph nodes involvement, extension, and tumor size), molecular subtype, treatment methods such as surgery and hormonal therapy, and tumor grade. Age also contributed to the racial differences, but in an opposite direction. Black patients are more likely to be diagnosed at a younger age, which is related with better survival outcomes. Overall, we found that all racial disparity in survival among Louisiana breast cancer patients was explained by factors included in the study. Citation Format: Qingzhao Yu, Lu Zhang, Meichin Hsieh, Xiaocheng Wu, Richard A Scribner, Bin Li. Regularized multiple mediation analysis for big data set—With an application to explore racial disparity in breast cancer survival [abstract]. In: Proceedings of the Eleventh AACR Conference on the Science of Cancer Health Disparities in Racial/Ethnic Minorities and the Medically Underserved; 2018 Nov 2-5; New Orleans, LA. Philadelphia (PA): AACR; Cancer Epidemiol Biomarkers Prev 2020;29(6 Suppl):Abstract nr C020.
Read full abstract