Abstract
Abstract Background: Many of these practical applications typically produce huge data sets that contain several thousands of objects with hundreds (or even thousands) of features that describe each object. Automatic modeling, organizingand interpreting these large-scale data sets is increasingly part of various disciplines, such as health, medicine, and biology. The study aims to propose a new framework for constrained learning of predictive models where we enforce additional constraints for the models to be as similar as possible. Methods: We develop a new task of differential predictive modeling to solve the problem of differences in the data distributions by developing a continuum of predictive models. To achieve this, we construct a new framework for constrained learning of predictive models where we enforce additional constraints for the models to be as similar as possible. The major distinction of our approach with the existing methods is that none of the existing methods explore the model distance based on building the similar tree concept. We compute the distance between two datasets from the structural information about the induced trees without referring back to the original data statistics or the model statistics. Our 29-state Medicaid dataset, which contains 100% of the claims for 90% of all African American and 90% of all Hispanic and Latino Medicaid enrollees in the entire U.S. geo-coded at the zip code and county levels. Principal Findings: The goal of the disparities work is to capture the differences across different population groups and provide a better understanding that can explain these differences. Using this model, we could be able to measure the group difference in a particular minority group and compare these differences across the other subgroups. This is useful in ranking the groups where the disparity varies from the highest to the lowest. The novelty of our approach is that it captures the disparity of groups through a systematic comparison of multivariate models which cannot be done using any of the existing methods available in the machine learning methods. Conclusions: This model has the potential to dictate more aggressive new therapeutic treatments for specific subsets of high-risk patients. This project proposes new research directions in exploratory tools needed by the domain experts analyzing their data. It will provide significant insights for data analysis that can be utilized in various domains of science. One of the ultimate goals of the proposed research is the development of a clinician-friendly software and web-based analysis tool to analyze the health disparity multivariate data using the prediction models. Citation Format: Wonsuk Yoo. Differential predictive modeling to grasp disparities among sub-populations [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr LB-154.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have