Abstract

e13581 Background: Medical research in the areas of immunotherapies and targeted therapies has been drawing heavy attention and investment as the number of approved therapies continue to rise. Advanced findings on cancer treatment based on rich healthcare records help doctors to understand patient’s characteristics. The more accurate one can make a mathematical model predicting the likelihood of disease appearance, progression or treatment initiation, the better a proactive and effective targeting solution can be provided. Methods: Create positive/negative labels for patients who start to receive treatment or not within a time-sensitive window. Based on EHR records as Procedures, Prescription, and Diagnosis (PPD), patient’s feature interrelationship can be determined by a conditional probability matrix. Map PPD data to a well-designed patient-image profiles based a modified genetic algorithm to optimize and measure closeness. A Convolutional Neural Networks (CNN) model is then used to extract characteristics from patients’ feature image and learn the local patterns. Training the CNN model together with other upgraded AIML models is used to enhance overall prediction precision. Results: 144 different aggregated PPD features for patients with Chronic Lymphocytic Leukemia (CLL) were selected from the databases. Overall, the number of positive patients, who received treatment due to disease progression, is around 3% among 60K cases in each cohort, which is an extremely unbalanced dataset that is a challenge task for model training. In practice, we care about the model precision at k, which is the percentage of truly identified patients with k highest prediction scores. Comparing to common machine models like Random Forest, XGBoost and CatBoost, our proposed CNN model boosts the ensemble of baseline model performance in terms of average prediction precision at 1000 from 14.9% to 17.1%, which is about 15% relative percentage increase. Using this approach, many more truly identified patients could potentially receive targeted treatment on time. Patients’ top features and key feature interactions can also be identified as important references. Conclusions: The novel CNN boosting algorithm considers both aggregated PPD feature pairs using the graphical structure, significantly improves prediction model performance, and increases the model interpretability. In summary, the proposed model can help identify more potential patient candidates and determine precise treatment options.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.