Abstract

We combined clinical practice changes and standardizations with informatics technologies to automate aggregation, integration and harmonization of comprehensive patient data from the multiple source systems used in clinical practice into a big data analytics resource system (BDARS). This system supports the development of artificial intelligence (AI) algorithms for clinically actionable analytics to guide decision frameworks. Our purpose was to combine the BDARS with AI to identify additional default planning constraints from our historic data to further reduce dysphagia risk. The BDARS harmonized data for over 22,000 patients, from which we identified 135 of 496 recent patients treated for head and neck cancer demonstrating physician graded dysphagia toxicity scores worsening from baseline to a maximum ≥ 2. Our method used both physical and biologically corrected (α/β =2.5) DVH curves from the BDARS to study both absolute and percentage volume-based DVH metrics for evidence for predicting dysphagia. The method combined a statistical categorization algorithm (SCA) with machine learning (ML) to identify DVH metrics with stronger mutual evidence and more extensive detailing of evidence than either approach alone. SCA-ML ranking was used to cull the large set of DVH metric candidates to a reduced set for input to a predictive ML model (PMLM), reducing risk of overfitting. The PMLM was iteratively constructed, minimizing the number of DVH metric inputs according to their ability to improve the model sensitivity. Seven swallowing structures producing 738 candidate DVH metrics were examined with the SCA-ML. Structures included superior constrictor muscle (SCM), inferior constrictor muscle (ICM), larynx (L), esophagus (E), ipsilateral and contra lateral parotids (IP, CP) and submandibular glands (ISG, CSG). Structures not contoured on at least 90% of the plans were noted for evidence but excluded from the predictive model. High SCA-ML scores were identified for SCM: D20%[EQD2Gy] ≤47.7, D25%[Gy]≤50.4, ISG: D35%[Gy]≤61.7 and E: D2cc [Gy]≤22.6.ICM:D55%[EQD2Gy]≤10.8 had a low SCA-ML score. High SCA-ML scores were obtained for L: D25%[Gy] ≤21.2 and CSG: D45%[Gy]≤28.8 but were excluded in the model due to lacking contours. Sensitivity (0.88 ±0.13) and AUC (0.74 ± 0.17) of the baseline PMLM was not significantly (p>0.05) different from the PMLM using SCM: D20%[EQD2Gy] and ISG: D35%[Gy], with SCM: D20%[EQD2Gy] being the dominant factor. Using the physical dose metric SCM: D25%[Gy] instead of the bio-corrected alternative did not significantly degrade the model. The SCA-ML based AI approach combined with the BDARS identified SCM:D25% [Gy] ≤50.4 as an additional mid priority metric, that together with historic constraints ICM:Mean [Gy] ≤20, L:Mean [Gy] ≤20, and SCM:Mean [Gy] ≤50 could improve outcomes. This study provided practical demonstration of how big data evidence combined with AI could advance iterative clinical learning paradigms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.