Abstract

Colorectal cancer (CRC) is third in prevalence and mortality among all cancers in the US. Currently, the United States Preventative Services Task Force (USPSTF) recommends anyone ages 50–75 and/or with a family history to be screened for CRC. To improve screening specificity and sensitivity, we have built an artificial neural network (ANN) trained on 12 to 14 categories of personal health data from the National Health Interview Survey (NHIS). Years 1997–2016 of the NHIS contain 583,770 respondents who had never received a diagnosis of any cancer and 1409 who had received a diagnosis of CRC within 4 years of taking the survey. The trained ANN has sensitivity of 0.57 ± 0.03, specificity of 0.89 ± 0.02, positive predictive value of 0.0075 ± 0.0003, negative predictive value of 0.999 ± 0.001, and concordance of 0.80 ± 0.05 per the guidelines of Transparent Reporting of Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) level 2a, comparable to current risk-scoring methods. To demonstrate clinical applicability, both USPSTF guidelines and the trained ANN are used to stratify respondents to the 2017 NHIS into low-, medium- and high-risk categories (TRIPOD levels 4 and 2b, respectively). The number of CRC respondents misclassified as low risk is decreased from 35% by screening guidelines to 5% by ANN (in 60 cases). The number of non-CRC respondents misclassified as high risk is decreased from 53% by screening guidelines to 6% by ANN (in 25,457 cases). Our results demonstrate a robustly-tested method of stratifying CRC risk that is non-invasive, cost-effective, and easy to implement publicly.

Highlights

  • Colorectal adenocarcinomas are the result of unregulated growth in the colon mucosa that commonly starts with polypoid lesions progressing into advanced cancers [1]

  • While the National Health Interview Survey (NHIS) dataset records the age at which the respondent was professionally diagnosed with Colorectal cancer (CRC), the dataset does not record the time at which diagnoses of other predictors was given

  • The crossed error bars in each receiver operating characteristic curves (ROC) are the standard deviation across the ten folds of cross-testing

Read more

Summary

Introduction

Colorectal adenocarcinomas are the result of unregulated growth in the colon mucosa that commonly starts with polypoid lesions progressing into advanced cancers [1]. Colorectal cancer (CRC) claims 8.4% of all cancerdeaths and the overall 5-year survival rate is 66% [2]. Scoring colorectal cancer risk based on personal health data. BJN et al took over and continued this study with the funding of R01EB022589 while DAR went on to be employed and supported with salary by Sun Nuclear Corporation. Sun Nuclear Corporation provided support in the form of salaries for authors DAR, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section. The specific role of DAR is articulated in the ‘author contributions’ section.

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.