Colorectal cancer (CRC) is the unregulated growth of one or more adenomatous polyps (most frequently adenocarcinomas) occurring in the colon and/or the rectum. Of all new cancer cases in the US, 8.0% are colorectal. Colorectal cancer claims 8.4% of all cancer-deaths and thus has above-average mortality vs. other cancers. To detect CRC at early stages, the United States Preventative Services Task Force (USPSTF) recommends screening by colonoscopy or sigmoidoscopy for ages 50-75. However, the high false positive rates of these procedures lead to a lot of unnecessary screening. The goal of this study is to develop an artificial neural network (ANN) for colorectal cancer risk prediction usable for triaging people for screening based on their personal health data. The data to train and validate the ANN is the set of 1997-2016 responses (excepting 2004) to the National Health Interview Survey (NHIS) personal health questionnaire, in which colon and/or rectal cancer occurring 4 years or less from the survey date is counted as one instance of CRC. Respondents with 1 or more null entries were discarded. Using 70% of the NHIS data, training uses gradient descent in parameter space to adjust the parameters of a softmax-function and minimize a logistic cost function (regression). The extent to which the parameters are adjusted is calculated using the chain rule of calculus (backpropagation). The validation tests the regression-fitted function upon the remaining 30% of the data. As a binary test, sensitivity (TPR), specificity (SPC), and positive/negative predictive values (PPV/NPV) are calculated and compared with the USPSTF. As a trinary test, three levels of risk-stratification are defined by requiring that no more than 1% of CRC/non-CRC cases be misclassified as low/high-risk. If a person of medium risk having/not having CRC is counted as half of a false negative/positive (e.g., annual/biennial screenings for those of high/medium risk), the trinary test of risk-stratification also has an associated TPR, SPC, PPV, and NPV. As a binary test, the ANN has sensitivity of 0.7, specificity of 0.7, PPV of 0.09, and NPV of 0., all (except for NPV) exceeding the USPSTF guidelines and independent of tumorous advancement. As a trinary test, lowered SPC and PPV are exchanged for higher TPR and NPV.Abstract MO_30_2748; Table 1Performance of USPSTF (entire data set) compared to the ANN (validation-portion of data set, only).CRC, # screenedCRC, # not screenedNo CRC, # screenedNo CRC, # not screenedANN (bi.)517255106,864235,029ANN (tri.)497145142,603.5207,004USPSTF762490232,545297,065TPRSPCPPVNPVANN (bi.)0.6700.6870.009450.997ANN (tri.)0.7740.4080.003470.9993USPSTF0.6090.4390.003270.998 Open table in a new tab CRC risk calculated by ANN from personal health data is noninvasive, insensitive to tumorous advancement, & outperforms USPSTF screening guidelines as a binary and trinary test. In addition, the ANN offers the prospect of CRC risk assessment in real time and on the world map.
Read full abstract