Abstract

Colorectal cancer (CRC), as a result of a multistep process and under multiple factors, is one of the most common life-threatening cancers worldwide. To identify the “high risk” populations is critical for early diagnosis and improvement of overall survival rate. Of the complicated genetic and environmental factors, which group is mostly concerning colorectal carcinogenesis remains contentious. For this reason, this study collects relatively complete information of genetic variations and environmental exposure for both CRC patients and cancer-free controls; a multimethod ensemble model for CRC-risk prediction is developed by employing such big data to train and test the model. Our results demonstrate that (1) the explored genetic and environmental biomarkers are validated to connect to the CRC by biological function- or population-based evidences, (2) the model can efficiently predict the risk of CRC after parameter optimization by the big CRC-related data, and (3) our innovated heterogeneous ensemble learning model (HELM) and generalized kernel recursive maximum correntropy (GKRMC) algorithm have high prediction power. Finally, we discuss why the HELM and GKRMC can outperform the classical regression algorithms and related subjects for future study.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.