Abstract

Machine learning is a very important method for predicting the extent of cancer. In recent years, there has been considerable progress in detecting tiny features of the body, such as the specific surface area of cancer cells, the nuclear volume and perimeter of cancer cells. There is not development in measuring macro symptoms, such as Coughing of Blood, Chest Pain. Because in some remote and backward areas, detailed data are difficult to obtain, macro features need to be considered. In this paper, we use some relevant characteristics and environmental factors to predict the degree of cancer. During data preprocessing, some irrelevant contents are deleted, and the training set and the test set are divided. At the same time, some text data are digitized. Then, Naive Bayes, Decision Tree, Random Forest and Support Vector Machine were used in turn to make predictions and record their respective results. As a result, Naive Bayes, Decision Tree, Random Forest and Support Vector Machine achieved a fairly high prediction rate. More relevant features are obtained through the Person correlation coefficient and Feature Importance. On this basis, the prediction will not only maintain a high prediction rate, but also greatly reduce the memory consumption and training time.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.