Abstract

The advancement of medical science and technology has provided more methods for the diagnosis and treatment of malignant tumors, and the survival period of cancer patients has been significantly extended. However, many patients with malignant tumors still have recurrence and metastasis after effective treatment. Exploring the mechanism of tumor recurrence and metastasis to predict the recurrence and metastasis of cancer is a major clinical issue. At the same time, the rapid development of the Human Genome Project and gene microarray technology has enabled the activity of many genes in the patient's body to be intuitively measured through the chip. The rapid development of machine learning has contributed to the data mining and medical science of this DNA microarray technology. Therefore, this project aims at the above-mentioned problems and predicts the location and time of recurrence by analyzing a large amount of clinical data and 2enetic data. Firstly, we perform simple data cleaning and normalization processing on clinical data and genetic data; second, perform differential gene screening; next, select principal component analysis, sparse principal component analysis, nuclear principal component analysis, and multi-dimensional scaling algorithms to reduce the dimensional of the data. Finally, the genetic data uses random forest, support vector machine, linear support vector machine, guided aggregation algorithm, gradient boosting algorithm, and ensemble learning for machine learning, and then finds the best parameters and methods through grid search, and selects the appropriate model Evaluation method. The clinical data is manually selected and classified using machine learning. Finally, the results of clinical data and genetic data are combined to predict the site of recurrence. Using the above method to predict the recurrence and the location of the recurrence, a good effect was achieved. Taking whether to recurrence as an example, the accuracy rate of the verification set reached 0.825, the recall rate reached 0.801, the Fl score reached 0.800. Through the retrospective study and prediction of gastric cancer recurrence, the model proposed in this paper has potential clinical value.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call