Abstract

Early detection of a disease is a crucial task because of unavailability of proper medical facilities. Cancer is one of the critical diseases that needs early detection for survival. A cancer tumor is caused due to thousands of genetic mutations. Understanding the genetic mutations of cancer tumor is a tedious and time-consuming task. A list of genetic variations is analysed manually by a molecular pathologist. The clinical strips of indication are of nine classes, but the classification is still unknown. The objective of this implementation is to suggest a multiclass classifier which classifies the genetic mutations with respect to the clinical signs. The clinical evidences are text-evidences of gene mutations and analysed by Natural Language Processing (NLP). Various machine learning concepts like Naive Bayes, Logistic Regression, Linear Support Vector Machine, Random Forest Classifier applied on the collected dataset which contain the evidence based on genetic mutations and other clinical evidences that pathology or specialists used to classify the gene mutations. The performances of the models are analysed to get the best results. The machine learning models are implemented and analyzed with the help of gene, variance and text features. Based on the variants of gene mutation, the risk of the cancer can be detected and the medications can be prescribed accordingly.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call