Abstract

According to the CDC, in the United States, Ovarian Cancer is the second most prevalent form of gynecologic cancer and is the fifth leading cause of mortality in women. The only reliable method to screen for this cancer is TVS (trans-vaginal sonography), which is both invasive and costly. The goal of this project was to use the mRMR (Maximum Relevance Minimum Redundancy) Feature Selection Algorithm to select a panel of biomarkers from the Ovarian Cancer dataset and create a non-invasive and inexpensive software tool that could help validate the panel and assist with the early detection of Ovarian Cancer, with a reasonable level of sensitivity.
 This project uses an ovarian cancer dataset with 49 features. The mRMR filter method [9, 10, 12]of feature selection eliminates the redundant features while keeping the relevant features that impact the target class. This project accomplished the final goal of creating a working web application that asks a clinician to provide a few basic blood test results and generates a prediction. The machine learning model [7] used by the application is Random Forest Machine Learning model which is created with the K best features picked by the mRMR algorithm and is successfully utilized to predict the disease and treatment targets thus helping with reducing the mortality rate from ovarian cancer. 
 This project used the Random Forest Classifier model machine learning model. It has been shown to work well with smaller datasets (as with this project’s dataset) and had a sensitivity score of 0.96.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call