Breast cancer stands as the most frequently diagnosed life-threatening cancer among women worldwide. Understanding patients' drug experiences is essential to improving treatment strategies and outcomes. In this research, we conduct knowledge discovery on breast cancer drugs using patients' reviews. A new machine learning approach is developed by employing clustering, text mining and regression techniques. We first use Latent Dirichlet Allocation (LDA) technique to discover the main aspects of patients' experiences from the patients' reviews on breast cancer drugs. We also use Expectation-Maximization (EM) algorithm to segment the data based on patients' overall satisfaction. We then use the Forward Entry Regression technique to find the relationship between aspects of patients' experiences and drug's effectiveness in each segment. The textual reviews analysis on breast cancer drugs found 8 main side effects: Musculoskeletal Effects, Menopausal Effects, Dermatological Effects, Metabolic Effects, Gastrointestinal Effects, Neurological and Cognitive Effects, Respiratory Effects and Cardiovascular. The results are provided and discussed. The findings of this study are expected to offer valuable insights and practical guidance for prospective patients, aiding them in making informed decisions regarding breast cancer drug consumption.
Read full abstract