Abstract

Breast cancer diagnosis is a critical area in medical research, where the challenge lies not only in accurate identification but also in managing the inherent complexity of high-dimensional datasets. This paper navigates this challenge by exploring dimensionality reduction techniques to enhance diagnostic accuracy. The primary objective of this research was to employ dimensionality reduction methods to refine breast cancer diagnosis, with a focus on improving accuracy and interpretability. The study investigates the impact of preprocessing techniques on a high-dimensional dataset, aiming to uncover meaningful patterns for effective diagnostic models. Starting with a dataset including 569 observations and 30 attributes, careful examination reveals imbalances in the dataset (63% benign, 37% malignant). To deal with multicollinearity, we use the coefficients of Pearson correlation to find and eliminate highly correlated features. Subsequent data transformation, utilizing min-max normalization, ensures uniform weighting. Principal Component Analysis (PCA) is then leveraged for comprehensive dimensionality reduction. Visualizations through scree plots and bi-plots underscore the efficacy of early principal components in distinguishing benign from malignant cases. Our results demonstrate a notable 24% reduction in data dimensionality, affirming the process's efficiency. This abstract beckon the exploration of detailed findings, emphasizing dimensionality reduction's pivotal role in refining breast cancer diagnosis for more accurate, efficient, and interpretable models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.