Abstract
In the rapidly expanding landscape of medical data, the need for innovative approaches to maximize classification performance has become increasingly critical. As data volumes grow, ensuring that diagnostic systems work with accurate and relevant data is paramount for effective and generalizable classification. This study introduces a novel gradient-based sample selection method, the first of its kind in the literature, specifically designed to enhance classification accuracy by removing redundant and non-informative data. Unlike traditional methods that focus solely on feature selection, this approach integrates an advanced sample selection technique to optimize the input data, leading to more accurate and efficient diagnostics. The method is validated on multiple disease datasets, including the Wisconsin Diagnostic Breast Cancer (WDBC) dataset and the Cleveland Coronary Artery Disease (CAD) dataset, demonstrating its broad applicability and effectiveness. To address dataset imbalance, the Adaptive Synthetic Sampling (ADASYN) method is employed, followed by Particle Swarm Optimization (PSO) for feature selection. The refined datasets are then classified using a Support Vector Machine (SVM), showing that even traditional classifiers can achieve substantial improvements when enhanced with advanced sample selection. The results underscore the critical importance of precise sample selection in boosting classification performance, setting a new standard for computer-aided diagnostics and paving the way for future innovations in handling large and complex medical datasets.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.