Abstract
mining and Knowledge Discovery hidden and valuable knowledge from the data sources is discovered. The traditional algorithms used for knowledge discovery are bottle necked due to wide range of data sources availability. Class imbalance is a one of the problem arises due to data source which provide unequal class i.e. examples of one class in a training data set vastly outnumber examples of the other class(es).This paper proposes a method belonging to under sampling approach which uses OPTICS one of the best visualization clustering technique for handling class imbalance problem. In the proposed approach, further Classification of new data is performed by applying C4.5 algorithm as the base algorithm. The method is optimized by the selection of the most suitable clusters for deletion of the majority dataset based on visualization algorithms. An experimental analysis is carried out over a wide range of highly imbalanced data sets and uses the statistical tests suggested in the specialized literature. The results obtained show that our novel proposal outperforms other classic and recent models in terms of Area under the ROC Curve, F-measure, precision, TP rate and TN rate.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.