Abstract
Big data plays a major role in the learning, manipulation, and forecasting of information intelligence. Due to the imbalance of data delivery, the learning and retrieval of information from such large datasets can result in limited classification outcomes and wrong decisions. Traditional machine learning classifiers successfully handling the imbalanced datasets still there is inadequacy in overfitting problems, training cost, and sample hardness in classification. In order to forecast a better classification, the research work proposed the novel “Self-Boosted with Dynamic Semi-Supervised Clustering Method”. The method is initially preprocessed by constructing sample blocks using Hybrid Associated Nearest Neighbor heuristic over-sampling to replicate the minority samples and merge each copy with every sub-set of majority samples to remove the overfitting issue thus slightly reduce noise with the imbalanced data. After preprocessing the data, massive data classification requires big data space which leads to large training costs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.