Abstract

This paper aims to develop the methodology for enhancing the regression models using Cluster based sampling techniques (CST) to achieve high predictive accuracy and can also be used to handle large datasets. Hard clustering (KMeans Clustering) or Soft clustering (Fuzzy C-Means) to generate samples called clusters, which in turn is used to generate the Local Regression Models (LRM) for the given dataset. These LRMs are used to create a Global Regression Model. This methodology is known as Enhanced Regression Model (ERM). The performance of the proposed approach is tested with 5 different datasets. The experimental results revealed that the proposed methodology yielded better predictive accuracy than the non-hybrid MLR model; also, fuzzy C-Means performs better than the KMeans clustering algorithm for sample selection. Thus, ERM has potential to handle data with uncertainty and complex pattern and produced a high prediction accuracy rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call