Abstract

This paper uses a case based study – “product sales estimation” on real-time data to help us understand the applicability of linear and non-linear models in machine learning and data mining. A systematic approach has been used here to address the given problem statement of sales estimation for a particular set of products in multiple categories by applying both linear and non-linear machine learning techniques on a data set of selected features from the original data set. Feature selection is a process that reduces the dimensionality of the data set by excluding those features which contribute minimal to the prediction of the dependent variable. The next step in this process is training the model that is done using multiple techniques from linear & non-linear domains, one of the best ones in their respective areas. Data Remodeling has then been done to extract new features from the data set by changing the structure of the dataset & the performance of the models is checked again. Data Remodeling often plays a very crucial and important role in boosting classifier accuracies by changing the properties of the given dataset. We then try to explore and analyze the various reasons due to which one model performs better than the other & hence try and develop an understanding about the applicability of linear & non-linear machine learning models. The target mentioned above being our primary goal, we also aim to find the classifier with the best possible accuracy for product sales estimation in the given scenario.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call