Abstract

This study talks about how data mining can be used for sales forecasting in retail sales and demand prediction. Prediction of sales is a crucial task which determines the success of any organization in the long run. There are various techniques available for predicting the sales of a supermarket such as Time Series Algorithm, Regression Techniques, Association rule etc. In this paper, a comparative analysis of some of the Supervised Machine Learning Techniques have been done such as Multiple Linear Regression Algorithm, Random Forest Regression Algorithm, K-NN Algorithm, Support Vector Machine (SVM) Algorithm and Extra Tree Regression to build a prediction model and precisely estimate possible sales of 45 retail outlets of Walmart store which are at different geographical locations. Walmart is one of the foremost stores across the world and thus authors would like to predict the sales accurately. Certain events and holidays affect the sales periodically, which sometimes can also be on a daily basis. The forecast of probable sales is based on a combination of features such as previous sales data, promotional events, holiday week, temperature, fuel price, CPI i.e., Consumer Price Index and Unemployment rate in the state. The data is collected from 45 outlets of Walmart and the prediction about the sales of Walmart was done using various Supervised Machine Learning Techniques. The contribution of this paper is to help the business owners decide which approach to follow while trying to predict the sales of their Supermarket taken into account different scenarios including temperature, holidays, fuel price, etc. This will help them in deciding the promotional and marketing strategy for their products.

Highlights

  • Retail is considered as one of the most significant and fastgrowing business domains in data science field because of its high-volume data and abundant optimization challenges for example, ideal prices, recommendations, discounts, stock levels which can be resolved by using different data analysis methods

  • Extra Tree Regression Model performs best on the data for all three years when compared to other supervised Machine Learning techniques and predicts the sales with 98% accuracy and could be relied upon for Sales forecasting when the parameters considered are Fuel Price, Unemployment, Holiday and CPI

  • Based on the above experimentation, it has been observed that Simple Regression techniques for building the prediction models may not be the best choice for sales prediction if the management is trying to predict the sales for lesser duration and have historical data only for few years

Read more

Summary

Introduction

Retail is considered as one of the most significant and fastgrowing business domains in data science field because of its high-volume data and abundant optimization challenges for example, ideal prices, recommendations, discounts, stock levels which can be resolved by using different data analysis methods. When it comes to predicting the sales of commodities, it gets quite challenging in today‘s stimulating and ever-changing business environment. It may include internal actions for example, promotions, discounts, pricing etc., which add to the intricacy of the problem

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call