Abstract

The cumulative growth of data from various sources has led to the era of big data. Big Data analytics give rise opportunities in designing of competitive offer packages for customers to provide reliable services, but analysis must be accurate and timely for successful decision making. For testing and analyzing Big Data, various statistical methods are developed. Traditional statistical analysis focuses on sampling for generating a predictive mode. To overcome this limitation, Big Data is partition into sub data sets and statistical analysis is employed on each subsets. As the structure of data sets are to be studied initially we have to go through various steps in statistical modeling up to Exploratory Data Analysis (EDA). Dependent variable and independent variables are identified and suitable parametric modeling is suggested. Regression techniques are used to describe the relation between dependent and independent variables. Here we focused different linear regression techniques. The performance are evaluated through simulation methods in the experimental data sets from UCI machine learning repository and its seen that multivariate linear regression shows better performance in parametric modeling.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call