Abstract

The traditional RDBMS is not sufficient to manage highly scalable web log data. Hadoop framework can overcome the problems raised in traditional systems. Hadoop is a sophisticated framework to manage gigantic amounts of scalable data. It contains a Map Reduce framework which helps in writing applications to process enormous volumes of data in parallel, on voluminous clusters of vendible hardware in a reliable manner. Analysis of Weblog data needs to be processed in a distributed environment due to its nature of huge volume and also the generation of online streaming data. Hadoop framework with pig scripting is used for extracting the dynamic patterns from weblog data in a distributed environment. Diverse situations in the data set are rendered by using status codes. Frequency of status codes is also analyzed. Managing a huge volume of data using distributed processing considerably reduces the execution time.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call