Apache Pig Research Articles

Background: In this astronomically immense world tremendous amount of data engendering in every minute from the different domain which is referred to Big Data. In the last few years the data is incrementing day by day across the world. This Research fixates on the analysis of malefaction rates of 5 different states year wise, all the analysis is done utilizing Apache Pig. Methods: The goal of the work is to analyze the astronomically immense malefaction data and find the estimate number of malefaction transpires in sundry states. This is done in Apache pig environment utilizing “Pig Latin” as language. A short code is indicted in Pig Latin which is utilized to load and process the data into Map reduce environment, afterwards the result are obtained with the detail of minimum and maximum mapper and reducer timing. Results: The data is visualized into graphs to make analysis to analyze the variation of malefaction rates in distinct states. After analyzing the malefaction against women, murder cases are very high in 2006-2010 as compared to other year groups whereas abducting and rape cases incremented perpetually from 2001 to 2014 respectively. Similarly all the reports regarding to different malefaction rates are visualized above by utilizing graphs. Conclusion: Various results are found with sundry queries and everything is represented graphically for better understanding and comparison. This avails us to find which state is affected by which crime. The expeditiousness of Apache pig can additionally be optically discerned as this immensely colossal crime data processed in short time with precision.

Read full abstract

The day-to-day life of the people doesn't depend only on what they think, but it is affected and influenced by what others think. The advertisements and campaigns of the favourite celebrities and mesmerizing personalities influence the way people think and see the world. People get the news and information at lightning speed than ever before. The growth of textual data on the internet is very fast. People express themselves in various ways on the web every minute. They make use of various platforms to share their views and opinions. A huge amount of data is being generated at every moment on this process. Being one of the most important and well-known social media of the present time, millions of tweets are posted on Twitter every day. These tweets are a source of very important information and it can be made use for business, small industries, creating government policies, and various studies can be performed by using it. This paper focuses on the location from where the tweets are posted and the language in which the tweets are written. These details can be effectively analysed by using Hadoop. Hadoop is a tool that is used to analyze distributed big data, streaming data, timestamp data and text data. With the help of Apache Flume, the tweets can be collected from Twitter and then sink in the HDFS (Hadoop Distributed File System). These raw data then analyzed using Apache Pig and the information available can be made use for social and commercial purposes. The result will be visualized using Apache Zeppelin.

Read full abstract

Apache Pig Research Articles

Related Topics

Articles published on Apache Pig

Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

Apache Hadoop based effective sentiment analysis on demonetization and covid-19 tweets

Performance testing in lexical analysis on latest Twitter trends for enterprise network using PIG

Hadoop Based Generic Template for Performing Sentiment Analysis Using Apache PIG

Analysis of Crime Rates of Different States in India Using Apache Pig in HDFS Environment

The research of social processes at the university using big data

Bigdata Analysis on Airline Delay and Cancellation

Equi-Depth Histogram Construction Methodology for Big Data Tools

Twitter data analysis using hadoop ecosystems and apache zeppelin

Pipeline provenance for cloud‐based big data analytics

Simulation of Performance Analysis of MongoDB, PIG, HIVE Storage, Map Reduce, Spark and Yarn

Web Server log Analysis for Unstructured data Using Apache Flume and Pig

An Overview of Apache Pig and Apache Hive

Using the Big Data in the human resources management systems

Integración de herramientas para la toma de decisiones en la congestión vehicular

Health data analytics using scalable logistic regression with stochastic gradient descent

A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system

PERFORMANCE COMPARISON OF HADOOP MAPREDUCE AND APACHE PIG

An Efficient Storage and Retrieval of DICOM Objects using Big Data Technologies

Opinion Mining of Twitter Data using Hadoop and Apache Pig

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Apache Pig Research Articles

Related Topics

Articles published on Apache Pig

Analysis of data processing efficiency with use of Apache Hive and Apache Pig in Hadoop environment

Apache Hadoop based effective sentiment analysis on demonetization and covid-19 tweets

Performance testing in lexical analysis on latest Twitter trends for enterprise network using PIG

Hadoop Based Generic Template for Performing Sentiment Analysis Using Apache PIG

Analysis of Crime Rates of Different States in India Using Apache Pig in HDFS Environment

The research of social processes at the university using big data

Bigdata Analysis on Airline Delay and Cancellation

Equi-Depth Histogram Construction Methodology for Big Data Tools

Twitter data analysis using hadoop ecosystems and apache zeppelin

Pipeline provenance for cloud‐based big data analytics

Simulation of Performance Analysis of MongoDB, PIG, HIVE Storage, Map Reduce, Spark and Yarn

Web Server log Analysis for Unstructured data Using Apache Flume and Pig

An Overview of Apache Pig and Apache Hive

Using the Big Data in the human resources management systems

Integración de herramientas para la toma de decisiones en la congestión vehicular

Health data analytics using scalable logistic regression with stochastic gradient descent

A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system

PERFORMANCE COMPARISON OF HADOOP MAPREDUCE AND APACHE PIG

An Efficient Storage and Retrieval of DICOM Objects using Big Data Technologies

Opinion Mining of Twitter Data using Hadoop and Apache Pig