Abstract

The amount of data has risen significantly over the last few years, due to the popularity of some of the data generation sources like social media, electronic health records, sensors and online shopping sites. Analyzing, processing and storing this data is very prominent since it helps to uncover hidden patterns and unknown correlations. A big data analysis and prediction System is proposed in this context, which combines weather observations, health data and social media content in order to forecast the outbreaks of infectious diseases in a locality. Finding information about the determinants of disease outbreaks are required to reduce its effects on populations. An In-mapper combiner based MapReduce algorithm is used to calculate the mean of daily measurements of various climate parameters like temperature, atmospheric pressure, relative humidity, solar and wind. The climatic parameter that may leads to the outbreak of a disease is identified by finding the correlation between the parameters and disease incidence count. To evaluate how user’s tweeting patterns and sentiments matched with the outbreak of diseases, all tweets containing keywords related to diseases are collected using twitter streaming APIs and are analyzed and processed using Spark framework. The performance of proposed model is improved due to the presence of tweet processing. This indicates that the real-time analysis of social media data can provide more effective result rather than working on the historical data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call