Abstract

Big data analytics and machine learning are the promising fields of the present time and playing important role in the healthcare sector. Big data analytical techniques help in analyzing a huge volume of data which may be in structured, semistructured, or unstructured form, and extract meaningful information for effective decision-making. Machine learning techniques help in performing predictions with the trained models on the input datasets and perform classification, clustering of data. In this chapter, the author has performed data analysis on diabetic patients dataset categorical in nature using big data analytical techniques, i.e., MapReduce, Apache Pig, Apache Hive, Apache Spark, and their architectures are discussed. Apart from big data analytics, machine learning techniques, i.e., K-Nearest Neighbor, Decision Trees, Bagged Trees, are implemented on the female diabetic patient dataset which is categorical and numerical for performing predictions based on the attributes like Age, Body Mass Index, Glucose, Blood Pressure, etc. The sensitivity achieved by the decision tree is 61.2% which is higher compared to KNN and bagged tree, whereas the Specificity achieved by the KNN is 89.2% which is higher than the other two algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call