Abstract

Background: Internet of Things (IoT) edge analytics enables data computation and storage to be available adjacent to the source of data generation at the IoT system. This method improves sensor data handling and speeds up analysis, prediction, and action. Using machine learning for analytics and task offloading in edge servers could minimise latency and energy usage. However, one of the key challenges in using machine learning in edge analytics is to find a real-world dataset to implement a more representative predictive model. This challenge has undeniably slowed down the adoption of machine learning methods in IoT edge analytics. Thus, the generation of realistic synthetic datasets can leverage the need to speed up methodological use of machine learning in edge analytics. Methods: We create synthetic data with features that are like data from IoT devices. We use an existing air quality dataset that includes temperature and gas sensor measurements. This real-time dataset includes component values for the Air Quality Index (AQI) and ppm concentrations for various polluting gases. We build a JavaScript Object Notation (JSON) model to capture the distribution of variables and the structure of this real dataset to generate the synthetic data. Based on the synthetic dataset and original dataset, we create a comparative predictive model. Results: Analysis of synthetic dataset predictive model shows that it can be successfully used for edge analytics purposes, replacing real-world datasets. There is no significant difference between the real-world dataset compared the synthetic dataset. The generated synthetic data requires no modification to suit the edge computing requirements. Conclusions: The framework can generate representative synthetic datasets based on JSON schema attributes. The accuracy, precision, and recall values for the real and synthetic datasets indicate that the logistic regression model is capable of successfully classifying data.

Highlights

  • The widespread adoption of the Internet of Things (IoT) in business and industry has resulted in significant investment in advanced applications development (Brous et al, 2020)

  • The first visible comparison is based on the JavaScript Object Notation (JSON) Schema for the original dataset and the JSON Schema for the synthetic dataset, both of which appear to be quite precise in matching the same structure and variable components

  • We further validated our experimental approach by training a machine learning model to predict the four air quality categories specified in the dataset using both the actual Air Quality Index (AQI) and a synthetic dataset

Read more

Summary

Introduction

The widespread adoption of the Internet of Things (IoT) in business and industry has resulted in significant investment in advanced applications development (Brous et al, 2020) These applications focus on increasing efficiency and cost reduction while speeding up the analytic process at receiving ends. Integration of Machine Learning (ML) capabilities into EC has enabled sensor-based application specific analytics at the IoT network edge. Internet of Things (IoT) edge analytics enables data computation and storage to be available adjacent to the source of data generation at the IoT system. This method improves sensor data handling and speeds up analysis, prediction, and action. One of the key challenges in using machine learning in edge analytics is to find a realworld dataset to implement a more representative predictive model. The accuracy, precision, and recall values for the real and synthetic datasets indicate that the logistic regression model is capable of successfully classifying data

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.