Abstract
One of the main challenges in modern Internet of Things (IoT) systems is the efficient collection, routing and management of data streams from heterogeneous sources, including sources with high ingestion rates. Despite the existence of various IoT data streaming frameworks, there is still no easy way for collecting and routing IoT streams in efficient and configurable ways that are easy to be implemented and deployed in realistic environments. In this paper, we introduce a programmable engine for Distributed Data Analytics (DDA), which eases the task of collecting IoT streams from different sources and accordingly, routing them to appropriate consumers. The engine provides also the means for preprocessing and analysis of data streams, which are two of the most important tasks in Big Data analytics applications. At the heart of the engine lies a Domain Specific Language (DSL) that enables the zero-programming definition of data routing and preprocessing tasks. This DSL is outlined in the paper, along with the middleware that supports its runtime execution. As part of the paper, we present the architecture of the engine, as well as the digital models that it uses for modelling data streams in the digital world. We also discuss the validation of the DDA in several data intensive IoT use cases in industrial environments, including use cases in pilot productions lines and in several real-life manufacturing environments. The latter manifest the configurability, programmability and flexibility of the DDA engine, as well as its ability to support practical applications.
Highlights
We are currently witnessing the rise of the Internet of Things (IoT) paradigm as a result of the proliferation of the number and type of internet-connected devices
The vast majority of Industrial IoT (IIoT) use cases involve collection and processing of data from a variety of distributed data sources, including processing and analytics over data streams with very high ingestion rates
In the remaining of the paper we introduce a configurable infrastructure for distributed data analytics environments, which caters for the flexible and almost zero-programming configuration of data routing and preprocessing functions
Summary
We are currently witnessing the rise of the Internet of Things (IoT) paradigm as a result of the proliferation of the number and type of internet-connected devices. Zero-programming options functionalities like the Kafka Streams DSL are very versatile when there is a need to dynamically select data sources from the shop floor and to define queries over them They are not very effective when there is a need to define how data should be routed across many different data consumers (e.g., predictive analytics algorithms, digital twins) in industrial environments. The FAR-EDGE DDA is configurable in the scope of edge computing environments, as it defines an Edge Analytics Engine (EAE) that handles analytics within the scope of an edge node, while at the same time providing mechanisms for combining and synchronizing analytics functionalities across multiple edge nodes The latter functionalities are very common in IIoT environments, where several processes (e.g., production scheduling, supply chain management) are likely to involve multiple stations or entire plants i.e., multiple edge nodes using different streaming middleware toolkits (e.g., Kafka or Spark).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.