Abstract

The massive onsite data produced by the Internet of Things (IoT) can bring valuable information and immense potentials, thus empowering a new wave of emerging applications. However, with the rapid increase of onsite IoT data streams, it has become extremely challenging to develop a scalable computing platform and provide a comprehensive workflow for processing IoT data streams with lower latency and more intelligence. To this end, we present a Kubernetes-based scalable fog computing platform (KFIML), integrating big data streaming processing with machine learning (ML)-based applications. We also provide a comprehensive IoT data processing workflow, including data access and transfer, big data processing, online ML, long-term storage, and monitoring. The platform is feasibly validated on a clustered testbed, which comprises a master node, IoT broker servers, worker nodes, and a local database server. By leveraging the lightweight orchestration system, namely Kubernetes, we can readily scale and manage containerized software frameworks on our testbed. The big data processing layer utilizes the advanced data flow frameworks such as Apache Flink, to support both streaming processing and statistical analysis with low latency. In addition, the specified long short-term memory (LSTM)-based ML pipelines are employed on the online ML layer, to enable the real-time predictive analysis of IoT data streams. The experiments on a real-world smart grid use case demonstrate that the container-based KFIML platform can be well-scaled with Kubernetes to efficiently perform big data processing increased onsite IoT data streams with lower latency and conduct ML-based applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call