Data stream clustering for low-cost machines

Christophe Cérin,Keiji Kimura,Mamadou Sow

doi:10.1016/j.jpdc.2022.04.009

Abstract

Nowadays, the operations performed by the Internet of Things (IoT) systems are no more trivial since they rely on more sophisticated devices than in the past. The IoT system is physically composed of connected computing, digital, mechanical devices such as sensors or actuators. Most of the time, each of them incorporates a logical arithmetic unit that can pre-compute or compute on the device. To extract value from the data produced at the edge, processing power offered by cloud computing is still utilized. However, streaming data to the cloud exposes some limitations related to the increased communication and data transfer, which introduces delays and consumes network bandwidth. Clustering data is one example of a treatment that can be executed in the cloud. In this paper, we propose a methodology for solving the data stream clustering problem at the edge. Data Stream clustering is defined as the clustering of data that arrive continuously, such as telephone records, multimedia data, sensors data, financial transactions, etc. Since we use low-cost and low-capacity devices, the objective is, given a sequence of points, to construct a good clustering of the stream using a small amount of memory and time. We propose a ‘windowing’ scheme, coupled with a sampling scheme to respect the objective. Under the experimental conditions, experiments show that the clustering solutions can be controlled, with difficulties for time-stamped data but not for random data or data with well-delimited clusters. The main advantage of our schema is that we are clustering data “on the fly” with no knowledge or assumption regarding the available data. We do not assume that all the data are known before a treatment batch by batch. Our schema also has the potential to be adapted to other classes of machine learning algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Parallel and Distributed Computing	Publication Date: Apr 20, 2022
Citations: 3	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Data stream clustering for low-cost machines

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing

Lead the way for us

Similar Papers

IoT cyber risk: a holistic analysis of cyber risk assessment frameworks, risk vectors, and risk ranking process
Kamalanathan Kandasamy ... Krishnashree Achuthan
EURASIP Journal on Information Security | VOL. 2020
Kamalanathan Kandasamy, et. al.Kamalanathan Kandasamy ... Krishnashree Achuthan
26 May 2020
EURASIP Journal on Information Security | VOL. 2020

Communication management for reliable service based IoT systems
Patryk Schauer ... Lukasz Falas
-
Patryk Schauer, et. al.Patryk Schauer ... Lukasz Falas
01 Sep 2020
01 Sep 2020

Enabling Technologies for Effective Deployment of Internet of Things (IoT) Systems
Jamil Y Khan ... Dong Chen
Journal of Telecommunications and the Digital Economy | VOL. 2
Jamil Y Khan, et. al.Jamil Y Khan ... Dong Chen
26 May 2020
Journal of Telecommunications and the Digital Economy | VOL. 2

INTERNET OF THINGS SYSTEMS SECURITY: BENCHMARKING AND PROTECTION

-

07 May 2020
07 May 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data stream clustering for low-cost machines

Abstract

Talk to us

Similar Papers

More From: Journal of Parallel and Distributed Computing