The Internet of Things (IoT) generates substantial data through sensors for diverse applications, such as healthcare services. This article addresses the challenge of efficiently utilizing resources in resource-scarce IoT-enabled sensors to enhance data collection, transmission, and storage. Redundant data transmission from sensors covering overlapping areas incurs additional communication and storage costs. Existing schemes, namely Asymmetric Extremum (AE) and Rapid Asymmetric Maximum (RAM), employ fixed and variable-sized windows during chunking. However, these schemes face issues while selecting the index value to decide the variable window size, which may remain zero or very low, resulting in poor deduplication. This article resolves this issue in the proposed Controlled Cut-point Identification Algorithm (CCIA), designed to restrict the variable-sized window to a certain threshold. The index value for deciding the threshold will always be larger than the half size of the fixed window. It helps to find more duplicates, but the upper limit offset is also applied to avoid the unnecessarily large-sized window, which may cause extensive computation costs. The extensive simulations are performed by deploying Windows Communication Foundation services in the Azure cloud. The results demonstrate the superiority of CCIA in various metrics, including chunk number, average chunk size, minimum and maximum chunk number, variable chunking size, and probability of failure for cut point identification. In comparison to its competitors, RAM and AE, CCIA exhibits better performance across key parameters. Specifically, CCIA outperforms in total number of chunks (6.81%, 14.17%), average number of chunks (4.39%, 18.45%), and minimum chunk size (153%, 190%). These results highlight the effectiveness of CCIA in optimizing data transmission and storage within IoT systems, showcasing its potential for improved resource utilization and reduced operational costs.
Read full abstract