Abstract

High-quality data are always desirable for superior decision-making in smart buildings. However, latency issues, communication failures, meter glitches, etc., create data anomalies. Especially, the redundant/duplicate records captured at the same time instants are critical anomalies. Two such cases are the same timestamps with the same energy consumption reading and the same timestamps with different energy consumption readings. This causes data inconsistency that deludes decision-making and analytics. Thus, such anomalies must be properly identified. So, this paper performs an enumeration of redundant data anomalies in smart building energy consumption readings using an analytical approach with 4-phases (sub-dataset extraction, quantification, visualization, and analysis). This provides the count, distribution, type, and correlation of redundancies. Smart buildings’ energy consumption dataset of Darmstadt city, Germany, was used in this study. From this study, the highest count of redundancies is observed as 5060 on 26 January 2012 with the average count of redundancies at the hour level being 211 and the minute level being 7. Similarly, the lowest count of redundancies is observed as 89 on 24 January 2012. Further, out of these 5060 redundancies, 1453 redundancies are found with the same readings and 3607 redundancies are found with different readings. Additionally, it is identified that there are only 14 min out of 1440 min on 26 January 2012 without having any redundancy. This means that almost 99% of the minutes in the day possess some kind of redundancies, where the energy consumption readings were recorded mostly with two occurrences, moderately with three occurrences, and very few with four and five occurrences. Thus, these findings help in enhancing the quality of data for better analytics.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call