Abstract

IoT (Internet of Things) based Smart Grid (SG) is defined as a power grid integrated with a large network of smart objects portrayed by information and communication technology. The data sources of IoT-based SG, as well as their correlations, are usually perplexing, which necessitate indexing techniques for complex queries over the SG dataset to efficiently exploit the rich connotations of data to enable characteristic analytics and fault prediction. As part of popular big data platform, HBase is replacing classic relational data- bases to host huge heterogeneous data records in the form of key-value storage. However, most existing secondary index schemes on HBase are managed and retrieved by corresponding data columns instead of queries to incur inefficiency in answering a complex data query. In this paper, we propose an adaptive indexing technique to speed up a complex data query on HBase for IoT-based SG big data. Our proposed method is based on the observation that most analyses over big power grid data focus on data subsets related to specific power grid events or monitoring data instead of the whole dataset. Theoretical analysis and experimental test show that the proposed query-oriented secondary indexing scheme is feasible in improving the query performance. For a join operation, when compared with a query scheme without secondary indexing, our proposed indexing scheme outperforms from a minimum 6.54 × speedup to a maximum 860 × speedup; when compared with a classic secondary indexing scheme implemented on HBase, our indexing scheme outperforms from a minimum 1.20 × speedup to a maximum 8.68 × speedup. Our indexing technique would be a useful reference for other industrial big data practices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call