Abstract

The rapid development of Internet of Things (IoT) technology and the widespread deployment of various sensors around the world have produced a large number of data streams. Thus, current computing systems face the challenge of quickly receiving and managing these large-scale streaming data. This study builds an efficient distributed database based on Greenplum (GP) and focuses on solving the problem of the low efficiency of structured data queries for observed ecological data collected from fragile areas in Northwest China's desert oasis. First, a distributed database is designed and deployed at the physical storage structure level. A database table structure is then established based on the characteristics of the streaming data. On this basis, the data storage strategy is optimized at the data table level. Additionally, the query efficiency of the distributed database is compared with the query efficiency of traditional standalone databases. The results show that the distributed database significantly improves the data query efficiency. The greater the amount of data stored, the better the improvement in efficiency. Finally, based on the optimized distributed database, we develop a data sharing system for streaming data from ecologically fragile areas in the desert oasis in Northwest China, which provides a new approach for the efficient sharing of massive amounts of IoT streaming data for ecological monitoring. Our storage system is still currently working normally, which is highly important to both data managers and users.

Highlights

  • E ECOSYSTEMS are important parts of the Earth’s framework and form the core of its most active biosphere

  • We introduce GP into the application of ecological monitoring Internet of Things (IoT) and optimize the query efficiency from the database physical structure and database table structure according to the characteristics of ecological monitoring streaming data

  • Li et al designed a distributed storage system-based massively parallel processing (MPP) architecture to solve the shortcomings of low concurrency and poor scalability and accelerate the query of the entire project to improve project performance

Read more

Summary

INTRODUCTION

E ECOSYSTEMS are important parts of the Earth’s framework and form the core of its most active biosphere. We adopt a geoscience field (IoT ecological monitoring) as an example to study a new, open source data management architecture to achieve massive data sharing and services in the context of earth science big data. We introduce GP into the application of ecological monitoring IoT and optimize the query efficiency from the database physical structure and database table structure according to the characteristics of ecological monitoring streaming data This strategy solves the problem of data query bottlenecks in the case of a limited number of servers. Li et al designed a distributed storage system-based massively parallel processing (MPP) architecture to solve the shortcomings of low concurrency and poor scalability and accelerate the query of the entire project to improve project performance They concluded that with its good scalability and MPP advantage, the system can be used to solve the mass data storage problem [36]. No GP database application cases in the field of ecological monitoring IoT are available

SYSTEM OVERVIEW
DATABASE TABLE DESIGN
DELL R940
38.77 Jun 2012–present 11375716
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call