Databases are an integral part of almost every application nowadays. For example, applications using Internet of Things (IoT) sensory data, such as in Industry 4.0, are a classic example of an organized storage system. Due to its enormous size, it may be stored in the cloud. This paper presents the authors’ proposition for cloudcentric sensory measurements and measurements acquisition. Then, it focuses on evaluating industrial cloud storage engines for sensory functions, experimenting with three open-source types of distributed Database Management Systems (DBMS); MongoDB and PostgreSQL, with two forms of PostgreSQL schemes (Javascript Object Notation (JSON)-based and relational), against their respective horizontal scaling strategies. Several experimental cases have been performed to measure database queries’ response time, achieved throughput, and corresponding failures. Three distinct scenarios have been thoroughly tested, the most common but widely used: (i) data insertions, (ii) select/find queries, and (iii) queries related to aggregate correlation functions. The experimental results concluded that PostgreSQL with JSON achieves a 5–57% better response than MongoDB for the insert queries (cases of native, two, and four shards implementations), while, on the contrary, MongoDB achieved 56–91% higher throughput than PostgreSQL for the same set up. Furthermore, for the data insertion experimental cases of six and eight shards, MongoDB performed 13–20% more than Postgres in response time, achieving × 2 times higher throughput. Relational PostgreSQL was × 2 times faster than MongoDB in its standalone implementation for selection queries. At the same time, MongoDB achieved 19–31% faster responses and 44–63% higher throughput than PostgreSQL in the four tested sharding subcases (two, four, six, eight shards), accordingly. Finally, the relational PostgreSQL outperformed MongoDB and PostgreSQL JSON significantly in all correlation function experiments, with performance improvements from MongoDB, closing the gap with PostgreSQL towards minimizing response time to 26% and 3% for six and eight shards, respectively, and achieving significant gains towards average achieved throughput.
Read full abstract