Abstract
The recent great technological advance has led to a broad proliferation of Monitoring Infrastructures, which typically keep track of specific assets along time, ranging from factory machinery, device location, or even people. Gathering this data has become crucial for a wide number of applications, like exploration dashboards or Machine Learning techniques, such as Anomaly Detection. Time-Series Databases, designed to handle these data, grew in popularity, becoming the fastest-growing database type from 2019. In consequence, keeping track and mastering those rapidly evolving technologies became increasingly difficult. This paper introduces the holistic design approach followed for building NagareDB, a Time-Series database built on top of MongoDB—the most popular NoSQL Database, typically discouraged in the Time-Series scenario. The goal of NagareDB is to ease the access to three of the essential resources needed to building time-dependent systems: Hardware, since it is able to work in commodity machines; Software, as it is built on top of an open-source solution; and Expert Personnel, as its foundation database is considered the most popular NoSQL DB, lowering its learning curve. Concretely, NagareDB is able to outperform MongoDB recommended implementation up to 4.7 times, when retrieving data, while also offering a stream-ingestion up to 35% faster than InfluxDB, the most popular Time-Series database. Moreover, by relaxing some requirements, NagareDB is able to reduce the disk space usage up to 40%.
Highlights
The great progress in the technological field has led to a dramatic increase in deployed monitoring devices
We demonstrate and benchmark the novel approach followed for implementing NagareDB, a resource-efficient Time-Series database built on top of MongoDB—a database typically discouraged in the Time-Series scenario [7,8,9]
In order to address this problem, and to lower the barriers to building Monitoring Infrastructures, we introduced the novel approach followed to create NagareDB, a resource-compromised and efficient Time-series database built on top of MongoDB, the most popular NoSQL open-source database
Summary
The great progress in the technological field has led to a dramatic increase in deployed monitoring devices Those devices, commonly called sensors, are employed in a broad number of scenarios, ranging from traditional factories and commercial malls to the largest experiment on Earth [1]. GDB are designed to be independent with regard to the nature of the data to be handled Thanks to this Swiss army knife behavior, GDB are typically the most popular DBMS, which makes it easier to find expert personnel in their usage [4]. This flexibility is gained at the expense of efficiency, since GDB are not tailored to benefit from the specifics of any particular scenario [11]. System performance is limited, and strongly attached to the design decisions made by the database engineers, while fitting the particular scenario into the GDB
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have