The issues related to the formation of large data collections are not fully resolved. The amount of information in the world is constantly increasing, which has resulted in the problem of its storage. The term "big data" created to define this data includes the following characteristics such as quantity, processing speed, variety, reliability, variability and value. This type of information includes environmental characteristics; the data determine the distribution of relevant indicators on the Earth and make it possible to make a forecast for the future regarding their changes in time and space, which is important for economic management and sustainable development of humanity. However, there is not enough information on the effective organisation of the storage and processing of such data, and further research is needed. Thus, the object of the study is the data obtained at environmental monitoring stations. The subject of the study is the storage of data obtained as a result of environmental monitoring. The purpose of the study is to develop criteria for evaluating and comparing different types of data repositories, taking into account specific requirements for their storage; to determine the types of information to be stored in the database; to create an ER-diagram of a particular database. The received data are classified according to the state of the environment, its location and pollution. As the data is obtained from an extended system of observations, it passes in stages from the place of its registration through the city, regional, state and global network to the place of its storage. Accordingly, the following criteria for the information received must be provided such as the ability to store data of various types, quick access and processing, and scalability. There are two main models of data bases such as relational and non-relational, each of them has its advantages and disadvantages. For example, relational (SQL) data storage systems have rigid schemes that ensure the reliability of information storage, but are inefficient for processing a large number of queries and have no significant scalability. Non-relational (NoSQL) systems store data in an unstructured type, are easily scalable, and provide high speed of query processing. Conclusions. The research has shown that non-relational databases are more appropriate for storing data obtained from environmental monitoring stations. A scheme for processing the data was created. The groups of parameters that will be stored in the database are outlined. The main criteria for data storage were developed, allowing for more efficient data organisation. An ER diagram for the database was implemented.
Read full abstract