USE OF DISTRIBUTED DBMS FOR SPATIAL DATA PROCESSING

Aleksey A Kolesnikov,Elena V Komissarova,Ivan V Zhdanov

doi:10.33764/2618-981x-2020-1-2-82-87

Abstract

Currently, data volumes are growing exponentially. Geospatial data is one of the main elements of the concept of Big data. There is a very large number of tools for analyzing Big data, but not all of them take into account the features and have the ability to process geospatial data. The article discusses three popular open analytical tools Hadoop Spatial, GeoSpark, GeoFlink for working with geospatial data of very large volumes. Their architectures, advantages and disadvantages, depending on the execution time and the amount of data used are considered. Processing evaluations were also performed in terms of both streaming and packet data. The experiments were carried out on raster and vector data sets, which are satellite imagery in the visible range, NDVI and NDWI indices, climate indicators (snow cover, precipitation intensity, surface temperature), data from the Open Street Map in the Novosibirsk and Irkutsk Regions.

Full Text