The article presents methods for caching and displaying data from spectral satellite images using libraries of distributed computing systems that are part of the Apache Hadoop ecosystem, and GeoServer extensions. The authors gave a brief overview of existing tools that provide the ability to present remote sensing data using distributed information technologies. A distinctive feature is the way to convert remote sensing data inside Apache Parquet files for further display. This approach allows you to interact with the distributed file system via the Kite SDK libraries and switch on additional data processors based on Apache Hadoop technology as external services. A comparative analysis of existing tools, such as: GeoMesa, GeoWawe, etc is performed. The following steps are described: extracting data from Apache Parquet via the Kite SDK, converting this data to GDAL Dataset, iterating the received data, and saving it inside the file system in BIL format. In this article, the BIL format is used for the GeoServer cache. The extension was implemented and published under the Apache License on the GitHub resource. In conclusion, you will find instructions for installing and using the created extension.
Read full abstract