Abstract

In the big data era, high-resolution raster-based geocomputation has been widely employed in geospatial studies. The algorithms used in local map algebra operations are data-intensive and require a large memory space and massive computing power. Simply employing distributed computing framework such as Hadoop to serve such applications incurs storage and performance issues. In this paper, we present a two-level storage strategy specially for map-reduce implementation of local map algebra algorithms under Hadoop. This approach implements efficient storage and manipulation of large raster data sets through three processes: (1) partitioning a raster file into square tile sets, (2) compressing and reorganizing these tile sets to prevent tile overlap across data divisions, and (3) improving MapReduce’s I/O interfaces for data exchange of parallel computation of map algebra. Experiments with real-world datasets show that the proposed strategy can achieve high speedup and efficiency for raster-based spatial analysis applications. The results also show that the strategy has satisfactory scalability as the number of data nodes in clusters or the raster data volume is increased.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.