Reading vector data more efficiently: Assessing performance of the OGR simple feature library

Anran Yang,Zhinong Zhong,Qingren Jia,Ning Jing

doi:10.1111/tgis.12840

Abstract

AbstractReading vector data files to in‐memory data models efficiently is a crucial step to handle the ever‐growing volume of geographical data. Although advanced IO solutions like distributed file systems or Message Passing Interface parallel IO work well in some high‐end computing environments, the 20‐year‐old OGR simple features library is still the de facto tool for loading vector files when developing GIS algorithms in most scenarios, which is not very efficient when data become larger. In this article, we analyze the bottleneck of the OGR library and find that excessive small objects are the main source of slowness. We then offer advice to improve efficiency when using OGR. To further verify our findings and provide an alternative to the OGR library in performance‐sensitive scenarios, we develop a library based on continuous memory pools to avoid small objects. Experiments show that our advice is effective and our library can be several times faster than the OGR library for IO‐intensive programs.

Full Text