3D analytical pathology imaging examines high resolution 3D image volumes of human tissues to facilitate biomedical research and provide potential effective diagnostic assistance. Such approach - quantitative analysis of large-scale 3D pathology image volumes - generates tremendous amounts of spatially derived 3D micro-anatomic objects, such as 3D blood vessels and nuclei. Spatial exploration of such massive 3D spatial data requires effective and efficient querying methods. In this paper, we present a scalable and efficient 3D spatial query system for querying massive 3D spatial data based on MapReduce. The system provides an on-demand spatial querying engine which can be executed with as many instances as needed on MapReduce at runtime. Our system supports multiple types of spatial queries on MapReduce through 3D spatial data partitioning, customizable 3D spatial query engine, and implicit parallel spatial query execution. We utilize multi-level spatial indexing to achieve efficient query processing, including global partition indexing for data retrieval and on-demand local spatial indexing for spatial query processing. We evaluate our system with two representative queries: 3D spatial joins and 3D k-nearest neighbor query. Our experiments demonstrate that our system scales to large number of computing nodes, and efficiently handles data-intensive 3D spatial queries that are challenging in analytical pathology imaging.
Read full abstract