The analysis of high resolution whole slide tissue images is a computationally expensive task, which adversely impacts effective use of pathology imaging data in research. We propose runtime solutions to enable efficient execution of pathology image analysis applications on modern distributed memory hybrid platforms equipped with both CPUs and GPUs. Hybrid systems offer significant computation capacity, but taking advantage of this computing power is complex. An application developer may have to implement multiple versions of data processing codes targeted for different computing devices. The developer also has to tackle the challenges of efficiently distributing computational load among the nodes of a distributed memory machine and among computing devices on a node. This is particularly difficult in analysis of high resolution images because of irregular computing costs of processing different image regions. In order to address these problems, we have leveraged a high-level image processing language (Halide) and integrated it into our runtime system called Region Templates (RT). The language simplifies the application development while generating code for multiple devices, such as CPU and GPU. The integration with RT allows for efficient multiple node hybrid execution. We also developed a novel cost-aware data partitioning (CADP) strategy that considers the workload irregularity to minimize load imbalance. Our experimental evaluation shows significant performance improvements on hybrid CPU-GPU machines, as compared with using a single processor (CPU or GPU), as well as on multi-GPU systems. CADP resulted in 1.7× better performance than other workload partitioning approaches (e.g., KD-Trees) on a hybrid machine and was up to 2.24× faster in multi-node settings.