Ray tracing is a well known technique to generate life-like images based on models of light shading, reflection, and refraction. The massive computation and memory demands of ray tracing complex scenes have long motivated researchers to use parallel processing in reducing the ray tracing time. This paper gives a study of parallel implementation of a ray tracing algorithm on a distributed memory parallel computer. The computational cost of rendering pixels and patterns of data access can not be predicted until runtime. To efficiently parallelize such an application, the issues of database partition, data management and load balancing must be addressed. In this paper, we discuss the ways of database partition and propose a dynamic data management scheme which can exploit image coherence to reduce data communication time. A global load balancing mechanism is presented to ensure a good load balance among processors during ray tracing time. The success of our implementation depends crucially on a number of parameters which are experimentally evaluated.