In this work a hybrid parallel Monte Carlo based neutron transport simulation program has been developed using Message-passing Interface (MPI) and Compute Unified Device Architecture (CUDA) technologies. Such program is aimed to run on a GPU-Cluster, that means, a computer cluster in which the nodes are provided with programmable Graphics Processing Units (GPU). A quite simple, but very time consuming Monte Carlo simulation have been considered in order to shown that making use of an uncomplicated and low cost computer architecture, it is possible to achieve great gains in terms of computational performance. As an example, in the best case, a parallel simulation running on an 8-GPU-cluster (4 multi-core PC, with 2 GPU each) was more than 2000 times faster than the sequential program running on a single processor. Here, the physical model, hardware and software architecture, as well as results obtained in comparative experiments are described and comment.