Computational modeling and simulation of cellular blood flow is highly desired for understanding blood microcirculation and blood-related diseases such as thrombosis and tumor, but it remains a challenging task primarily because blood in microvessels should be described as a dense suspension of different types of deformable cells. The focus of the present work is on the development of a particle-based and GPU-accelerated numerical method that is able to quickly simulate the various behaviors of deformable cells in three-dimensional arbitrarily complex geometries. We employ a two-fluid model to describe blood flow, incorporating the deformation and aggregation of cells. A smoothed dissipative particle dynamics is used to solve the two-fluid model, and a discrete microstructure model is applied for the cell deformation, as well as a Morse potential model for the cell aggregation. The heterogeneous CPU-GPU environment is established, where each GPU thread is dedicated to a particle, and the CPU is mainly responsible for loading and exporting data. Five test cases are conducted against analytical theory, experimental data, and previous numerical results, for pure fluid, cell deformation, cell aggregation, cell suspension and the cellular flow in a complex network, respectively. It is shown that the methodology can accurately predict various behaviors of cells, and the GPU is well suited for particle-based modeling. Especially for cellular blood flow, where calculating cellular forces is a compute-intensive and time-consuming task, the GPU offers exceptional parallel capabilities, significantly enhancing the simulation efficiency. The speedup is about 3.5 times faster than the CPU parallelization with 96 cores for the pure fluid, and this acceleration nearly reaches 20 times when cells are included in the simulations. Particularly, the calculations for deformation and aggregation forces demonstrate a substantial speedup, achieving the improvements of up to 120 and 640 times, respectively, compared to their serial counterparts. The present methodology can effectively integrate various behaviors of cells, and has the potential in simulating very large microvascular networks at organ levels.