The Cell Broadband Engine (BE) is a heterogeneous 9-core microprocessor which initially saw the light in the Sony PlayStation 3. This paper describes the parallelization of a video processing application on the Cell BE, and the programming model chosen for this application. Serial implementations on PPE only and parallel implementations on PPE-SPE with 8 SPEs are described. This is followed by the presentation of the speedup comparisons with and without DMA and thread creation overhead times. The results presented in this paper demonstrate that the Cell BE processor can achieve a speedup of 10x on this application and shows good scalability with number of SPEs. When the input data size is at least 512x512, we showed that the speedup becomes limited by the number of memory transfer.
Read full abstract