Abstract
In this paper, the problem of efficient mapping of the H.264 decoder on an embedded quad-core platform is addressed. For this purpose, a new partitioning method called `hybrid partitioning' is proposed. Partitioning is a very important issue for the mapping of application software on multi-core systems. For H.264 video decoders, functional partitioning and data partitioning were proposed, and usually used. Hybrid partitioning is the mixture of two partitioning methods, and each module is partitioned by functional partitioning or data partitioning, depending on the module's features. Compared with dedicated functional or data partitioning, hybrid partitioning is as powerful as data partitioning for load balancing between cores, and is also as efficient as functional partitioning from the viewpoint of memory requirement. Hybrid partitioning is also free from the macroblock level dependency problem that data partitioning usually has in video decoding. As a result of applying hybrid partitioning, 86.0% of waiting overhead is reduced, compared with functional partitioning. Regarding memory usage, hybrid partitioning requires 51.2% less VLIW (Very Long Instruction Word) program memory, and 62.0% less CGRA (Coarse-Grained Reconfigurable Array) program memory, than data partitioning. As for SDRAM (Synchronous Dynamic Random-Access Memory) bandwidth, compared with data partitioning, hybrid partitioning conserves the SDRAM bandwidth of 38.6MHz. This is 11.6% of the whole bandwidth budget of 333MHz SDRAM memory used in experiments. A parallelized decoder with hybrid partitioning on an embedded quad-core system is 3.5 times faster than that on a single core.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have