Abstract
Video algorithm (e.g. H.264, MPEG2/4 etc) requires tremendous amount of computation power and data bandwidth. This complexity depends on encoding vs. decoding mode, video standard, resolution, frame-rate and visual quality constraints. Many video architecture solutions typically use multiple processing elements (e.g. multiple DSPs or MCU, DSP/MCU with dedicated accelerators or FPGA etc) to achieve the high computation requirements for video algorithms. These architectures provide new challenges to video software's that are typically designed to run on a single processor. This paper presents software design for a video architecture using parallel processing elements. This paper explains following aspects in detail a) Software partitioning b) Algorithm specific optimizations c) Processor specific optimizations d) Efficient DMA/Cache usage e) Concurrent scheduling of all parallel processing elements. The given approach is explained with example of MPEG4 encoder on TMS320DM6446, which is Davincitrade family device from Texas Instruments Ltd. The given software architecture is scalable for various video standards (e.g. H.264, MPEG2/4 etc) as well as various parallel processing hardware solutions. The software achieves performance Dl@30 fsp on given device at less than 50% of DSP load.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.