Abstract
Many commercial applications use dedicated hardware such as ASIC to allow efficient and low power implementation of the solution. In the design process, the software design and hardware design are typically performed separately leading to long time-to-market. Typically, many iterations are needed to tune the software to be suitable for hardware design and the hardware to have regular data flow, high throughput, low power, and low cost. In our work, we seek to perform software-hardware (SW/HW) co-design so that both software level and hardware level performance are considered jointly in order to achieve superior performance with short time-to-market. We further develop this approach to Algorithm and Hardware (A/H) Co-design to achieve hardware-friendly algorithm and corresponding hardware design which have regular data flow, high throughput, low power, and low cost. We use this method to attack few problems in video coding. The first problem is A/H co-design for motion estimation. In video encoders, motion estimation (ME) is an important step to reduce the temporal redundancy. While there are many existing hardware designs for SAD-based ME, there is little hardware design for the superior RD-based ME because the RD-based ME requires floating-point arithmetic and has highly irregular data flow making it impossible to parallelize. We proposed a method involving software and hardware innovations that, when combined, can achieve close-to-optimal RD performance, regular data flow, high data throughput, and high data re-use. The second problem is A/H co-design for video filtering, which are needed in video coding and other applications to achieve various objectives. While there are many existing hardware designs for dedicated filters, it is challenging to design a reconfigurable hardware architecture for a general filter with user-selectable filter shapes and sizes which is a close-to-impossible task for any single hardware design. In our work, we assume small modification of filter coefficients are acceptable and perform SW/HW co-design on the filter coefficients. Our solution involves software innovations which achieve general and reconfigurable filtering at a slight cost of performance degradation, and can be implemented by our proposed hardware which has regular data flow, high data throughput, and high data re-use. The Third problem is A/H co-design for filtering for video coding. We first propose a reconfigurable hardware friendly re-design scheme, which can re-design various kinds of the filters to make them can be easily implemented by hardware but also keep similar performance to original ones. And based on this scheme, we propose a low complexity hardware architecture which can implemented various filtering methods and can achieve very high, if not the highest, data re-use ratio. The design is implemented with TSMC 0.18um CMOS technology and costs 73k gates. Under a clock frequency of 63MHz, the architecture allows the real-time processing of 1920 × 1080 (1080P) at 30fps. The maximum frequency of our proposed architecture is around 200 MHz. The Fourth issue is A/H co-design for transform process in video coding. We propose a reconfigurable hardware friendly re-design method for the transform based on minimized difference function which largely reduces the number of operations and make the data flow regular. The difference criterion we proposed in this thesis can be also helpful for other matrix related hardware orientated designs
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.